This is one of my dream projects but I could never figure out how to expose the site to the public internet. I will try the port forwarding thing but I have a feeling my Xfinity router has locked that down (please correct me if I’m wrong)
Got it working with like 10 minutes of effort, lol. Don’t know why I struggled previously so much
One thing I previously was stumped about was getting my public IP address. Kinda surprising that I just go to a site like whatsmyip.com and get the value from there. I thought that wouldn’t work because Xfinity always rotates my public IP
Try Cloudflare tunnel (like a comment above suggested).
It creates a private connection between your home network and cloudflare, which won’t expose your home IP or network to the outside.
It’s a compromise to have cloudflare MITM your self-hosted website, but it’s better than burning through your (very generous—sarcastic) xfinity monthly cap.
In theory, yes. You get access to their CDN when using a tunnel. You can setup custom caching rules to serve content from their edge network and reduce your outgoing bandwidth.
In practice, no one visits my site so I can’t test it. lol.
I’m wondering if you have any plans to add a history viewer on top of the git log to see the various experiments and how they are different from each other. I’m thinking about your “regular Joe Reviewer #2” use case where they might be interested in the development of the work but might not be able to use git log to track it. What I’m thinking about is an HTML overlay over git log that shows cards for each experiment, maybe all the commits that went into it (if they are tagged somehow to a specific experiment), and then the performance stats under each one. This would allow you to publish the log somewhere easy (like GitHub Pages/OSF.io/your university static HTML host) so that non-technical people can interact with the work.
An additional bonus would also be to add “traceability” of the work by linking experiments to each other, so to visualize how each experiment builds onto other ones, kind of in a network.
In short, yes. Part of the reason I built logis was to support further tools I want to build on-top of the structured data in the commit log. I’m more interested in collaboration tools though, rather than just observability into experiments.
Someone exploiting an unprotected open port on the public internet is an expected outcome. Also, aren’t most databases supposed to run behind a firewall anyways? I wonder if they also never bothered changing the default PG admin password.
Just setting up a wireguard tunnel (e.g., with Tailscale or headscale) would have prevented this while still allowing remote access to the database service.
I would focus on the why of docker and on an hands-on example that can get people working with it. I wouldn’t go to I depth on more advanced topics ( e.g., networking, volumes) or other technology built of docker (e.g., kubernetes) as it could become an unproductive bird walk.
As far as the workshop outline, I would start with a motivating example (e.g., simple web app with a database and a redid data store) and compare running it on bare metal and on docker (compose).
The rest of the workshop could cover dockerfile and docker compose syntax, potential uses, and future topics (here you can talk about the more advanced topics like the underlying docker tech, kube, and other).
Wouldn’t the total energy draw be the area under the curve here? If that’s the case, the graphs show that serving TLS actually uses about 40% more energy (eyeballing it), that regular http requests. Am I missing something?
The point of TLS everywhere is clear from the security and labor-economic perspectives. However it was bewildering to read some people believe it is also zero energy overhead.
The author now appears to stress how ISP content injection could nullify the difference but it doesn’t sound credible. Such bottom feeder providers were niche even in their heyday and petered out long before letsencrypt movement picked up. Also a significant portion of worldwide HTTPS traffic is intra-app/services.
No, the difference in the total area under the curve is mainly made up from the SoC clocking down. The actual difference in power usage due to “additional silicon” being active is the difference when both curves are in the period in which they do work.
Even from the first posts about it, announcing the funding for it, it was very much framed as an experiment though. Although maybe it failed in a surprising place I guess.
The way I imagine it happened is that the curl/libcurl maintainers got asked (probably insistently) to switch to Rust. They got a grant to get started and did some good progress on the project, while also not being as proficient in Rust as they are in C. But the project never really reached feature parity, and people didn’t really care to push it over the finish line. I would be bitter too after investing time into something because people asked for it and didn’t show up.
The author of Hyper created a C API for libcurl to use, so I’m not sure how much Rust the Curl devs had to interact with. In libcurl work had to be done to allow for switching between HTTP backends.
Note that the article mentions two other experimental curl components implemented in Rust. They might also be on the way out, but I didn’t get that impression.
I mostly write SQL nowadays. What I don’t get about these critiques is the problem that they are trying to solve.
SQL is quirky and sometimes opinionated. But it works well enough most of the time. A query statement is self contained, separated in digestible chunks that map to the data extraction logic, and is readable enough (if someone formats the query in a readable way).
The proposed concatenation of functions is more confusing, which is more similar to how R or pandas manipulates data using objects rather than how SQL builds the query from tables. I guess that if you come from an imperative language that heavily uses objects, it might be difficult to start writing SQL right away. But after having worked with data for sometimes, I would choose a long SQL query over a concatenation of 10-20 functions.
If you’re trying to programmatically generate queries (for example, if you’re making a generic business intelligence tool that aspires to query across databases with different schemas) it’s really painful because SQL doesn’t compose very well, and if you find a way to compose SQL queries it frequently results in really stupid query plans even though you didn’t meaningfully change the semantics of the query.
But even if you’re just using a database “normally”–writing application queries against a known schema, even those of us who have been doing this for decades still have to reference the documentation for banal things because there’s very little syntactic or grammatical consistency with respect to how you express similar features. Sometimes a subquery has to be aliased, sometimes it doesn’t. A CTE is pretty similar to subqueries, but the syntax for defining and referencing them is different (e.g., a subquery can be referenced by its alias directly, but a CTE alias has to be selected from). One can pretty easily imagine a more consistent language than SQL.
Rationally, going to a casino or playing the lotto doesn’t make sense. As you show, you lose money in the long term.
But people play for the short term possibility to win. People win the lotto (6 right numbers) every once in a while and they haven’t been playing the lotto for 70,000 years.
Also, isn’t there another theorem that shows that the probabilities of independent events (like lotto numbers) don’t stack? Meaning, I I lose this week, I don’t have a different probability of winning this week, because the probability of losing twice in a row is lower than losing and winning (if it makes sense). I always get tripped up between the binomial theorem and this independence theorem.
Not a theorem, it is the very nature, and the definition, of independent events.
“People” dont win the lotto every now and then. A tiny amount of people wins the lotto throughout times. That is a a huge difference to the point of meaning the opposite.
There is no such thing as “short term possibility to win”, it is just a chance. Sure, you can win, but you can also be hit by thunder or bit by a rabid dog while walking on the street.
I think lotto discourse tends to ignore two things:
Entertainment: The loss in expected value could easily be covered by the entertainment gained
Disposable costs: For some people that play, the loss of $1 or $2 is insignificant enough but the prize money is life changing. If they saved up all their dollars from occasionally buying the lotto tickets, it wouldn’t add up to all that much anyways. So they are happy making that trade - even if its a negative value exchange because they don’t value what they are giving up and that doesn’t have to be irrational.
Now of course, these arguments no longer apply when people are gambling significant amount of money (or at least significant for them).
What’s “rational” is non-obvious to me. Maximizing expected value seems obviously rational but stuff like the [Petersburg Paradox] shows what kind of weird and undesirable consequences it has.
What’s “rational” is non-obvious to me. Maximizing expected value seems obviously rational but stuff like the St. Petersburg Paradox shows what kind of weird and undesirable consequences it has.
I used to play the lottery weekly until I realised it makes more sense to buy all the unique lottery tickets I’d ever buy at once to maximise my odds. Is that accurate?
Rationally it’s a bad idea to go to the casino [if you want to maximise your savings] (or you’re genetically predisposed to addictive behaviour). The casino is fun. Now I play roulette for fun now and again, because winning feels better than losing feels bad when playing with money I’d otherwise waste on penny whistles and moon pies, and it’s a blast. When you hit, you hit and that rush of brain chemicals is like nothing else.
There have been isolated cases where the design of particular lottery systems could under certain circumstances have positive expected value. Because it involved MIT-affiliated people, this instance is probably one of the more famous/infamous:
But it’s worth noting that for their biggest wins that group was buying huge quantities of tickets for a single targeted drawing, with a careful distribution of number selections. And in that specific case there was no way to avoid doing so. The details, for those who are interested, were:
In a normal drawing for this lottery (“Cash WinFall”), there was a large prize for a perfect six-number match, and lesser prizes for matching fewer than six numbers.
If no ticket matched all six numbers, the top prize fund would grow for the next drawing, until it reached a set cap.
If the top prize reached the cap, it would reset by redistributing money to the lesser prizes (a “rolldown event”).
In a drawing with a “rolldown”, the expected value went positive due to the increased fund for the lesser prizes, but actually taking advantage of this required purchasing enough tickets with enough combinations of numbers to guarantee wins on the lesser prizes. For this particular lottery, the necessary quantity of tickets was in the hundreds of thousands, requiring an investment group and non-trivial amounts of money both to cover the direct purchase cost and to cover the logistical costs of selecting the numbers for and actually obtaining that many lottery tickets in a short time frame.
The problem with box plots is that they only represent a summary of the distribution, linked to the 5 number summary. As far as summary statistics go, the 5 number summary is not the best but it’s not the worse. It just gives you an idea of how the data is distributed, if it has a long tail or a peak in the middle. But there is some data loss in the “compression” of the distribution in quartiles, especially in the case that 25% of the distribution is clumped together in one of the quartiles, like the linked article shows.
Using histograms (or other plots like in the linked page) is another solution. It increases the complexity of the plot, so they are a little bit harder to read and interpret.
At the end of the day, most work just compares means and standard deviations, and these statistics “compress” the information about the distribution even more.
At the end of the day, use graphs that help you tell the story that you want to tell about your data. If you are interested in exploring why there is a whole in your distribution (see example in the link), then use a histogram (or a similar graph). If you are interested in discussing the shape of the distribution (e.g., the first example in the link is similar to all income distribution graphs), then a box plot might work.
Mojo follows the Atomic CSS approach, which means that it focuses on providing a set of low-level CSS utilities that can be used to build custom user interfaces.
My understanding is that their CSS framework is a middleware between low-level CSS (native? actual?) and high-abstraction frameworks (tailwind? bootstrap?).
I’m not sold on the value-added of something like what they are proposing over just writing CSS directly.
Mojo is one of many atomic (a.k.a. utility-first) CSS frameworks. It didn’t invent the concept. According to Let’s Define Exactly What Atomic CSS is, the term “Atomic CSS” was coined in 2013. Mojo was released a month ago.
How is atomic/utility-first CSS different from inline styles?
Next, you might wonder why would anyone would use utility-first CSS over semantic CSS. The Case for Atomic / Utility-First CSS links to some arguments. (I don’t claim that utility-first CSS is always better – I think it has upsides and downsides. And of course, any specific implementation of utility-first CSS such as Mojo also has pros and cons.)
There is no way to write that hover style as an inline style. As my comment’s second link said, “Inline styles can’t target states like hover or focus”.
Mojo class names such as px-5 can be much shorter than equivalent CSS rules such as padding-left: 5; padding-right: 5;.
Being able to set padding-left and padding-right to the same number makes consistency easier to achieve.
I haven’t found explicit reasons on Mojo’s website for the use of abbreviations in simpler classes such as pl-* (padding-left), but the arguments I’ve heard from Tailwind users are that you can not only type the rule faster, you can see more such rules on the screen at a time, requiring less horizontal or vertical scrolling when navigating. Some Tailwind users claim that after you learn the abbreviations they aren’t slower than regular CSS.
I really like projects like just or makesure in that they’re trying to take what is good in make, and leave the clubkyness aside. However, both oversimplify the problem space in my opinion, making it hard to imagine how they would scale to a whole program build system: how do you build, run, test, package, provision, deploy, release, decomission, etc in a reproducible way all within a consistent system. So far I have only found make (with all it’s idiosyncrasies and pain points) to fit the bill, but am really hoping there will be viable alternatives for medium scale projects.
Just is great: not a build system, it looks pretty, supports inline scripts in other languages, and most of all it doesn’t try to give you enough rope to do whatever enough rope is for.
I mostly used make to run commands for a project without using the build system stuff. Just replaced it pretty much all the time because it’s easier to use.
Do you use “make” to store and provision an application? If yes, that’s a new one for me.
Usually make will handle dependencies and builds. The rest is handled by one or more CI /CD systems (storing the artifact, deploying the artifact, etc).
Yes, at least in the sense that it uses Nix to provision an env/tools and Pulumi to provision cloud resources and deploy the artifacts. The key benefit is that I don’t have a distinction between CI and local: I can run everything on my machine, and it will run just the same on CI/CD. Another benefit (at least for me) is no YAML.
What is the common mental model that people have of JOINs?
For me, I put together the left table first. This is the data “core” that I’m interested it. Then I join stuff to this core table and it will always be the left table.
I use WHERE clauses to filter the “core” data, and secondary conditions on the right table when joining. I don’t find adding conditions on the “core” table in ON statements to be very maintainable in the future and I like to have the conditions on the right table in the ON statement so I don’t have to go find them at the bottom of the query if it’s a long one.
As @squadette just clarified in another comment, the oversimplified thing in that model is that the rows in the left table can “multiply” if there are multiple matching rows in the right table. If you just think of it as attaching rows from the right table to rows from the left table, you don’t get the case where the left table ends up “bigger” due to this duplication.
A point can be an end, sure. You can say TLS implements “end to end encryption” between the client and browser.
You would of course be ignoring what e2e means in practice: encrypting communications between two individuals. TLS is not e2e as typically the website or service used to facillitate communicatioin is not the intended recipient of said communication.
Furthermore, HTTPS typically only authenticates the server and not the client. e2e encryption authenticates both ends in practice.
Finally, legislation targeting e2e encryption, e.g. in the UK, specifically forces the companies hosting communications platforms to be able to decrypt communications going through them. None of these companies would be using TLS to implement e2e as that would only encrypt traffic to and from the services. Therefore saying “if they ban e2e they have to ban HTTPS” is false.
but if the requirement is only for communication platform companies then they’re not really banning end-to-end encryption. you can obviously use HTTPS for communication between individuals, even if it wasn’t designed for that.
You’re focusing on semantics and missing the bigger picture. Sure they’re not literally banning all end to end encryption. They’re “only” banning by far the most common deployments of e2e today, e.g. Signal, WhatsApp, etc, thereby harming the privacy of ordinary people and leaving determined criminals unaffected.
our conversation has naturally been about semantics because your initial comment was “TLS is for point-to-point encryption.” the bigger picture is another topic.
sometimes a point is an end. that’s one of the issues here, is that who/what an “end” is is a fuzzy concept
if i host a text file, and i use HTTPS, is that end-to-end, between me and you?
what if the url is a random ID, and i only tell you what it is?
what if i use client certs?
what if i allow you to post a comment, so it’s bidirectional?
if you ban it without answering exactly what it is, you could end up either not banning anything, or banning the whole internet
(there are other issues with banning e2ee, but one of them is defining it in the first place)
Regarding E2E vs P2P. It feels unclean to take on the terminology of the enemy, who has specifically crafted it aid them in disempowering people. So I gather that the idea is not to ban encryption, but to ban people from using encryption with each other, but it would still be allowed to use encryption between your computer and a web server that is run by people who are approved known good guys, or something.
This doesn’t make much sense and the posted article seems to be writing some code using a crypto library. I don’t think that the fact you can do that is particularly relevant to the idea of this ban. The idea that you could circumvent a ban by just illegally using cryptography is obvious. I don’t mean to be negative, it’s cool how short you can write something like that (athough it is not well engineered primarily for reasons you mentioned in the bullet list).
I don’t think this terminology was first coined by “the enemy”, but as a way for chat application vendors to distinguish between apps that can encrypt messages between peers whilst still being routed over a central server versus apps that only have an encrypted connection to the central server(s). As in, “this app actually provides the security you might be looking for”.
Especially after the Snowden revelations it became quite obvious that merely having a secure connection to the intermediate server where things can be decrypted is not enough, and we need end to end encryption, even if we’re not using a true peer to peer system (which would be strictly less usable due to parties having to be online at the same time in order to be able to exchange messages).
e2ee usually refers to communication between two people, let’s call them Alice and Bob. Let’s also say that Bob gets in trouble with the police and his messages to Alice become of interest. With e2ee, police can’t read the messages because they don’t have the keys, unless they ask Alice or Bob for their devices and read them through that. Apple’s on-device CSAM’s scanner was a way around weakening e2ee while meeting the law’s intended purpose to “protect the children”.
https is different because the police can directly go to the server operator and ask for Bob’s logs, without risking to tip off Bob (or Alice) of the request (see Warrant canary for more reading),
Somewhat off topic. Does anyone know were to find a shapefile editor? I would like to setup a locator for work but I need to draw the boundary shapes first.
I agree with Simon that LLMs are akin to the introduction of mainframes in the 50s on clerical work (think of Hidden Figures).
I can see two different things happening. First, just like a calculator, LLM will be able to enhance the work of a skilled person. Just like a calculator/mainframe/computer helped with streamlining engineering work. Moreover, I am excited to see what new avenues these models could open up in the future.
On the other hand, the introduction of mainframes has made certain positions to be completely redundant (like human computers). It will be interesting to see how LLMs will impact positions that we believed were safe from automation, like copywriters or other creative writing work. Also, it will be interesting to see how it will change/impact English Language Arts instruction. Just think about your math teacher telling you that you had to learn multiplication tables because you were not supposed to carry a calculator with you all the time. The idea of the 5-paragraph essay is probably out of the window with LLMs, as most of these models can write an essay at the college level.
The things that make me excited about LLMs are the projects that integrate them with other services. LLMs make fantastic UIs for low-stakes interactions. I remember when the semantic web hype was at its peak, the vision was that companies would provide web services to interact with their models and anyone would be able to build UIs that connect to them and, especially, write tools that query dozens of data source and merge the responses. Some of the things I’ve seen where LLMs translate natural-language queries into JSON API requests make it seem that we might be close to actually providing this kind of front end.
The hallucination problem means that I wouldn’t want them to be responsible for every step (sure, go and plan a trip for me querying a load of different airline and hotel company data sources, but please met me check that you actually got the dates right), but there’s a lot of potential here.
I think that hallucination/confabulation is inherent in the framing of the tool. Decoding notes from a few days ago, I think that there are formal reasons why a model must predict a superposition of possible logical worlds, and there will always be possible logical worlds which are inconsistent with our empirical world.
Briefly, think of text-generation pipelines as mapping from syntax to semantics (embedding text in a vector space), performing semantic transformations (predicting), and then mapping from semantics to syntax (beam-searching decoded tokens). Because syntax and semantics are adjoint, with syntax left of semantics, our primary and final mappings are multi-valued; syntax maps to an entire subspace of possible semantics, and semantics maps to a superposition of possible text completions.
I think this is precisely the kind of thing that would benefit from proper integration with a query API. If the LLM is able to do a rough mapping to some plausible API names and then query the docs to find if they’re real and refine its response based on that then it doesn’t matter if they hallucinate, the canonical data source is the thing that fixes them. If the LLM can then generate examples, then feed them into a type checker for the language and retry until they pass, then it could be great for exploring APIs.
i am an expert on ruby and gave it a small prompt and the answer was something akin to a highschooler bullshitting their way through a book report instead of a cogent answer to the question. it will probably get better but right now it’s not there for that case. i read the docs for the item and the answer was clear.
Yeah for topics I know well, I get the same thing. Sometimes it’s pretty good, but pretty often it either
produces a very wordy and shallow answer
outright lies / makes stuff up
But I was thinking about the case where you train it on Ruby docs specifically. I think the hallucinations become much rarer then, which is mentioned in the OP’s blog post.
There is apparently a way to fine tune the model based on a particular corpus.
Either way, I still think people will end up using it for programming language docs. I’ve tried it on stuff I don’t know as well, and then I google the answer afterward, and it saves me time
I think this is also possibly because Google is pretty bad now. But even though there’s a lot of annoying hype, and huge flaws, it seems inevitable.
The good thing is that there will be a lot of open source models. People are working on that and they will train it on PL docs, I’m sure
Critically, it means the Bayesian–subjectivist can make decisions as if the same hypothesis – based on the same evidence – is both true and false, depending on the reward structure of the decision ahead of them. In competition for limited resources, the Bayesian–subjectivist is going to crush the frequentist–objectivist.
At this point I’ve read dozens of Bayesian articles trash-talking frequentists. I have not once met a self-identified frequentist. I don’t believe they’re real. If they are, I have no idea what they look like. It’s like I’m reading articles about how bigfoot is an antivaxxer.
under the frequentism–objectivism school, we cannot make decisions within the framework of probability. We have to step out of this framework to make any decisions.
But didn’t you have to step out of the framework with “bayesian-subjectivism” school too? To think “there’s a 77 % chance variant A is actually better than B”, you had to have picked a prior beforehand.
In the spectrum of things, I’d say I identify as a frequentist. I find all these articles pretty tiresome, though. The connections are much stronger than articles like this acknowledge…
Generally, I find Bayesian stuff often does a poor job characterizing or thinking about important things like sampling error or data-driven analysis decisions (though of course it is possible to handle those things).
Also, I’d challenge one to come up with a useful interpretation of specific probability values without eventually appealing to some frequentist scenario like betting…
When you say you’re a frequentist, do you mean you won’t use bayesian methods? Most of the people I know are either capital-B Bayesians who think that they should never use frequentist methods, or “eh I use both whatever works” people.
Well, yeah… whatever works. I’d say I’m frequentist in that I believe frequentist methods are valid and useful, which seems to be the point of contention of the never-ending pro-Bayesian posts?
I think very few people identify as frequentists for two reasons:
(1) frequentist hypothesis testing is the norm in the statistics. It’s less common for people who belong to the majority group to actively identify with it because most people will just assume that they belong to it. For example, if you read a paper that reports results of a t-test (or whatever else), you can safely assume that they did it within a frequentist framework (i.e., null hypothesis, alternative hypothesis, reject/not reject the null hypothesis)
(2) most people think about probability in a Bayesian way regardless of whether their statistical framework is frequentist or Bayesian. That is, people think about probability as a likelihood of something being true (e.g., a p-value of 0.01 means that they are 99% sure that the null hypothesis is false). This is how probabilities should be interpreted within a Bayesian framework. This is also the wrong interpretation of probabilities in a frequentist framework. In this case, the interpretation relies on sampling from a population and the frequency of samples with the statistic equal or more extreme to the one observed under the null hypothesis.
The trade-off between Bayesian and frequentist statistics is between frontend and backend complexity. Bayesian gives you a clean interpretation of results at the cost of complexity in modeling and program specification. Frequentist statistics gives you a wonky interpretation of results but has more straightforward modeling/testing.
This is one of a bunch of different accounts by #mastoadmins. I find it funny that in the past month, scaling mastodon has gone from a problem no-one ever really had, to a problem that is well-documented and understood.
This is a particularly interesting example, because @nova@hachyderm.io was running the instance as a small instance in her basement, that went from 700 ( a fairly small instance) users to 30,000 users (one of the biggest) in a month. It started out as a pet, and had to quickly scale up to a herd of cattle, and it’s a fascinating account of how they did that.
I dunno; it was pretty obvious past a certain point, and that certain point was much, much lower than 30k.
You get to ten thousand users, you’re seeing one new signup every 90 seconds, what exactly do you think is going to happen next week? You get to 20,000, and what, you think it’s just going to stop for some reason? Doesn’t make any sense to me.
I think they address that in the article, adding more users didn’t necessarily correlate to increased load on the system. I can see there might be some extra load (new users following other new users that you weren’t already federating with), but they likely would’ve had similar size issues if they’d capped it at 10k users rather than hitting 30k by the sounds of it.
That’s an interesting thought. Instance load should correlate with the size of the network that instance follows. As you point out, there should be a point where adding an extra user makes only minimal impact on the overall network, as they will follow people who are already known by the instance. I wonder where that point is.
Timeline generation in Mastodon is surprisingly expensive, where it can be a major contributor to the total load of the server. IIRC Mastodon has an optimization to stop timeline generation for inactive users because mastodon.social had a ton of inactive users, but the timeline generation for them was contributing a lot of the total load.
Sure, but even ignoring the moderation issues, it would have been much easier to move to new disks if the number of users were reasonable. You don’t have to move all the remote media, just the stuff owned by the actual local users.
Here’s a thread from mastodon (link) from the hachyderm admin team that addresses this with more of their thinking:
Kris Nóva — @nova
Was finally able to write-up a little more detail about the production problems we faced with #Hachyderm last week. This covers the migration out of our basement and into Digital Ocean and Hetzner.
I can’t say enough about the team who worked so hard on this. I shared some screenshots in the blog, but you would have to have been there to really see how powerful this group of wonderful people really is.
@recursive
It’s very impressive that you didn’t close new user registrations the whole time. Nice work, y’all deserve a rest.
Kris Nóva — @nova
We closed them for a few hours the night before the final cutover. I made this decision. I really don’t think it would have made a difference with our our performance either way. The only reason for considering it was because we were worried about first impressions with users.
I’m not sure where the belief that new users are somehow impacting performance is coming from. We would need to be talking hundreds of thousands of users.
Hazel Weakly — @hazelweakly
you need to calculate a person’s home feed for the first time when an account is created and that can be expensive computationally.
In theory a huge spike of signups might knock you over, and depending on how the feed recalculation logic works, you could wind up with a potential thundering herd type of issue.
But… It’s gotta be a lot of signups. Fast enough that the server can’t keep up. It wasn’t the issue or even noticeable for us :)
my theory is that the older accounts that follow more users across instances are the bigger issue than any local-only home feed. those jobs that push and pull across the federation are the expensive ones to run, database wise, and more likely to require retries due to failures from network issues or other instance scaling issues.
Hazel Weakly — @hazelweakly
right, yeah. That’s one of the reasons we felt that new user signups are a red herring for performance concerns. There’s a lot of other more likely causes
Michael Fisher — @mjf_pro
One day I have to tell you a story from earlier in my career about an EMC VMAX 40K SAN that had 28 of 32 disk directors go offline due to a bug in the replication firmware. It wrecked a weekend and a few months beyond. What you had going on with the old NFS (can we call it “Hachyd’oh!”?) didn’t look like it had the data loss risk/impact that our incident did at least. But yeah, storage channel problems can REALLY mess up a day.
I’m still convinced that NFS would work with our setup :)
Root cause of our problems were the bad disks. We had fast response times for all GET requests, as the pg database fits into memory. But the POSTs that result in pg updates/inserts were super slow. This way we had lots of requests in flight, resulting in a high number of concurrent disk access - also for sidekiqs.
Hazel Weakly — @hazelweakly
I think it would work too under ideal conditions, but I particularly didn’t like how brittle it was to tune and how easy it was to have multiple different facets of the system cause each other to fail.
Maybe it would’ve been the same with object storage? But at least failing disks on one machine wouldn’t have cascaded to slow and unresponsive other systems
We basically needed to increase the nfsd thread count to be able to serve all concurrent accesses.
The main problem here was really the poor performance of our disks.
But you’re right, NFS is backed by one server. When that server has a problem, the whole application has a problem.
I don’t get this post. I think the author might miss out the point of collaboration.
If I’m stuck on a coding problem, I might go do something else to “reset” and usually I come back to my desk with a solution. I wouldn’t want to have a group brainstorm on it.
But, if my team is working on a feature roadmap for the next couple of years, I believe that a brainstorm might be the best way to get it outlined. I don’t believe my team will buy a roadmap that I came up with while mowing my lawn. The brainstorm session is more about buy in and team building than pure creativity.
Not just that, together you might come to a much better design than alone. There’ve been plenty of cases where I got stuck on a legitimately difficult issue and wanted to hash out our options with a colleague, and then together we came to insights I probably wouldn’t have come up with myself, and vice versa.
Sometimes one person’s “obvious observation” is another’s “genious insight”.
And besides that, most of the typical industry work isn’t like that at all - most of it is boring plumbing work that just requires moving stuff around, reading up on APIs and trying out incremental changes.
I think you’ve got it - and while the author does seem to want to smoosh “brainstorming,” “collaboration,” and “innovation” all into one thing that can’t happen in a group of people - I don’t think the author misses the point of collaboration.
The core of it is - well, at the core of the article:
But don’t equate seeing collaboration with the outcomes of meaningful, collaborative work.
I think the author is writing for folks that think collaboration is the whole team sitting down, whipping out whiteboards AND that exact activity is the answer to every problem, cultural, technical, or otherwise.
While we’re throwing around anecdotes:
For better or for worse, at my job, collaboration is seen this way. Like the article says, because they plan the big brainstorming sessions, people show up, and then a little while later, big, innovative things happen.
But what really happened is that those big things were long in the making (and the result of one, maybe two people thinking about it away from a computer… in the shower, before falling asleep, zoning out at the aquarium) and when management threw everyone in the room together, those one or two people already knew how to argue their case, convince the rest of the folks on the team that just wanted an easy solution, and lo-and-behold - the whole team came up with an innovative solution.
Management sees collaboration and the few folks that ideated have no reason to challenge the system because it works and makes everyone’s lives easier and better.
The author would have management (The Business) wake up and realize what’s going on - but I don’t think that would help much.
Management wants successful teams at the end of the day (who knows why, really) and those one or two people that actually ideate want to continue ideating and leading by example — at least until they are under-compensated, taken for granted, or overruled too many times for less-than-optimal solutions to problems.
I think of a spreadsheet as a way to represent your data. Each data token is the intersection of a variable (usually a column in the spreadsheet) and an observation (usually a row in the spreadsheet). The intersection of variable and observation becomes a cell.
When you want to interact with your data, you can think of most operations either working with variables/columns or within an observation/row. Most spreadsheets implement this with cell-level formulas, but you can think about those formulas as either column- or row-wise operations.
Some advanced formulas (vlookup comes to mind) can work on whole columns instead of only cells.
As far as programming goes, you can think of a spreadsheet as a matrix, where you can do column, row, or cell operations are appropriate.
I haven’t tried Matrix/Element threads but I too prefer Zulip’s threading model to Slack’s.
You can reply to a Slack message in a thread, and the replies will be hidden from the main channel - the original message will have an extra line saying “10 replies” which you can click to view the thread in a sidebar. If you reply to a thread (or specifically subscribe to it) you get notifications for new replies, otherwise the thread is pretty much invisible once the original message scrolls out of view. I’m on some channels where people try to keep all conversations in threads, but the natural thing to do is to simply reply in the channel. If you want to add to a specific discussion once the main conversation has moved to another topic, you can either reply in a thread (and no-one will notice) or reply in the main channel (which means you are trying to change the subject back).
In Zulip, all messages in a channel (called a “stream”) belong to threads (“topics”) but the messages are also shown in the stream, ordered by time. Topics have names, kind of like email subject lines, and they are visible in a side bar with separate unread counts. You can click on a message to zoom into that topic. If you accidentally attach a message to a wrong topic, you can move it to another one.
I believe the decision stems from how the conversation is conceptualized to be represented in the chat thread.
Slack (and now element) conceptualize the threads to be “secondary” to the main conversation. Maybe you want to reply to an older message after the main thread has moved on or you have a one off comment to a message that doesn’t need to break up the main conversation. In this model, there is a 1:1 relationship between the channel topic and the conversation within. Threads give the option to branch off from this in a somewhat clunky way.
Zulip takes a different approach in conceptualizing conversations and threads. In their model, each conversation revolves around a topic (within a channel). This way to organize conversations is more akin to how mailing lists or forums organize content. This breaks up conversations into subtopic containers.
Both approaches have merits and drawbacks. But the main difference stems from wanting to model in the chat platform a “water cooler” conversation vs a billeting board.
This is one of my dream projects but I could never figure out how to expose the site to the public internet. I will try the port forwarding thing but I have a feeling my Xfinity router has locked that down (please correct me if I’m wrong)
I think it works on Xfinity?
https://www.xfinity.com/support/articles/port-forwarding-xfinity-wireless-gateway
Yes, that looks promising! Thanks for the research
Got it working with like 10 minutes of effort, lol. Don’t know why I struggled previously so much
One thing I previously was stumped about was getting my public IP address. Kinda surprising that I just go to a site like whatsmyip.com and get the value from there. I thought that wouldn’t work because Xfinity always rotates my public IP
Try Cloudflare tunnel (like a comment above suggested).
It creates a private connection between your home network and cloudflare, which won’t expose your home IP or network to the outside.
It’s a compromise to have cloudflare MITM your self-hosted website, but it’s better than burning through your (very generous—sarcastic) xfinity monthly cap.
Does Cloudflare Tunnel help with your bandwidth cap? Do they offer caching or something?
In theory, yes. You get access to their CDN when using a tunnel. You can setup custom caching rules to serve content from their edge network and reduce your outgoing bandwidth.
In practice, no one visits my site so I can’t test it. lol.
Definitely an interesting tool.
I’m wondering if you have any plans to add a history viewer on top of the git log to see the various experiments and how they are different from each other. I’m thinking about your “regular Joe Reviewer #2” use case where they might be interested in the development of the work but might not be able to use git log to track it. What I’m thinking about is an HTML overlay over git log that shows cards for each experiment, maybe all the commits that went into it (if they are tagged somehow to a specific experiment), and then the performance stats under each one. This would allow you to publish the log somewhere easy (like GitHub Pages/OSF.io/your university static HTML host) so that non-technical people can interact with the work.
An additional bonus would also be to add “traceability” of the work by linking experiments to each other, so to visualize how each experiment builds onto other ones, kind of in a network.
In short, yes. Part of the reason I built logis was to support further tools I want to build on-top of the structured data in the commit log. I’m more interested in collaboration tools though, rather than just observability into experiments.
Someone exploiting an unprotected open port on the public internet is an expected outcome. Also, aren’t most databases supposed to run behind a firewall anyways? I wonder if they also never bothered changing the default PG admin password.
Just setting up a wireguard tunnel (e.g., with Tailscale or headscale) would have prevented this while still allowing remote access to the database service.
I would focus on the why of docker and on an hands-on example that can get people working with it. I wouldn’t go to I depth on more advanced topics ( e.g., networking, volumes) or other technology built of docker (e.g., kubernetes) as it could become an unproductive bird walk.
As far as the workshop outline, I would start with a motivating example (e.g., simple web app with a database and a redid data store) and compare running it on bare metal and on docker (compose).
The rest of the workshop could cover dockerfile and docker compose syntax, potential uses, and future topics (here you can talk about the more advanced topics like the underlying docker tech, kube, and other).
Wouldn’t the total energy draw be the area under the curve here? If that’s the case, the graphs show that serving TLS actually uses about 40% more energy (eyeballing it), that regular http requests. Am I missing something?
You really don’t.
The point of TLS everywhere is clear from the security and labor-economic perspectives. However it was bewildering to read some people believe it is also zero energy overhead.
The author now appears to stress how ISP content injection could nullify the difference but it doesn’t sound credible. Such bottom feeder providers were niche even in their heyday and petered out long before letsencrypt movement picked up. Also a significant portion of worldwide HTTPS traffic is intra-app/services.
No, the difference in the total area under the curve is mainly made up from the SoC clocking down. The actual difference in power usage due to “additional silicon” being active is the difference when both curves are in the period in which they do work.
Sounds a bit like they didn’t believe in the project, but maybe this is bitter hindsight
Even from the first posts about it, announcing the funding for it, it was very much framed as an experiment though. Although maybe it failed in a surprising place I guess.
The way I imagine it happened is that the curl/libcurl maintainers got asked (probably insistently) to switch to Rust. They got a grant to get started and did some good progress on the project, while also not being as proficient in Rust as they are in C. But the project never really reached feature parity, and people didn’t really care to push it over the finish line. I would be bitter too after investing time into something because people asked for it and didn’t show up.
The author of Hyper created a C API for libcurl to use, so I’m not sure how much Rust the Curl devs had to interact with. In libcurl work had to be done to allow for switching between HTTP backends.
Note that the article mentions two other experimental curl components implemented in Rust. They might also be on the way out, but I didn’t get that impression.
Absolutely, I’m not saying they don’t have a right to be bitter
I mostly write SQL nowadays. What I don’t get about these critiques is the problem that they are trying to solve.
SQL is quirky and sometimes opinionated. But it works well enough most of the time. A query statement is self contained, separated in digestible chunks that map to the data extraction logic, and is readable enough (if someone formats the query in a readable way).
The proposed concatenation of functions is more confusing, which is more similar to how R or pandas manipulates data using objects rather than how SQL builds the query from tables. I guess that if you come from an imperative language that heavily uses objects, it might be difficult to start writing SQL right away. But after having worked with data for sometimes, I would choose a long SQL query over a concatenation of 10-20 functions.
If you’re trying to programmatically generate queries (for example, if you’re making a generic business intelligence tool that aspires to query across databases with different schemas) it’s really painful because SQL doesn’t compose very well, and if you find a way to compose SQL queries it frequently results in really stupid query plans even though you didn’t meaningfully change the semantics of the query.
But even if you’re just using a database “normally”–writing application queries against a known schema, even those of us who have been doing this for decades still have to reference the documentation for banal things because there’s very little syntactic or grammatical consistency with respect to how you express similar features. Sometimes a subquery has to be aliased, sometimes it doesn’t. A CTE is pretty similar to subqueries, but the syntax for defining and referencing them is different (e.g., a subquery can be referenced by its alias directly, but a CTE alias has to be selected from). One can pretty easily imagine a more consistent language than SQL.
Rationally, going to a casino or playing the lotto doesn’t make sense. As you show, you lose money in the long term.
But people play for the short term possibility to win. People win the lotto (6 right numbers) every once in a while and they haven’t been playing the lotto for 70,000 years.
Also, isn’t there another theorem that shows that the probabilities of independent events (like lotto numbers) don’t stack? Meaning, I I lose this week, I don’t have a different probability of winning this week, because the probability of losing twice in a row is lower than losing and winning (if it makes sense). I always get tripped up between the binomial theorem and this independence theorem.
Not a theorem, it is the very nature, and the definition, of independent events.
“People” dont win the lotto every now and then. A tiny amount of people wins the lotto throughout times. That is a a huge difference to the point of meaning the opposite.
There is no such thing as “short term possibility to win”, it is just a chance. Sure, you can win, but you can also be hit by thunder or bit by a rabid dog while walking on the street.
I think lotto discourse tends to ignore two things:
Now of course, these arguments no longer apply when people are gambling significant amount of money (or at least significant for them).
What’s “rational” is non-obvious to me. Maximizing expected value seems obviously rational but stuff like the [Petersburg Paradox] shows what kind of weird and undesirable consequences it has.
What’s “rational” is non-obvious to me. Maximizing expected value seems obviously rational but stuff like the St. Petersburg Paradox shows what kind of weird and undesirable consequences it has.
I used to play the lottery weekly until I realised it makes more sense to buy all the unique lottery tickets I’d ever buy at once to maximise my odds. Is that accurate?
Rationally it’s a bad idea to go to the casino [if you want to maximise your savings] (or you’re genetically predisposed to addictive behaviour). The casino is fun. Now I play roulette for fun now and again, because winning feels better than losing feels bad when playing with money I’d otherwise waste on penny whistles and moon pies, and it’s a blast. When you hit, you hit and that rush of brain chemicals is like nothing else.
There have been isolated cases where the design of particular lottery systems could under certain circumstances have positive expected value. Because it involved MIT-affiliated people, this instance is probably one of the more famous/infamous:
https://alum.mit.edu/slice/calculated-approach-winning-lottery
But it’s worth noting that for their biggest wins that group was buying huge quantities of tickets for a single targeted drawing, with a careful distribution of number selections. And in that specific case there was no way to avoid doing so. The details, for those who are interested, were:
This is a common illusion. That’s why some people wait in the casino until black comes out five times in a row and then they bet on red.
The problem with box plots is that they only represent a summary of the distribution, linked to the 5 number summary. As far as summary statistics go, the 5 number summary is not the best but it’s not the worse. It just gives you an idea of how the data is distributed, if it has a long tail or a peak in the middle. But there is some data loss in the “compression” of the distribution in quartiles, especially in the case that 25% of the distribution is clumped together in one of the quartiles, like the linked article shows.
Using histograms (or other plots like in the linked page) is another solution. It increases the complexity of the plot, so they are a little bit harder to read and interpret.
At the end of the day, most work just compares means and standard deviations, and these statistics “compress” the information about the distribution even more.
At the end of the day, use graphs that help you tell the story that you want to tell about your data. If you are interested in exploring why there is a whole in your distribution (see example in the link), then use a histogram (or a similar graph). If you are interested in discussing the shape of the distribution (e.g., the first example in the link is similar to all income distribution graphs), then a box plot might work.
So…they’ve reinvented inline css?
Absolutely not:
padding:12px_6pxis “atomic”. Not to confuse with inline CSS:
padding: 12px 6pxThere is too much whitespace in inline CSS, just unreadable.
What does atomic mean in this context, and how is the inline css not atomic?
This is what they have on their help file:
Maybe they are inspired by this?
My understanding is that their CSS framework is a middleware between low-level CSS (native? actual?) and high-abstraction frameworks (tailwind? bootstrap?).
I’m not sold on the value-added of something like what they are proposing over just writing CSS directly.
They also reinvented CSS classes, but they call them patterns. https://mojocss.com/docs/guide/component-abstraction#using-patterns
Mojo is one of many atomic (a.k.a. utility-first) CSS frameworks. It didn’t invent the concept. According to Let’s Define Exactly What Atomic CSS is, the term “Atomic CSS” was coined in 2013. Mojo was released a month ago.
How is atomic/utility-first CSS different from inline styles?
Next, you might wonder why would anyone would use utility-first CSS over semantic CSS. The Case for Atomic / Utility-First CSS links to some arguments. (I don’t claim that utility-first CSS is always better – I think it has upsides and downsides. And of course, any specific implementation of utility-first CSS such as Mojo also has pros and cons.)
Ok but in this case the name of the class is the css to be generated.
Not really. This Mojo CSS example (source) demonstrates two things Mojo does that inline styles can’t:
hoverstyle as an inline style. As my comment’s second link said, “Inline styles can’t target states like hover or focus”.px-5can be much shorter than equivalent CSS rules such aspadding-left: 5; padding-right: 5;.padding-leftandpadding-rightto the same number makes consistency easier to achieve.pl-*(padding-left), but the arguments I’ve heard from Tailwind users are that you can not only type the rule faster, you can see more such rules on the screen at a time, requiring less horizontal or vertical scrolling when navigating. Some Tailwind users claim that after you learn the abbreviations they aren’t slower than regular CSS.I really like projects like just or makesure in that they’re trying to take what is good in make, and leave the clubkyness aside. However, both oversimplify the problem space in my opinion, making it hard to imagine how they would scale to a whole program build system: how do you build, run, test, package, provision, deploy, release, decomission, etc in a reproducible way all within a consistent system. So far I have only found make (with all it’s idiosyncrasies and pain points) to fit the bill, but am really hoping there will be viable alternatives for medium scale projects.
I’m pretty sure Just is explicitly not designed to serve as the platform for a build system. The resemblance to Make is incidental.
Just is great: not a build system, it looks pretty, supports inline scripts in other languages, and most of all it doesn’t try to give you enough rope to do whatever enough rope is for.
I mostly used make to run commands for a project without using the build system stuff. Just replaced it pretty much all the time because it’s easier to use.
Yes it’s the first listed feature in the README:
Do you use “make” to store and provision an application? If yes, that’s a new one for me.
Usually make will handle dependencies and builds. The rest is handled by one or more CI /CD systems (storing the artifact, deploying the artifact, etc).
Yes, at least in the sense that it uses Nix to provision an env/tools and Pulumi to provision cloud resources and deploy the artifacts. The key benefit is that I don’t have a distinction between CI and local: I can run everything on my machine, and it will run just the same on CI/CD. Another benefit (at least for me) is no YAML.
Make doesn’t need to store the artifact, that’s what the file system is for.
What is the common mental model that people have of JOINs?
For me, I put together the left table first. This is the data “core” that I’m interested it. Then I join stuff to this core table and it will always be the left table.
I use WHERE clauses to filter the “core” data, and secondary conditions on the right table when joining. I don’t find adding conditions on the “core” table in ON statements to be very maintainable in the future and I like to have the conditions on the right table in the ON statement so I don’t have to go find them at the bottom of the query if it’s a long one.
As @squadette just clarified in another comment, the oversimplified thing in that model is that the rows in the left table can “multiply” if there are multiple matching rows in the right table. If you just think of it as attaching rows from the right table to rows from the left table, you don’t get the case where the left table ends up “bigger” due to this duplication.
If they ban “end to end” (?) encryption. they have to ban HTTPS.
TLS is for point-to-point encryption.
who says a point can’t be an end
A point can be an end, sure. You can say TLS implements “end to end encryption” between the client and browser.
You would of course be ignoring what e2e means in practice: encrypting communications between two individuals. TLS is not e2e as typically the website or service used to facillitate communicatioin is not the intended recipient of said communication.
Furthermore, HTTPS typically only authenticates the server and not the client. e2e encryption authenticates both ends in practice.
Finally, legislation targeting e2e encryption, e.g. in the UK, specifically forces the companies hosting communications platforms to be able to decrypt communications going through them. None of these companies would be using TLS to implement e2e as that would only encrypt traffic to and from the services. Therefore saying “if they ban e2e they have to ban HTTPS” is false.
but if the requirement is only for communication platform companies then they’re not really banning end-to-end encryption. you can obviously use HTTPS for communication between individuals, even if it wasn’t designed for that.
You’re focusing on semantics and missing the bigger picture. Sure they’re not literally banning all end to end encryption. They’re “only” banning by far the most common deployments of e2e today, e.g. Signal, WhatsApp, etc, thereby harming the privacy of ordinary people and leaving determined criminals unaffected.
our conversation has naturally been about semantics because your initial comment was “TLS is for point-to-point encryption.” the bigger picture is another topic.
i have no idea what you two are talking about.
sometimes a point is an end. that’s one of the issues here, is that who/what an “end” is is a fuzzy concept
if you ban it without answering exactly what it is, you could end up either not banning anything, or banning the whole internet (there are other issues with banning e2ee, but one of them is defining it in the first place)
Regarding E2E vs P2P. It feels unclean to take on the terminology of the enemy, who has specifically crafted it aid them in disempowering people. So I gather that the idea is not to ban encryption, but to ban people from using encryption with each other, but it would still be allowed to use encryption between your computer and a web server that is run by people who are approved known good guys, or something.
This doesn’t make much sense and the posted article seems to be writing some code using a crypto library. I don’t think that the fact you can do that is particularly relevant to the idea of this ban. The idea that you could circumvent a ban by just illegally using cryptography is obvious. I don’t mean to be negative, it’s cool how short you can write something like that (athough it is not well engineered primarily for reasons you mentioned in the bullet list).
I don’t think this terminology was first coined by “the enemy”, but as a way for chat application vendors to distinguish between apps that can encrypt messages between peers whilst still being routed over a central server versus apps that only have an encrypted connection to the central server(s). As in, “this app actually provides the security you might be looking for”.
Especially after the Snowden revelations it became quite obvious that merely having a secure connection to the intermediate server where things can be decrypted is not enough, and we need end to end encryption, even if we’re not using a true peer to peer system (which would be strictly less usable due to parties having to be online at the same time in order to be able to exchange messages).
e2ee usually refers to communication between two people, let’s call them Alice and Bob. Let’s also say that Bob gets in trouble with the police and his messages to Alice become of interest. With e2ee, police can’t read the messages because they don’t have the keys, unless they ask Alice or Bob for their devices and read them through that. Apple’s on-device CSAM’s scanner was a way around weakening e2ee while meeting the law’s intended purpose to “protect the children”.
https is different because the police can directly go to the server operator and ask for Bob’s logs, without risking to tip off Bob (or Alice) of the request (see Warrant canary for more reading),
Somewhat off topic. Does anyone know were to find a shapefile editor? I would like to setup a locator for work but I need to draw the boundary shapes first.
There are some great JavaScript librerías that can draw shapes as GeoJSON - I like leaflet-freedraw for example, which I wrote about here: https://simonwillison.net/2021/Jan/24/drawing-shapes-spatialite/
I agree with Simon that LLMs are akin to the introduction of mainframes in the 50s on clerical work (think of Hidden Figures).
I can see two different things happening. First, just like a calculator, LLM will be able to enhance the work of a skilled person. Just like a calculator/mainframe/computer helped with streamlining engineering work. Moreover, I am excited to see what new avenues these models could open up in the future.
On the other hand, the introduction of mainframes has made certain positions to be completely redundant (like human computers). It will be interesting to see how LLMs will impact positions that we believed were safe from automation, like copywriters or other creative writing work. Also, it will be interesting to see how it will change/impact English Language Arts instruction. Just think about your math teacher telling you that you had to learn multiplication tables because you were not supposed to carry a calculator with you all the time. The idea of the 5-paragraph essay is probably out of the window with LLMs, as most of these models can write an essay at the college level.
The things that make me excited about LLMs are the projects that integrate them with other services. LLMs make fantastic UIs for low-stakes interactions. I remember when the semantic web hype was at its peak, the vision was that companies would provide web services to interact with their models and anyone would be able to build UIs that connect to them and, especially, write tools that query dozens of data source and merge the responses. Some of the things I’ve seen where LLMs translate natural-language queries into JSON API requests make it seem that we might be close to actually providing this kind of front end.
The hallucination problem means that I wouldn’t want them to be responsible for every step (sure, go and plan a trip for me querying a load of different airline and hotel company data sources, but please met me check that you actually got the dates right), but there’s a lot of potential here.
It’s probably going to end up being used a lot for programming language docs.
I know how to get through https://www.python.org/doc/ by this point
But if I pick an unfamiliar language like https://ruby-doc.org/ , I would probably prefer to query it in natural language
Though I also hesitate because of the hallucination issue. I know teams have been working on attribution
If they can solve that problem it would be great
My hunch is though that it breaks some properties of the model, and blows up the model sizes
I think that hallucination/confabulation is inherent in the framing of the tool. Decoding notes from a few days ago, I think that there are formal reasons why a model must predict a superposition of possible logical worlds, and there will always be possible logical worlds which are inconsistent with our empirical world.
Briefly, think of text-generation pipelines as mapping from syntax to semantics (embedding text in a vector space), performing semantic transformations (predicting), and then mapping from semantics to syntax (beam-searching decoded tokens). Because syntax and semantics are adjoint, with syntax left of semantics, our primary and final mappings are multi-valued; syntax maps to an entire subspace of possible semantics, and semantics maps to a superposition of possible text completions.
I think this is precisely the kind of thing that would benefit from proper integration with a query API. If the LLM is able to do a rough mapping to some plausible API names and then query the docs to find if they’re real and refine its response based on that then it doesn’t matter if they hallucinate, the canonical data source is the thing that fixes them. If the LLM can then generate examples, then feed them into a type checker for the language and retry until they pass, then it could be great for exploring APIs.
i am an expert on ruby and gave it a small prompt and the answer was something akin to a highschooler bullshitting their way through a book report instead of a cogent answer to the question. it will probably get better but right now it’s not there for that case. i read the docs for the item and the answer was clear.
Yeah for topics I know well, I get the same thing. Sometimes it’s pretty good, but pretty often it either
But I was thinking about the case where you train it on Ruby docs specifically. I think the hallucinations become much rarer then, which is mentioned in the OP’s blog post.
There is apparently a way to fine tune the model based on a particular corpus.
Either way, I still think people will end up using it for programming language docs. I’ve tried it on stuff I don’t know as well, and then I google the answer afterward, and it saves me time
I think this is also possibly because Google is pretty bad now. But even though there’s a lot of annoying hype, and huge flaws, it seems inevitable.
The good thing is that there will be a lot of open source models. People are working on that and they will train it on PL docs, I’m sure
At this point I’ve read dozens of Bayesian articles trash-talking frequentists. I have not once met a self-identified frequentist. I don’t believe they’re real. If they are, I have no idea what they look like. It’s like I’m reading articles about how bigfoot is an antivaxxer.
But didn’t you have to step out of the framework with “bayesian-subjectivism” school too? To think “there’s a 77 % chance variant A is actually better than B”, you had to have picked a prior beforehand.
In the spectrum of things, I’d say I identify as a frequentist. I find all these articles pretty tiresome, though. The connections are much stronger than articles like this acknowledge…
Generally, I find Bayesian stuff often does a poor job characterizing or thinking about important things like sampling error or data-driven analysis decisions (though of course it is possible to handle those things).
Also, I’d challenge one to come up with a useful interpretation of specific probability values without eventually appealing to some frequentist scenario like betting…
When you say you’re a frequentist, do you mean you won’t use bayesian methods? Most of the people I know are either capital-B Bayesians who think that they should never use frequentist methods, or “eh I use both whatever works” people.
Well, yeah… whatever works. I’d say I’m frequentist in that I believe frequentist methods are valid and useful, which seems to be the point of contention of the never-ending pro-Bayesian posts?
It’s unlikely that you’re actually a frequentist. Try this quiz to find out https://www.youtube.com/watch?v=GEFxFVESQXc
What’s the probability given that I’ve never seen a Black Swan?
I think very few people identify as frequentists for two reasons:
(1) frequentist hypothesis testing is the norm in the statistics. It’s less common for people who belong to the majority group to actively identify with it because most people will just assume that they belong to it. For example, if you read a paper that reports results of a t-test (or whatever else), you can safely assume that they did it within a frequentist framework (i.e., null hypothesis, alternative hypothesis, reject/not reject the null hypothesis)
(2) most people think about probability in a Bayesian way regardless of whether their statistical framework is frequentist or Bayesian. That is, people think about probability as a likelihood of something being true (e.g., a p-value of 0.01 means that they are 99% sure that the null hypothesis is false). This is how probabilities should be interpreted within a Bayesian framework. This is also the wrong interpretation of probabilities in a frequentist framework. In this case, the interpretation relies on sampling from a population and the frequency of samples with the statistic equal or more extreme to the one observed under the null hypothesis.
The trade-off between Bayesian and frequentist statistics is between frontend and backend complexity. Bayesian gives you a clean interpretation of results at the cost of complexity in modeling and program specification. Frequentist statistics gives you a wonky interpretation of results but has more straightforward modeling/testing.
This is one of a bunch of different accounts by #mastoadmins. I find it funny that in the past month, scaling mastodon has gone from a problem no-one ever really had, to a problem that is well-documented and understood.
This is a particularly interesting example, because @nova@hachyderm.io was running the instance as a small instance in her basement, that went from 700 ( a fairly small instance) users to 30,000 users (one of the biggest) in a month. It started out as a pet, and had to quickly scale up to a herd of cattle, and it’s a fascinating account of how they did that.
The money quote of this account:
I believe it’s a paraphrase of the famous quote of John Gall: https://www.goodreads.com/quotes/9353506-a-complex-system-that-works-is-invariably-found-to-have
It didn’t have to by the way. Leaving sign ups open up to 30k users all while hosting in the basement was a choice.
A choice that was clear in hindsight, not clear during the initial investment.
I dunno; it was pretty obvious past a certain point, and that certain point was much, much lower than 30k.
You get to ten thousand users, you’re seeing one new signup every 90 seconds, what exactly do you think is going to happen next week? You get to 20,000, and what, you think it’s just going to stop for some reason? Doesn’t make any sense to me.
I think they address that in the article, adding more users didn’t necessarily correlate to increased load on the system. I can see there might be some extra load (new users following other new users that you weren’t already federating with), but they likely would’ve had similar size issues if they’d capped it at 10k users rather than hitting 30k by the sounds of it.
That’s an interesting thought. Instance load should correlate with the size of the network that instance follows. As you point out, there should be a point where adding an extra user makes only minimal impact on the overall network, as they will follow people who are already known by the instance. I wonder where that point is.
Timeline generation in Mastodon is surprisingly expensive, where it can be a major contributor to the total load of the server. IIRC Mastodon has an optimization to stop timeline generation for inactive users because mastodon.social had a ton of inactive users, but the timeline generation for them was contributing a lot of the total load.
From TFA:
Sure, but even ignoring the moderation issues, it would have been much easier to move to new disks if the number of users were reasonable. You don’t have to move all the remote media, just the stuff owned by the actual local users.
Pretty bold choice, too.
Here’s a thread from mastodon (link) from the hachyderm admin team that addresses this with more of their thinking:
Kris Nóva — @nova
@recursive
Kris Nóva — @nova
Hazel Weakly — @hazelweakly
dominic — @dma
Hazel Weakly — @hazelweakly
Michael Fisher — @mjf_pro
Malte Janduda — @malte
Hazel Weakly — @hazelweakly
Malte Janduda — @malte
I don’t get this post. I think the author might miss out the point of collaboration.
If I’m stuck on a coding problem, I might go do something else to “reset” and usually I come back to my desk with a solution. I wouldn’t want to have a group brainstorm on it.
But, if my team is working on a feature roadmap for the next couple of years, I believe that a brainstorm might be the best way to get it outlined. I don’t believe my team will buy a roadmap that I came up with while mowing my lawn. The brainstorm session is more about buy in and team building than pure creativity.
Not just that, together you might come to a much better design than alone. There’ve been plenty of cases where I got stuck on a legitimately difficult issue and wanted to hash out our options with a colleague, and then together we came to insights I probably wouldn’t have come up with myself, and vice versa.
Sometimes one person’s “obvious observation” is another’s “genious insight”.
And besides that, most of the typical industry work isn’t like that at all - most of it is boring plumbing work that just requires moving stuff around, reading up on APIs and trying out incremental changes.
I think you’ve got it - and while the author does seem to want to smoosh “brainstorming,” “collaboration,” and “innovation” all into one thing that can’t happen in a group of people - I don’t think the author misses the point of collaboration.
The core of it is - well, at the core of the article:
I think the author is writing for folks that think collaboration is the whole team sitting down, whipping out whiteboards AND that exact activity is the answer to every problem, cultural, technical, or otherwise.
While we’re throwing around anecdotes:
For better or for worse, at my job, collaboration is seen this way. Like the article says, because they plan the big brainstorming sessions, people show up, and then a little while later, big, innovative things happen.
But what really happened is that those big things were long in the making (and the result of one, maybe two people thinking about it away from a computer… in the shower, before falling asleep, zoning out at the aquarium) and when management threw everyone in the room together, those one or two people already knew how to argue their case, convince the rest of the folks on the team that just wanted an easy solution, and lo-and-behold - the whole team came up with an innovative solution.
Management sees collaboration and the few folks that ideated have no reason to challenge the system because it works and makes everyone’s lives easier and better.
The author would have management (The Business) wake up and realize what’s going on - but I don’t think that would help much.
Management wants successful teams at the end of the day (who knows why, really) and those one or two people that actually ideate want to continue ideating and leading by example — at least until they are under-compensated, taken for granted, or overruled too many times for less-than-optimal solutions to problems.
I think of a spreadsheet as a way to represent your data. Each data token is the intersection of a variable (usually a column in the spreadsheet) and an observation (usually a row in the spreadsheet). The intersection of variable and observation becomes a cell.
When you want to interact with your data, you can think of most operations either working with variables/columns or within an observation/row. Most spreadsheets implement this with cell-level formulas, but you can think about those formulas as either column- or row-wise operations.
Some advanced formulas (vlookup comes to mind) can work on whole columns instead of only cells.
As far as programming goes, you can think of a spreadsheet as a matrix, where you can do column, row, or cell operations are appropriate.
Great that threads were added. A shame that they are slack-threads and not zulip/email/reddit -threads
I haven’t used Slack in years, what is the difference between their threading model and others?
IIUC, Slack threads branch only from a top-level message in a channel. Subthreads aren’t really a thing.
I haven’t tried Matrix/Element threads but I too prefer Zulip’s threading model to Slack’s.
You can reply to a Slack message in a thread, and the replies will be hidden from the main channel - the original message will have an extra line saying “10 replies” which you can click to view the thread in a sidebar. If you reply to a thread (or specifically subscribe to it) you get notifications for new replies, otherwise the thread is pretty much invisible once the original message scrolls out of view. I’m on some channels where people try to keep all conversations in threads, but the natural thing to do is to simply reply in the channel. If you want to add to a specific discussion once the main conversation has moved to another topic, you can either reply in a thread (and no-one will notice) or reply in the main channel (which means you are trying to change the subject back).
In Zulip, all messages in a channel (called a “stream”) belong to threads (“topics”) but the messages are also shown in the stream, ordered by time. Topics have names, kind of like email subject lines, and they are visible in a side bar with separate unread counts. You can click on a message to zoom into that topic. If you accidentally attach a message to a wrong topic, you can move it to another one.
I believe the decision stems from how the conversation is conceptualized to be represented in the chat thread.
Slack (and now element) conceptualize the threads to be “secondary” to the main conversation. Maybe you want to reply to an older message after the main thread has moved on or you have a one off comment to a message that doesn’t need to break up the main conversation. In this model, there is a 1:1 relationship between the channel topic and the conversation within. Threads give the option to branch off from this in a somewhat clunky way.
Zulip takes a different approach in conceptualizing conversations and threads. In their model, each conversation revolves around a topic (within a channel). This way to organize conversations is more akin to how mailing lists or forums organize content. This breaks up conversations into subtopic containers.
Both approaches have merits and drawbacks. But the main difference stems from wanting to model in the chat platform a “water cooler” conversation vs a billeting board.