1.

The solution is to kill the gnome-software process if you want to use ‘Software’ more than once per session. Apparently this has been an issue for at least 12 months

Feel like this was an issue for me back in 2018 and stuck with me until I left fedora.

1. 6

Starts off pretentious and goes to the moon. Pretty woodworking, though.

1. 6

I was that age once. I thought I knew it all.

Time has certainly humbled me.

1. 6

Modern software engineering is rotten. I should know — it’s been my livelihood since I graduated from college in late 2019.

Yeah sheesh.

1.

tbf he’s about the same age I was when I first discovered formal methods, so sometimes 25-yo energy leads to good things when you age a bit

1. 10

The only conference I truly love is !!con, about the joy and whimsy of computing. Where else can you see someone build a Rube Goldberg of web services for his daughter at college so that when she:

1. sends a certain text message, it
2. triggers a tone at home that the family pets have learned means to come running, and then
3. a 3d printed treat dispenser gives them treats in a certain spot, so that
4. they’re in the right position for the automatic camera, which
5. snaps a shot of them and texts it back to her, so she can see her pets
1.

Awww dang I’m gonna be on flights all of !!con :(

1.

We haven’t announced anything for 2023 yet so there’s still a chance :)

1. 10

(on stuff existing forever) Did it ever work for anyone? — No…. But it might work for us!

The hubris

1. 7

Unicode contains the letter 𓂺, an Egyptian hieroglyphic invented thousands of years ago. Unexpected things can endure for a long time.

1. 8

I feel like you chose this particular example of a hieroglyphic letter for a specific reason but I can’t put my finger on it.

1.

Shows up as a blank box to me, which I feel is meaningful in this context.

1.

That’s because a lot of default OS fonts leave that symbol, and only that symbol, out. Shows up fine on Android though.

1.

It’s shown for me in Chrome on macOS, but not in Chrome on Windows.

For those who can’t see it, it’s a penis, shown in the act of either micturition or ejaculation.

1.

I can’t imagine a more appropriate single-character interpretation of human history.

2.

That’s deflating.

1. 6

Is it natural for me to have skeptical feelings for something like this?… The reasoning being, there has already been so many man hours poured into FP languages, by various experts, and it hasn’t been “paradigm changing”. On top of that, optimizing FP seems to take a lot of work and knowledge. You’re telling me they’ve got all the experts, and hammered something out (which itself sounds inadequate to achieve something great here), which is better than GHC or similar? It’s incredible if true. Like insanely incredible. I would love for someone to explain to me why Verse has a chance at all.

1. 22

… The reasoning being, there has already been so many man hours poured into FP languages, by various experts, and it hasn’t been “paradigm changing”.

It definitely it has been, look around! Languages have been copying lots of things from FP tools. Algebraic data types, higher-order functions everywhere, pattern matching, immutable by default, futures and promises, generators, etc. These features are many decades old in Haskell, OCaml, etc.

You’re telling me they’ve got all the experts, and hammered something out (which itself sounds inadequate to achieve something great here), which is better than GHC or similar?

No, it’s completely different. Verse is absolutely definitely not Haskell and never will be. The goal of GHC is totally different to what Epic Games is doing with Verse.

People in Epic Games have been thinking about this for decades, recently invested a bunch of time, hired a bunch of experts, made a language, decided to rewrite one of the most popular games in the world using the language they just made. So yeah, I think “Verse has a chance” at doing what it was exactly designed to do. Why not?

1.

decided to rewrite one of the most popular games in the world using the language they just made

Verse’s current implementation compiles to Unreal Engine’s Blueprint VM, so I imagine the integration into Fortnite was simple and not a complete rewrite. The /Fortnite.com/ package contains a bunch of Fornite specific APIs that FII out to Fortnite code iiuc

1. 5

Yes, the initial integration uses existing tools, but from Tim Sweeney (just after 33 minute mark):

We see ourselves at the very beginning of a 2 or 3 year process of rewriting all of the game play logic of Fortnite Battle Royale (which is now C++) in Verse…

1.

Verse’s current implementation compiles to Unreal Engine’s Blueprint VM

I really hope the Open™ implementation is implemented on top of something a lot more efficient. Blueprint is horribly wasteful. The VM and bytecode has been largely the same since the UnrealScript days - for the curious, it’s essentially a glorified tree-walk interpreter (but with compact bytecode instead of heap-allocated nodes.) It really makes me wonder why Epic decided to continue using the same inefficient tech for UE4’s Blueprint.

1.

The guy who presented the language on stage built a lot of JavaScriptCore / WebKit internals, so they certainly have the right people to build a very fast VM.

2.

Languages have been copying lots of things from FP tools.

I should’ve been clearer in what I meant, but this is definitely true! Definitely true… What I meant was any particular FP language hooking everyone into the FP paradigm (or changing the FP paradigm to a degree of making it globally attractive).

Secondly, if it’s already changed then by the right people in various ways - what the heck does Verse have to offer? I think your insight further drives home the idea that Verse is just a “fun company project”, at least to me!

Thanks for the comment :)

1.

To be pedantic, generators are from CLU, which is where a lot of languages got them. I’d have to check my research notes but I think pattern matching is from Prolog? Definitely popularized by FP though.

2.

Verse is a domain specific scripting language for UEFN, the Unreal Editor for Fortnite. It’s the language used to describe game logic. The goal is to allow game logic to run in a distributed fashion across many servers, where there could be a hundred thousand players in the same game.

This is a pretty extreme requirement, I don’t actually know how you can reasonably do this in a conventional language. It makes sense that they’ve chosen a logic programming language, because that allows you to express a set of requirements at a high level, without writing a ton of code to specifically take into account the fact that your logic is running in a distributed fashion. According to what I’ve read about Verse, the existing logic languages they looked at have non-deterministic evaluation, which make code unpredictable and hard to debug. So Verse is a deterministic logic language, and that is called out as a specific contribution in the research paper about the Verse calculus.

As for whether Verse will “succeed”. The only job it has to do is to describe game logic in UEFN, and the criteria for success is that it scales to allow very large game environments, in a way that Epic hasn’t succeeded in doing using C++. There’s lots of money available and they’ve hired the best people. so I think they have a good chance of succeeding.

1. 6

I think, paradoxically, we’re going to see more ultra-terse languages, so that the AI can store more context and you can save money on tokens.

1. 3

That might not require changing the languages themselves!

For example, if you can have a $LANG <->$L translation, where $L is a “compressed” version of$LANG optimized for model consumption, but which can be losslessly re-expanded into $LANG for human consumption, that might get you close enough to what you’d get from a semantically terser language that you’d rather continue to optimize for human consumption in$LANG.

1. 1

So all those years of golfing in esolangs will pay off?? I’ve thought about this too, and you might be able to store more code in your context window if the embeddings are customized for your language, like a Python specific model compressing all Python keywords and common stdlib words to 1 or 2 bytes. TabNine says they made per-language models, so they may already exhibit this behavior.

1. 3

Or perhaps there will be huge investment in important language models like python, and none for Clojure. I have a big fear around small languages going away - already it’s hard to find SDKs and standard tools for them.

1.

I don’t think small languages will be going anywhere. For one thing, they’re small, which means the level of effort to keep them up and going isn’t nearly as large as popular ones. For another, FFI exists, which means that you often have access to either the C ecosystem, or the system of the host language, and a loooot of SDKs are open source these days, so you can peer into their guts and pull them into your libraries as needed.

Small languages are rarely immediately more productive than more popular ones, but the sort of person that would build or use one (hi!) isn’t going to disappear because LLMs are making the larger language writing experience more automatic. Working in niche programming spaces does mean you have to sometimes bring or build your own tools for things. When I was writing a lot of Janet, I ended up building various things for myself.

2. 1

Timely that I’ve started learning https://mlochbaum.github.io/BQN :)

1. 1

Perl’s comeuppance!

1. 3

I’ve been doing the same thing! ChatGPT is really exciting.

There’s no API for this [chatgpt] at the moment, but if you open up the browser DevTools and watch the network tab you can see it retrieving JSON any time you navigate to an older conversation.

Yes there is :P

1. 3

I meant an API for retrieving my saved conversations, not creating new ones. I just updated that paragraph to clarify.

1. 3

I’d still recommend using the API, because you can also tweak the system prompt, which makes it easier to control how data is formatted. Instead of “how do I do XYZ? Provide python code”, you chat “how do I do XYZ” and put “you only output python code” in the system prompt.

(Also the playground has a “download history as JSON/CSV” option. Maybe you could script against that somehow?)

1. 1

1. 1

I just used GPT4 to design a Tasker task I’ve wanted for months. God I love this thing

1. 29

Is this tag going to survive a hype cycle?

LLM-based tools are currently quickly evolving. What if the next step in their evolution will go beyond language-level representation? The “LLM” will become only a step or component, like Deep Learning is now, in something using a new acronym.

1. 12

I’ll counter with: if it doesn’t, why are we allowing the hype to build up here? Are these stories worth submitting?

Either the tag needs to be here or we need to, as a community, screen for these and flag them appropriately.

1. 9

The purpose of a tag is to allow people to hide stuff, so if we think they need screening that’s an argument for a tag IMHO.

1. 3

for what it’s worth I don’t hide anything, I use tags to find articles in specific subjects :)

and an llm tag would mean I can find articles specifically about large language models without having to read through all the computer vision and other stuff

however, I think llm might be too specific, it’s a very particular type of AI so maybe we just need a more general “language models” or “text ai” tag or something

1. 2

In general usage I think llm has come to stand in for all large statistical models even if they don’t model language. I hope our tags don’t echo that mistake.

An nlp tag would work if we’re restricting this to language models. If this is for statistical generative algorithms, including non-language models, a broader ai tag is probably more appropriate than llm.

2. 1

My point exactly! :)

2. 1

I think it will; GPT4 has been incredibly useful to me, in my personal non-software research, already. Even if the technology never gets past this point there’s still a lot to write on how we can use it as it is.

1. 4

I think the idea is more along the lines of “what if a new architecture, which happens not to be a large language model, blows GPT4 out of the water at tasks which GPT is good at?”

LLMs are only here to stay if nothing else usurps their throne.

Edit: Indeed, one of OPs proposed links is barely about a LLM. It’s about an image model being integrated into emacs (sharing infrastructure with an existing LLM integration).

• DALL-E now supported in Emacs chatgpt-shell

Which isn’t to say that we’ve moved on from LLMs yet. But is to suggest that it’s likely too narrow a topic to make a good tag.

1. 2

Counter-point – bike shedding the topic name has costs, too, and just picking one and rolling with it is likely a better use of everyone’s time than trying to get the language exactly right up front.

1. 11

What are some ai-tagged posts that wouldn’t be tagged llm? As I understand, there’s a lot of discourse around about whether AI is actually anything more than LLMs, so it seems like there isn’t even consensus that the two are separable categories. That would make it hard for the tag to be used correctly, even if it were a sound distinction in theory.

1. 20

Non-LLM machine learning, traditional symbolic AI, gamedev behavior trees.

1. 5

My understanding, basically like the other comments I didn’t see until I finished writing my own, is that Language Models are specifically a kind of statistical model trained on words, and the large ones are implied to have been trained on substantial scrapes of the internet, or some similarly Chunky dataset. The original ai tag proposal has a couple examples of technology under the umbrella of “ai” which appear to me to not be about Language Models in particular:

…Although some are also about technology demonstrated in their application to building language models:

While others seem to be language models…but not “Large” (?) ones:

Also, recent ai submissions are likely heavily weighted more into the specific technology of LLM’s because of their abrupt, recent rise in popularity, which I think justifies separating them from other ai submissions.

1. 4

Anything that doesn’t use LLM, which is for example anything not made for text input.

1. 3

The one post a year celebrating genetic algorithms.

1. 2

DALL-E and its ilk are trendy AIs that aren’t LLMs

1. 5

Surely there is an interesting new discussion of the nature of “rationality” happening somewhere, now that we have an example system that can “explain its chain of thought” in a (usually plausible) way when asked, that we know for a fact is entirely based on probabilistic associations between word fragment sequences and no specific “reasoning” mechanism. Does anyone know where to find this?

1. 7

It kinda goes the other way around. When prompted to reason step by step LLM answers are more accurate. If you ask afterwards it’ll start doing mental gymnastics to defend whatever answer it gave.

1. 2

Interesting. I would have thought this commitment bias to be limited to humans. I would have thought ChatGPT rather immune to this, for I have seen it change its mind¹ rather easily when corrected.

1. 8

The model behind ChatGPT has been primed to do chain of thought reasoning. When asking complex question you’ll often notice it’ll first give an explanation and then the answer. The model didn’t think of the answer first and then thought it would be useful to lead with some introductory information, it’s how it gets to the answer in the first place.

Furthermore, ChatGPT is not a chatbot whom you ask questions to, it is a text generator simulating a chat log. It takes quite a bit of postprocessing work to stop an LLM from completing the “human” part of the conversation. It’ll take what you say for granted because it doesn’t know the difference between what you type or what it generated, it’s all just text to complete. What you type becomes part of the reasoning, so to speak.

I don’t think it’s too far fetched to draw parallels to human reasoning. When you go over an idea in your head you’re not doing deductive reasoning. Unlike LLMs though we can do that if we decide to, or we can test in the real world what we came up with, and we can generally tell the difference between things someone just told us and our own opinions and knowledge.

1. 4

It’s emulating a human’s commitment bias, because it was trained on examples that show commitment bias.

1. 2

I think it’s more because of how it works: it’s an autocomplete based on previous tokens, so when adding text after an answer, it already has it as a reference and it influences the prediction. Not that it causes itself to emulate humans explaining themselves, but rather that there are already topics and statements to talk about.

2. 2

Here’s an example I had of ChatGPT getting stuck on answer and doing a bad job of justifying it:

Me: What has three legs in the morning, two legs in the afternoon, and four legs at night?

ChatGPT: The answer to this classic riddle is “a human being.”

The riddle is a reference to the stages of life. In the morning of life, which is considered to be childhood, a human crawls on hands and knees, hence having three “legs.” In the afternoon of life, which represents adulthood, a human walks upright on two legs. Finally, in the evening of life, which represents old age, a human may require the assistance of a cane, hence having four “legs.”

This riddle has been around for centuries and was famously posed by the Sphinx in Greek mythology to travelers who sought to enter the city of Thebes. Oedipus was able to solve the riddle by answering “man,” causing the Sphinx to destroy itself in frustration.

It’s not weird to get stuck on thinking I asked the riddle of sphinx when I didn’t, but it is weird (from a human POV) to misexplain the riddle.

1. 5

GPT4 seems to handle it:

Me: What has three legs in the morning, two legs in the afternoon, and four legs at night?

GPT4: This riddle is a variation of the famous riddle of the Sphinx from Greek mythology. The original riddle goes: “What has four legs in the morning, two legs in the afternoon, and three legs at night?” The answer to the original riddle is “a human being,” which refers to the stages of a person’s life. In the morning (infancy), a human crawls on all fours; in the afternoon (adulthood), a human walks on two legs; and at night (old age), a human walks with the aid of a cane, which represents the third leg.

However, your riddle has a different sequence of leg numbers. While it may be an intentional twist on the original riddle, there isn’t a clear answer that fits the description.

GPT4: Ah, I see what you’re getting at. Benjamin Button is a fictional character from the short story “The Curious Case of Benjamin Button” by F. Scott Fitzgerald, which was later adapted into a movie. The character ages in reverse, being born as an old man and growing younger throughout his life.

Using the Benjamin Button scenario in the context of your riddle, it could work as follows:

In the morning (old age), Benjamin would have three legs, including a cane for support. In the afternoon (adulthood), he would have two legs, as he would no longer need the cane. At night (infancy), he would have four legs, because he would crawl like a baby.

In this context, the answer to the riddle could be “Benjamin Button.”

1. 3

Humanity is doomed. :-)

1. 2

…well that’s alarming.

1. 1

Try prompting it with some other answers and see if it justifies them all

2. 1

Relatable.

3. 7

It’s a pretty old idea that humans spontaneously generate stories justifying their actions or beliefs on the spot instead of those pre-existing justifications motivating the action or belief. I think there’s a lot of credence to this idea, I often discover my own opinions about things in the course of writing them down in comments or blog posts or a journal or whatever else. Writing is nature’s way of letting you know how sloppy your thinking is and all that.

1. 15

The proposed answer is actually a data input to another program that will output the digits. In this respect it is no different than the “input to bc” proposal that they discarded early on. I have to classify this as clickbait, though it is a higher class of clickbait than the “One weird trick” type of trash that one usually finds floating on the internet.

1. 5

I agree. One could easily ‘win’ this challenge by simply stating that the ‘representation’

 \infty


(which is 6 bytes and thus only 48 bits long) input into LaTeX exceeds all real numbers, given it prints out the infinity symbol.

1. 10

like the image compression algorithm which interprets a file of zero length as Lenna, and anything else using PNG, thus achieving great compression on synthetic benchmarks

1. 3
1. 2

Yeah, this. I swore xkcd did it, but I guess not.

2. 6

This isn’t in the spirit of the game, which is too find the largest finite number. It’s like saying you can run a marathon in 15 minutes by using a car.

1. 2

\infty isn’t a number, though.

2. 2

Yes, this kind of question only makes sense if you pick a fixed language for the representation. However, as long as you pick something turing-complete, it doesn’t matter too much which one you pick, it’s mostly just a constant factor on how many bits are required to write something.

Back in 2001, somebody held a contest challenging people to express the biggest integer they could in 512 bytes of C. The results were quite interesting, and the best programs so advanced that it wasn’t even obvious how to prove which one would produce the biggest output.

1. 2

Knee-deep in an Alloy specification for a custom ERP system, but I’ll make some time to take my son to the beach.

1. 3

I wrote some of the docs for that! Feel free to ping me if you have any alloy questions :)

1. 1

Thanks!

Today I was refering to your post introducing Alloy 6. I thought I would write the spec using only the old-style, but it got cumbersome. Thanks for the great docs and articles.

1. 7

Just like humans. Only 10% of the llm’s brain is actually used.

1. 18

That’s a myth. Using 100% of your brain at the same time is called a “seizure”.

1. 10

This. It’s like saying that your 4-cylinder car engine only uses 25% of its capacity ’cause only one cylinder fires at a time. Try firing all 4 cylinders at the same time all the time and see how well it works out.

1. 3

I believe the origin of the myth is that fewer than 10% of your neurones are firing at any given time. This is somewhat intrinsic to how neural networks work: if every neurone fires at the same time then you will always have the same output signal. Imagine an LLM where every single value in every matrix is one. It probably won’t give interesting output.

Perhaps equally importantly, your brain has to dissipate heat from the neurones firing. In normal use, this is close to 100 W. If it has to dissipate 1 kW, then the blood in your brain would boil quite quickly. Not so much a seizure as cooking.

2. 2

hahaha

1. 18

It’s a great pity to see somebody I really admire allow himself to be dazzled by this technology’s ability to mimic competence by pattern matching. An autocomplete with the entire corpus of stackoverflow will come up with some brilliant answers for some things, some subtly wrong answers for others, some absolutely wrong answers for yet others and utterly garbled output for more.

The whole issue is that you need to be able to differentiate between all of them. And again, it is all entirely based on you stealing the work of others with an advanced search tool which has absolutely no idea as to the veracity or quality of what you’re stealing.

I’m sorry Stevey, this is the next crypto. And it’s funny he mentions the whole not sticking with amazon thing because he was skeptical or k8s or whatever, because surely that argument equally applies to crypto? It’s survivorship bias, you regret not betting big on the things that turned out to work, then decide because this seems like a crazy claim like some other things that succeeded, this one must be true.

The good thing with LLM type solutions is that you can go and try and see for yourself how wrong it gets it. Picking and choosing some lisp example that happens to work is exactly how this stuff gets you.

I genuinely find this whole LLM craze extremely depressing. No waiting for the results to come in, no considering what the limits of such technology might be (yes LLMs will be VERY useful for some things, just not all). It’s just like the self driving nonsense. If you take time to think about how these algorithms work (essentially pattern matching with better results at higher data density) it becomes really obvious that it won’t work well for fields that require precise answers and which quickly get novel (driving, coding).

It’s the emperor’s new clothes, and god’s sake Stevey I thought you were wiser than this (this is really about wisdom more so than intelligence, many smart people I know have been sucked in).

I think this whole phenomenon is well described by Filip at https://blog.piekniewski.info/2023/02/07/ai-psychosis/

1. 5

A point I haven’t seen discussed yet is that, right now the stuff seems great because the majority of the content on the internet is training content: writing by people. But in a year or two when the internet is flooded with ChatGPT crap, the training data will be full of ChatGPT crap too. The algorithm can only do so much to compensate for the fact that soon it will be metaphorically eating its own shit.

1. 3

IMO there’s one very big difference between the radiology and self-driving stuff, and what we have now. Radiology was big companies making shrinkedwrapped products that they sold to hospitals, ChatGPT is an AI anybody can try to use for anything they want.

So to finish this essay on a bit more positive note, here are some professions which in my opinion may actually get displaced with existing tech: … Pharmacist. It is absolutely possible today to build a machine that will automatically dose and pack custom prescriptions. A drug consultation could be done remotely via a video call before releasing the product to the customer.

Okay this is totally insane.

1. 7

All it takes is a (few) high-profile case(s) where someone got killed by an LLM that got the dosage wrong (grams instead of milligrams seems like a likely mistake) and let’s see how quickly politicians will outlaw this kind of tech for serious use.

1. 1

I think the author does not refer to LLM in this section. It’s introduced with (highlight by me):

here are some professions which in my opinion may actually get displaced with existing tech

If he was refering to LLMs, I would have expected “with LLMs” or “with the new tech” or at least “with the existing tech” (even if existing was a bit weird to refer to something new). But written like this to me this is a reference to a broad spectrum of technology.

I understand it so that he means existing technology in general, so in most cases probably something more tailored to the use case (e.g. bureaucrat: web application, cashier: self-service cashiers, bank clerks: online banking). But all of this already exists to more or less extent in different parts of the world.

1. 2

This unity of error types across completely different codebases doing similar things is fascinating to me. I wonder whether there have been studies about the same bugs manifesting in complete rewrites of software. All of us here have probably come to understand the wisdom of regression tests, so my hypothesis is that bugs would recur. Some parts of a problem are just tricky, and humans tend to fall into the same traps. A project’s bug database is a valuable mine of information about those tricky parts!

1. 2

The legendary/notorious Leveson and Knight paper showed that different people working on the same spec often made the same kinds of mistakes: http://sunnyday.mit.edu/critics.pdf

1. 3

Python’s string processing APIs are actually quite fast and optimized. In some cases they might actually do better than Rust’s. For example, since the string representation CPython uses differs based on the string contents, it’s possible for CPython to use more optimized implementations specifically for ASCII.

This is something you should be able to check: how many names per second can Rust process if you’re not using Python?

We could instead pass the full list of petition names into the function we are optimizing. If we were to write it in Rust, it might be slower, or no faster—but we could release the GIL for much of the processing, which would allow us to take advantage of multiple cores. So while it wouldn’t save CPU time, it might save us clock time by using multiple cores at once.

There’s another reason this could be faster: you could build the list without the function or return overheads. Even if that’s only saves 10 ns per name, that adds up when you’re calling it 5 million times a second!

1. 3

This is something you should be able to check: how many names per second can Rust process if you’re not using Python?

Seconded. It feels like there’s limited value in including Rust in this exercise if you’re not going to profile/experiment to understand what’s making the Rust code slow.

P.S. Am I missing a link to the code?

1. 1

Yeah, and if this was real code I’d dig deeper, it’s possible with enough effort Rust could be much faster for example (Rust actually has an ASCII version of upperspacing so you could imagine checking which representatio Python is using, etc.).

But since it’s just thinly veiled propaganda… :)

1. 12

I guess if you have to do a project announcement, getting Steve Yegge to do a fifteen page rant is the best way to do it!

1. 12

What’s the end of a Yegge post like? I’ve never made it that far. The tops are always very interesting though!

1. 5

Yup, I was always big fan of his blog, e.g. “Size is Code’s Worst Enemy” and the like. This one didn’t disappoint either, he always keeps it real, even when hyping AI:

Entire industries are being built on Kubernetes, and it’s not even very good either :)

A weird thing is that I JUST discovered he did a “Stevey Tech Talk” YouTube channel during the pandemic:

Somebody pointed to his Emacs videos, and I watched a few others. There’s some technical stuff, but also a lot of tech industry / career stuff.

The last video is him “unretiring” and going to SourceGraph. The quality is hit and miss, and he admits as much, but I watched more than a few episodes! (not always to completion)

FWIW Kubernetes was developed in the Seattle office of Google, where many Microsoft people were hired starting ~2004 or so, and where Google Compute Engine started. Steve worked at Amazon not Microsoft, and then worked at the Google Seattle office starting in 2005. Hence the little tidbits in the blog post about seeing an early demo of Kubernetes.

So Kubernetes to me has a big Microsoft flavor (Fire and Motion, etc.), which to me contrasts with the Unix / Sun / DEC / Xerox PARC flavor of the original Google systems, developed in the Bay Area (where I worked, also starting 2005). Not that they were perfect – they also had big flaws.

Also, I’ve been a “meh” person on LLMs. This post and Bellard’s work (https://bellard.org/ts_server/) makes me a little more interested.

I’m wondering if a LLM can automatically add static types to Python code, and REFACTOR it to be statically type-able. I did this by hand for years in Oil. Some of it is straightforward, but some of it requires deep knowledge of the code.

My feeling is that they can’t, but I haven’t actually tried and verified. When I tried ChatGPT for non-programming stuff, it got things hilariously wrong, but I was purposely trying to push its limits. I wonder if it will be helpful if I take a less adversarial approach.

Though writing code isn’t a bottleneck in creating software: https://twitter.com/oilsforunix/status/1600181755478147073

Writing code faster creates a testing burden (which Yegge alludes to). If a large portion of programmers end up spending most of their time testing code created by LLMs, that will be an interesting outcome. I guess their counterpoint is that many programmers will WANT this – it will enable them to do stuff they couldn’t do before. It’s possible. Doesn’t sound that appealing to me, but it’s possible

I will say it’s true that sometimes I just type stuff in from a book or from a codebase I know is good, and I understand it AFTERWARD (by testing and refactoring!). So yes probably LLMs can accelerate that, but NOT if most of the code they’re trained on is bad. Somebody has got to write code – it can’t be everyone using LLMs.

1. 7

If a large portion of programmers end up spending most of their time testing code created by LLMs, that will be an interesting outcome. I guess their counterpoint is that many programmers will WANT this

And that is my nightmare scenario (okay, one of my nightmare scenarios), as it reduces us from code monkeys (bad enough) to test monkeys.

1. 5

Yeah it’s a crazy thing to think about … I’m thinking of a recent PR to Oil, where we spent a lot of time on testing, and a lot of time on the code structure as well. I think LLMs might have some bearing on both parts, but will fall down in different ways for each.

The testing can be very creative, and I enjoy acting like an adversary for myself. A “test monkey”, but in a good way – I let go of my preconceptions of the implementation, become a blank slate, and just test the outer interface. I think about the test matrix and the state space.

We also did a lot of iteration on the code structure. After you get something passing tests, you have to structure your code in a way so that you can still add features. For this specific example, we separate out setpgid() calls from the low level process code, so that they’re only relevant when shell job control is on, which it isn’t always. We also referred to zsh code implementing the same thing, but it’s structured totally differently.

Basically the process code isn’t littered with if statements for job control – it’s factored out. I think LLMs are and will be quite bad at that kind of “factoring”. They are kind of “throw it against the wall” types.

You could copy some code from an LLM one time. But then the next week, when you need to add a feature ON TOP of that code, the LLM isn’t going to be able to help you. It won’t even understand the code it told you to put in :)

I’m also thinking that testing is a specification activity. It’s the art of thinking clearly. Easily half of the work of Oil is coming up with HOW it should behave, which I first encode in tests, e.g. https://www.oilshell.org/release/0.14.2/quality.html

So yeah I think it may be something like “autocomplete”. Interestingly some people seem to find autocomplete more useful than others. I am mostly a Ctrl-N vim person. I think if you’re using too much autocomplete, it could be a sign the language is repetitive / “uncompressed” / not properly “Huffman coded”.

1. 2

It reminds me of the different levels of self driving cars: at intermediate levels they become more dangerous because humans only need to respond to exceptional circumstances. Humans are responsive to novel stimulus, not routine stimulus. Therefore, they will stop paying attention and disaster will strike.

2. 1

The problem is getting an llm to read a non-trivial amount of code. Which I assume is basically a problem with the hosted systems and not a technological limitation

1. 2

With 32K token context models coming soon in GPT4 (and larger elsewhere) This is likely not the main problem (soon at least)

1. 2

This reads AI-written. Two things stand out:

• There’s no “memory” between any of the three frameworks. Article never draws comparisons between them, and when it talks about how Fastify follows 6.12, it doesn’t remember what it said about 6.12 earlier in the piece.
• This section in “Express.js”:

Node.js best practice 6.16 recommends the use of third-party regex validation packages such as validator.js and safe-regex to detect vulnerable patterns. It discourages implementing your own Regex patterns, as poorly written regexes are susceptible to regex DoS attacks and can be easily exploited to block event loops completely.