Thanks for writing this. There’s a severe shortage of this kind of concrete, detailed example in discussions about how people are using these tools.
One thing that strikes me about the examples in the article is that they’re all small, self-contained, greenfield projects with no existing coding conventions or architecture. I’m far from the first to observe this, but my hunch is that one of the reasons I haven’t had as much success with these tools is that very little of my work fits that description. The vast majority of my day-to-day coding involves modifying an existing code base with a lot of moving parts whose behaviors are mostly governed by a messy pile of domain-specific business rules that have accumulated for years.
My experience with various tools (disclaimer: I have only this week started to try out Claude Code, so maybe it’ll be better) is that they can be okay at following low-level technical requirements, but they struggle with higher-level concepts and make lots of fundamental mistakes based on not having a big-picture understanding of how all the pieces fit together.
They’ll know that the data model has entity types A, B, and C, but won’t know that an instance of C can only exist if there’s a corresponding B. So they’ll write code that creates a new C in isolation. The agentic tools will then spin their wheels trying over and over again to fix the fact that the “create C” function is bombing out, but they’ll never notice the problem is actually the lack of a B. I then have to spend time iteratively nudging them in the right direction and it quickly gets into “would have been faster to just write the code myself” territory.
On some level, I suspect that this is a documentation failure. It takes a new engineer on my team a bit of time to come up to speed, and they spend much of that time pairing with or picking the brains of the existing team members to learn things they couldn’t find in our internal docs. Some of the things that take new team members the most time to learn are the same ones LLMs struggle with. Maybe if our docs were better and talked in depth about those things and were kept up to date, both LLMs and new engineers would struggle less.
However, to the tools’ credit, I’ll say my team has seen much more impressive results in UI code. Our web app is fairly standard React/Redux code and LLM tools seem to cope with it a lot better. I don’t work on UI code too much, but I know our frontend engineers have been saving time asking tools to write new components for them and such. Even for them, it’s not a “3x more productive” level of gain, but it seems significant.
A big issue with these tools is that they are supposedly “intelligent” and therefore should be capable of doing everything. Products like Claude Code, Cursor, etc. reinforce that impression. But it is the wrong way to go about it. They are not intelligent, and the task quickly becomes too big for them.
An LLM is a tool, like a drill or a screwdriver. But what they try to sell you is a machine that can hang a painting on your wall. That machine will inevitably make a mess in any situation that is not ideal, but drills and screwdrivers are quite useful.
My number one use of LLMs with coding is this: write a // TODO comment, pipe the function (or the file) to a local, not brilliant LLM, and replace it with the response. That’s it.
This is a wonderful idea. Apart from the nostalgia factor, I think newsreaders are much better equipped to deal with big streams of information than any algorithmic time.
Yes and no. After making and using Illuminant for a while, I feel like I want to read everything, which is not the usual way to consume “Twitter-likes”, so I can’t follow too many people, or the newsreader-interface feels overwhelming.
It might just be me, though, back when Twitter existed I also wanted to read everything, and was annoyed by the interface, having to remember how far to (doom)scroll down.
Can you elaborate the “laziness” part? I have a very narrow view of EU, but I read somewhere that EU does not encourage risk taking like the US does, so most large successful tech companies are American and not European.
But we sure are more conservative in technology. We let others do the pioneering work and then just pick established technologies, instead of trying or building new stuff. We are scared of being wrong and of failing.
There are exceptions of course, but they are too rare.
Though I think the difference may be more cultural - in general EU has a low risk taking culture (scared of being wrong and failing), which possibly also means the society in general frowns upon those who fail. Unlike in the US, where every failure (that does not ruin you financially) is celebrated and risk taking is rewarded.
European culture is different, but from what I understand, there are also very practical reasons that we don’t have a VC culture like the US. For all the talk about harmonization, single market, etc., I was surprised to learn that it is apparently not possible that a VC in one EU country can fund a startup in another one. Or at least not without jumping through countless bureaucratic hoops. Effectively, each country needs to take on Silicon Valley on its own.
No,. there’s a lot of policy discretion. The US government has access to any data stored in the US belonging to non-US persons without basic due process like search warrants. The data they choose to access is a policy question. The people being installed in US security agencies have strong connections to global far right movements.
In 2004 servers operated by Rackspace in the UK on behalf of Indymedia were handed over to the American authorities with no consideration of the legal situation in the jurisdiction where they were physically located.
/Any/ organisation- governmental or otherwise- that exposes themselves to that kind of risk needs to be put out of business.
I seem to remember an incident where instapaper went offline. The FBI raided a data centre and took a blade machine offline containing blade servers they had warrants for, and instapapers, which they didn’t. So accidents happen.
Yes, but in that case the server was in an American-owned datacenter physically located in America (Virginia), where it was within the jurisdiction of the FBI.
That is hardly the same as a server in an American-owned datacenter physically located in the UK, where it was not within the jurisdiction of the FBI.
Having worked for an American “multinational” I can see how that sort of thing can happen: a chain of managers unversed in the law assumes it is doing “the right thing”. Which makes it even more important that customers consider both the actual legal situation and the cost of that sort of foulup when choosing a datacenter.
The US government has access to any data stored in the US belonging to non-US persons without basic due process like search warrants.
Serious question, who’s putting data in us-west etc when there is eu data centres? And does that free rein over data extend to data in European data centres? I was under the impression that safe harbour regs protected it? But it’s been years since I had to know about this kind of stuff and it’s now foggy.
It does not matter where the data is stored. Using EU datacenters will help latency if that is where your users are, but it will not protect you from warrants. The author digs into this in this post, but unfortunately, it is in Dutch: https://berthub.eu/articles/posts/servers-in-de-eu-eigen-sleutels-helpt-het/
Serious question, who’s putting data in us-west etc when there is eu data centres?
A lot of non-EU companies. Seems like a weird question, not everyone is either US or EU. Almost every Latin American company I’ve worked for uses us-east/west, even if it has no US customers. It’s just way cheaper than LATAM data centers and has better latency than EU.
Obviously the world isn’t just US/EU, I appreciate that. This article is dealing with the trade agreements concerning EU/US data protection though so take my comment in that perspective.
I haven’t personally made up my mind on this, but one piece of evidence in the “it’s dramatically different (in a bad way)” side of things would be the usage of unvetted DOGE staffers with IRS data. That to me seems to indicate that the situation is worse than before.
Not sure what you mean—Operational Desert Storm and the Cold War weren’t initiated by the US nor were Iraq and the USSR allies in the sense that the US is allied with Western Europe, Canada, etc (yes, the US supported the USSR against Nazi Germany and Iraq against Islamist Iran, but everyone understood those alliances were temporary—the US didn’t enter into a mutual defense pact with Iraq or USSR, for example).
they absolutely 100% were initiated by the US. yes the existence of a mutual defense pact is notable, as is its continued existence despite the US “seeking to harm” its treaty partners. it sounds like our differing perceptions of whether the present moment is “dramatically different” come down to differences in historical understanding, the discussion of which would undoubtedly be pruned by pushcx.
This isn’t true, as the US has been the steward of the Internet and its administration has turned hostile towards US’s allies.
In truth, Europe already had a wake-up call with Snowden’s revelations, the US government spying on non-US citizens with impunity, by coercing private US companies to do it. And I remember the Obama administration claiming that “non-US citizens have no rights”.
But that was about privacy, whereas this time we’re talking about a far right administration that seems to be on a war path with US’s allies. The world today is not the same as it was 10 years ago.
hm, you have a good point. I was wondering why now it would be different but “privacy” has always been too vague a concept for most people to grasp/care about. But an unpredictable foreign government which is actively cutting ties with everyone and reneging on many of its promises with (former?) allies might be a bigger warning sign to companies and governments world wide.
I mean, nobody in their right mind would host stuff pertaining to EU citizens in, say, Russia or China.
I would like to represent the delegation of broke people in their 20s whose tech salaries are efficiently absorbed by their student loans:
You don’t need a smart bed. My mattress cost $200 and my bedframe cost <50. I sleep fine. I know as people age they need more back support but you do NOT need this. $250 worth of bed is FINE. You will survive!!
I’m not sure I agree. Like if you are living paycheck-to-paycheck then yeah, probably don’t drop $2k on a mattress. But I believe pretty strongly in spending good money on things you use every day.
The way it was explained to me that aligned with my frugal-by-nature mindset was basically an amortization argument. You (hopefully) use a bed every single day. So even if you only keep your bed for a single year (maybe these newfangled cloud-powered beds will also have planned obsolescence built-in, but the beds I know of should last at least a decade), that’s like 5 bucks a day. Which is like, a coffee or something in this economy. I know my colleagues and I will sometimes take an extra coffee break some days, which could be a get up and walk break instead.
You might be young now, but in your situation I would rather save for my old age than borrow against my youth. And for what it’s worth I have friends in their 20s with back problems.
(of course, do your own research to figure out what sort of benefits a mattress will give to your sleep, back, etc. my point is more that even if the perceived benefits feel minimal, so too do the costs when you consider the usage you get)
Mattresses are known to have a rather high markup, and the salesmen have honed the arguments you just re-iterated to perfection. There are plenty of items I’ve used nearly daily for a decade or more. Cutlery, pots, my wallet, certain bags, my bike, etc. None of them cost anywhere near $2000. Yes, amortized on a daily basis, their cost comes to pennies, which is why life is affordable.
Yes, there are bad mattresses that will exacerbate bad sleep and back problems. I’ve slept on some of them. When you have one of those, you’ll feel it. If you wake up rested, without pains or muscle aches in the morning, you’re fine.
I too lament that there are things we buy which have unreasonable markups, possibly without any benefits from the markups at all. I guess my point is more that I believe – for the important things in life – erring on the side of “too much” is fine. I personally have not been grifted by a $2k temperature-controlled mattress, but if it legitimately helped my sleep I wouldn’t feel bad about the spend. So long as I’m not buying one every month.
I think one point you’re glossing over is that sometimes you have to pay an ignorance tax. I know about PCs, so I can tell you that the RGB tower with gaming branding plastered all over it is a grift [1]. And I know enough about the purpose my kitchen knife serves to know that while it looks cool, the most that the $1k chef’s knife could get me is faster and more cleanly cut veggies [2].
You sound confident in your understanding of mattresses, and that’s a confidence I don’t know if I share. But if I think of a field I am confident in, like buying PCs, I would rather end the guy who buys the overly marked-up PC that works well for him than the one who walks a way with a steal that doesn’t meet his needs. Obviously we want to always live in the sweet spot of matching spend to necessity, but I don’t know if it’s always so easy.
[1] except for when companies are unloading their old stock and it’s actually cheap.
[2] but maybe, amortized, that is worth it to you. I won’t pretend to always be making the right decisions.
I personally have not been grifted by a $2k temperature-controlled mattress, but if it legitimately helped my sleep I wouldn’t feel bad about the spend.
Note, because it’s not super obvious from the article: the $2k (or up to about 5k EUR for the newest version) is only the temperature-control, the mattress is extra.
All that said: having suffered from severe sleep issues for a stretch of years, I can totally understand how any amount of thousands feels like a steal to make them go away.
One of the big virtues of the age of the internet is that you can pay your ignorance tax with a few hours of research.
In any case, framing it as ‘$5 a day’ doesn’t make it seem like a lot until you calculate your daily take-home pay. For most people, $5 is like 10% of their daily income. You can probably afford being ignorant about a few purchases, but not about all of them.
One of the big virtues of the age of the internet is that you can pay your ignorance tax with a few hours of research.
Maybe I would have agreed with you five years ago, but I don’t feel the same way today. Even for simple factual things I feel like the amount of misinformation and slop has gone up, much less things for which we don’t have straight answers.
For most people, $5 is like 10% of their daily income. You can probably afford being ignorant about a few purchases, but not about all of them.
Your point is valid. I agree that we can’t 5-bucks-of-coffee-a-day away every purchase we make. Hopefully the ignorance tax we pay is much less than 10% of our daily income.
I think smart features and good quality are completely separate issues. When I was young, I also had a cheap bed, cheap keyboard, cheap desk, cheap chair, etc. Now that I’m older, I kinda regret that I didn’t get better stuff at a younger age (though I couldn’t really afford it, junior/medior Dutch/German IT jobs don’t pay that well + also a sizable student loan). More ergonomic is better long-term and generally more expensive.
Smart features on the other hand, are totally useless. But unfortunately, they go together a bit. E.g. a lot of good Miele washing machines (which do last longer if you look at statistics of repair shops) or things like the non-basic Oral-B toothbrushes have Bluetooth smart features. We just ignore them, but I’d rather have these otherwise good products without she smart crap.
Also, while I’m on a soapbox – Smart TVs are the worst thing to happen. I have my own streaming box, thank you. Many of them make screenshots to spy on you (the samba.tv crap, etc).
Also, while I’m on a soapbox – Smart TVs are the worst thing to happen. I have my own streaming box, thank you. Many of them make screenshots to spy on you (the samba.tv crap, etc).
Yes, absolutely! Although it would be cool to be able to run a mainline kernel and some sort of Kodi, cutting all the crap…
You don’t need a smart bed. My mattress cost $200 and my bedframe cost <50. I sleep fine. I know as people age they need more back support but you do NOT need this. $250 worth of bed is FINE. You will survive!!
I guess you never experienced a period with serious insomnia. It can make you desperate. Your whole life falls in to shambles, you’ll become a complete wreck, and you can’t resolve the problem while everybody else around seems to be able to just go to bed, close their eyes and sleep.
There is so much more to sleep than whether your mattress can support your back. While I don’t think I would ever buy such a ludicrous product, I have sympathy for the people who try this out of sheer desperation. At the end of the day, having Jeff Bezos in your bed and some sleep is actually better than having no sleep at all.
You make some good points why this kind of product shouldn’t exist and anything but a standard mattress should be a matter of medical professionals and sleep studies. When people are delirious from a lack of sleep and desperate, these options shouldn’t be there to take advantage of them. I’m surprised at the crazy number of mattress stores out there in the age of really-very-good sub-$1,000 mattresses you can have delivered to your door. I think we could do more to protect people from their worn out selves.
None of the old people in my family feel the need for an internet connected bed (that stops working during an internet or power outage). Also, I imagine that knowing you are being spied on in your sleep by some creepy amoral tech company does not improve sleep quality.
I do know that creepy amoral tech companies collect tons of personal data so that they can monetize it on the market (grey or otherwise). Knowing that you didn’t use your bed last night would be valuable information for some grey market data consumers I imagine. This seems like a ripe opportunity for organized crime to coordinate house breakins using an app.
I believe the people who buy this want to basically experience the most technological “advanced” thing they can pay for. They don’t “need” it. It’s more about the experience and the bragging rights, but I could be wrong.
I’m sorry to somewhat disagree. The reason I would buy this (not at that price tag, I had actually looked into this product) is because I am a wildly hot person/sleeper. I have just a flat sheet on and I am still sweating. I have ceiling fans running additional fans added. This is not only about the experience unless a good night sleep is now considered “an experience”. I legitimately wear shorts even in up to a foot of snow.
Ouch… Please do not follow this piece of advice. A lot of cheap mattresses contain “cancer dust”[1] that you just breath in when you sleep. You most likely don’t want to buy the most expensive mattress either, because many of the very expensive mattresses are just cheap mattresses made overseas with expensive marketing.
The best thing to do is to look at your independent consumer test results for your local market. (In Germany where I live it’s “Stiftugn Warentest” and in France where I’m from it’s “60 millions de consommateurs (fr).” I don’t know what it is in the US.)
A good mattress is not expensive, but it’s not cheap either. I spend 8 hours sleeping on this every day, I don’t want to cheap out.
[1] I don’t mean literal cancer dust. It’s usually just foam dust created when the mattress foam was cut, or when it rubs against the cover. People jokingly call it “cancer dust”
Agreed. One of the great features of LLMs is that you can ask it to elaborate, clarify, give examples, etc. If you like, you can learn much more from an LLM than from a single StackOverflow answer. The author mentions a particular detailed answer. But now every answer can be like that.
As always, some people put in the work, and others don’t.
I don’t have a great deal of experience with LLMs, but typically as soon as I ask them more than two follow-up questions, they begin hallucinating.
I was recently working with go-git and decided to ask ChatGPT how I could upload packs via SSH as the documentation didn’t make it seem immediately obvioys, it kept trying to use nonexistent HTTP transport functions over SSH even though I explicitly provided the entire godoc documentation for the SSH transports and packfiles. Granted, the documentation was lacking, but all I needed to do at last was to thoroughly digest the documentation, which ChatGPT is evidently unable to do. In another scenario, it also suggested ridiculous things like “Yes, you can use sendfile() for zero-copy transfers between a pipe and a socket”.
Anyhow, at least for the fields that I encounter, ChatGPT is way worse than asking on SO or just asking in an IRC channel.
Unfortunately, one needs to develop some kind of intuition of what an LLM is capable of. And with that, I don’t just mean LLMs in general and the types of questions they can handle, but also the different models and the way you feed them your documentation.
Some are better at technical questions than others, some are better at processing large text input. I prefer to use local models, but they can’t handle long conversations. I almost automatically start a new one after two follow-ups because I know Qwen will get confused. On the other hand, I know that if I were to use Gemini on NotebookLM, I could throw a whole book in it, and it would find the right part without breaking a sweat.
Using the right LLM-equipped tool is just as an important choice as the model itself. For understanding codebases, aider is the best in my experience. It adds the information git has about your project automatically to the context. For more general learning and research, I like to create Knowledge Stacks in Msty.
Fair point. If I’m ever slightly in doubt about its competence, I push back on LLMs vigorously. I include in my pre-instructions that I’m not looking for a “yes man”. (At the same time, I don’t want pedantic disagreement. Ah, the complexity of it all!)
How does a company get so backwards? They went with utmost haste from leading to following. Where once I thought the sky was the limit for this product, now I just think they’re stuck on the exact same plateau as everyone else.
I think they have some definite polish on aspects of their editor, but I was (and still am) put off by the funding model for the editor. Good quality engineering isn’t free and I remain unconvinced that hockey stick growth style models are practical for guiding reasonable product development.
Another thing that gets me is marrying a reasonably fast process (the editor’s insertion speed and search speed) with a much slower process (LLM inference).
According to the Minimizing Latency: Serving The Model section, they are sending the text to online services to get predictions? Apart from the privacy concerns, I wonder who is paying for the GPU cost, and how does that factor in their business model.
FWIW I should be a little more precise and say “either LLM inference or network requests” since there’s clearly capacity to send over the net. Both seem slower (and more variable) than local editor business.
What’s backwards here? The presence of LLM at all?
I’ve come to consider this kind of LLM tab completion essential. It’s the only piece of LLM software I use and I find it saves me a lot of time, at least the implementation in Cursor. It often feels like having automatic vim macros over the semantics of the code rather than the syntax of the code. Like if I’m refactoring a few similar functions, I do the first one and then magically I can just press tab a few times to apply the spirit of the same refactor to the rest of the functions in the file.
My question is: why is that good? “Magical” is one of those words in programming that usually means something has gone horribly wrong.
Don’t get me wrong: I want my tool to make it easy to make mechanical changes that touch a bunch of code. I just don’t want the process to do it to be a magical heuristic.
You’re the one saying that the company is doing something backwards, I think it’s on you to justify that when asked, not to come back with a question tbh.
“Magical” is one of those words in programming that usually means something has gone horribly wrong.
Statements like these are just dogma/ rhetoric. Words like “magical” are just like “simplicity” or “ugly”, they mean something different to everyone.
I just don’t want the process to do it to be a magical heuristic.
Why not? What if it’s a problem best suited by heuristics?
Don’t get me wrong: I want my tool to make it easy to make mechanical changes that touch a bunch of code. I just don’t want the process to do it to be a magical heuristic.
I’m in a similar boat.
I’m less bullish on having the LLM do a large-scale refactoring than I am on using an LLM to generate a codemod that I can use to do the large-scale refactoring in a deterministic fashion.
But for small-scale changes—I wouldn’t even necessarily call them “refactorings”—like adding a new field to a struct and then threading that all way through, I’ve found that our edit predictions can cut down on a lot of the mundanity of a change like that.
The big question is: what environment does that codemod target?
For a system like this to work well there has to be a consistent high-level way of defining transformations that many people will use and write about so that models will understand it well. For that to happen you need an abstraction over the idea of a syntax node.
I’m not sure about Zed but you can do that with Cursor. The tab model is very small and fast, but Cursor has a few options ranging from “implicit inline completion suggestions with tab” to “long form agent instructions and review loop” similar to what you describe - you ask it to do stuff in a chat like interface, it proposes diffs, you can accept the diffs or request adjustments. But, I find explicitly talking to the AI much slower and more flow interrupting compared to tab completion.
I do use a mode that’s in between the two where I can select some text, press cmd-k, describe the edit and it will propose the diff inline with the document. Usually my prompt is very terse, like “fix”, “add tests”, “implement interface”, “use X instead of Y”, “handle remaining cases” that sort of thing.
I use plenty of heuristics in my editor already, like I appreciate fuzzy-file-find remembering my most opened files and up-weighting them, same with LSP suggestions and auto-imports. The AI tab completion experience is a more magical layer on top, but after using it for about an hour it starts to feel just like regular tab completion that provides “insert function name”, it’s just providing more possible edits. Another time saver I appreciate is when it suggests an edit to balance some parenthesis/braces for a long nested structure that I’m struggling to wrangle in my own.
I do use a mode that’s in between the two where I can select some text, press cmd-k, describe the edit and it will propose the diff inline with the document. Usually my prompt is very terse, like “fix”, “add tests”, “implement interface”, “use X instead of Y”, “handle remaining cases” that sort of thing.
These days, my favorite use of LLMs is to write a // TODO comment at the appropriate place, send a snippet with the lines to be changed to the LLM, and replace the selection with the response. With the right default prompt, this works really well with the pipe command in editors like Neovim, Kakoune, etc. and a command line client like llm, or smartcat.
The place I miss an LLM the most is in my shell. I’d love to be able to fall back to llm to construct a pipeline rather than needing to read 6 different manpages and iterate through trial and error. Do you have a setup for ZSH/bash/etc that’s lightweight? I haven’t seen anything inspiring in this area yet outside proprietary terminal emulators (I’m not interested)
I’m spoiled because I can’t do anything like that. I’m inventing a genuinely new technology, so I always have to think for myself because there’s no one to follow or imitate. I’m sure it sounds weird to hear me be excited about building my internal model for where changed requirements will manifest as need for changed code, but my mental model of that is razor sharp, and thinking about where changes are needed myself gives me leave to think about whether my code is expressive enough and has strong architecture.
But yeah, I know I’m the weird one. I’m the kid that retyped the red-underlined word instead of right clicking to correct spelling, the idea being that I wanted learn how to spell and spot/correct spelling mistakes instead of the machine.
Once the diff gets big enough it starts to have problem of its own. How will you know if it’s all correct without redoing all the work? What if the diff is stale by the time it is reviewed and approved? Generating a script instead of a diff solves those problems, and incidentally has another property that I prize very highly: it is just as useful to humans as it is to LLMs. Once you can define large changes as small scripts typing will no longer be the odious part of making changes that touch a lot of code.
when “ollama” uses the model id “mistral-small:24b”, what exactly is the model and what quantization does it use? Does it use a single GPU or does it use both?
apparently a single or fp16/bf16 or even the 8bit 70B is not gonna fit in 2x24GB, what exactly is used here and how?
mistral-small:24b points to 24b-instruct-2501-q4_K_M, so definitely not full precision. It is only 14GB and fits well on one card. You can find the available versions here: https://ollama.com/library/mistral-small/tags
I love going back to basics and cutting away as much cruft as possible, but blogs in raw HTML or text generally don’t have an RSS feed, probably because it requires more work to maintain. As a result, I inevitably visit them only once and then lose track of them.
That’s the first time someone mention RSS feed when I share something about my small txt “blog”, thank you for your comment.
I need to think a bit about it before actually solving you problem, but I think that the bash script I use to create new txt files (with BOM) might be extended to create/update a rss feed.
Bit of a self-promo I guess but I wrote a tool to deal with hand-managed RSS feeds. I didn’t end up using it for much, but that’s a symptom of me not blogging much more than anything. If it works for other folks, I’d probably be up for spending a couple cycles updating it. https://codeberg.org/klardotsh/kaboom
That’s the beauty of a static site generator. You get nice formatting, maybe also RSS feeds, with minimal per-post effort. (e.g. write in maybe Markdown). That said, been a while since I’ve added a post myself. ;)
$1700 is quite a large budget. If the total cost were halved, that would still be a sizeable budget. I feel like tech writers these days are forgetting what the phrase “on a budget” implies.
I agree, it is a big sum of money. I interpret “on a budget” as “relatively cheap”, not as “nearly free”. I think it is pretty cheap compared to what one normally needs to pay for that amount of VRAM. To me, the term is more justified here than in a post where someone buys a second-hand Apple laptop for $1100 and claims it is a cheap solution to browse the web.
I really hope AMD catches up and prices come down, because AI-capable hardware is not nearly as accessible as it should be.
Maybe it would be more accurate to say “on a budget” is a form of weasel word. Its interpretation depends on your familiarity with current prices and your socioeconomic status.
From my very subjective (and probably outdated) PoV…
$1000 is a fancy high-end laptop
$2000 buys a laptop only extremely well-paid people can justify
$800 buys a high-end desktop
You can imagine the surprise I felt (or was that shame?) when seeing a $1700 price tag on a “budget” desktop PC that can do AI.
“Building a personal, private AI computer for $1700” would communicate the intent a little better, without suggesting to the reader anything about their ability to afford it.
Btw, I don’t mean to imply any wrong was committed. I’m just pointing out that the wording on the post had some unintended effects on at least this reader. To a large degree that is unavoidable, no matter what a person publishes on the web.
Agreed. I’m running local models with an RTX 3060 12 GB that costs about $330 on NewEgg or 320€ new at ebay.de, and it’s actually useful. The context sizes must be kept tiny but even then it can provide basic code completion and short chats.
The code they write is riddled with subtle bugs but making my computer program itself seems to never get old. Luckily they also make it quicker to write throwaway unit tests. The small chat models are useful for language tasks such as translating unstructured article title + URL snippets to clean markdown links. They also act as a thesaurus, very useful for naming things, and can unblock progress when you’re really stuck with some piece of code (rubberduck debugging). Usually the model just tells me to “double-check you didn’t make any mistakes” though :)
On the software side I use ollama for running the models, continue.dev for programming (it’s really janky), and the PageAssist Firefox extension for chat.
If you look at “modern” gaming graphics cards, that is so cheap I was actually surprised. (even compared to a 30xx from some years ago)
If the median price of a thing is high, then absolute values don’t matter. A new car for under 10k EUR would still be “on a budget”, even if it’s a lot of money.
How about a good priced consumer grade card that gives similar performance? Is there any option to the Nvidia Tesla P40? Slightly more modern, less power without all the hacky stuff?
You can run inference using CPU only, but you’ll have to use smaller models since it’s slower. But the P40 is the best value right now given the amount of VRAM it has.
There are several options for a consumer grade card, but it all gets incredibly expensive really fast. I just checked for my country (The Netherlands) and the cheapest 24GB card is 949 euros new. And that is an AMD card, not an Nvidia. While I am sure the hardware is just as good, the fact is that the software support for AMD is currently not at the same level as Nvidia.
Second-hand, one can look for RTX 3090’s and RTX 4090’s. But a quick check shows that a single second-hand 3090 would cost over 600 euros at minimum here. And this does not consider that those cards are really power hungry and often take up 3 PCIe slots, to make space for the cooling, which would have been an issue in this workstation.
Since I could only accommodate speeds to what PCIe 3.0 offers anyway, a limitation of the workstation, this seemed the best option to me. But of course, check the markets that are available to you to see if there are better deals to be made for your particular situation.
I sometimes wonder if it’s just the aspect of getting older that makes software seem like it’s getting worse. Software was certainly simpler in my time, but it didn’t do a lot of the things that software does today. Both bad and good. I couldn’t have imagined having 32 processors available on a user system back in 1980. Now it’s commonplace. Kind of like the car, things have advanced to the point where things are better and worse.
I don’t have a full answer for you, but one of the subjective aspects of software that I find gets worse over time is that I feel that there is a “greater distance” between myself, the software, and my data.
For example, if I want to count how many pdf files I have in my home directory on my home computer, it’s a simple incantation: find ~ -name '*.pdf' | wc -l. It’s easy, fast, and it uses the great capability of Unix pipelines to pass data through multiple filters.
If I wanted to do the same, but for files in an S3 bucket, I need to use the web UI and hope that I can do a search or filter and that the UI will show how many objects matched. I could also try to use the awscli tool to have a stream of bytes that I can pipe into grep | wc -l to make the task more Unixy. In either case, I can’t start obtaining the information until I’ve authenticated with MFA.
In this example, going from “I have a question” to “I have an answer” is much faster and much less of a hassle in the desktop scenario than in the web service scenario, because the “distance” between myself and my data is much shorter. If the S3 approach is annoying enough, I might avoid finding answers to my questions.
In the past 10-15 years, as software has migrated from the desktop to the web, our ability to manipulate our data quickly, easily, and flexibly has diminished greatly. Even on our phones, managing simple files and their content can feel out of reach. And unfortunately, it seems that more of our daily experience must be done with software where it feels like you are not the owner of your data, but just a customer who can do with his data what the SaaS provider has deemed you can do.
I can’t start obtaining the information until I’ve authenticated with MFA.
Surely you also sign in to the Unix-like OS on your home computer before running find? Likely you use fewer factors of authentication than for AWS, but I guess it’s a difference between two factors and one factor rather than two and zero.
Modern software seems much more difficult to automate. Web/cloud are the worse for this. The default way to repeat anything is to manually perform the action again, and again, and again…
APIs are all very well, but typically take significant programming to use. Oh for an equivalent to “find ~ -name '*.pdf' | wc -l” for normal (or unusual) actions.
If I wanted to do the same, but for files in an S3 bucket, I need to use the web UI and hope that I can do a search or filter and that the UI will show how many objects matched. I could also try to use the awscli tool to have a stream of bytes that I can pipe into grep | wc -l to make the task more Unixy. In either case, I can’t start obtaining the information until I’ve authenticated with MFA.
In my experience, one needs to push a bit for things to become simple again, but it is often possible. Another option in your example is to use s3fs-fuse, which will let you mount a bucket just like any other external storage. Then find ~ -name '*.pdf' | wc -l simply works as it did before.
It is a simple one-time install. You do have to set up the MFA, that is true. But I would argue that security is one of the things that has gotten better over the years.
I guess cuz I kind of divorced myself from most of the cloud stuff. Although I do it at work, most of my personal stuff is just standard documents that I sell post and backup both on the cloud and locally. I do understand that sentiment though. For most normal folk it’s probably true.
It’s the same complaint about bloat, needless complexity, and inefficient fancy new language features, but being an artifact of its time, it complains about software needing 1MB of RAM when back in his days 32KB would suffice.
Mmm maybe but, you are talking with somebody very enthusiastic with AI developments for instance. I’m very interested in new things and new technologies. The complexity I’m referring here, like the one of web frameworks to take an example, or the absurd dependencies chain, has nothing to do with cool new things, it is just “free” complexity, where you do what you could easily do without such a mess.
Popularity has always been self-perpetuating, LLMs don’t make that any different. Before them, there already were fewer fora, fewer Stackoverflow answers and fewer Youtube tutorials for more obscure tools and languages. Many took that into account when picking something new to use or learn.
But I do think that LLMs exaggerate the problem in another way. LLMs have the tendency to confidently paper over the gaps of their knowledge, and the paper they use is that of the well-known solutions. As an example, often when I ask an LLM on how to do something in Kakoune, a modal editor, it will respond (partly) with Vim knowledge, the dominant modal editor. This can be thoroughly confusing, and can be discouraging to venture outside the common paths, which is bad for innovation.
Another example. A while ago, I was brainstorming with an LLM about a solution to a problem I needed to solve. I had an idea and asked if it could validate that by generating examples of the implementation. It cheerfully responded that it was indeed an excellent solution, and then demonstrated that by generating examples of another solution. Apparently, the internet had solved this before and their solution was actually better than mine. So this turned out well for me, but one can easily imagine a case where this behavior hinders the development of novel ideas.
One should be cautious with LLMs that operate on untrusted input like emails. This is waiting to be attacked with prompt injection. For now, the worst that could happen is probably that emails end up in the wrong folder, but one could easily see where this goes. Spammers will try to trick the LLM into letting their email in Inbox, hackers will try to suppress security warnings by moving it to Junk before you can see it, etc.
Enjoy the brief period of silence before the world has caught up.
Once such a tool gains the ability to delete, forward or reply to emails, all bets are off though, and you will get hacked.
One should be cautious with LLMs that operate on untrusted input like emails. This is waiting to be attacked with prompt injection. For now, the worst that could happen is probably that emails end up in the wrong folder, but one could easily see where this goes. Spammers will try to trick the LLM into letting their email in Inbox, hackers will try to suppress security warnings by moving it to Junk before you can see it, etc.
This all applies to any spam filtering, such as SpamAssassin too. Does any of this uniquely impact LLMs? (The classic equivalent to prompt injection might be exploiting a parser bug and then getting RCE to influence other spam filtering, or worse.)
Enjoy the brief period of silence before the world has caught up.
+1. I guess LLM spam filtering will soon be the minimum.
Once such a tool gains the ability to delete, forward or reply to emails, all bets are off though, and you will get hacked.
I see there’s a plausible risk, but it feels a bit much to say it’ll definitely happen. Is there any reason to believe this won’t just be the normal “arms race between security engineers and hackers”? As with most tech, some will get hacked.
This all applies to any spam filtering, such as SpamAssassin too. Does any of this uniquely impact LLMs? (The classic equivalent to prompt injection might be exploiting a parser bug and then getting RCE to influence other spam filtering, or worse.)
I think in its current form, the damage that could be done is the same: misfiled emails. However, what makes LLMs worse is that they are fundamentally flawed and unfixable. A parser bug can be patched. And, sure, there will be a next bug, but eventually the thing will be hardened and safe to use.
Guarding an LLM against prompt injection, on the other hand, is a hopeless task, as there is no way to reliably separate user input from trusted input. This is the reason that projects like Operator from OpenAI come with a hefty disclaimer. The big players still haven’t figured it out. And I don’t think they ever will.
Well, if they ever start caring, I expect they will figure out how to have instructions as a separate out-of-band input. I find it likely that they will manage to do this within a year, using the already accumulated data sets and also the existing models. Although as an economic claim that they won’t start caring, you are probably right.
There are few independent LLM lines, and LLMs are complicated — thus there is a larger risk of a wide-spectrum attack, and bigger interest in finding it. Personally-tuned bayesian spamfilters can be less of a monoculture risk if a parser is either very well polished or written in a non-RCE-friendly language.
I just set my browser default download dir to /tmp and to ask me for every file so it’s an easy opt-out. Simple and effective :)
I still have a ~/Downloads dir for files I opt-in to saving but don’t want to put in a tidier place.
Re the thread: I don’t have anything I do manually every day. Maybe refreshing lobsters counts?
I need to do that, but I tend to want things from 6 months or a year ago out of it sometimes, so my Downloads folder is an ever-increasing list that I only occasionally go erase big files from(sort by largest, delete stuff I know I don’t care about anymore).
Of course I also dump screenshots and every other thing that would have historically gone in /tmp there too.
Was in exactly in that position, and than I added ~/tmp (just a folder in home) dir for screenshots, random git clones and other waste. That way, Downloads only stores stuff from the internet, so it’s always safe to wipe, as, worst-case, I’ll just re-download.
I configured it so that my desktop shows my Home folder, not a dedicated Desktop folder, and let everything save there (downloads, screenshots, etc.)
This forces me to clean up old stuff because I don’t like noisy desktops. It is like the zero inbox policy some people use for email, powered by my irritation.
Everything for me goes into ~/Downloads: movies, DMGs, PDFs, ePubs, images. Once in a while I’ll clean up a category which has a destination somewhere else on the file system (most things these days) or delete them.
That’s what I’ve been doing, except I’m lazy and what ends up happening is I wait until I get a low space warning, then I sort by largest and delete a bunch until I’m bored or I have enough space to do what I need doing at the moment.
So I’m going to try they ~/tmp/ and put everything except downloads there, and then rotate ~/Downloads like matklad, see how that works out.
Same, I would very much like the text to stay crisp. I can get less contrast by dimming my screen, whereas increasing brightness can’t recover contrast that isn’t there.
It still exists on LCDs… but not everyone can afford to take the approach I took for brute-forcing around various multi-monitor Linux gaming annoyances by just moving the games to a whole other PC and adding a KVM switch.
(No, I’m not crazy… just disinterested enough in AAA games that a hand-me-down PC from 2012 upgraded with a few other hand-me-down parts will suffice.)
If you change the monitor’s contrast for text, you also change it for image editing and video playback and so on.
Not the main point, but fading out the comments like that is IMO the worst thing you can do in your colour scheme. Jim Fisher argues it more completely, but nothing is more frustrating than a colleague completely failing to see exactly the information they needed, which was right there next to the very line of code they were editing, because of washed out colours.
Sometimes I’ve been so confused about why someone wasn’t finding the information they needed in a module that I knew had a long explanatory comment, until I watched them (when screen sharing) immediately and repeatedly scroll past every wall of light grey text - carefully constructed comments that their colour theme was telling them were irrelevant.
That is not “comments that their colour theme was telling them were irrelevant”, that is just a mistake from that colleague. If you are in a situation that you are actually looking for information, and you can’t see it when it is right in front of you, then it is simply time to take a break. Unless the comments were in the same colour as the background, your colleague could still easily see that something was there.
I believe the associations Jim Fisher writes about are highly subjective, and personally, I have very different ones than he describes.
Comments are not washed out because they are less important, but because they are something different than code. They are more like footnotes. When reading code, I want to follow the trail that the computer takes when executing the statements, so I wish to be able to easily follow those. Comments that are formatted as shown in the article would be likes obstacles that I need to jump over, before I can continue to follow the trail.
Remember, code tells what happens, comments tell why it happens. If at some point I get interested in the why, I “readjust” my focus to get the comments into view.
As for the red and green coloured diffs. I really don’t feel them as value judgements. I think it is just what you are accustomed to. They could be yellow and blue, for all I care.
Edit: maybe an editor should just have two different colour schemes, tuned for the work you are doing. One for detailed inspection of what the computer does, and another for a broader, more holistic view of what the code is about.
carefully constructed comments that their colour theme was telling them were irrelevant
Perhaps they scrolled past because they’ve read too many articles declaring that comments are always out-of-date, always a failure, always bad, etc. ;)
I don’t like gray comments either, though more because I spend a lot of time editing them, sometimes screenfuls of them. If I were editing docs in a Markdown file I’d never make the main text color light gray.
I’m less convinced by the article’s other argument, against green insertions and red deletions. If you think deleting code is good, your brain will quickly adjust to that context. When you buy puts you don’t need to physically or mentally invert the chart colors, you just want to see the other color.
With a machine this affordable, you sort of reprogram yourself to live within its boundaries.
I think this belies the post’s title (“best laptop ever”). The author is clearly happy with his Air, but someone with a tighter budget and slightly different requirements could have written the same post about a even cheaper $500 Windows laptop.
I don’t know, the M-series are just nothing like previous laptops, which were often insanely overpriced for the terrible specs they came with - so I would still probably go for a (possibly used) M1 over.. pretty much anything else, even on a semi-tight budget.
I don’t know about the M-series, never used one, but the article says they use it for:
Granted, I only use it for light web development, browsing, emails, and occasionally running a small Docker container
You can do that with just about any laptop that was released in the last five years, provided you have enough memory. I regularly use a ten-year-old Dell that works just fine. The only trick is that it has 16Gb of RAM.
Assuming you want to stay constantly plugged in. I’ve got two Dell XPS 13s (one 2017, one 2020) and they both have atrocious battery life. And the 2020 one was barely used for 4 years so the battery is still very healthy.
I’ve got an XPS 13 (9300, from 2020) that has a pretty good battery, but I wasn’t expecting that. PC laptop batteries feel like a lottery, especially as they age. ThinkPads in my experience have been the worst for battery health, with massive capacity drop-off in a few years. Meanwhile, my 12 year old MacBook has pretty reasonable battery health considering its age. (It’s not just an Apple thing too; I’m told Panasonic Let’s Notes are also really good with aging batteries.)
Transforming government services isn’t as easy as the tech bros and billionaires make it out to be.
But it often is simple. Don’t trust the people who are paid to “solve” a problem when they it is unsolvable. Organizations created to solve an issue may make it worse because their existence depends on the issues continued existence. I’d rather trust an outsider with a proven track record of simplifying processes and a general trail of success rather than someone whose employment and/or empire may well rely on a problem appearing unsolvable.
The worst thing that could happen to governments and politicians is people realizing we don’t really need them for most things where they have inserted themselves into our lives, so they keep certain problems going and unsolved, and sometimes actually cause a crisis and sell themselves as the solution.
Rest of the article was fine, idk why this obvious and absolutely misplaced shade throwing at the beginning was necessary.
Heh, yeah, I had the same thought. I worked at a startup that:
Had excellent staff who understood fast delivery and had an excellent track record of delivering excellent results in large bureaucratic organizations
Had direct access to the Digital Transformation group in our local state-level government as Paying Customer #1. They had a mandate to deliver solutions quickly and a team that was quite excellent. We had a fantastic working relationship with them.
Was continuously dragged down by other “stakeholders” within the government that, in my opinion, in many cases had very little business being involved but managed to convince the appropriate people that our projects were in-scope for them and that they needed to sign off on everything before each deployment.
One of my “favourite” moments was when the Director of Central IT raised a “security issue” very late in the process for a given deployment. This was in a very large stakeholder meeting.
Director: “Me and my staff have some significant concerns about the security of the deployment you’re about to do.”
Me: “Oh! Could you elaborate on that?”
Director: “I’ll follow up with an email. We will be voting to block the deployment until these concerns are addressed.”
After not getting any follow-up for a few days I started chasing him (because we were currently blocked from deployment…) and finally he responds with a PDF called “McAfee Top 10 Security Vulnerabilities in Web Applications”
Me: “I’ve reviewed the document you sent me and, from my point of view, we’re not deficient on any of those 10 items. Is there a specific item you’re concerned about that we could address for you?”
Director: “If you can’t see where your deficiencies are based on that document I’m not sure I can help you.”
In the end? I had to put him on the spot in another stakeholder meeting a few weeks later and get him to say “there are no specific security concerns we have at this time” after he, several times, raised these vague concerns as a blocker but couldn’t articulate any specifics. Deployment delayed for weeks over nothing.
I’ve been involved in quite a few government modernization projects. I’ve also run into people with similar dispositions as this security director. In my experience these interactions have been a cultural mismatch rather than the rent seeking behavior described by the post you’re responding to. There has been a culture in most government organizations, particularly around IT security and other risk categories, that revolves around checklists. They will have concerns unless you’ve come to the review meeting with some positive evidence of your diligence in detecting errors in a bunch of different areas. So his concerns weren’t so much that his staff identified specific vulnerabilities but that you’ve asserted you’re ready to deploy without furnishing evidence to support the claim. It all comes down to whether the leader of their department has something to provide to legislative committees, IGs, and other oversight groups to prove they weren’t negligent if/when something goes awry. I’ve had quite a bit of success helping these folks get comfortable with how modern development practices embed a lot of their checklists in the earlier stages of development.
If this happened in India I’d assume the person was looking for a bribe. The culture is that people get on committees with the mindset of feudal guards: they’re there not as guardians of the process but as extortionists. Society suffers.
I have heard the same sort of stories from people trying to sell products and services to any large organization, whether in the public or private sector. It sticks out because for public organizations they’re wasting the public’s time and money, instead of the shareholder’s.
IME, engineers rarely say a problem like this is “unsolvable”. It almost always boils down to people in charge not wanting to pay to do it the right way.
Don’t trust the people who are paid to “solve” a problem when they it is unsolvable. Organizations created to solve an issue may make it worse because their existence depends on the issues continued existence. I’d rather trust an outsider with a proven track record of simplifying processes and a general trail of success rather than someone whose employment and/or empire may well rely on a problem appearing unsolvable.
How would that even work? This outsider also depends on the issues for their existence, so the same incentives apply.
Outsiders can bring a fresh perspective, insiders have intimate domain knowledge that is not easily replicated. Both are valuable, and that has nothing to do with governments per se.
Thanks for writing this. There’s a severe shortage of this kind of concrete, detailed example in discussions about how people are using these tools.
One thing that strikes me about the examples in the article is that they’re all small, self-contained, greenfield projects with no existing coding conventions or architecture. I’m far from the first to observe this, but my hunch is that one of the reasons I haven’t had as much success with these tools is that very little of my work fits that description. The vast majority of my day-to-day coding involves modifying an existing code base with a lot of moving parts whose behaviors are mostly governed by a messy pile of domain-specific business rules that have accumulated for years.
My experience with various tools (disclaimer: I have only this week started to try out Claude Code, so maybe it’ll be better) is that they can be okay at following low-level technical requirements, but they struggle with higher-level concepts and make lots of fundamental mistakes based on not having a big-picture understanding of how all the pieces fit together.
They’ll know that the data model has entity types A, B, and C, but won’t know that an instance of C can only exist if there’s a corresponding B. So they’ll write code that creates a new C in isolation. The agentic tools will then spin their wheels trying over and over again to fix the fact that the “create C” function is bombing out, but they’ll never notice the problem is actually the lack of a B. I then have to spend time iteratively nudging them in the right direction and it quickly gets into “would have been faster to just write the code myself” territory.
On some level, I suspect that this is a documentation failure. It takes a new engineer on my team a bit of time to come up to speed, and they spend much of that time pairing with or picking the brains of the existing team members to learn things they couldn’t find in our internal docs. Some of the things that take new team members the most time to learn are the same ones LLMs struggle with. Maybe if our docs were better and talked in depth about those things and were kept up to date, both LLMs and new engineers would struggle less.
However, to the tools’ credit, I’ll say my team has seen much more impressive results in UI code. Our web app is fairly standard React/Redux code and LLM tools seem to cope with it a lot better. I don’t work on UI code too much, but I know our frontend engineers have been saving time asking tools to write new components for them and such. Even for them, it’s not a “3x more productive” level of gain, but it seems significant.
A big issue with these tools is that they are supposedly “intelligent” and therefore should be capable of doing everything. Products like Claude Code, Cursor, etc. reinforce that impression. But it is the wrong way to go about it. They are not intelligent, and the task quickly becomes too big for them.
An LLM is a tool, like a drill or a screwdriver. But what they try to sell you is a machine that can hang a painting on your wall. That machine will inevitably make a mess in any situation that is not ideal, but drills and screwdrivers are quite useful.
My number one use of LLMs with coding is this: write a
// TODOcomment, pipe the function (or the file) to a local, not brilliant LLM, and replace it with the response. That’s it.This is a wonderful idea. Apart from the nostalgia factor, I think newsreaders are much better equipped to deal with big streams of information than any algorithmic time.
Yes and no. After making and using Illuminant for a while, I feel like I want to read everything, which is not the usual way to consume “Twitter-likes”, so I can’t follow too many people, or the newsreader-interface feels overwhelming.
It might just be me, though, back when Twitter existed I also wanted to read everything, and was annoyed by the interface, having to remember how far to (doom)scroll down.
It’s no longer safe to have anything in US clouds. But we in Europe have been massively lazy in some aspects.
Can you elaborate the “laziness” part? I have a very narrow view of EU, but I read somewhere that EU does not encourage risk taking like the US does, so most large successful tech companies are American and not European.
I wouldn’t be sure that the difference is political. For instance, https://www.heritage.org/index/pages/all-country-scores shows that 14 countries in Europe are more free than USA in terms of economic freedom.
But we sure are more conservative in technology. We let others do the pioneering work and then just pick established technologies, instead of trying or building new stuff. We are scared of being wrong and of failing.
There are exceptions of course, but they are too rare.
Thanks for the Freedom Index, useful data.
Though I think the difference may be more cultural - in general EU has a low risk taking culture (scared of being wrong and failing), which possibly also means the society in general frowns upon those who fail. Unlike in the US, where every failure (that does not ruin you financially) is celebrated and risk taking is rewarded.
European culture is different, but from what I understand, there are also very practical reasons that we don’t have a VC culture like the US. For all the talk about harmonization, single market, etc., I was surprised to learn that it is apparently not possible that a VC in one EU country can fund a startup in another one. Or at least not without jumping through countless bureaucratic hoops. Effectively, each country needs to take on Silicon Valley on its own.
It’s just as safe as it’s always been.
No,. there’s a lot of policy discretion. The US government has access to any data stored in the US belonging to non-US persons without basic due process like search warrants. The data they choose to access is a policy question. The people being installed in US security agencies have strong connections to global far right movements.
In 2004 servers operated by Rackspace in the UK on behalf of Indymedia were handed over to the American authorities with no consideration of the legal situation in the jurisdiction where they were physically located.
/Any/ organisation- governmental or otherwise- that exposes themselves to that kind of risk needs to be put out of business.
I seem to remember an incident where instapaper went offline. The FBI raided a data centre and took a blade machine offline containing blade servers they had warrants for, and instapapers, which they didn’t. So accidents happen.
Link: https://blog.instapaper.com/post/6830514157
Yes, but in that case the server was in an American-owned datacenter physically located in America (Virginia), where it was within the jurisdiction of the FBI.
That is hardly the same as a server in an American-owned datacenter physically located in the UK, where it was not within the jurisdiction of the FBI.
Having worked for an American “multinational” I can see how that sort of thing can happen: a chain of managers unversed in the law assumes it is doing “the right thing”. Which makes it even more important that customers consider both the actual legal situation and the cost of that sort of foulup when choosing a datacenter.
The FBI has offices around the world.
https://www.fbi.gov/contact-us/international-offices
Serious question, who’s putting data in
us-westetc when there is eu data centres? And does that free rein over data extend to data in European data centres? I was under the impression that safe harbour regs protected it? But it’s been years since I had to know about this kind of stuff and it’s now foggy.It does not matter where the data is stored. Using EU datacenters will help latency if that is where your users are, but it will not protect you from warrants. The author digs into this in this post, but unfortunately, it is in Dutch: https://berthub.eu/articles/posts/servers-in-de-eu-eigen-sleutels-helpt-het/
I re-read the English article a bit better and see he addresses it with sources and linked articles. Saturday morning, what can I say.
A lot of non-EU companies. Seems like a weird question, not everyone is either US or EU. Almost every Latin American company I’ve worked for uses us-east/west, even if it has no US customers. It’s just way cheaper than LATAM data centers and has better latency than EU.
Obviously the world isn’t just US/EU, I appreciate that. This article is dealing with the trade agreements concerning EU/US data protection though so take my comment in that perspective.
I don’t see how this is at odds with the parent comment?
That is the one good thing. It has always been unsafe, but now people are finally starting to understand that.
Because it’s dramatically less safe. Everyone saying “it’s the same as before” has no clue what is happening in the US government right now.
And everyone saying it’s dramatically different has no clue what has happened in the US government in the past.
I haven’t personally made up my mind on this, but one piece of evidence in the “it’s dramatically different (in a bad way)” side of things would be the usage of unvetted DOGE staffers with IRS data. That to me seems to indicate that the situation is worse than before.
yeah could be
You’re incorrect. The US has never had a government that openly seeks to harm its own allies.
What do you mean? Take Operation Desert Storm. Or the early Cold War.
Not sure what you mean—Operational Desert Storm and the Cold War weren’t initiated by the US nor were Iraq and the USSR allies in the sense that the US is allied with Western Europe, Canada, etc (yes, the US supported the USSR against Nazi Germany and Iraq against Islamist Iran, but everyone understood those alliances were temporary—the US didn’t enter into a mutual defense pact with Iraq or USSR, for example).
they absolutely 100% were initiated by the US. yes the existence of a mutual defense pact is notable, as is its continued existence despite the US “seeking to harm” its treaty partners. it sounds like our differing perceptions of whether the present moment is “dramatically different” come down to differences in historical understanding, the discussion of which would undoubtedly be pruned by pushcx.
My gut feeling says that you’re right, but actually I think practically nobody knows whether you are or not. To take one example, it’s not clear whether the US government is going to crash its own banking system: https://www.crisesnotes.com/how-can-we-know-if-government-payments-stop-an-exploratory-analysis-of-banking-system-warning-signs/ . The US governmant has done plenty of things that BAD before but it doesn’t often do anything that STRANGE. I think.
the reply was to me
Oh, yeah. Clearly I’m bad at parsing indentation on mobile.
Just because it was not safe before, doesn’t mean it cannot be (alarmingly) less safe now.
And just because it logically can be less safe now doesn’t mean it is.
It is not. Not anymore. But I don’t want to get into political debate here.
I suspect parent meant it has never been safe
This isn’t true, as the US has been the steward of the Internet and its administration has turned hostile towards US’s allies.
In truth, Europe already had a wake-up call with Snowden’s revelations, the US government spying on non-US citizens with impunity, by coercing private US companies to do it. And I remember the Obama administration claiming that “non-US citizens have no rights”.
But that was about privacy, whereas this time we’re talking about a far right administration that seems to be on a war path with US’s allies. The world today is not the same as it was 10 years ago.
hm, you have a good point. I was wondering why now it would be different but “privacy” has always been too vague a concept for most people to grasp/care about. But an unpredictable foreign government which is actively cutting ties with everyone and reneging on many of its promises with (former?) allies might be a bigger warning sign to companies and governments world wide.
I mean, nobody in their right mind would host stuff pertaining to EU citizens in, say, Russia or China.
Which is to say: its not safe at all and never has been a good idea.
I would like to represent the delegation of broke people in their 20s whose tech salaries are efficiently absorbed by their student loans:
You don’t need a smart bed. My mattress cost $200 and my bedframe cost <50. I sleep fine. I know as people age they need more back support but you do NOT need this. $250 worth of bed is FINE. You will survive!!
I’m not sure I agree. Like if you are living paycheck-to-paycheck then yeah, probably don’t drop $2k on a mattress. But I believe pretty strongly in spending good money on things you use every day.
The way it was explained to me that aligned with my frugal-by-nature mindset was basically an amortization argument. You (hopefully) use a bed every single day. So even if you only keep your bed for a single year (maybe these newfangled cloud-powered beds will also have planned obsolescence built-in, but the beds I know of should last at least a decade), that’s like 5 bucks a day. Which is like, a coffee or something in this economy. I know my colleagues and I will sometimes take an extra coffee break some days, which could be a get up and walk break instead.
You might be young now, but in your situation I would rather save for my old age than borrow against my youth. And for what it’s worth I have friends in their 20s with back problems.
(of course, do your own research to figure out what sort of benefits a mattress will give to your sleep, back, etc. my point is more that even if the perceived benefits feel minimal, so too do the costs when you consider the usage you get)
Mattresses are known to have a rather high markup, and the salesmen have honed the arguments you just re-iterated to perfection. There are plenty of items I’ve used nearly daily for a decade or more. Cutlery, pots, my wallet, certain bags, my bike, etc. None of them cost anywhere near $2000. Yes, amortized on a daily basis, their cost comes to pennies, which is why life is affordable.
Yes, there are bad mattresses that will exacerbate bad sleep and back problems. I’ve slept on some of them. When you have one of those, you’ll feel it. If you wake up rested, without pains or muscle aches in the morning, you’re fine.
I too lament that there are things we buy which have unreasonable markups, possibly without any benefits from the markups at all. I guess my point is more that I believe – for the important things in life – erring on the side of “too much” is fine. I personally have not been grifted by a $2k temperature-controlled mattress, but if it legitimately helped my sleep I wouldn’t feel bad about the spend. So long as I’m not buying one every month.
I think one point you’re glossing over is that sometimes you have to pay an ignorance tax. I know about PCs, so I can tell you that the RGB tower with gaming branding plastered all over it is a grift [1]. And I know enough about the purpose my kitchen knife serves to know that while it looks cool, the most that the $1k chef’s knife could get me is faster and more cleanly cut veggies [2].
You sound confident in your understanding of mattresses, and that’s a confidence I don’t know if I share. But if I think of a field I am confident in, like buying PCs, I would rather end the guy who buys the overly marked-up PC that works well for him than the one who walks a way with a steal that doesn’t meet his needs. Obviously we want to always live in the sweet spot of matching spend to necessity, but I don’t know if it’s always so easy.
[1] except for when companies are unloading their old stock and it’s actually cheap.
[2] but maybe, amortized, that is worth it to you. I won’t pretend to always be making the right decisions.
Note, because it’s not super obvious from the article: the $2k (or up to about 5k EUR for the newest version) is only the temperature-control, the mattress is extra.
All that said: having suffered from severe sleep issues for a stretch of years, I can totally understand how any amount of thousands feels like a steal to make them go away.
One of the big virtues of the age of the internet is that you can pay your ignorance tax with a few hours of research.
In any case, framing it as ‘$5 a day’ doesn’t make it seem like a lot until you calculate your daily take-home pay. For most people, $5 is like 10% of their daily income. You can probably afford being ignorant about a few purchases, but not about all of them.
Maybe I would have agreed with you five years ago, but I don’t feel the same way today. Even for simple factual things I feel like the amount of misinformation and slop has gone up, much less things for which we don’t have straight answers.
Your point is valid. I agree that we can’t 5-bucks-of-coffee-a-day away every purchase we make. Hopefully the ignorance tax we pay is much less than 10% of our daily income.
I think smart features and good quality are completely separate issues. When I was young, I also had a cheap bed, cheap keyboard, cheap desk, cheap chair, etc. Now that I’m older, I kinda regret that I didn’t get better stuff at a younger age (though I couldn’t really afford it, junior/medior Dutch/German IT jobs don’t pay that well + also a sizable student loan). More ergonomic is better long-term and generally more expensive.
Smart features on the other hand, are totally useless. But unfortunately, they go together a bit. E.g. a lot of good Miele washing machines (which do last longer if you look at statistics of repair shops) or things like the non-basic Oral-B toothbrushes have Bluetooth smart features. We just ignore them, but I’d rather have these otherwise good products without she smart crap.
Also, while I’m on a soapbox – Smart TVs are the worst thing to happen. I have my own streaming box, thank you. Many of them make screenshots to spy on you (the samba.tv crap, etc).
Yes, absolutely! Although it would be cool to be able to run a mainline kernel and some sort of Kodi, cutting all the crap…
I guess you never experienced a period with serious insomnia. It can make you desperate. Your whole life falls in to shambles, you’ll become a complete wreck, and you can’t resolve the problem while everybody else around seems to be able to just go to bed, close their eyes and sleep.
There is so much more to sleep than whether your mattress can support your back. While I don’t think I would ever buy such a ludicrous product, I have sympathy for the people who try this out of sheer desperation. At the end of the day, having Jeff Bezos in your bed and some sleep is actually better than having no sleep at all.
You make some good points why this kind of product shouldn’t exist and anything but a standard mattress should be a matter of medical professionals and sleep studies. When people are delirious from a lack of sleep and desperate, these options shouldn’t be there to take advantage of them. I’m surprised at the crazy number of mattress stores out there in the age of really-very-good sub-$1,000 mattresses you can have delivered to your door. I think we could do more to protect people from their worn out selves.
None of the old people in my family feel the need for an internet connected bed (that stops working during an internet or power outage). Also, I imagine that knowing you are being spied on in your sleep by some creepy amoral tech company does not improve sleep quality.
I do know that creepy amoral tech companies collect tons of personal data so that they can monetize it on the market (grey or otherwise). Knowing that you didn’t use your bed last night would be valuable information for some grey market data consumers I imagine. This seems like a ripe opportunity for organized crime to coordinate house breakins using an app.
I believe the people who buy this want to basically experience the most technological “advanced” thing they can pay for. They don’t “need” it. It’s more about the experience and the bragging rights, but I could be wrong.
I’m sorry to somewhat disagree. The reason I would buy this (not at that price tag, I had actually looked into this product) is because I am a wildly hot person/sleeper. I have just a flat sheet on and I am still sweating. I have ceiling fans running additional fans added. This is not only about the experience unless a good night sleep is now considered “an experience”. I legitimately wear shorts even in up to a foot of snow.
As the article says, you can get the same cooling effect with an aquarium chiller for that purpose. You don’t need a cloud-only bed cooler.
Ouch… Please do not follow this piece of advice. A lot of cheap mattresses contain “cancer dust”[1] that you just breath in when you sleep. You most likely don’t want to buy the most expensive mattress either, because many of the very expensive mattresses are just cheap mattresses made overseas with expensive marketing.
The best thing to do is to look at your independent consumer test results for your local market. (In Germany where I live it’s “Stiftugn Warentest” and in France where I’m from it’s “60 millions de consommateurs (fr).” I don’t know what it is in the US.)
A good mattress is not expensive, but it’s not cheap either. I spend 8 hours sleeping on this every day, I don’t want to cheap out.
[1] I don’t mean literal cancer dust. It’s usually just foam dust created when the mattress foam was cut, or when it rubs against the cover. People jokingly call it “cancer dust”
source?
https://www.everydayhealth.com/healthy-home/does-your-mattress-contain-fiberglass-how-to-know-and-why-its-dangerous/
wait… is it carcinogenic? Now I’m concerned lol
I wouldn’t know. Because it depends on what the “dust” is. It just lead most reviewer to say “this can’t be healthy”
This article claims that it just lead to lung irritation. But again, I’m just paranoid, with asbestos we started having concerns way too late.
If used in certain ways, sure.
To be fair, there are people who also copy-pasta from Stack Overflow.
If you use your tools mindfully, actively seek understanding, and pay attention to self-improvement, language models can serve a useful role.
Agreed. One of the great features of LLMs is that you can ask it to elaborate, clarify, give examples, etc. If you like, you can learn much more from an LLM than from a single StackOverflow answer. The author mentions a particular detailed answer. But now every answer can be like that.
As always, some people put in the work, and others don’t.
I don’t have a great deal of experience with LLMs, but typically as soon as I ask them more than two follow-up questions, they begin hallucinating.
I was recently working with go-git and decided to ask ChatGPT how I could upload packs via SSH as the documentation didn’t make it seem immediately obvioys, it kept trying to use nonexistent HTTP transport functions over SSH even though I explicitly provided the entire godoc documentation for the SSH transports and packfiles. Granted, the documentation was lacking, but all I needed to do at last was to thoroughly digest the documentation, which ChatGPT is evidently unable to do. In another scenario, it also suggested ridiculous things like “Yes, you can use
sendfile()for zero-copy transfers between a pipe and a socket”.Anyhow, at least for the fields that I encounter, ChatGPT is way worse than asking on SO or just asking in an IRC channel.
Unfortunately, one needs to develop some kind of intuition of what an LLM is capable of. And with that, I don’t just mean LLMs in general and the types of questions they can handle, but also the different models and the way you feed them your documentation.
Some are better at technical questions than others, some are better at processing large text input. I prefer to use local models, but they can’t handle long conversations. I almost automatically start a new one after two follow-ups because I know Qwen will get confused. On the other hand, I know that if I were to use Gemini on NotebookLM, I could throw a whole book in it, and it would find the right part without breaking a sweat.
Using the right LLM-equipped tool is just as an important choice as the model itself. For understanding codebases, aider is the best in my experience. It adds the information git has about your project automatically to the context. For more general learning and research, I like to create Knowledge Stacks in Msty.
As a test, I cloned the
go-gitrepository and asked aider your question. It pointed me to this file: https://github.com/go-git/go-git/blob/main/plumbing/transport/ssh/upload_pack_test.go and then proceeded to write come code.I am not familiar with go-git, but would that have been a helpful answer to you?
It’s much closer than what I got, but the actual solution is slightly more complex. (Though I stopped using go-git for remote operations due to bugs)
Fair point. If I’m ever slightly in doubt about its competence, I push back on LLMs vigorously. I include in my pre-instructions that I’m not looking for a “yes man”. (At the same time, I don’t want pedantic disagreement. Ah, the complexity of it all!)
How does a company get so backwards? They went with utmost haste from leading to following. Where once I thought the sky was the limit for this product, now I just think they’re stuck on the exact same plateau as everyone else.
I think they have some definite polish on aspects of their editor, but I was (and still am) put off by the funding model for the editor. Good quality engineering isn’t free and I remain unconvinced that hockey stick growth style models are practical for guiding reasonable product development.
Another thing that gets me is marrying a reasonably fast process (the editor’s insertion speed and search speed) with a much slower process (LLM inference).
According to the Minimizing Latency: Serving The Model section, they are sending the text to online services to get predictions? Apart from the privacy concerns, I wonder who is paying for the GPU cost, and how does that factor in their business model.
FWIW I should be a little more precise and say “either LLM inference or network requests” since there’s clearly capacity to send over the net. Both seem slower (and more variable) than local editor business.
What’s backwards here? The presence of LLM at all?
I’ve come to consider this kind of LLM tab completion essential. It’s the only piece of LLM software I use and I find it saves me a lot of time, at least the implementation in Cursor. It often feels like having automatic vim macros over the semantics of the code rather than the syntax of the code. Like if I’m refactoring a few similar functions, I do the first one and then magically I can just press tab a few times to apply the spirit of the same refactor to the rest of the functions in the file.
My question is: why is that good? “Magical” is one of those words in programming that usually means something has gone horribly wrong.
Don’t get me wrong: I want my tool to make it easy to make mechanical changes that touch a bunch of code. I just don’t want the process to do it to be a magical heuristic.
You’re the one saying that the company is doing something backwards, I think it’s on you to justify that when asked, not to come back with a question tbh.
Statements like these are just dogma/ rhetoric. Words like “magical” are just like “simplicity” or “ugly”, they mean something different to everyone.
Why not? What if it’s a problem best suited by heuristics?
I’m in a similar boat.
I’m less bullish on having the LLM do a large-scale refactoring than I am on using an LLM to generate a codemod that I can use to do the large-scale refactoring in a deterministic fashion.
But for small-scale changes—I wouldn’t even necessarily call them “refactorings”—like adding a new field to a struct and then threading that all way through, I’ve found that our edit predictions can cut down on a lot of the mundanity of a change like that.
The big question is: what environment does that codemod target?
For a system like this to work well there has to be a consistent high-level way of defining transformations that many people will use and write about so that models will understand it well. For that to happen you need an abstraction over the idea of a syntax node.
can the LSP protocol married somehow to tree-sitter be the answer here?
Tree sitter is far closer to being the answer than LSP is
My ideal interaction would be something like, “an LLM writes a script that modifies code and I decide whether I want to run that script”.
I’m not sure about Zed but you can do that with Cursor. The tab model is very small and fast, but Cursor has a few options ranging from “implicit inline completion suggestions with tab” to “long form agent instructions and review loop” similar to what you describe - you ask it to do stuff in a chat like interface, it proposes diffs, you can accept the diffs or request adjustments. But, I find explicitly talking to the AI much slower and more flow interrupting compared to tab completion.
I do use a mode that’s in between the two where I can select some text, press cmd-k, describe the edit and it will propose the diff inline with the document. Usually my prompt is very terse, like “fix”, “add tests”, “implement interface”, “use X instead of Y”, “handle remaining cases” that sort of thing.
I use plenty of heuristics in my editor already, like I appreciate fuzzy-file-find remembering my most opened files and up-weighting them, same with LSP suggestions and auto-imports. The AI tab completion experience is a more magical layer on top, but after using it for about an hour it starts to feel just like regular tab completion that provides “insert function name”, it’s just providing more possible edits. Another time saver I appreciate is when it suggests an edit to balance some parenthesis/braces for a long nested structure that I’m struggling to wrangle in my own.
These days, my favorite use of LLMs is to write a
// TODOcomment at the appropriate place, send a snippet with the lines to be changed to the LLM, and replace the selection with the response. With the right default prompt, this works really well with the pipe command in editors like Neovim, Kakoune, etc. and a command line client like llm, or smartcat.The place I miss an LLM the most is in my shell. I’d love to be able to fall back to llm to construct a pipeline rather than needing to read 6 different manpages and iterate through trial and error. Do you have a setup for ZSH/bash/etc that’s lightweight? I haven’t seen anything inspiring in this area yet outside proprietary terminal emulators (I’m not interested)
I’m spoiled because I can’t do anything like that. I’m inventing a genuinely new technology, so I always have to think for myself because there’s no one to follow or imitate. I’m sure it sounds weird to hear me be excited about building my internal model for where changed requirements will manifest as need for changed code, but my mental model of that is razor sharp, and thinking about where changes are needed myself gives me leave to think about whether my code is expressive enough and has strong architecture.
But yeah, I know I’m the weird one. I’m the kid that retyped the red-underlined word instead of right clicking to correct spelling, the idea being that I wanted learn how to spell and spot/correct spelling mistakes instead of the machine.
Once the diff gets big enough it starts to have problem of its own. How will you know if it’s all correct without redoing all the work? What if the diff is stale by the time it is reviewed and approved? Generating a script instead of a diff solves those problems, and incidentally has another property that I prize very highly: it is just as useful to humans as it is to LLMs. Once you can define large changes as small scripts typing will no longer be the odious part of making changes that touch a lot of code.
when “ollama” uses the model id “mistral-small:24b”, what exactly is the model and what quantization does it use? Does it use a single GPU or does it use both?
apparently a single or fp16/bf16 or even the 8bit 70B is not gonna fit in 2x24GB, what exactly is used here and how?
mistral-small:24bpoints to24b-instruct-2501-q4_K_M, so definitely not full precision. It is only 14GB and fits well on one card. You can find the available versions here: https://ollama.com/library/mistral-small/tagsI have not tried the other versions yet.
I love going back to basics and cutting away as much cruft as possible, but blogs in raw HTML or text generally don’t have an RSS feed, probably because it requires more work to maintain. As a result, I inevitably visit them only once and then lose track of them.
That’s the first time someone mention RSS feed when I share something about my small txt “blog”, thank you for your comment.
I need to think a bit about it before actually solving you problem, but I think that the bash script I use to create new txt files (with BOM) might be extended to create/update a rss feed.
Bit of a self-promo I guess but I wrote a tool to deal with hand-managed RSS feeds. I didn’t end up using it for much, but that’s a symptom of me not blogging much more than anything. If it works for other folks, I’d probably be up for spending a couple cycles updating it. https://codeberg.org/klardotsh/kaboom
It turns out that maintaining an RSS feed by hand is exactly as much work as linking to the pages in an index.
(I also blog in raw html on https://orib.dev, but I don’t make much noise about it; it’s just the least fussy way for me to type shit out.)
That’s the beauty of a static site generator. You get nice formatting, maybe also RSS feeds, with minimal per-post effort. (e.g. write in maybe Markdown). That said, been a while since I’ve added a post myself. ;)
$1700 is quite a large budget. If the total cost were halved, that would still be a sizeable budget. I feel like tech writers these days are forgetting what the phrase “on a budget” implies.
I agree, it is a big sum of money. I interpret “on a budget” as “relatively cheap”, not as “nearly free”. I think it is pretty cheap compared to what one normally needs to pay for that amount of VRAM. To me, the term is more justified here than in a post where someone buys a second-hand Apple laptop for $1100 and claims it is a cheap solution to browse the web.
I really hope AMD catches up and prices come down, because AI-capable hardware is not nearly as accessible as it should be.
Maybe it would be more accurate to say “on a budget” is a form of weasel word. Its interpretation depends on your familiarity with current prices and your socioeconomic status.
From my very subjective (and probably outdated) PoV…
You can imagine the surprise I felt (or was that shame?) when seeing a $1700 price tag on a “budget” desktop PC that can do AI.
“Building a personal, private AI computer for $1700” would communicate the intent a little better, without suggesting to the reader anything about their ability to afford it.
Btw, I don’t mean to imply any wrong was committed. I’m just pointing out that the wording on the post had some unintended effects on at least this reader. To a large degree that is unavoidable, no matter what a person publishes on the web.
A few weeks ago I saw a reference to someone on ex-Twitter speccing an LLM workstation for $6,000, so $1,700 is a on a budget compared to that.
They get 5 tokens per second on the 70b models
Agreed. I’m running local models with an RTX 3060 12 GB that costs about $330 on NewEgg or 320€ new at ebay.de, and it’s actually useful. The context sizes must be kept tiny but even then it can provide basic code completion and short chats.
The code they write is riddled with subtle bugs but making my computer program itself seems to never get old. Luckily they also make it quicker to write throwaway unit tests. The small chat models are useful for language tasks such as translating unstructured article title + URL snippets to clean markdown links. They also act as a thesaurus, very useful for naming things, and can unblock progress when you’re really stuck with some piece of code (rubberduck debugging). Usually the model just tells me to “double-check you didn’t make any mistakes” though :)
On the software side I use ollama for running the models, continue.dev for programming (it’s really janky), and the PageAssist Firefox extension for chat.
Apparently the Commodore Amiga 500 was introduced at 699 USD in 1987 - just shy of 2 000 USD inflation adjusted.
Guess that says more about how much prices for computers have come down, than anything.
If you look at “modern” gaming graphics cards, that is so cheap I was actually surprised. (even compared to a 30xx from some years ago)
If the median price of a thing is high, then absolute values don’t matter. A new car for under 10k EUR would still be “on a budget”, even if it’s a lot of money.
How about a good priced consumer grade card that gives similar performance? Is there any option to the Nvidia Tesla P40? Slightly more modern, less power without all the hacky stuff?
the RTX series have a desktop form factor and comparable memory, but ain’t exactly cheap.
You can run inference using CPU only, but you’ll have to use smaller models since it’s slower. But the P40 is the best value right now given the amount of VRAM it has.
There are several options for a consumer grade card, but it all gets incredibly expensive really fast. I just checked for my country (The Netherlands) and the cheapest 24GB card is 949 euros new. And that is an AMD card, not an Nvidia. While I am sure the hardware is just as good, the fact is that the software support for AMD is currently not at the same level as Nvidia.
Second-hand, one can look for RTX 3090’s and RTX 4090’s. But a quick check shows that a single second-hand 3090 would cost over 600 euros at minimum here. And this does not consider that those cards are really power hungry and often take up 3 PCIe slots, to make space for the cooling, which would have been an issue in this workstation.
Since I could only accommodate speeds to what PCIe 3.0 offers anyway, a limitation of the workstation, this seemed the best option to me. But of course, check the markets that are available to you to see if there are better deals to be made for your particular situation.
I sometimes wonder if it’s just the aspect of getting older that makes software seem like it’s getting worse. Software was certainly simpler in my time, but it didn’t do a lot of the things that software does today. Both bad and good. I couldn’t have imagined having 32 processors available on a user system back in 1980. Now it’s commonplace. Kind of like the car, things have advanced to the point where things are better and worse.
I don’t have a full answer for you, but one of the subjective aspects of software that I find gets worse over time is that I feel that there is a “greater distance” between myself, the software, and my data.
For example, if I want to count how many pdf files I have in my home directory on my home computer, it’s a simple incantation:
find ~ -name '*.pdf' | wc -l. It’s easy, fast, and it uses the great capability of Unix pipelines to pass data through multiple filters.If I wanted to do the same, but for files in an S3 bucket, I need to use the web UI and hope that I can do a search or filter and that the UI will show how many objects matched. I could also try to use the
awsclitool to have a stream of bytes that I can pipe intogrep | wc -lto make the task more Unixy. In either case, I can’t start obtaining the information until I’ve authenticated with MFA.In this example, going from “I have a question” to “I have an answer” is much faster and much less of a hassle in the desktop scenario than in the web service scenario, because the “distance” between myself and my data is much shorter. If the S3 approach is annoying enough, I might avoid finding answers to my questions.
In the past 10-15 years, as software has migrated from the desktop to the web, our ability to manipulate our data quickly, easily, and flexibly has diminished greatly. Even on our phones, managing simple files and their content can feel out of reach. And unfortunately, it seems that more of our daily experience must be done with software where it feels like you are not the owner of your data, but just a customer who can do with his data what the SaaS provider has deemed you can do.
Surely you also sign in to the Unix-like OS on your home computer before running
find? Likely you use fewer factors of authentication than for AWS, but I guess it’s a difference between two factors and one factor rather than two and zero.Modern software seems much more difficult to automate. Web/cloud are the worse for this. The default way to repeat anything is to manually perform the action again, and again, and again…
APIs are all very well, but typically take significant programming to use. Oh for an equivalent to “
find ~ -name '*.pdf' | wc -l” for normal (or unusual) actions.In my experience, one needs to push a bit for things to become simple again, but it is often possible. Another option in your example is to use s3fs-fuse, which will let you mount a bucket just like any other external storage. Then
find ~ -name '*.pdf' | wc -lsimply works as it did before.It is a simple one-time install. You do have to set up the MFA, that is true. But I would argue that security is one of the things that has gotten better over the years.
Do you also find that kids are worse behaved and have stupider haircuts? I know I do, but so has almost every generation of middle aged adult.
I guess cuz I kind of divorced myself from most of the cloud stuff. Although I do it at work, most of my personal stuff is just standard documents that I sell post and backup both on the cloud and locally. I do understand that sentiment though. For most normal folk it’s probably true.
Check out A Plea For Lean Software by Wirth: https://ia801608.us.archive.org/8/items/pdfy-PeRDID4QHBNfcH7s/LeanSoftware_text.pdf
It’s the same complaint about bloat, needless complexity, and inefficient fancy new language features, but being an artifact of its time, it complains about software needing 1MB of RAM when back in his days 32KB would suffice.
Mmm maybe but, you are talking with somebody very enthusiastic with AI developments for instance. I’m very interested in new things and new technologies. The complexity I’m referring here, like the one of web frameworks to take an example, or the absurd dependencies chain, has nothing to do with cool new things, it is just “free” complexity, where you do what you could easily do without such a mess.
Popularity has always been self-perpetuating, LLMs don’t make that any different. Before them, there already were fewer fora, fewer Stackoverflow answers and fewer Youtube tutorials for more obscure tools and languages. Many took that into account when picking something new to use or learn.
But I do think that LLMs exaggerate the problem in another way. LLMs have the tendency to confidently paper over the gaps of their knowledge, and the paper they use is that of the well-known solutions. As an example, often when I ask an LLM on how to do something in Kakoune, a modal editor, it will respond (partly) with Vim knowledge, the dominant modal editor. This can be thoroughly confusing, and can be discouraging to venture outside the common paths, which is bad for innovation.
Another example. A while ago, I was brainstorming with an LLM about a solution to a problem I needed to solve. I had an idea and asked if it could validate that by generating examples of the implementation. It cheerfully responded that it was indeed an excellent solution, and then demonstrated that by generating examples of another solution. Apparently, the internet had solved this before and their solution was actually better than mine. So this turned out well for me, but one can easily imagine a case where this behavior hinders the development of novel ideas.
One should be cautious with LLMs that operate on untrusted input like emails. This is waiting to be attacked with prompt injection. For now, the worst that could happen is probably that emails end up in the wrong folder, but one could easily see where this goes. Spammers will try to trick the LLM into letting their email in Inbox, hackers will try to suppress security warnings by moving it to Junk before you can see it, etc.
Enjoy the brief period of silence before the world has caught up.
Once such a tool gains the ability to delete, forward or reply to emails, all bets are off though, and you will get hacked.
That is a very good point, and I actually tried as a test to send some emails like “This is definitely not a spam” :) For fun mostly.
Tool only allows to move to spam or not. Does not have any other options.
This all applies to any spam filtering, such as SpamAssassin too. Does any of this uniquely impact LLMs? (The classic equivalent to prompt injection might be exploiting a parser bug and then getting RCE to influence other spam filtering, or worse.)
+1. I guess LLM spam filtering will soon be the minimum.
I see there’s a plausible risk, but it feels a bit much to say it’ll definitely happen. Is there any reason to believe this won’t just be the normal “arms race between security engineers and hackers”? As with most tech, some will get hacked.
I think in its current form, the damage that could be done is the same: misfiled emails. However, what makes LLMs worse is that they are fundamentally flawed and unfixable. A parser bug can be patched. And, sure, there will be a next bug, but eventually the thing will be hardened and safe to use.
Guarding an LLM against prompt injection, on the other hand, is a hopeless task, as there is no way to reliably separate user input from trusted input. This is the reason that projects like Operator from OpenAI come with a hefty disclaimer. The big players still haven’t figured it out. And I don’t think they ever will.
Well, if they ever start caring, I expect they will figure out how to have instructions as a separate out-of-band input. I find it likely that they will manage to do this within a year, using the already accumulated data sets and also the existing models. Although as an economic claim that they won’t start caring, you are probably right.
There are few independent LLM lines, and LLMs are complicated — thus there is a larger risk of a wide-spectrum attack, and bigger interest in finding it. Personally-tuned bayesian spamfilters can be less of a monoculture risk if a parser is either very well polished or written in a non-RCE-friendly language.
I rotate downloads folder! That is, move whatever is currently in downloads to downloads/.old, deleting the old old.
I just set my browser default download dir to
/tmpand to ask me for every file so it’s an easy opt-out. Simple and effective :)I still have a
~/Downloadsdir for files I opt-in to saving but don’t want to put in a tidier place.Re the thread: I don’t have anything I do manually every day. Maybe refreshing lobsters counts?
I need to do that, but I tend to want things from 6 months or a year ago out of it sometimes, so my Downloads folder is an ever-increasing list that I only occasionally go erase big files from(sort by largest, delete stuff I know I don’t care about anymore).
Of course I also dump screenshots and every other thing that would have historically gone in /tmp there too.
Was in exactly in that position, and than I added
~/tmp(just a folder in home) dir for screenshots, randomgit clones and other waste. That way, Downloads only stores stuff from the internet, so it’s always safe to wipe, as, worst-case, I’ll just re-download.~/tmp I wipe manually once in a while.
That totally makes sense! Good idea! promptly steals said idea
I configured it so that my desktop shows my Home folder, not a dedicated Desktop folder, and let everything save there (downloads, screenshots, etc.)
This forces me to clean up old stuff because I don’t like noisy desktops. It is like the zero inbox policy some people use for email, powered by my irritation.
I have the same setup except ~/tmp gets wiped on login. That way I definitely don’t depend on anything in there.
Everything for me goes into ~/Downloads: movies, DMGs, PDFs, ePubs, images. Once in a while I’ll clean up a category which has a destination somewhere else on the file system (most things these days) or delete them.
That’s what I’ve been doing, except I’m lazy and what ends up happening is I wait until I get a low space warning, then I sort by largest and delete a bunch until I’m bored or I have enough space to do what I need doing at the moment.
So I’m going to try they ~/tmp/ and put everything except downloads there, and then rotate ~/Downloads like matklad, see how that works out.
And there was me quite enjoying the extra contrast!
Same, I would very much like the text to stay crisp. I can get less contrast by dimming my screen, whereas increasing brightness can’t recover contrast that isn’t there.
Sounds like it’s time for some user css 8)
Back in the days of CRTs, we had a button on the monitor that you could turn to adjust the contrast. It was wonderful.
It still exists on LCDs… but not everyone can afford to take the approach I took for brute-forcing around various multi-monitor Linux gaming annoyances by just moving the games to a whole other PC and adding a KVM switch.
(No, I’m not crazy… just disinterested enough in AAA games that a hand-me-down PC from 2012 upgraded with a few other hand-me-down parts will suffice.)
If you change the monitor’s contrast for text, you also change it for image editing and video playback and so on.
Not the main point, but fading out the comments like that is IMO the worst thing you can do in your colour scheme. Jim Fisher argues it more completely, but nothing is more frustrating than a colleague completely failing to see exactly the information they needed, which was right there next to the very line of code they were editing, because of washed out colours.
Sometimes I’ve been so confused about why someone wasn’t finding the information they needed in a module that I knew had a long explanatory comment, until I watched them (when screen sharing) immediately and repeatedly scroll past every wall of light grey text - carefully constructed comments that their colour theme was telling them were irrelevant.
That is not “comments that their colour theme was telling them were irrelevant”, that is just a mistake from that colleague. If you are in a situation that you are actually looking for information, and you can’t see it when it is right in front of you, then it is simply time to take a break. Unless the comments were in the same colour as the background, your colleague could still easily see that something was there.
I believe the associations Jim Fisher writes about are highly subjective, and personally, I have very different ones than he describes.
Comments are not washed out because they are less important, but because they are something different than code. They are more like footnotes. When reading code, I want to follow the trail that the computer takes when executing the statements, so I wish to be able to easily follow those. Comments that are formatted as shown in the article would be likes obstacles that I need to jump over, before I can continue to follow the trail.
Remember, code tells what happens, comments tell why it happens. If at some point I get interested in the why, I “readjust” my focus to get the comments into view.
As for the red and green coloured diffs. I really don’t feel them as value judgements. I think it is just what you are accustomed to. They could be yellow and blue, for all I care.
Edit: maybe an editor should just have two different colour schemes, tuned for the work you are doing. One for detailed inspection of what the computer does, and another for a broader, more holistic view of what the code is about.
Washed out text is inherently harder to read, so if comments are meant to be read, putting them in a washed out color is a bad idea.
Perhaps they scrolled past because they’ve read too many articles declaring that comments are always out-of-date, always a failure, always bad, etc. ;)
I don’t like gray comments either, though more because I spend a lot of time editing them, sometimes screenfuls of them. If I were editing docs in a Markdown file I’d never make the main text color light gray.
I’m less convinced by the article’s other argument, against green insertions and red deletions. If you think deleting code is good, your brain will quickly adjust to that context. When you buy puts you don’t need to physically or mentally invert the chart colors, you just want to see the other color.
related article (just for record): Don’t Use Session (Signal Fork)
They go over that the original post in such detail, but they won’t link to it. That is a bit weak.
I think this belies the post’s title (“best laptop ever”). The author is clearly happy with his Air, but someone with a tighter budget and slightly different requirements could have written the same post about a even cheaper $500 Windows laptop.
I don’t know, the M-series are just nothing like previous laptops, which were often insanely overpriced for the terrible specs they came with - so I would still probably go for a (possibly used) M1 over.. pretty much anything else, even on a semi-tight budget.
Do you have a recommendation perhaps?
I use a Thinkpad X1 Carbon Gen6 for like 300€ refurbished a year ago and it’s fine: Intel i7, 16GB, 256GB, Intel graphics, Linux Mint Debian Edition.
Just good enough to run a tiny LLM like phi3 slowly. An M1 would be much faster, I guess.
What’s the battery life like? I’d expect to be able to use the M1 away from power most of the day.
With the M2 MBP, I no longer take a power adaptor on day trips, even if I expect to be working on it all day.
I never really tried, but I doubt that it would work out.
I don’t know about the M-series, never used one, but the article says they use it for:
You can do that with just about any laptop that was released in the last five years, provided you have enough memory. I regularly use a ten-year-old Dell that works just fine. The only trick is that it has 16Gb of RAM.
Assuming you want to stay constantly plugged in. I’ve got two Dell XPS 13s (one 2017, one 2020) and they both have atrocious battery life. And the 2020 one was barely used for 4 years so the battery is still very healthy.
I’ve got an XPS 13 (9300, from 2020) that has a pretty good battery, but I wasn’t expecting that. PC laptop batteries feel like a lottery, especially as they age. ThinkPads in my experience have been the worst for battery health, with massive capacity drop-off in a few years. Meanwhile, my 12 year old MacBook has pretty reasonable battery health considering its age. (It’s not just an Apple thing too; I’m told Panasonic Let’s Notes are also really good with aging batteries.)
Ah, yeah, I am on my third battery. I should have mentioned that too.
It is okay, though. I use it regularly for work on a two-hour train ride and I don’t get into trouble.
But it often is simple. Don’t trust the people who are paid to “solve” a problem when they it is unsolvable. Organizations created to solve an issue may make it worse because their existence depends on the issues continued existence. I’d rather trust an outsider with a proven track record of simplifying processes and a general trail of success rather than someone whose employment and/or empire may well rely on a problem appearing unsolvable. The worst thing that could happen to governments and politicians is people realizing we don’t really need them for most things where they have inserted themselves into our lives, so they keep certain problems going and unsolved, and sometimes actually cause a crisis and sell themselves as the solution.
Rest of the article was fine, idk why this obvious and absolutely misplaced shade throwing at the beginning was necessary.
Also don’t trust people who say it is simple when they don’t understand the whole problem, and don’t have the time and budget constraints.
Heh, yeah, I had the same thought. I worked at a startup that:
One of my “favourite” moments was when the Director of Central IT raised a “security issue” very late in the process for a given deployment. This was in a very large stakeholder meeting.
Director: “Me and my staff have some significant concerns about the security of the deployment you’re about to do.”
Me: “Oh! Could you elaborate on that?”
Director: “I’ll follow up with an email. We will be voting to block the deployment until these concerns are addressed.”
After not getting any follow-up for a few days I started chasing him (because we were currently blocked from deployment…) and finally he responds with a PDF called “McAfee Top 10 Security Vulnerabilities in Web Applications”
Me: “I’ve reviewed the document you sent me and, from my point of view, we’re not deficient on any of those 10 items. Is there a specific item you’re concerned about that we could address for you?”
Director: “If you can’t see where your deficiencies are based on that document I’m not sure I can help you.”
In the end? I had to put him on the spot in another stakeholder meeting a few weeks later and get him to say “there are no specific security concerns we have at this time” after he, several times, raised these vague concerns as a blocker but couldn’t articulate any specifics. Deployment delayed for weeks over nothing.
I’ve been involved in quite a few government modernization projects. I’ve also run into people with similar dispositions as this security director. In my experience these interactions have been a cultural mismatch rather than the rent seeking behavior described by the post you’re responding to. There has been a culture in most government organizations, particularly around IT security and other risk categories, that revolves around checklists. They will have concerns unless you’ve come to the review meeting with some positive evidence of your diligence in detecting errors in a bunch of different areas. So his concerns weren’t so much that his staff identified specific vulnerabilities but that you’ve asserted you’re ready to deploy without furnishing evidence to support the claim. It all comes down to whether the leader of their department has something to provide to legislative committees, IGs, and other oversight groups to prove they weren’t negligent if/when something goes awry. I’ve had quite a bit of success helping these folks get comfortable with how modern development practices embed a lot of their checklists in the earlier stages of development.
If this happened in India I’d assume the person was looking for a bribe. The culture is that people get on committees with the mindset of feudal guards: they’re there not as guardians of the process but as extortionists. Society suffers.
I have heard the same sort of stories from people trying to sell products and services to any large organization, whether in the public or private sector. It sticks out because for public organizations they’re wasting the public’s time and money, instead of the shareholder’s.
IME, engineers rarely say a problem like this is “unsolvable”. It almost always boils down to people in charge not wanting to pay to do it the right way.
How would that even work? This outsider also depends on the issues for their existence, so the same incentives apply.
Outsiders can bring a fresh perspective, insiders have intimate domain knowledge that is not easily replicated. Both are valuable, and that has nothing to do with governments per se.