You may even get to the point that you believe it is impossible to avoid half-constructed objects in imperative code. Which is why I am again going to bang on the fact that functional programs prove that it is in fact possible, because they give their programmers no choice, and there isn’t any program that functional programmers are particularly stymied by as a result of this constraint on them.
My experience differs from this. I find that in functional programs I frequently encounter the need (or at least the desire) to create a FooBuilder object to collect the information which I will later turn into a Foo, but which I can’t create all in one go because the process of collecting all the fields is lengthy or difficult.
For example, I may want to build a tree from the top down: find the information that goes in the root node, then using that information go obtain the data for each child of the root, one-by-one, and recurse downward. BUT, if the tree itself has invariants that are only true when all child nodes are properly populated, then I may need to build a type called “PartialTree” which is a tree-like structure that does not guarantee the invariants, then build a PartialTree recursively and finally use it to construct the Tree.
If I do this, the half-constructed objects ARE needed, it’s just that I can choose to use different types for the half-constructed object and the fully-constructed object to clearly denote to the type system when one can and can’t rely on the invariants that come with being fully constructed.
I think this could be only a semantic disagreement about what a half-constructed object is—I think if you use a separate partial type, you’ve avoided half-constructed values of any type. Maybe it’s better to say that stepwise initialization is not a good reason to loosen the final type.
It is sometimes a problem. If it is too expensive to construct two trees, you are left with the option of constructing the partial tree only, or relying on some uncommon language features for subtyping.
It is implied that every solution should be a technological one, should be sellable and should be intuitive. That’s it: you should not think too much about a problem but instead build blindly whatever solution comes to mind using the currently trending technological stack.
I’m reminded of simple vs. easy: Every solution framed this way is of the easy kind. It increments complexity yet again.
This isn’t just a microservices issue: it’s a broader problem in our industry. We throw around big words that sound impressive but mean wildly different things to different people.
Yup. See also “agents” in the LLM space at the moment - spend any time in those conversations and it quickly becomes clear that everyone involved has a slightly (or sometimes wildly) different definition that they incorrectly assume is shared by everyone else.
My favorite answers to the question “what does agent mean?” are the ones that involve “agentic”, because they do precisely nothing to progress that conversation!
You really have to ask every person what they mean. A lot of them mean a background service with user rights, like a macOS launchd agent, but this time with LLMs. The other thing they may mean is a chat assistant with a grab bag of side-effecting tools to use, like Siri, but this time with LLMs. And sometimes it’s the Siri part plus the background part too, and often the tool use is a very handwaved feature set.
agreed. IMO the LLM space suffers from a terminology vacuum - outside of academic terms (i.e. all these models are autoregressive generative text-to-text transformers) there really isn’t much terminology to describe how to use models in the real world. so people come up with terms and definitions that spread very quickly because of the extreme level of interest in the area.
When you build a crawler for a large swath of the web, is it so inconvenient to avoid making your traffic to any given site a single intense burst? A shared work queue and distributed BFS seems like the least they could do.
I do occasionally wonder whether the bots themselves are built at least partially with LLM-generated code, or by engineers who lean heavily on such tools. It would explain the incredible lack of, uh, common decency on part of the bots and their owners. I don’t have any hard evidence though so it’s all speculative ¯\(ツ)/¯
If the people who are trying their best to hoover up the entire internet in their quest for a better ELIZA gave a single damn about whether they inconvenience others they’d find a different line of work.
I vaguely recall hearing that there is an open crawling initiative that aims to basically crawl the Internet and make the results available to anyone who wants them. I’m not sure if they charge for the results or not, but one can imagine an initiative that makes the results available via torrent or similar so that the service itself is cheaper than crawling the Internet on one’s own.
I grew up as a Python programmer and type driven programming feels like such absurd tedium. Also a lot of the time even if you know what you want to do but the syntax will be godawful or the language so limited that it’s not possible.
Moved from a C# company to a Python company and most of the bugs seem to be type errors. Even misspelled class members slip through. Not that we have more bugs here, but this company is very dedicated to quality and I can only imagine what we could do with a typed language.
I did my thesis on category theory so I’m sure I’ve seen the worst of it. Didn’t see much of that in our million line code base. Most people like stability after all.
I come from typed world but then spent a lot of time with Lisp and Clojure, so now I love an interactive development workflow. If I’ve succeeded at writing a short program with obviously no bugs instead of no obvious bugs, types would only have slowed me down. And I have very little patience anymore with slow compilers.
However, I still want a typed language for a large codebase. Type checks are tests. They’re the real bottom of the test pyramid, finer grained and more numerous than unit tests. When you “make illegal states unrepresentable,” types do that work.
Gradual typing makes a lot of sense to me, except the part where it’s not eventually required.
Comparing data that is in a strong type and JSON I just received over the wire, one of them has easy guarantees while the other needs runtime validation or else. I don’t want all data to be the second kind. In a big enough system you need more reasons to trust the validity of data you receive from elsewhere in the program. Strong types raise that baseline.
I struggle with getting past basic impatience to establish a TDD or BDD habit. There are also so many takes about the right way to do those things that I don’t know which are missing the point. I wouldn’t mind a recommendation if you know a really well-informed guide.
I consider myself a pretty test-focused developer and I don’t know that I would say I subscribe to any TDD or BDD habit. If you’re writing tests, you’re getting value from those tests, and you know when/what types of tests to write, you’re fine.
I think the unfortunate reality is that this really comes down to taste and experience. Factoring your code so that it’s easily testable and then writing those critical tests is a skill that you develop over time – I really don’t think it’s the kind of thing that can be distilled into a couple of talking points. But if I had to try, I would say:
Keep IO separate from computation
Those computation functions are the ones that are worth testing
This is all familiar and I feel I write testable code along these lines, but what I’d like to try is not writing code without a test waiting to be turned green, that kind of thing. I imagine the best final workflow is not so extreme but I won’t really know the balance until I overdo it.
The times where it’s easiest to do this are when you have a well-defined feature set already written out. From there, it’s a matter of converting each individual requirement into a test case and using the test itself to define the interfaces you expect you’ll need to implement that requirement. When you don’t have good requirements, this style of test writing is basically impossible.
Fair enough! Personally I say if you’re already writing testable code you don’t stand to get much benefit from test-first: its main plus is that it forces you to think about these things upfront.
recently I had a mini-epiphany, about how to choose testing style. I was writing a compiler with a REST frontend, so you need both property tests (for codecs) as well as unit tests to pin down examples you want to work. But PDTs are fastest at sampling typical values, whereas UTs are for “zero-probability” configurations. So you need both, in different circumstances
One tip that helped for me: if you can hook up your test system something like entr, getting immediate feedback has made a huge difference. I learned about it from Julia Evan’s excellent blog post.
The default IDE experience in something like Xcode with running tests is slow and can make going through TDD a drag.
What do fans of (different kinds of) tiling window managers really like about them? I’m earnestly curious since I like my windows to float, pile/overlap somewhat, and be their own best sizes. Is it about putting all pixels to work? Keyboard finesse?
For me it’s an accessibility thing. It’s the best way I have found to use a desktop computer with keyboard only.
I turn off tabs in all programs that have them, so that I can navigate all windows in the same way, using at most one key per hand and keyboard-half.
I do have a nice pointing device for emergencies (drawing memes in Krita or so) but I can’t use it regularly.
There’s a big difference between tools that are merely possible to use with a keyboard, and tools that are easy to use with a keyboard, but I think it’s less obvious to users that can fall back to pointing.
It’s hard to put into words but I’ll try. This is strictly about a work setup with 3 screens, 2 big important ones and 1 smaller laptop screen (on the right). This also assumes xmonad or mimicking its workspace switching in i3.
So I usually have my browser up on the left screen (sometimes 50:50 split with another browser or a shell or whatever), my IDE up on the middle one and then emails or slack or whatever on the right one (all usually in fullscreen).
Now let’s say I share my screen for pairing, then I also want my partner/team visible, so I need Video+IDE+Slack - instead of fumbling around I’d just press win-5 or whatever on the left one and have that up there. Pressing win-5 on the middle one now will switch the IDE workspace with the video one (because my cam is on the right, so I’m more looking into the cam), etc.pp
But a key point is that I am sending all specific apps to their one fixed workspace on open (something like 1=shell,2=ide,3=browser,4=email,….) so I never have to alt-tab to find anything because I just know which app I will get, and I can still temporarily move e.g. the video window to the browser workspace for a split view (or alt-tab in a single workspace, between slack and emails)
So yes, it might sound nitpicky but it’s like laying out stuff on a workbench. I don’t think about it anymore and I don’t even need to look there. Just push the mouse somewhere onto that screen and press an easy key combo and I have everything where I want it, and still dynamically move it around. If I have a second browser window open I can either put that on workspace 3 (so I alt-tab between exactly the 2 open tabs) or put it next to the other one, OR I put it on one of my undefined workspaces (e.g. 8) and have that up somewhere.
If you are a heavy window overlapper then this might not make a ton of sense, from my experience. But even on non-tiling WMs I often do 1:1 or 1:1:1 or 1:1:1:1 splits but then I have to alt-tab a lot. But yes, it’s also very much about keyboard shortcuts, which you can somewhat recreate in e.g. hammerspoon but imho not as nicely.
I like the workbench analogy. I use i3 in the same way: one workspace for terminals with various stuff (email, IRC, current project, maybe a grep/git/logtail/whatever), one with a browser window on the company chat, one with a private chat, one with a “main browser window” for everything else, one with my editor and occasionally I’ll spawn more workspaces for one-off tasks or other things I need open all the time. Right now I’m typing this, I have 7 workspaces active and I know exactly which workspace contains which windows. “A place for everything and everything in its place”, basically.
About ten years before I learned about the existence of tiling window managers, I already knew the bits of window management that I hated:
having important information in one window covered by another window
being forced to constantly rearrange the window stack to see what I needed
each window opening in the wrong size and needing to manually fix it
The first time I encountered a tiling window manager, I instantly saw that it solved these problems, switched, and never looked back.
So I guess that I have the reverse question: what do people really like about floating window managers? Do the annoyances I listed above not bother you or are you getting some benefit from the floating layout that’s worth the price?
Not to steer us off topic, but since you asked: I must not have much frustration about rearranging windows, because I get satisfaction from arranging them.
I think you can avoid needed information being unduly covered by making windows as small as they can reasonably be. It’s not necessary to ensure they never overlap anywhere. For example, my TextEdit windows are about the size of my phone in portrait, which is about 5% of the monitor’s physical area.
I generally use one 27” screen. When idle, I have several stacks of windows. Browsers on the left, terminals in the lower right corner, that kind of thing. I keep each stack neat in sort of a cascade arrangement. With an AppleScript on a key shortcut I can make a window my favorite size, for example, I can reset a terminal to 80x25 or a browser to 1024x768.
The center of the screen is available for a main app like an IDE. Such a window will overlap the side stacks, but won’t cover them.
For a task, I’ll set up my workspace in a second or two, mainly by bringing forward the specific open windows I’ll need. Being in separate stacks, they already don’t overlap. I may hide unrelated apps, too. And then I’ll launch whatever main app I need in the center, or maybe move a browser window there, or just enlarge it from the corner and reset its size later.
During the task, if a center app is frontmost and I need a side window, they’re only half-covered by it anyway, so I have a huge click target to bring that other window to the front. Fitts’s Law is my friend. Command-tab feels fast but is imprecise; I probably don’t want all windows of an app. I’d usually rather click the specific window. And if a window is covered, that’s what exposé is for. It’s on my mouse’s thumb button.
It’s not like I never get caught on this and I do need a unique setup from time to time. On my laptop screen I run a different scheme just due to real estate. But for the most part I have my stuff laid out neatly, or I have in mind a neat arrangement to return to, and I think a mouse is just good at this. (Trackpads are comparatively clumsy.)
For me it’s a mix of those (both what was already mentioned and what you suggested):
I really don’t want my hands to leave the keyboard, so my wm has to be usable without a pointing device
I don’t like big HQ screens, they’re expensive, take place, and don’t look much better in the color settings I use them (quite dim, with few contrasting colors), so I do want windows to take all the space. To the point I don’t even have a wallpaper, it’s niri’s default mid grey.
I’m much more comfortable with seeing only the windows I’m working on, and I can’t deal with a bunch of windows piled together (it feels like writing on a desktop on a messy stack of paper). So even in non-tiling wm I’m a heavy user of workspaces.
It turns out the year of the Linux desktop won’t be a result of free desktop environments catching up to macOS; it’ll be a result of macOS regressing below the level of free desktops ;-P
(Seriously: this already happened with Windows many years ago. I switched from Windows NT (!) to Linux (Red Hat, IIRC?) at work many years ago and it was faster, smoother, more stable (!), and more feature-rich.)
I can’t find the source of it but there was a line floating around like “They promised me that they would make computers as easy to use as making a telephone call. They’ve succeeded! I can no longer figure out how to make a telephone call.”
I want an insanely great system to exist again, and I’m open to suggestions.
It looks to me like no one attempts to compete with Apple at their user experience game—consistent behavior, minimum surprise, things just working. Enjoying popularity and a lack of opposition in the space they’ve carved out, they no longer have to make their systems insanely great for many of us users to continue thinking they’re the best. Eventually that has meant they’re merely the least bad. A lot of the time I’m happy enough, but sometimes I feel stuck with all this. I wonder what battles they fight internally to keep the dream of the menu bar alive. But dammit, the same key shortcuts copy and paste in every app.
The last time I used Gnome, I mouse-scrolled through a settings screen, snagged on a horizontal slider control, adjusted the setting with no clue what the original value was, and found there’s no setting I could use to avoid that. The last time I used Windows, I was again in the system settings app but found it didn’t have a setting I remembered. I learned Control Panel still exists too, and the half the settings still live there. My Mac, on the other hand, is insanely OK.
If you’re open to suggestions, have you tried Haiku before? It, too, has the same key shortcuts copy/pasting in every app (even when you use “Windows” shortcuts mode, i.e. Ctrl not Alt as the main accelerator key). We’re not quite as full-featured or polished yet as macOS was/is, but we’d like to think the potential is there in ways it’s not for the “Linux desktop” :)
Thanks; I have eyed Haiku with interest from time to time! It does strike me as in line with some of my values. Maybe I’ll give it a more earnest try.
Update, impressions after a few hours of uptime: This is nice. I could get comfortable here. It’s really cohesive and all the orthogonal combinable parts feel well chosen. The spatial Finder is alive and well. I was especially impressed when I figured out how a plain text file was able to have rich text styling, while remaining perfectly functional with cat. Someone really must be steering this ship because it emphatically doesn’t suck.
Thanks for your kind words! Some of the “orthogonal combinable” parts (“styled text in extended attributes” and spatial Tracker included) are ideas we originally inherited from BeOS (but of course we’ve continued that philosophy). And we actually don’t have a single “project leader” role; the project direction is determined by the development team (with the very occasional formal vote to make final decisions; but more often there is sufficient consensus that this simply is not needed.) But we definitely is a have a real focus on cohesiveness and “doing things well”, which sometimes leads to strife and development taking longer than it does with other projects (I wrote about this a few years back in another comment on Lobsters – the article that comment was in a discussion thread for, about Haiku’s package manager is also excellent and worth a read), but in the end leads to a much more cohesive and holistic system design and implementation, which makes it all worth it, I think.
But dammit, the same key shortcuts copy and paste in every app.
So much this. I am continually disappointed that both Gnome and KDE managed to fuck this up. Given that they both started as conceptual clones of Windows (more-or-less) I guess it isn’t surprising, but still
A minimal Linux setup can be great. You have few components that are really well maintained, so nothing breaks. It also moves slowly. My setup is essentially the same as back in 2009: A WM (XMonad), a web browser (Firefox), and an editor (Emacs). When I need some other program for a one-off task, I launch an ephemeral Nix shell. If you prefer Wayland, Hyprland can give you many niceties offered by desktop environments, like gestures, but it is still really minimal.
Nix is really cool for keeping a system clean like that. I bounced off NixOS because persisting the user settings I cared about started to remind me of moving a cloud service definition to Terraform. I do love the GC’ed momentary tool workspaces.
If Emacs was my happy place, I think your setup would be really pleasing. But I am a GUI person, to the degree that tiling window managers make my nose wrinkle. Windows are meant to breathe and crowd, I think. That’s related to the main reason I want apps to work similarly, because I’ll be gathering several of them around a task.
I want to believe there is a world in which a critical mass of FOSS apps are behaviorally cohesive and free of pesky snags, but I have my doubts that I’d find it in Linux world, simply because the culture biases toward variety and whim, and away from central guidance and restraint. Maybe I just don’t know where to look. BSDs look nice this way in terms of their base systems, but maybe not all the way to their GUIs.
My MacBook Pro is nagging me to upgrade to the new OS release. It lists a bunch of new features that don’t care about. In the meantime, the following bugs (which are regressions) have been unfixed for multiple major OS versions:
When a PDF changes, Preview reloads it. It remembers the page you were on (it shows it in the page box) but doesn’t jump there. If you enter the page in the page box, it doesn’t move there because it thinks you’re there already. This worked correctly for over a decade and then broke.
The calendar service fails to sync with a CalDAV server if you have groups in your contacts. This stopped working five or so years ago, I think.
Reconnecting an external monitor used to be reliable and move all windows that were there last time it was connected back there. Now it works occasionally.
There are a lot of others, these are the first that come to mind. My favourite OS X release was 10.6: no new user-visible features, just a load of bug fixes and infrastructure improvements (this one introduced libdispatch, for example).
It’s disheartening to see core functionality in an “abandonware” state while Apple pushes new features nobody asked for. Things that should be rock-solid, just… aren’t.
It really makes you understand why some people avoid updates entirely. Snow Leopard’s focus on refinement feels like a distant memory now.
The idea of Apple OS features as abandonware is a wild idea, and yet here we are. The external monitor issue is actually terrible. I have two friends who work at Apple (neither in OS dev) and both have said that they experience the monitor issue themselves.
I was thinking about this not too long ago; there are macOS features (ex the widgets UI) that don’t seem to even exist anymore. So many examples of features I used to really like that are just abandoned.
Reconnecting an external monitor used to be reliable and move all windows that were there last time it was connected back there. Now it works occasionally.
This works flawlessly for me every single time, I use Apple Studio Display at home and a high end Dell at the office.
On the other hand, activating iMessage and FaceTime on a new MacBook machine has been a huge pain for years on end…
On the other hand, activating iMessage and FaceTime on a new MacBook machine has been a huge pain for years on end…
I can quote on that, but not with my Apple account, but with my brother’s. Coincidentally, he had less problems activating iMessage/FaceTime on an Hackintosh machine.
A variation on that which I’ve run in to is turning the monitor off and putting the laptop to sleep, and waking without moving or disconnecting it.
To avoid all windows ending up on stuck on the laptop display, I have to sleep the laptop, the power off the monitor. To restore power on the monitor, then wake the laptop. Occasionally (1 in 10 times?) it still messes up and I have to manually move windows back to the monitor display.
(This is when using dual-head mode with both the external monitor and laptop display in operation)
iCloud message sync with message keep set to forever seems to load soooo much that messages on my last laptop would be so awful to type long messages (more than 1 sentence) directly into the text box I started to write messages outside of the application, copy/paste and send the message. The delay was in seconds for me.
I’m really heartened by how many people agree that OS X 10.6 was the best.
Edited to add … hm - maybe you’re not saying it was the best OS version, just the best release strategy? I think it actually was the best OS version (or maybe 10.7 was, but that’s just a detail).
It was before Apple started wanting to make it more iPhone-like and slowly doing what Microsoft did with Windows 8 (who did it in a ‘big bang’) by making Windows Phone and Windows desktop amost indistinguishable. After Snow Leopard, Apple became a phone company and very iPhone-centric and just didn’tbother with the desktop - it became cartoonish and all flashy, not usable. That’s when I left MacOS and haven’t looked back.
Recently, Disk Utility has started showing a permissions error when I click unmount or eject on SD cards or their partitions, if the card was inserted after Disk Utility started. You have to quit and re-open Disk Utility for it to work. It didn’t use to be like that, but it is now, om two different Macs. This is very annoying for embedded development where you need to write to SD cards frequently to flash new images or installers. So unmounting/ejecting drives just randomly broke one day and I’m expecting it won’t get fixed.
Another forever-bug: when you’re on a higher refresh rate screen, the animation to switch workspaces takes more time on higher refresh rate screens. This has forced me to completely change how I use macOS to de-emphasise workspaces, because the animation is just obscenely long after I got a MacBook Pro with a 120Hz screen in 2021. Probably not a new bug, but an old bug that new hardware surfaced, and I expect it will never get fixed.
I’m also having issues with connecting to external screens only working occasionally, at least through USB-C docks.
The hardware is so damn good. I wish anyone high up at Apple cared at all about making the software good too.
Oh, there’s another one: the fstab things to not mount partitions that match a particular UUID no longer work and there doesn’t appear to be any replacement functionality (which is annoying when it’s a firmware partition that must not be written to except in a specific way, or it will sofr-brick the device).
Oh, fun! I’ve tried to find a way to disable auto mount and the only solutions I’ve found is to add individual partition UUIDs to a block list in fstab, which is useless to me since I don’t just re-use the same SD card with the same partition layout all the time, I would want to disable auto mounting completely. But it’s phenomenal to hear that they broke even that sub-par solution.
Maybe, but we’re talking about roughly 1.2 seconds from the start of the gesture until keyboard input starts going to an app on the target workspace. That’s an insane amount of delay to just force the user to sit through on a regular basis… On a 60Hz screen, the delay is less than half that (which is still pretty long, but much much better)
Not a fix, but as a workaround have you tried Accessibility > Display > Reduce Motion?
I can’t stand the normal desktop switch animation even when dialed down all the way. With that setting on, there’s still a very minor fade-type effect but it’s pretty tolerable.
Sadly, that doesn’t help at all. My issue isn’t with the animation, but with the amount of time it takes from I express my intent to switch workspace until focus switches to the new workspace. “Reduce Motion” only replaces the 1.2 second sliding animation with a 1.2 second fading animation, the wait is exactly the same.
Don’t update/downgrade to Sequoia! It’s the Windows ME of MacOS’s. After Apple support person couldn’t resolve any of the issues I had, they told me to reinstall Sequoia and then gave me instructions to upgrade to Ventura/Sonoma.
I thought Big Sur was the Windows ME of (modern) Mac OS. I have had a decent experience in Sequoia. I usually have Safari, Firefox, Chrome, Mail, Ghostty, one JetBrains thing or another (usually PyCharm Pro or Clion), Excel, Bitwarden, Preview, Fluor, Rectangle, TailScale, CleanShot, Fantastical, Ice and Choosy running pretty much constantly, plus a rotating cast of other things as I need them.
Aside from Apple Intelligence being hot garbage, (I just turn that off anyway) my main complaint about Sequoia is that sometimes, after a couple dozen dock/undock cycles (return to my desk, connect to my docking station with a 30” non-hidpi monitor, document scanner, time machine drive, smart card reader, etc.) the windows that were on my Macbook’s high resolution screen and move to my 30” when docked don’t re-scale appropriately, and I have to reboot to address that. That seems to happen every two weeks or so.
Like so many others here, I miss Snow Leopard. I thought Tiger was an excellent release, Leopard was rough, and Snow Leopard smoothed off all the rough edges of Tiger and Leopard for me.
I’d call Sequoia “subpar” if Snow Leopard is your “par”. But I don’t find that to be the case compared to Windows 11, KDE or GNOME. It mostly just stays out of my way.
Apple’s bug reporting process is so opaque it feels like shouting into the void.
And, Apple isn’t some little open source project staffed by volunteers. It’s the richest company on earth. QA is a serious job that Apple should be paying people for.
Apple’s bug reporting process is so opaque it feels like shouting into the void.
Yeah. To alleviate that somewhat (for developer-type bugs) when I was making things for Macs and iDevices most of the time, I always reported my bugs to openradar as well:
which would at least net me a little bit of feedback (along the lines of “broken for everyone or just me?”) so it felt a tiny bit less like shouting into the void.
I can’t remember on these. The CalDAV one is well known. Most of the time when I’ve reported bugs to Apple, they’ve closed them as duplicates and given no way of tracking the original bug.
No. I tried being a good user in the past but it always ended up with “the feature works as expected”. I won’t do voluntary work for a company which repeatedly shits on user feedback.
10.6 “Snow Leopard” was the last Mac OS that I could honestly say I liked. I ran it on a cheap mini laptop (a Dell I think) as a student, back when “hackintoshes” were still possible.
Computers are so ridiculously powerful these days that it’s so weird we still have CI/CD pipelines that take tens of minutes. … Maybe if everyone wasn’t busy building ad tech and chat bots, we’d get somewhere.
I feel like it’s nearly always been like this. We rarely get to optimize or simplify our systems just because smaller and lighter is faster, cheaper, and longer-lived. Usually we do it to remove outright blockage. When the pain is relieved, we stop. When the blockage is gone, some other problem is now a bigger deal—the feature that could make money doesn’t exist yet. So we grow insensitive to lesser pains. Then one day we see 1,000 cuts, not necessarily because we noticed them ourselves, but because we found a new (or old) alternative with eye-opening tradeoffs.
Every time this topic comes up I post a similar comment about how hallucinations in code really don’t matter because they reveal themselves the second you try to run that code.
That’s fascinating. I’d really enjoy hearing some more about that. Was this a team project? Were there tests? I feel like this would be really valuable as a sort of post mortem.
Lots of different teams and project. I am talking 30% of a 1k engineer department being feature frozen for months to try to dig out of the mess.
And yes there were tests. Tests do not even start to cut it. We are talking death through thousands deep cut.
This is btw not a single anecdote. My network of “we are here to fix shit” people are flooded with these cases. I expect the tech industry output to plummet starting soon.
Again, really interesting and I’ve love more details. I am at a company that has adopted code editors with AI and we have not seen anything like that at all.
That just sounds so extreme to me. Feature frozen for months is something I’ve personally never even heard of, I’ve never experienced anything like that. It feels kind of mind boggling that AI would have done that.
Nope. They had tested it. But to test, you have to be able to understand the failure cases. Which you have heuristics for based on how humans write code
These things are trained exactly to avoid this detection. This is how they get good grade. Humans supervising them is not a viable strategy.
I’d like to understand this better. Can you give an example of something a human reviewer would miss because it’s the kind of error a human code author wouldn’t make but an LLM would?
I’m with @Diana here. You test code, but testing does not guarantee the absence of bugs. Testing guarantees the absence of a specific bug that is tested for. LLM-generated code has a habit of failing in surprising ways that humans fail to account for.
I used AI primarily for generating test cases, specifically prompting for property tests to check the various properties we expect the cryptography to uphold. A test case found a bug.
“Those are scary things, those gels. You know one suffocated a bunch of people in London a while back?”
Yes, Joel’s about to say, but Jarvis is back in spew mode. “No shit. It was running the subway system over there, perfect operational record, and then one day it just forgets to crank up the ventilators when it’s supposed to. Train slides into station fifteen meters underground, everybody gets out, no air, boom.”
Joel’s heard this before. The punchline’s got something to do with a broken clock, if he remembers it right.
“These things teach themselves from experience, right?,” Jarvis continues. “So everyone just assumed it had learned to cue the ventilators on something obvious. Body heat, motion, CO2 levels, you know. Turns out instead it was watching a clock on the wall. Train arrival correlated with a predictable subset of patterns on the digital display, so it started the fans whenever it saw one of those patterns.”
“Yeah. That’s right.” Joel shakes his head. “And vandals had smashed the clock, or something.”
Hallucinated methods are such a tiny roadblock that when people complain about them I assume they’ve spent minimal time learning how to effectively use these systems—they dropped them at the first hurdle.
You imply that because one kind of hallucination is obvious, all hallucinations are so obvious that (per your next 3 paragraphs) the programmer must have been 1. trying to dismiss the tool, 2. inexperienced, or 3. irresponsible.
You describe this as a failing of the programmer that has a clear correction (and elaborate a few more paragraphs):
You have to run it yourself! Proving to yourself that the code works is your job.
It is, and I do. Even without LLMs, almost every bug I’ve ever committed to prod has made it past “run it yourself” and the test suite. The state space of programs is usually much larger than we intuit and LLM hallucinations, like my own bugs, don’t always throw exceptions on the first run or look wrong when read.
I think you missed the point of this post. It tells the story of figuring out where one hallucination comes from and claims LLMs are especially prone to producing hallucinations about niche topics. It’s about trying to understand in depth how the tool works and the failure mode that it produces hallucinations that looks plausible to inexperienced programmers; you’re responding with a moral dictum that the user is at fault for not looking at it harder. I’m strongly reminds me of @hwayne’s rebuttal of “discipline” advice (discussion).
Just because code looks good and runs without errors doesn’t mean it’s actually doing the right thing. No amount of meticulous code review—or even comprehensive automated tests—will demonstrably prove that code actually does the right thing. You have to run it yourself!
What does “running” the code prove?
Proving to yourself that the code works is your job. This is one of the many reasons I don’t think LLMs are going to put software professionals out of work.
So LLMs leave the QA to me, while automating the parts that have a degree of freedom and creativity to them.
Can you at least understand why some people are not that excited about LLM code assistants?
In a typed system, it proves that your code conforms to the properties of its input and output types, which is nice. In a tested system it proves whatever properties you believe your tests uphold.
So LLMs leave the QA to me, while automating the parts that have a degree of freedom and creativity to them.
QA was always on you. If you don’t enjoy using one, don’t? If you feel that it takes your freedom and creativity away, don’t use it. I don’t use LLMs for a ton of my work, especially the creative stuff.
In a typed system, it proves that your code conforms to the properties of its input and output types, which is nice. In a tested system it proves whatever properties you believe your tests uphold.
Which is at odds with the claim in the same sentence, that ‘comprehensive automated tests’ will not prove that code does the right thing. And yes, you can argue that the comprehensive tests might be correct, but do not evaluate the properties you expect the results to have, if you want to split hairs.
Evaluating code for correctness is the hard problem in programming. I don’t think anyone expected LLMs to make that better, but there’s a case to be made that LLMs will make it harder. Code-sharing platforms like Stack Overflow or Github at least provide some context about the fitness of the code, and facilitate feedback.
The article is supposed to disprove that, but all it does is make some vague claims about “running” the code (while simultaneously questioning the motives of people who distrust LLM-generated code). I don’t think it’s a great argument.
What did you think my article was trying to disprove?
It’s an article that’s mainly about all the ways LLMs can mislead you that aren’t as obvious as hallucinating a method that doesn’t exist. Even the title contains an implicit criticism of LLMs: “Hallucinations in code are the least dangerous form of LLM mistakes”.
If anything, this is a piece about why people should “distrust LLM-generated code” more!
Can you at least understand why some people are not that excited about LLM code assistants?
Because they don’t enjoy QA.
I don’t enjoy manual QA myself, but I’ve had to teach each myself to get good at it - not because of LLMs, but because that’s what it takes to productively ship good software.
I actually disagree a little bit here. QA’ing every bit of functionality you use is never going to scale. At some level you have to trust the ability of your fellow human beings to fish out bugs and verify correctness. And yes, it’s easy for that trust to be abused, by supply chain attacks and even more complicated “Jia Tan”-like operations.
But just like LLMs can be said to do copyright laundering, they also launder trust, because it’s impossible for them to distinguish example code from working code, let alone vulnerable code from safe code.
What I meant was something slightly different. Almost every piece of software that’s not a bootloader runs on a distributed stack of trust. I might trust a particular open source library, I might trust the stdlib, or the operating system itself. Most likely written by strangers on the internet. It’s curl | sudo bash all the way down.
The action of importing code from github, or even copy-pasting it from stack overflow, is qualitatively different from that of trusting the output of an LLM, because an LLM gives you no indication as to whether the code has been verified.
I’d go so far as to say the fact that an LLM emitted the code gives you the sure indication it has not been verified and must be tested—the same as if I wrote quicksort on a whiteboard from memory.
I think this post is more than just another “LLMs bad” post, though I did enjoy your response post as a standalone piece. The author ‘s co-worker figured out it didn’t work pretty quickly. It’s more interesting to me that the author found the source of the hallucination, and that it was a hypothetical that the author themselves had posed.
That’s why I didn’t link to the “Making o1, o3, and Sonnet 3.7 Hallucinate for Everyone” post from mine - I wasn’t attempting a rebuttal of that, I was arguing against a common theme I see in discussions any time the theme of hallucinations in code is raised.
I turned it into a full post when I found myself about to make the exact same point once again.
And that’s fair enough - in context I read your comment as a direct reply. I appreciate all the work you’ve been doing on sharing your experience, Simon!
The shorthand syntax is comfortable. I may end up using llm schemas dsl all by itself—thanks for including it. For local models and those whose services don’t support a schema, I imagine the user could construct a suitable prompt template with that ingredient.
I’m not sold on the claim in the title, but this notion of intellectual control aptly names what we’re fighting for when we simplify. This is a heuristic for how easy the software is to change and how confident we can be that it works.
The first thing I took away from this is that if I’m ever in a debate with John Ousterhout, he’ll put words in my mouth, accusing me of feeling and believing things I don’t. I’ve never seen Bob Martin so diplomatic.
As for the goal of fearless refactoring, we now can choose languages that inherently provide an awful lot of that. Besides particularly complex functions, I would rather most explicit testing effort be spent at the feature scope.
Your response interests me because I didn’t notice that while I was reading the dialog. For my edification, would you mind pointing out one or two parts that generated those feelings?
I disagree; this illustrates your bias against comments.
Martin had just finished saying some comments are good and he then had to reinforce that he doesn’t hate them.
He has to correct Ousterhout again later:
Sorry to interrupt you; but I think you are overstating my position. I certainly never said that comments can never be helpful…
A few paragraphs later:
Given your fundamental disbelief in comments…
I was also put off by several uses of the word “Unfortunately”. They often read as jabs, positioning Ousterhout’s own opinion as if it’s ground truth, like “It’s too bad you’re wrong about that.”
In the same theme:
I think what happened here is that you were so focused on something that isn’t actually all that important (creating the tiniest possible methods) that you dropped the ball on other issues that really are important.
It just comes off as dismissive to me. However, it’s unlikely it was really that bad for the participants interacting live. Text makes a lot of things sound worse.
I think it’s ok to believe somebody thinks something that they don’t believe they think. In other words, if someone proposes a set of ideas, and those ideas have a logical conclusion, but they disavow the conclusion (while holding onto their ideas), you can justifiably ignore their disavowal.
That all holds up logically, but emotionally, I wouldn’t want to be on the receiving end of that vote of low confidence. What’s suitable to believe is not always appropriate to say.
I also switched to a mac for the first time in October, it’s mostly working out, but a couple of things drive me nuts currently:
Home does not mean Home like on Windows/Linux
going left or right word by word with alt instead of ctrl
activating an app with several windows brings all its windows to the foreground
Also while my employer lets me do 99% of the things on this machine, Karabiner seems to need a driver I’m not allowed to install, so unfortunately I can’t remap capslock, but that’s not a huge problem.
activating an app with several windows brings all its windows to the foreground
This behavior was kept for consistency with classic Mac OS, which had this behavior due to some limitations in global data structures that were not designed to support a multitasking OS.
You’ll find on the Mac that the main functionality, while opinionated, is often complemented by one or more for-pay, well-crafted independent tools for folks who need specific functionality.
Er, not really, not if you were already using Macs when OS X came in. It behaves the same way Classic did, and Classic did that for good reasons.
TBH I never really noticed it and don’t consider it to be an inconvenience, but now it’s been spelled out to me, I can see how it might confuse those used to other desktop GUIs.
FWIW, being used to OS X being just a Mac and working like a Mac, I found the article at the top here a non-starter for me. On the other hand, I absolutely detest KDE, and I would have liked something going the other way: how to make KDE usable if you are familiar with macOS or Windows.
I should note that I was born after the year 2000 and bought my first Mac computer in 2021, so it’s perhaps mostly amazing because of my relatively small scope.
I’m not understanding how that is consistent. When you click a modern macOS window, that window alone comes to the front, unless you are using Front and Center as another mentioned. When you click a Dock icon, then all the app’s windows come forward, but classic Mac OS didn’t have a Dock. I am out of my depth, though, as an OS X-era switcher.
In the very first versions of Mac OS, there was only one program running at a time (plus desk accessories), so only that application’s windows were visible.
Then came MultiFinder and Switcher and whatever. In the earliest versions, you could switch programs: all of one program’s windows would vanish, and the next’s would appear.
Eventually you had all your windows on screen at once two ways of switching: a menu of applications in the menu bar, and by clicking on a window. If you clicked on a window in classic Mac OS, all of its windows would be raised. Until Mac OS X (maybe OS 9?), it was not officially possible to interleave windows of different applications, and Mac OS to this day still raises all windows when you select an application in the Dock, just as it did when you selected an application in the switcher menu in the days of yore.
This behavior started because certain data structures in classic Mac OS were global, and the behavior stuck around for backwards compatibility reasons.
activating an app with several windows brings all its windows to the foreground
Perspective may or may not help muscle memory, and you probably know this already: This is because applications and their menu bars are the top level UI objects in macOS, whereas windows are one level down the tree. ⌘-Tab or a dock icon click will switch apps. The menu bar appears, and so do windows if they exist, but there may be none. (Some apps then create one.) If what you want is to jump to a window, Mission Control is your friend.
I suppose it’s my (weird?) setup where I have a firefox window open on the (smaller) laptop screen to the right that’s always open but less used - but also one with “current tabs” on one of the two main screens.
The odd thing to me is that alt-tab gives me both firefox windows (and thus hides e.g. my IDE) and not the last recently used like on Windows (and most linux WMs, I guess - but I use tiling there most of the time). I guess I ruined everything else by using a tiling WM for years where every window opens exactly on that screen I want it to be and I never had to alt-tab in the first place :P
You went past the good point (good commit messages for important changes bring a lot of value) and into the unnecessarily judgy territory. Some commits don’t need anything more than autogenerated summaries. I’d say lots of them don’t. For example
Three of those commit messages clearly cover the why, rather than the what. (That is good!) That’s important context that I don’t think an LLM could come up based on the change.
I disagree any of those say “why”. Why update the version (because that’s the point of the distro) , why more alignment options are needed (improve compatibility slightly, but don’t provide all variants at this point), why switch to new Ubuntu (likely the old one is EOL). Only the last one touches slightly on why (satisfy rubocop), but… why?
They’re simple summaries of what has actually changed.
No one said it’s too much effort. Automating what you can doesn’t imply or entail a change in standards. My phone’s typing autocomplete is pretty good now, and I still edit.
Looking for our own weaknesses is regular practice, not the exception.
So, parse, don’t validate all your data, not just wire formats. If you practice making illegal states unrepresentable, don’t you already have this covered?
I think part of the point is that zero values make it hard to make illegal states unrepresentable, because not every thing has a valid zero state.
My experience differs from this. I find that in functional programs I frequently encounter the need (or at least the desire) to create a FooBuilder object to collect the information which I will later turn into a Foo, but which I can’t create all in one go because the process of collecting all the fields is lengthy or difficult.
For example, I may want to build a tree from the top down: find the information that goes in the root node, then using that information go obtain the data for each child of the root, one-by-one, and recurse downward. BUT, if the tree itself has invariants that are only true when all child nodes are properly populated, then I may need to build a type called “PartialTree” which is a tree-like structure that does not guarantee the invariants, then build a PartialTree recursively and finally use it to construct the Tree.
If I do this, the half-constructed objects ARE needed, it’s just that I can choose to use different types for the half-constructed object and the fully-constructed object to clearly denote to the type system when one can and can’t rely on the invariants that come with being fully constructed.
I think this could be only a semantic disagreement about what a half-constructed object is—I think if you use a separate partial type, you’ve avoided half-constructed values of any type. Maybe it’s better to say that stepwise initialization is not a good reason to loosen the final type.
It is sometimes a problem. If it is too expensive to construct two trees, you are left with the option of constructing the partial tree only, or relying on some uncommon language features for subtyping.
Performance is always the exception and beats every other rule.
This is yet another pro/con thing where I lean back in my chair, look into the distance, and say “Mhm yep can’t disagree”
I’m reminded of simple vs. easy: Every solution framed this way is of the easy kind. It increments complexity yet again.
Yup. See also “agents” in the LLM space at the moment - spend any time in those conversations and it quickly becomes clear that everyone involved has a slightly (or sometimes wildly) different definition that they incorrectly assume is shared by everyone else.
“agentic” is the new buzz word around executives at the moment. My team doesn’t even know what agentic mean.
My favorite answers to the question “what does agent mean?” are the ones that involve “agentic”, because they do precisely nothing to progress that conversation!
You really have to ask every person what they mean. A lot of them mean a background service with user rights, like a macOS launchd agent, but this time with LLMs. The other thing they may mean is a chat assistant with a grab bag of side-effecting tools to use, like Siri, but this time with LLMs. And sometimes it’s the Siri part plus the background part too, and often the tool use is a very handwaved feature set.
agreed. IMO the LLM space suffers from a terminology vacuum - outside of academic terms (i.e. all these models are autoregressive generative text-to-text transformers) there really isn’t much terminology to describe how to use models in the real world. so people come up with terms and definitions that spread very quickly because of the extreme level of interest in the area.
When you build a crawler for a large swath of the web, is it so inconvenient to avoid making your traffic to any given site a single intense burst? A shared work queue and distributed BFS seems like the least they could do.
I do occasionally wonder whether the bots themselves are built at least partially with LLM-generated code, or by engineers who lean heavily on such tools. It would explain the incredible lack of, uh, common decency on part of the bots and their owners. I don’t have any hard evidence though so it’s all speculative ¯\(ツ)/¯
If the people who are trying their best to hoover up the entire internet in their quest for a better ELIZA gave a single damn about whether they inconvenience others they’d find a different line of work.
I vaguely recall hearing that there is an open crawling initiative that aims to basically crawl the Internet and make the results available to anyone who wants them. I’m not sure if they charge for the results or not, but one can imagine an initiative that makes the results available via torrent or similar so that the service itself is cheaper than crawling the Internet on one’s own.
Common Crawl? Many early LLMs were trained on that.
Yes, thank you. That’s what I was thinking about.
I grew up as a Python programmer and type driven programming feels like such absurd tedium. Also a lot of the time even if you know what you want to do but the syntax will be godawful or the language so limited that it’s not possible.
Moved from a C# company to a Python company and most of the bugs seem to be type errors. Even misspelled class members slip through. Not that we have more bugs here, but this company is very dedicated to quality and I can only imagine what we could do with a typed language.
Basic typing is fine but once you give people a typed language, they try doing all kinds of fuckery with it.
I did my thesis on category theory so I’m sure I’ve seen the worst of it. Didn’t see much of that in our million line code base. Most people like stability after all.
I come from typed world but then spent a lot of time with Lisp and Clojure, so now I love an interactive development workflow. If I’ve succeeded at writing a short program with obviously no bugs instead of no obvious bugs, types would only have slowed me down. And I have very little patience anymore with slow compilers.
However, I still want a typed language for a large codebase. Type checks are tests. They’re the real bottom of the test pyramid, finer grained and more numerous than unit tests. When you “make illegal states unrepresentable,” types do that work.
Gradual typing makes a lot of sense to me, except the part where it’s not eventually required.
Comparing data that is in a strong type and JSON I just received over the wire, one of them has easy guarantees while the other needs runtime validation or else. I don’t want all data to be the second kind. In a big enough system you need more reasons to trust the validity of data you receive from elsewhere in the program. Strong types raise that baseline.
I struggle with getting past basic impatience to establish a TDD or BDD habit. There are also so many takes about the right way to do those things that I don’t know which are missing the point. I wouldn’t mind a recommendation if you know a really well-informed guide.
I consider myself a pretty test-focused developer and I don’t know that I would say I subscribe to any TDD or BDD habit. If you’re writing tests, you’re getting value from those tests, and you know when/what types of tests to write, you’re fine.
I think the unfortunate reality is that this really comes down to taste and experience. Factoring your code so that it’s easily testable and then writing those critical tests is a skill that you develop over time – I really don’t think it’s the kind of thing that can be distilled into a couple of talking points. But if I had to try, I would say:
This is all familiar and I feel I write testable code along these lines, but what I’d like to try is not writing code without a test waiting to be turned green, that kind of thing. I imagine the best final workflow is not so extreme but I won’t really know the balance until I overdo it.
The times where it’s easiest to do this are when you have a well-defined feature set already written out. From there, it’s a matter of converting each individual requirement into a test case and using the test itself to define the interfaces you expect you’ll need to implement that requirement. When you don’t have good requirements, this style of test writing is basically impossible.
Fair enough! Personally I say if you’re already writing testable code you don’t stand to get much benefit from test-first: its main plus is that it forces you to think about these things upfront.
recently I had a mini-epiphany, about how to choose testing style. I was writing a compiler with a REST frontend, so you need both property tests (for codecs) as well as unit tests to pin down examples you want to work. But PDTs are fastest at sampling typical values, whereas UTs are for “zero-probability” configurations. So you need both, in different circumstances
One tip that helped for me: if you can hook up your test system something like entr, getting immediate feedback has made a huge difference. I learned about it from Julia Evan’s excellent blog post.
The default IDE experience in something like Xcode with running tests is slow and can make going through TDD a drag.
Yeah entr is excellent!
What do fans of (different kinds of) tiling window managers really like about them? I’m earnestly curious since I like my windows to float, pile/overlap somewhat, and be their own best sizes. Is it about putting all pixels to work? Keyboard finesse?
For me it’s an accessibility thing. It’s the best way I have found to use a desktop computer with keyboard only.
I turn off tabs in all programs that have them, so that I can navigate all windows in the same way, using at most one key per hand and keyboard-half.
I do have a nice pointing device for emergencies (drawing memes in Krita or so) but I can’t use it regularly.
There’s a big difference between tools that are merely possible to use with a keyboard, and tools that are easy to use with a keyboard, but I think it’s less obvious to users that can fall back to pointing.
It’s hard to put into words but I’ll try. This is strictly about a work setup with 3 screens, 2 big important ones and 1 smaller laptop screen (on the right). This also assumes xmonad or mimicking its workspace switching in i3.
So I usually have my browser up on the left screen (sometimes 50:50 split with another browser or a shell or whatever), my IDE up on the middle one and then emails or slack or whatever on the right one (all usually in fullscreen).
Now let’s say I share my screen for pairing, then I also want my partner/team visible, so I need Video+IDE+Slack - instead of fumbling around I’d just press win-5 or whatever on the left one and have that up there. Pressing win-5 on the middle one now will switch the IDE workspace with the video one (because my cam is on the right, so I’m more looking into the cam), etc.pp
But a key point is that I am sending all specific apps to their one fixed workspace on open (something like 1=shell,2=ide,3=browser,4=email,….) so I never have to alt-tab to find anything because I just know which app I will get, and I can still temporarily move e.g. the video window to the browser workspace for a split view (or alt-tab in a single workspace, between slack and emails)
So yes, it might sound nitpicky but it’s like laying out stuff on a workbench. I don’t think about it anymore and I don’t even need to look there. Just push the mouse somewhere onto that screen and press an easy key combo and I have everything where I want it, and still dynamically move it around. If I have a second browser window open I can either put that on workspace 3 (so I alt-tab between exactly the 2 open tabs) or put it next to the other one, OR I put it on one of my undefined workspaces (e.g. 8) and have that up somewhere.
If you are a heavy window overlapper then this might not make a ton of sense, from my experience. But even on non-tiling WMs I often do 1:1 or 1:1:1 or 1:1:1:1 splits but then I have to alt-tab a lot. But yes, it’s also very much about keyboard shortcuts, which you can somewhat recreate in e.g. hammerspoon but imho not as nicely.
I like the workbench analogy. I use i3 in the same way: one workspace for terminals with various stuff (email, IRC, current project, maybe a grep/git/logtail/whatever), one with a browser window on the company chat, one with a private chat, one with a “main browser window” for everything else, one with my editor and occasionally I’ll spawn more workspaces for one-off tasks or other things I need open all the time. Right now I’m typing this, I have 7 workspaces active and I know exactly which workspace contains which windows. “A place for everything and everything in its place”, basically.
Thank you for all the details!
About ten years before I learned about the existence of tiling window managers, I already knew the bits of window management that I hated:
The first time I encountered a tiling window manager, I instantly saw that it solved these problems, switched, and never looked back.
So I guess that I have the reverse question: what do people really like about floating window managers? Do the annoyances I listed above not bother you or are you getting some benefit from the floating layout that’s worth the price?
Not to steer us off topic, but since you asked: I must not have much frustration about rearranging windows, because I get satisfaction from arranging them.
I think you can avoid needed information being unduly covered by making windows as small as they can reasonably be. It’s not necessary to ensure they never overlap anywhere. For example, my TextEdit windows are about the size of my phone in portrait, which is about 5% of the monitor’s physical area.
I generally use one 27” screen. When idle, I have several stacks of windows. Browsers on the left, terminals in the lower right corner, that kind of thing. I keep each stack neat in sort of a cascade arrangement. With an AppleScript on a key shortcut I can make a window my favorite size, for example, I can reset a terminal to 80x25 or a browser to 1024x768.
The center of the screen is available for a main app like an IDE. Such a window will overlap the side stacks, but won’t cover them.
For a task, I’ll set up my workspace in a second or two, mainly by bringing forward the specific open windows I’ll need. Being in separate stacks, they already don’t overlap. I may hide unrelated apps, too. And then I’ll launch whatever main app I need in the center, or maybe move a browser window there, or just enlarge it from the corner and reset its size later.
During the task, if a center app is frontmost and I need a side window, they’re only half-covered by it anyway, so I have a huge click target to bring that other window to the front. Fitts’s Law is my friend. Command-tab feels fast but is imprecise; I probably don’t want all windows of an app. I’d usually rather click the specific window. And if a window is covered, that’s what exposé is for. It’s on my mouse’s thumb button.
It’s not like I never get caught on this and I do need a unique setup from time to time. On my laptop screen I run a different scheme just due to real estate. But for the most part I have my stuff laid out neatly, or I have in mind a neat arrangement to return to, and I think a mouse is just good at this. (Trackpads are comparatively clumsy.)
For me it’s a mix of those (both what was already mentioned and what you suggested):
I would guess:
Keyboard finesse mostly follows from that IMO.
Would someone please explain “Stygian Blue/Reddish Green” with respect to Scheme? Pretty sure we’re not talking about Pokémon games.
It’s just versioning. See https://github.com/johnwcowan/r7rs-work/blob/master/R7RSHomePage.md
It turns out the year of the Linux desktop won’t be a result of free desktop environments catching up to macOS; it’ll be a result of macOS regressing below the level of free desktops ;-P
(Seriously: this already happened with Windows many years ago. I switched from Windows NT (!) to Linux (Red Hat, IIRC?) at work many years ago and it was faster, smoother, more stable (!), and more feature-rich.)
I can’t find the source of it but there was a line floating around like “They promised me that they would make computers as easy to use as making a telephone call. They’ve succeeded! I can no longer figure out how to make a telephone call.”
I think that’s Bjarne Stroustrup
I want an insanely great system to exist again, and I’m open to suggestions.
It looks to me like no one attempts to compete with Apple at their user experience game—consistent behavior, minimum surprise, things just working. Enjoying popularity and a lack of opposition in the space they’ve carved out, they no longer have to make their systems insanely great for many of us users to continue thinking they’re the best. Eventually that has meant they’re merely the least bad. A lot of the time I’m happy enough, but sometimes I feel stuck with all this. I wonder what battles they fight internally to keep the dream of the menu bar alive. But dammit, the same key shortcuts copy and paste in every app.
The last time I used Gnome, I mouse-scrolled through a settings screen, snagged on a horizontal slider control, adjusted the setting with no clue what the original value was, and found there’s no setting I could use to avoid that. The last time I used Windows, I was again in the system settings app but found it didn’t have a setting I remembered. I learned Control Panel still exists too, and the half the settings still live there. My Mac, on the other hand, is insanely OK.
If you’re open to suggestions, have you tried Haiku before? It, too, has the same key shortcuts copy/pasting in every app (even when you use “Windows” shortcuts mode, i.e. Ctrl not Alt as the main accelerator key). We’re not quite as full-featured or polished yet as macOS was/is, but we’d like to think the potential is there in ways it’s not for the “Linux desktop” :)
Thanks; I have eyed Haiku with interest from time to time! It does strike me as in line with some of my values. Maybe I’ll give it a more earnest try.
Update, impressions after a few hours of uptime: This is nice. I could get comfortable here. It’s really cohesive and all the orthogonal combinable parts feel well chosen. The spatial Finder is alive and well. I was especially impressed when I figured out how a plain text file was able to have rich text styling, while remaining perfectly functional with cat. Someone really must be steering this ship because it emphatically doesn’t suck.
Thanks for your kind words! Some of the “orthogonal combinable” parts (“styled text in extended attributes” and spatial Tracker included) are ideas we originally inherited from BeOS (but of course we’ve continued that philosophy). And we actually don’t have a single “project leader” role; the project direction is determined by the development team (with the very occasional formal vote to make final decisions; but more often there is sufficient consensus that this simply is not needed.) But we definitely is a have a real focus on cohesiveness and “doing things well”, which sometimes leads to strife and development taking longer than it does with other projects (I wrote about this a few years back in another comment on Lobsters – the article that comment was in a discussion thread for, about Haiku’s package manager is also excellent and worth a read), but in the end leads to a much more cohesive and holistic system design and implementation, which makes it all worth it, I think.
I’ll read up, thanks!
So much this. I am continually disappointed that both Gnome and KDE managed to fuck this up. Given that they both started as conceptual clones of Windows (more-or-less) I guess it isn’t surprising, but still
On the other hand, having an entire Super key available for essentially whatever you want to do with it is… super.
A minimal Linux setup can be great. You have few components that are really well maintained, so nothing breaks. It also moves slowly. My setup is essentially the same as back in 2009: A WM (XMonad), a web browser (Firefox), and an editor (Emacs). When I need some other program for a one-off task, I launch an ephemeral Nix shell. If you prefer Wayland, Hyprland can give you many niceties offered by desktop environments, like gestures, but it is still really minimal.
Nix is really cool for keeping a system clean like that. I bounced off NixOS because persisting the user settings I cared about started to remind me of moving a cloud service definition to Terraform. I do love the GC’ed momentary tool workspaces.
If Emacs was my happy place, I think your setup would be really pleasing. But I am a GUI person, to the degree that tiling window managers make my nose wrinkle. Windows are meant to breathe and crowd, I think. That’s related to the main reason I want apps to work similarly, because I’ll be gathering several of them around a task.
I want to believe there is a world in which a critical mass of FOSS apps are behaviorally cohesive and free of pesky snags, but I have my doubts that I’d find it in Linux world, simply because the culture biases toward variety and whim, and away from central guidance and restraint. Maybe I just don’t know where to look. BSDs look nice this way in terms of their base systems, but maybe not all the way to their GUIs.
My MacBook Pro is nagging me to upgrade to the new OS release. It lists a bunch of new features that don’t care about. In the meantime, the following bugs (which are regressions) have been unfixed for multiple major OS versions:
There are a lot of others, these are the first that come to mind. My favourite OS X release was 10.6: no new user-visible features, just a load of bug fixes and infrastructure improvements (this one introduced libdispatch, for example).
It’s disheartening to see core functionality in an “abandonware” state while Apple pushes new features nobody asked for. Things that should be rock-solid, just… aren’t.
It really makes you understand why some people avoid updates entirely. Snow Leopard’s focus on refinement feels like a distant memory now.
The idea of Apple OS features as abandonware is a wild idea, and yet here we are. The external monitor issue is actually terrible. I have two friends who work at Apple (neither in OS dev) and both have said that they experience the monitor issue themselves.
It is one thing when a company ignores bugs reported by its customers.
It is another thing when a company ignores bugs reported by its own employees that are also customer-facing.
When I worked for a FAANG, they released stuff early internally as part of dogfooding programs to seek input and bug reports before issues hit users.
Sounds good, just that “you’re not the target audience” became a meme because so many bug reports and concerns were shut down with that response.
I was thinking about this not too long ago; there are macOS features (ex the widgets UI) that don’t seem to even exist anymore. So many examples of features I used to really like that are just abandoned.
This works flawlessly for me every single time, I use Apple Studio Display at home and a high end Dell at the office.
On the other hand, activating iMessage and FaceTime on a new MacBook machine has been a huge pain for years on end…
I can quote on that, but not with my Apple account, but with my brother’s. Coincidentally, he had less problems activating iMessage/FaceTime on an Hackintosh machine.
A variation on that which I’ve run in to is turning the monitor off and putting the laptop to sleep, and waking without moving or disconnecting it.
To avoid all windows ending up on stuck on the laptop display, I have to sleep the laptop, the power off the monitor. To restore power on the monitor, then wake the laptop. Occasionally (1 in 10 times?) it still messes up and I have to manually move windows back to the monitor display.
(This is when using dual-head mode with both the external monitor and laptop display in operation)
iCloud message sync with message keep set to forever seems to load soooo much that messages on my last laptop would be so awful to type long messages (more than 1 sentence) directly into the text box I started to write messages outside of the application, copy/paste and send the message. The delay was in seconds for me.
I’m really heartened by how many people agree that OS X 10.6 was the best.
Edited to add … hm - maybe you’re not saying it was the best OS version, just the best release strategy? I think it actually was the best OS version (or maybe 10.7 was, but that’s just a detail).
Lion was hot garbage. It showed potential (if you ignored the workflow regressions) but it was awful.
10.8 fixed many of lion’s issues and was rather good.
Snow Leopard was definitely peak macOS.
Are there people who still use 10.6? I wonder what would be missing compared to current MacOS. Can it run a current Firefox? Zoom?
It would be pretty hard to run 10.6 for something other than novelty, the root certs are probably all expired, and you definitely can’t run any sort of modern Firefox on it, the last version of FF to support 10.6 was ESR 45 released in 2016: https://blog.mozilla.org/futurereleases/2016/04/29/update-on-firefox-support-for-os-x/
I know there are people keeping Windows 7 usable despite lack of upstream support; it would be cool if that existed for 10.6 but it sounds like no.
Maybe 10.6 could still be useful for professional video/audio/photo editing software, the type that wasn’t subscription based.
It was before Apple started wanting to make it more iPhone-like and slowly doing what Microsoft did with Windows 8 (who did it in a ‘big bang’) by making Windows Phone and Windows desktop amost indistinguishable. After Snow Leopard, Apple became a phone company and very iPhone-centric and just didn’tbother with the desktop - it became cartoonish and all flashy, not usable. That’s when I left MacOS and haven’t looked back.
Recently, Disk Utility has started showing a permissions error when I click unmount or eject on SD cards or their partitions, if the card was inserted after Disk Utility started. You have to quit and re-open Disk Utility for it to work. It didn’t use to be like that, but it is now, om two different Macs. This is very annoying for embedded development where you need to write to SD cards frequently to flash new images or installers. So unmounting/ejecting drives just randomly broke one day and I’m expecting it won’t get fixed.
Another forever-bug: when you’re on a higher refresh rate screen, the animation to switch workspaces takes more time on higher refresh rate screens. This has forced me to completely change how I use macOS to de-emphasise workspaces, because the animation is just obscenely long after I got a MacBook Pro with a 120Hz screen in 2021. Probably not a new bug, but an old bug that new hardware surfaced, and I expect it will never get fixed.
I’m also having issues with connecting to external screens only working occasionally, at least through USB-C docks.
The hardware is so damn good. I wish anyone high up at Apple cared at all about making the software good too.
Oh, there’s another one: the fstab things to not mount partitions that match a particular UUID no longer work and there doesn’t appear to be any replacement functionality (which is annoying when it’s a firmware partition that must not be written to except in a specific way, or it will sofr-brick the device).
Oh, fun! I’ve tried to find a way to disable auto mount and the only solutions I’ve found is to add individual partition UUIDs to a block list in fstab, which is useless to me since I don’t just re-use the same SD card with the same partition layout all the time, I would want to disable auto mounting completely. But it’s phenomenal to hear that they broke even that sub-par solution.
Maybe it’s an intended “feature”, because 120Hz enabled iPhones and iPads have the same behavior.
Maybe, but we’re talking about roughly 1.2 seconds from the start of the gesture until keyboard input starts going to an app on the target workspace. That’s an insane amount of delay to just force the user to sit through on a regular basis… On a 60Hz screen, the delay is less than half that (which is still pretty long, but much much better)
Not a fix, but as a workaround have you tried Accessibility > Display > Reduce Motion?
I can’t stand the normal desktop switch animation even when dialed down all the way. With that setting on, there’s still a very minor fade-type effect but it’s pretty tolerable.
Sadly, that doesn’t help at all. My issue isn’t with the animation, but with the amount of time it takes from I express my intent to switch workspace until focus switches to the new workspace. “Reduce Motion” only replaces the 1.2 second sliding animation with a 1.2 second fading animation, the wait is exactly the same.
Don’t update/downgrade to Sequoia! It’s the Windows ME of MacOS’s. After Apple support person couldn’t resolve any of the issues I had, they told me to reinstall Sequoia and then gave me instructions to upgrade to Ventura/Sonoma.
I thought Big Sur was the Windows ME of (modern) Mac OS. I have had a decent experience in Sequoia. I usually have Safari, Firefox, Chrome, Mail, Ghostty, one JetBrains thing or another (usually PyCharm Pro or Clion), Excel, Bitwarden, Preview, Fluor, Rectangle, TailScale, CleanShot, Fantastical, Ice and Choosy running pretty much constantly, plus a rotating cast of other things as I need them.
Aside from Apple Intelligence being hot garbage, (I just turn that off anyway) my main complaint about Sequoia is that sometimes, after a couple dozen dock/undock cycles (return to my desk, connect to my docking station with a 30” non-hidpi monitor, document scanner, time machine drive, smart card reader, etc.) the windows that were on my Macbook’s high resolution screen and move to my 30” when docked don’t re-scale appropriately, and I have to reboot to address that. That seems to happen every two weeks or so.
Like so many others here, I miss Snow Leopard. I thought Tiger was an excellent release, Leopard was rough, and Snow Leopard smoothed off all the rough edges of Tiger and Leopard for me.
I’d call Sequoia “subpar” if Snow Leopard is your “par”. But I don’t find that to be the case compared to Windows 11, KDE or GNOME. It mostly just stays out of my way.
Have you ever submitted these regressions to Apple through a support form or such?
Apple’s bug reporting process is so opaque it feels like shouting into the void.
And, Apple isn’t some little open source project staffed by volunteers. It’s the richest company on earth. QA is a serious job that Apple should be paying people for.
Yeah. To alleviate that somewhat (for developer-type bugs) when I was making things for Macs and iDevices most of the time, I always reported my bugs to openradar as well:
https://openradar.appspot.com/page/1
which would at least net me a little bit of feedback (along the lines of “broken for everyone or just me?”) so it felt a tiny bit less like shouting into the void.
I can’t remember on these. The CalDAV one is well known. Most of the time when I’ve reported bugs to Apple, they’ve closed them as duplicates and given no way of tracking the original bug.
No. I tried being a good user in the past but it always ended up with “the feature works as expected”. I won’t do voluntary work for a company which repeatedly shits on user feedback.
I wonder if this means that tests have been red for years, or that there are no tests for such core functionality.
Sometimes we are the tests, and yet the radars go unread
10.6 “Snow Leopard” was the last Mac OS that I could honestly say I liked. I ran it on a cheap mini laptop (a Dell I think) as a student, back when “hackintoshes” were still possible.
I feel like it’s nearly always been like this. We rarely get to optimize or simplify our systems just because smaller and lighter is faster, cheaper, and longer-lived. Usually we do it to remove outright blockage. When the pain is relieved, we stop. When the blockage is gone, some other problem is now a bigger deal—the feature that could make money doesn’t exist yet. So we grow insensitive to lesser pains. Then one day we see 1,000 cuts, not necessarily because we noticed them ourselves, but because we found a new (or old) alternative with eye-opening tradeoffs.
Every time this topic comes up I post a similar comment about how hallucinations in code really don’t matter because they reveal themselves the second you try to run that code.
This time I’ve turned that into a blog post: https://simonwillison.net/2025/Mar/2/hallucinations-in-code/
As the person that saw 6 months of copilot kill so many systems due to the accumulation of latent hallucinations… Yeah. No.
That’s fascinating. I’d really enjoy hearing some more about that. Was this a team project? Were there tests? I feel like this would be really valuable as a sort of post mortem.
Lots of different teams and project. I am talking 30% of a 1k engineer department being feature frozen for months to try to dig out of the mess.
And yes there were tests. Tests do not even start to cut it. We are talking death through thousands deep cut.
This is btw not a single anecdote. My network of “we are here to fix shit” people are flooded with these cases. I expect the tech industry output to plummet starting soon.
Again, really interesting and I’ve love more details. I am at a company that has adopted code editors with AI and we have not seen anything like that at all.
That just sounds so extreme to me. Feature frozen for months is something I’ve personally never even heard of, I’ve never experienced anything like that. It feels kind of mind boggling that AI would have done that.
Did developers spend six months checking in code that they hadn’t tested? Because yeah, that’s going to suck. That’s the premise of my post.
Nope. They had tested it. But to test, you have to be able to understand the failure cases. Which you have heuristics for based on how humans write code
These things are trained exactly to avoid this detection. This is how they get good grade. Humans supervising them is not a viable strategy.
I’d like to understand this better. Can you give an example of something a human reviewer would miss because it’s the kind of error a human code author wouldn’t make but an LLM would?
I’m with @Diana here. You test code, but testing does not guarantee the absence of bugs. Testing guarantees the absence of a specific bug that is tested for. LLM-generated code has a habit of failing in surprising ways that humans fail to account for.
This isn’t really my experience unless you just say “write tests”. ex: https://insanitybit.github.io/2025/02/11/i-rolled-my-own-crypto
I used AI primarily for generating test cases, specifically prompting for property tests to check the various properties we expect the cryptography to uphold. A test case found a bug.
“Those are scary things, those gels. You know one suffocated a bunch of people in London a while back?”
Yes, Joel’s about to say, but Jarvis is back in spew mode. “No shit. It was running the subway system over there, perfect operational record, and then one day it just forgets to crank up the ventilators when it’s supposed to. Train slides into station fifteen meters underground, everybody gets out, no air, boom.”
Joel’s heard this before. The punchline’s got something to do with a broken clock, if he remembers it right.
“These things teach themselves from experience, right?,” Jarvis continues. “So everyone just assumed it had learned to cue the ventilators on something obvious. Body heat, motion, CO2 levels, you know. Turns out instead it was watching a clock on the wall. Train arrival correlated with a predictable subset of patterns on the digital display, so it started the fans whenever it saw one of those patterns.”
“Yeah. That’s right.” Joel shakes his head. “And vandals had smashed the clock, or something.”
You imply that because one kind of hallucination is obvious, all hallucinations are so obvious that (per your next 3 paragraphs) the programmer must have been 1. trying to dismiss the tool, 2. inexperienced, or 3. irresponsible.
You describe this as a failing of the programmer that has a clear correction (and elaborate a few more paragraphs):
It is, and I do. Even without LLMs, almost every bug I’ve ever committed to prod has made it past “run it yourself” and the test suite. The state space of programs is usually much larger than we intuit and LLM hallucinations, like my own bugs, don’t always throw exceptions on the first run or look wrong when read.
I think you missed the point of this post. It tells the story of figuring out where one hallucination comes from and claims LLMs are especially prone to producing hallucinations about niche topics. It’s about trying to understand in depth how the tool works and the failure mode that it produces hallucinations that looks plausible to inexperienced programmers; you’re responding with a moral dictum that the user is at fault for not looking at it harder. I’m strongly reminds me of @hwayne’s rebuttal of “discipline” advice (discussion).
What does “running” the code prove?
So LLMs leave the QA to me, while automating the parts that have a degree of freedom and creativity to them.
Can you at least understand why some people are not that excited about LLM code assistants?
In a typed system, it proves that your code conforms to the properties of its input and output types, which is nice. In a tested system it proves whatever properties you believe your tests uphold.
QA was always on you. If you don’t enjoy using one, don’t? If you feel that it takes your freedom and creativity away, don’t use it. I don’t use LLMs for a ton of my work, especially the creative stuff.
Which is at odds with the claim in the same sentence, that ‘comprehensive automated tests’ will not prove that code does the right thing. And yes, you can argue that the comprehensive tests might be correct, but do not evaluate the properties you expect the results to have, if you want to split hairs.
Evaluating code for correctness is the hard problem in programming. I don’t think anyone expected LLMs to make that better, but there’s a case to be made that LLMs will make it harder. Code-sharing platforms like Stack Overflow or Github at least provide some context about the fitness of the code, and facilitate feedback.
The article is supposed to disprove that, but all it does is make some vague claims about “running” the code (while simultaneously questioning the motives of people who distrust LLM-generated code). I don’t think it’s a great argument.
Ah, I see what you mean. Yes, I don’t think that “running” code is sufficient testing for hallucinations.
What did you think my article was trying to disprove?
It’s an article that’s mainly about all the ways LLMs can mislead you that aren’t as obvious as hallucinating a method that doesn’t exist. Even the title contains an implicit criticism of LLMs: “Hallucinations in code are the least dangerous form of LLM mistakes”.
If anything, this is a piece about why people should “distrust LLM-generated code” more!
Ah, if you restrict ‘hallucinations’ to specifically mean non-extant functions or variables, then I can see where you’re coming from.
Because they don’t enjoy QA.
I don’t enjoy manual QA myself, but I’ve had to teach each myself to get good at it - not because of LLMs, but because that’s what it takes to productively ship good software.
I actually disagree a little bit here. QA’ing every bit of functionality you use is never going to scale. At some level you have to trust the ability of your fellow human beings to fish out bugs and verify correctness. And yes, it’s easy for that trust to be abused, by supply chain attacks and even more complicated “Jia Tan”-like operations.
But just like LLMs can be said to do copyright laundering, they also launder trust, because it’s impossible for them to distinguish example code from working code, let alone vulnerable code from safe code.
That’s fair - if you’re working on a team with shared responsibility for a codebase you should be able to trust other team members to test their code.
You can’t trust an LLM to test its code.
What I meant was something slightly different. Almost every piece of software that’s not a bootloader runs on a distributed stack of trust. I might trust a particular open source library, I might trust the stdlib, or the operating system itself. Most likely written by strangers on the internet. It’s
curl | sudo bashall the way down.The action of importing code from github, or even copy-pasting it from stack overflow, is qualitatively different from that of trusting the output of an LLM, because an LLM gives you no indication as to whether the code has been verified.
I’d go so far as to say the fact that an LLM emitted the code gives you the sure indication it has not been verified and must be tested—the same as if I wrote quicksort on a whiteboard from memory.
I think this post is more than just another “LLMs bad” post, though I did enjoy your response post as a standalone piece. The author ‘s co-worker figured out it didn’t work pretty quickly. It’s more interesting to me that the author found the source of the hallucination, and that it was a hypothetical that the author themselves had posed.
That’s why I didn’t link to the “Making o1, o3, and Sonnet 3.7 Hallucinate for Everyone” post from mine - I wasn’t attempting a rebuttal of that, I was arguing against a common theme I see in discussions any time the theme of hallucinations in code is raised.
I turned it into a full post when I found myself about to make the exact same point once again.
And that’s fair enough - in context I read your comment as a direct reply. I appreciate all the work you’ve been doing on sharing your experience, Simon!
The shorthand syntax is comfortable. I may end up using
llm schemas dslall by itself—thanks for including it. For local models and those whose services don’t support a schema, I imagine the user could construct a suitable prompt template with that ingredient.I’m not sold on the claim in the title, but this notion of intellectual control aptly names what we’re fighting for when we simplify. This is a heuristic for how easy the software is to change and how confident we can be that it works.
Oo
someplenty of these are deep cuts I have never heard of. That’s a bookmarkI submitted this because my list of submissions and upvotes are a better list of bookmarks than any bookmark manager I’ve tried to commit to using!
Same. I feel like I learned something new about macOS key commands.
The first thing I took away from this is that if I’m ever in a debate with John Ousterhout, he’ll put words in my mouth, accusing me of feeling and believing things I don’t. I’ve never seen Bob Martin so diplomatic.
As for the goal of fearless refactoring, we now can choose languages that inherently provide an awful lot of that. Besides particularly complex functions, I would rather most explicit testing effort be spent at the feature scope.
Your response interests me because I didn’t notice that while I was reading the dialog. For my edification, would you mind pointing out one or two parts that generated those feelings?
Sure. For example,
Martin had just finished saying some comments are good and he then had to reinforce that he doesn’t hate them.
He has to correct Ousterhout again later:
A few paragraphs later:
I was also put off by several uses of the word “Unfortunately”. They often read as jabs, positioning Ousterhout’s own opinion as if it’s ground truth, like “It’s too bad you’re wrong about that.”
In the same theme:
It just comes off as dismissive to me. However, it’s unlikely it was really that bad for the participants interacting live. Text makes a lot of things sound worse.
Thanks for the insights. I appreciate it.
I think it’s ok to believe somebody thinks something that they don’t believe they think. In other words, if someone proposes a set of ideas, and those ideas have a logical conclusion, but they disavow the conclusion (while holding onto their ideas), you can justifiably ignore their disavowal.
That all holds up logically, but emotionally, I wouldn’t want to be on the receiving end of that vote of low confidence. What’s suitable to believe is not always appropriate to say.
I also switched to a mac for the first time in October, it’s mostly working out, but a couple of things drive me nuts currently:
Also while my employer lets me do 99% of the things on this machine, Karabiner seems to need a driver I’m not allowed to install, so unfortunately I can’t remap capslock, but that’s not a huge problem.
some amount of remapping f.e. caps-lock -> ctrl can be done via the
customize modifier keysthrough the settings menu itself.This is the way. You have to do it for every new keyboard that gets plugged in, but I have all capslock keys assigned to escape. On Sequoia it’s:
System Settings -> Keyboard -> Keyboard Shortcuts -> Modifier Keys
You have to go through the drop-down menu on the top to change the setting for each keyboard individually.
This behavior was kept for consistency with classic Mac OS, which had this behavior due to some limitations in global data structures that were not designed to support a multitasking OS.
This app might help with that:
https://hypercritical.co/front-and-center/
You’ll find on the Mac that the main functionality, while opinionated, is often complemented by one or more for-pay, well-crafted independent tools for folks who need specific functionality.
For long-time Mac people, a fun bit of trivia is that it’s written by John Siracusa.
This is an amazing fact - where did you learn about this peculiarity?
Not sure where I first read about it, but it goes back to Switcher/MultiFinder on classic MacOS in the mid 80’s.
Er, not really, not if you were already using Macs when OS X came in. It behaves the same way Classic did, and Classic did that for good reasons.
TBH I never really noticed it and don’t consider it to be an inconvenience, but now it’s been spelled out to me, I can see how it might confuse those used to other desktop GUIs.
FWIW, being used to OS X being just a Mac and working like a Mac, I found the article at the top here a non-starter for me. On the other hand, I absolutely detest KDE, and I would have liked something going the other way: how to make KDE usable if you are familiar with macOS or Windows.
I should note that I was born after the year 2000 and bought my first Mac computer in 2021, so it’s perhaps mostly amazing because of my relatively small scope.
Oh my word!
Fair enough, then…
(I can, just barely, remember the 1960s. Right now I am resisting the urge to crumble into dust and blow away on the breeze with all my might.)
I’m not understanding how that is consistent. When you click a modern macOS window, that window alone comes to the front, unless you are using Front and Center as another mentioned. When you click a Dock icon, then all the app’s windows come forward, but classic Mac OS didn’t have a Dock. I am out of my depth, though, as an OS X-era switcher.
In the very first versions of Mac OS, there was only one program running at a time (plus desk accessories), so only that application’s windows were visible.
Then came MultiFinder and Switcher and whatever. In the earliest versions, you could switch programs: all of one program’s windows would vanish, and the next’s would appear.
Eventually you had all your windows on screen at once two ways of switching: a menu of applications in the menu bar, and by clicking on a window. If you clicked on a window in classic Mac OS, all of its windows would be raised. Until Mac OS X (maybe OS 9?), it was not officially possible to interleave windows of different applications, and Mac OS to this day still raises all windows when you select an application in the Dock, just as it did when you selected an application in the switcher menu in the days of yore.
This behavior started because certain data structures in classic Mac OS were global, and the behavior stuck around for backwards compatibility reasons.
Perspective may or may not help muscle memory, and you probably know this already: This is because applications and their menu bars are the top level UI objects in macOS, whereas windows are one level down the tree. ⌘-Tab or a dock icon click will switch apps. The menu bar appears, and so do windows if they exist, but there may be none. (Some apps then create one.) If what you want is to jump to a window, Mission Control is your friend.
I know, but thanks.
I suppose it’s my (weird?) setup where I have a firefox window open on the (smaller) laptop screen to the right that’s always open but less used - but also one with “current tabs” on one of the two main screens.
The odd thing to me is that alt-tab gives me both firefox windows (and thus hides e.g. my IDE) and not the last recently used like on Windows (and most linux WMs, I guess - but I use tiling there most of the time). I guess I ruined everything else by using a tiling WM for years where every window opens exactly on that screen I want it to be and I never had to alt-tab in the first place :P
I went over this and I dislike how cumbersome Karabiner feels. You can remap keys pretty easily with a custom Launchd user agent. Some resources:
Obviously you should try the
hidutilcommand line by itself before creating the service.As an example, here’s what I have in
~/Library/LaunchAgents/org.custom.kbremap.plist(working on Sequoia 15.3.1): https://x0.at/BNtu.txtI hope this helps.
I can try again, thanks - but I’m pretty sure I spent some hours researching and couldn’t get it to work as a non-modifier in a way I need it.
I have a custom app I wrote to remap Command to Escape when pressed without being held, for the exact same reason that I couldn’t install Karabiner on a work laptop (those MacBook Pros with no physical escape key).
You could probably tweak it for your own purposes, hopefully without too much difficulty.
If writing a commit message is too much effort for you, you need to entertain seriously the notion that you’re bad at your job.
You went past the good point (good commit messages for important changes bring a lot of value) and into the unnecessarily judgy territory. Some commits don’t need anything more than autogenerated summaries. I’d say lots of them don’t. For example
Three of those commit messages clearly cover the why, rather than the what. (That is good!) That’s important context that I don’t think an LLM could come up based on the change.
I disagree any of those say “why”. Why update the version (because that’s the point of the distro) , why more alignment options are needed (improve compatibility slightly, but don’t provide all variants at this point), why switch to new Ubuntu (likely the old one is EOL). Only the last one touches slightly on why (satisfy rubocop), but… why?
They’re simple summaries of what has actually changed.
Agreed… those are subpar commit messages. That doesn’t mean that we “don’t need anything more”.
No one said it’s too much effort. Automating what you can doesn’t imply or entail a change in standards. My phone’s typing autocomplete is pretty good now, and I still edit.
Looking for our own weaknesses is regular practice, not the exception.