Can we use determinate Nix without nixd? My reason is that I assume things like parallel evaluation will be implemented in Nix and not nixd. I have zero interest on nixd but I would really appreciate parallel evaluation.
Sure, you can get the source from https://github.com/DeterminateSystems/nix-src. It is fully functional on its own. Obviously we don’t support using it in that fashion, but you probably didn’t care about that anyway.
The main argument against is that even if you assume good intentions, it won’t be as close to production as an hosted CI (e.g. database version, OS type and version, etc).
Lots of developers develop on macOS and deploy on Linux, and there’s tons of subtle difference between the two systems, such as case sensitivity of the filesystem, as well as default ordering just to give an example.
To me the point of CI isn’t to ensure devs ran the test suite before merging. It’s to provide an environment that will catch as many things as possible that a local run wouldn’t be able to catch.
To me the point of CI isn’t to ensure devs ran the test suite before merging.
I’m basically repeating my other comment but I’m amped up about how much I dislike this idea, probably because it would tank my productivity, and this was too good as example to pass up: the point of CI isn’t (just) to ensure I ran the test suite before merging - although that’s part of it, because what if I forgot? The bigger point, though, is to run the test suite so that I don’t have to.
I have a very, very low threshold for what’s acceptably fast for a test suite. Probably 5-10 seconds or less. If it’s slower than that, I’m simply not going to run the entire thing locally, basically ever. I’m gonna run the tests I care about, and then I’m going to push my changes and let CI either trigger auto-merge, or tell me if there’s other tests I should have cared about (oops!). In the meantime, I’m fully context switched away not even thinking about that PR, because the work is being done for me.
You’re definitely correct here but I think there are plenty of applications where you can like… just trust the intersection between app and os/arch is gonna work.
But now that I think about it, this is such a GH-bound project and like… any such app small enough in scope or value for this to be worth using can just use the free Actions minutes. Doubt they’d go over.
any such app small enough in scope or value for this to be worth using can just use the free Actions minutes.
Yes, that’s the biggest thing that doesn’t make sense to me.
I get the argument that hosted runners are quite weak compared to many developer machines, but if your test suite is small enough to be ran on a single machine, it can probably run about as fast if you parallelize your CI just a tiny bit.
With a fully containerized dev environment yes, that pretty much abolish the divergence in software configuration.
But there are more concern than just that. Does your app relies on some caches? Dependencies?
Where they in a clean state?
I know it’s a bit of an extreme example, but I spend a lot of time using bundle open and editing my gems to debug stuff, it’s not rare I forget to gem pristine after an investigation.
This can lead me to have tests that pass on my machine, and will never work elsewhere. There are millions of scenarios like this one.
I was once rejected from a job (partly) because the Dockerfile I wrote for my code assignment didn’t build on the assessor’s Apple Silicon Mac. I had developed and tested on my x86-64 Linux device. Considering how much server software is built with the same pair of configurations just with the roles switched around, I’d say they aren’t diminished enough.
Was just about to point this out. I’ve seen a lot of bugs in aarch64 Linux software that don’t exist in x86-64 Linux software. You can run a container built for a non-native architecture through Docker’s compatibility layer, but it’s a pretty noticeable performance hit.
One of the things that I like having a CI is the fact that it forces you to declare your dev environment programmatically. It means that you avoid the famous “works in my machine” issue because if tests works in your machine but not in CI, something is missing.
There are of course ways to avoid this issue, maybe if they enforced that all dev tests also run in a controlled environment (either via Docker or maybe something like testcontainers), but it needs more discipline.
This is by far the biggest plus side to CI. Missing external dependencies have bitten me before, but without CI, they’d bite me during deploy, rather than as a failed CI run. I’ve also run into issues specifically with native dependencies on Node, where it’d fetch the correct native dependency on my local machine, but fail to fetch it on CI, which likely means it would’ve failed in prod.
This is something “local CI” can check for. I’ve wanted this, so I added it to my build server tool (that normally runs on a remote machine) called ding. I’ll run something like “ding build make build” where “ding build” is the ci command, and “make build” is what it runs. It clones the current git repo into a temporary directory, and runs the command “make build” in it, sandboxed with bubblewrap.
The point still stands that you can forget to run the local CI.
It is sad that Tk doesn’t support Wayland and there doesn’t seem to be any effort to do so except for a port for Android that doesn’t seem to be much supported anywhere.
It would make for a really good for a cross-platform UI toolkit even if it looks slightly strange. One of my favorite Git UIs (gitk) is written on it and I still use every other week, but it looks really ugly in Wayland because of the non-integer scaling I use in my monitors.
Not really related to the topic, but one of the things that I most dislike in Go is the heavy usage of magic comments.
I understand the idea that they want new code to be parsable in older versions of the language, but considering how heavily the language relies on magic comments I can’t see why it didn’t had a preprocessor marker like C and # instead.
I wouldn’t stay it’s “heavy”, more like a sprinkling. There are only three magic comments that are generally used (build, embed, generate) and even then it’s sparingly. Grafana’s Tempo has ~170k lines of Go and only 5 of those lines are //go: directives. A very large class of programs will never use any of them.
It really depends, for example I have a small project of around 2k lines of code in Go and I had to use multiple build flags since the code is for a desktop application and I had to create different code paths for Linux/macOS/Windows.
It is even worse if you need to use CGo, since if you need to embed C code you basically need to write C code as comments. And since they’re comments, you don’t have syntax highlight at all (or at least I don’t know one text editor that supports it, and my Neovim does support embedded code since it works fine in Nix).
Now keep in mind that in general I don’t think it is a huge issue, but I still think the language would have been more elegant if it had something else to do those transformations. But again, this is Go, so maybe being ugly is better.
Quick reply, but it’s more about being simple rather than ugly. A preprocessor is even more magical and unwieldy than magic comments. C code often has endless runs of #ifdef ... #ifndef ... to separate build platforms making the code mostly unreadable. Not to mention the magic macros that change the code from underneath your feet so you don’t even know what you’re reading.
Even Rust’s macros allows one to create their own little language which no one else understands. Given how the express purpose of Go was to be easy to read for anyone not familiar with the codebase that’d have been a non-starter.
I figure the Go authors wanted, rightfully, to steer clear of the mess that is preprocessors.
Maybe I was not clear, I was not suggesting for Go to have a preprocessor, just that those kind of magic be indicated with a symbol different from comments. I suggested that it could be the marker of preprocessor in C (e.g. #), but not that this is a preprocessor per see.
The idea is because this creates a clean separation between what a comment is and what is something special interpreted by the compiler. Something that was different so it is easier to see that it is doing something, e.g. have different syntax highlight for it.
Looking at the comments, it seems that it is the time of the month to complain about open-source desktop stacks. Let me add my own complaint: why aren’t “window manager” and “desktop environment” separate things in practice? I’m using Gnome with keybinding hacks to feel somewhat like a tiling wm. I would prefer to use a proper wm, but I want the “desktop environment” part of Gnome more: providing me with configuration screens to decide the display layout when plugging an external monitor, having plugging an USB disk just work, having configuration screens to configure bluetooth headsets, having an easy time setting up a printer, having a secrets manager handle my SSH connections, etc.
None of this should be intrisically coupled with window management logic (okay maybe the external-monitor configuration one), yet for some reason I don’t know of any project that succeeded in taking the “desktop environment” of Gnome or Kde or XFCE, and swapping the window manager to something nice. (There have been hacks on top of Kwin or gnome-shell, some of them like PaperWM are impressive, but they feel like piling complexity and corner cases on top of a mess rather than a proper separation of concerns.)
The alternative that I know of currently is to spend days reading the ArchLinux wiki to find out how to setup a systray on your tiling WM to get the NetworkManager applet (for some reason the NetworkManager community can’t be bothered to come up with a decent TUI, although it would clearly be perfectly appropriate for its configuration), re-learn about another system-interface layer to get usb keys to automount, figure out which bluetooth deamon to run manually, etc. (It may be that Nix or other declarative-minded systems make it easier than old-school distributions.) This is also relevant for the Wayland discussion because Wayland broke things for several of these subsystems, and forced people to throw away decades of such manual configuration to rebuild it in various way.
Another approach, of course, would be to have someone build a pleasant, consistent “desktop environment” experience that is easy to reuse on top of independent WM projects. But I suspect that this is actually the more painful and less fun part of the problem – this plumbing gets ugly fast – so it may be that only projects that envision themselves with a large userbase of non-expert users can be motivated enough to pull this through. Maybe this would have more chances of succeeding if we had higher-level abstractions to talk to these subsystems (maybe syndicate and its system layer project which discusses exactly this, maybe Goblins, whatever), that various subsystems owner would be willing to adopt, and that would make it easier to have consistent tools to manipulate and configure them.
The latest versions of lxqt and xfce support running on pretty much any compositor that supports xdg-layer-shell (and in fact, neither lxqt nor xfce ship a compositor of their own). Cosmic also has some support for running with other compositors, although it does ship its own. There’s definitely room for other desktop environments to support this, too.
Another approach, of course, would be to have someone build a pleasant, consistent “desktop environment” experience that is easy to reuse on top of independent WM projects.
I think this is the main reason I use NixOS nowadays: you configure things the way you want, and they will be there even if you reinstall the system. In some ways I think NixOS is more of a meta-distro, where you customize the way you want, and to make things easier there are lots of modules that make configuring things like audio or systray easier.
You will still need to spend days reading documentation and code to get up there, but once it is working this rarely breaks (of course it does break eventually, but generally it is only one thing instead of several of them, so it is relatively easy to get it working again).
What you describe is a declarative configuration of the hodgepodge of services that form a “desktop environment” today, which is easy to transfer in new systems and to tweak as things change. This is not bad (and I guess most tiling-WM-with-not-much-more users have a version of this), it is a way to manage the heterogeneity that exists today.
But I had something better in mind. Those services could support a common format/protocol to export their configuration capabilities, and it would be easy for user-facing systems to export unified configuration tools for them (in your favorite GUI toolkit, as a TUI, whatever). systemd standardized a lot of things about actually running small system services, not much about exposing their options/configurations to users.
One of the main obstacles in using Nix for development environments is mastering the language itself. It takes time to become proficient writing Nix.
How about using AI to generate it instead.
This just sounds like a really bad idea. If the language is unapproachable, change the language or help people learn it. Requiring an LLM for generating configuration will just make the problem worse over time.
Let me rephrase: If the path of least resistance is generating configuration with an LLM, most people will follow this path, and this path doesn’t aid in learning in any way.
Also, it will cover the language complexity problems, making it potentially worse over time.
The learning path is never followed, and the complexity isn’t tackled. Hence: The LLM becomes a de-facto requirement.
It helps with generating configuration without thinking about it or understanding it. The configuration becomes something obscure and assumed to work well that only gets updated by LLMs and no one else. There’s no incentive for the average person to understand what they’re doing.
But this is already a problem right? I can ask right now any LLM to generate a shell.nix for X project, and it will generate it. While I don’t like the idea of auto generating code via LLM, I can understand how having something to scaffold code can be nice in some cases.
Heck, even before LLMs people would have things like snippets to generate code, and we also had things like rails generate to generate boilerplace code for you. This only goes one step further.
Yes, and we don’t want to make it worse or pretend that it’s acceptable.
snippets to generate code, and […] rails generate to generate boilerplace code for you. This only goes one step further.
I have the opinion that boilerplate generators (LLM or not) are a symptom of a problem, not a proper solution. Ignoring that, at least a regular generator:
Can impose limits. Meaning: Providing something very basic that should always work and is easy to understand, requiring the user to learn a bit in order to go further.
The output of the generator is deterministic and controlled by the framework authors, so you can impose those limits and make sure that the output is reasonably safe.
LLMs are not good learning tools because it cannot say “no”. You need to be experienced enough to make reasonable questions in order to get reasonable answers. Portraying an LLM as an alternative to learning for newcomers is counter-productive.
This is an optional feature though, you can use it or not. If your argument is that “this makes people lazy”, well, they can already be lazy by opening ChatGPT or any other LLM and do the same.
Portraying an LLM as an alternative to learning for newcomers is counter-productive.
While the post seems to suggest this is for newcomers it is not necessary true. I could see myself using it considering I had in the past copied and pasted my Nix configuration from some random project to start a new one.
They added it to their CLI. They published a blog post about it. They set up a dedicated marketing website. They made sure it’s literally the first CTA you see on their home page.
They set up a dedicated marketing website. They made sure it’s literally the first CTA you see on their home page.
I just went to their homepage and I see no mention about this feature. But even if it had, as long as it is beside the manual way I wouldn’t say it is encouraging, it is an alternative.
Encouragement would be if they remove all mentions of manual methods or buried it up in the documentation. This is not what is happening here, if I go to their documentation they still have lots of guides in how everything works. Here, just go to: https://devenv.sh/basics/.
Do we just live in completely separate universes?
I ask the same for you. Maybe you’re seeing a different version of the homepage, or maybe in your universe a blog post is the same as home page.
It helps with generating configuration without thinking about it or understanding it. The configuration becomes something obscure and assumed to work well that only gets updated by LLMs and no one else.
There is a perfect phrase for this, this is basically a “cargo cult”.
Thanks to LLMs I now use a huge array of DSL and configuration based technologies that I used not to use, because I didn’t have the time and mental capacity to learn 100s of different custom syntaxes.
Just a few examples: jq, bash, AppleScript, GitHub Actions YAML, Dockerfile are all things that I used to mostly avoid (unless I really needed them) because I knew it would take me 30+ minutes to spin back up on the syntax… and now I use them all the time because I don’t have to do that any more.
I would not feel confident trusting some config that an LLM spits. I would check if it does what its supposed to do, and lose more time than gaining it.
If I cannot scale the amount of different technologies, I use less or simplify. Example: Bash is used extensively in CI. GitHub Actions just calls bash scripts.
It only takes me a few seconds to confirm that what an LLM has written for me works: I try it out, and if it does the thing then great! If it spits out an error I loop that through the LLM a couple of times, if that doesn’t get me to a working solution I ditch the LLM and figure it out by myself.
The productivity boost I get from working like this is enormous.
I’m wondering: Doesn’t that make your work kinda un-reproducible?
I spend a lot of time figuring out why something in a codebase is like it is or does what it does. (And the answers are often quite surprising.)
“Because an LLM said so, at this point in time” is almost never what I’m looking for. It’s just as bad as “The person who implemented (and never got around to documenting) this moved to France and became a Trappist monk”.
I’d have to completely reconstruct the code in both cases.
In my experience it’s way more frustrating and erratic. Good that it works for you.
Apart from that, I think there is value in facing repetitive and easy tasks. Eventually you get tired of it, build a better solution, and learn along the way.
For non repetitive and novel tasks, I just want to learn it myself. Productivity is a secondary concern.
The fact of the matter is that the argument for this feature works regardless of the underlying system.
Your unspoken premise appears to me to be that we should all become masters of all our tools. There was a time I agreed with that premise, but now I think we are so thoroughly surrounded by tools we have to be selective about which ones to master. For the most part with devenv, you set it up once and get on with your life, so there isn’t the same incentive to master the tool or the underlying technology as there is with your primary programming language. I’m using Nix flakes and direnv on several projects at my work; my coworkers who use Nix are mostly way less literate in it than I am and it isn’t a huge obstacle to their getting things done with the benefit of it. Very few people do a substantial amount of Nix programming.
Your unspoken premise appears to me to be that we should all become masters of all our tools.
No, it’s not.
You don’t need to master every tool you use, just a basic understanding, a sense of boundaries and what it can or can’t do.
It doesn’t matter if your objective is “mastering” or “basic understanding”, both things require some learning, and LLMs do not provide that. That’s the main premise in my argument.
I don’t use a tool if I don’t know anything about it.
You don’t need to master every tool you use, just a basic understanding, a sense of boundaries and what it can or can’t do.
I could not agree more with that. LLMs have accelerated me to that point for so many new technologies. They help me gain exactly that understanding, while saving me from having to memorize the syntax.
If you’re using LLMs to avoid learning what tools can and cannot do then you’re not taking full advantage of the benefits they can bring.
My experience with LLMs is having to re-check everything in case its an hallucination (often is) and ending up checking the docs anyways.
The syntax is easy to remember for
me from tool to tool. Most projects tend to have examples on their website and that helps to remember the details. I stick to that.
At a technical level, while I understand the appeal of sticking to DEFLATE compression, the more appealing long term approach is probably to switch to zstd–it offers much better compression without slowdowns. It’s a bigger shift, but it’s a much clearer win if you can make it happen.
I admit to being a bit disappointed by the “no one will notice” line of thinking. It’s probably true for the vast majority of users, but this would rule out a lot of useful performance improvements. The overall bandwidth used by CI servers and package managers is really tremendous.
Took me a minute to realize that by “on JS” you meant, on the contents of .js/.mjs files. At first I thought you meant, to be implemented in JS. Very confusing :D
the more appealing long term approach is probably to switch to zstd–it offers much better compression without slowdowns.
Yes, especially since the change can’t recompress older versions anyway because of the checksum issue. Having a modern compression algorithm could result in smaller packages AND faster/equivalent performance (compression/decompression).
I agree. Gzip is just about as old as it gets. Surely npm can push for progress (I’m a gzip hater, I guess). That said,
Dictionaries can have a large impact on the compression ratio of small files, so Zstandard can use a user-provided compression dictionary.
I do wonder if npm could/would come up with a custom dictionary that would be optimized for, well, anything at all, be it the long tail of small packages or a few really big cornerstones.
I agree a better compression algorithm is always nice, but here back-compat is really important given there’s lots of tools and users.
It’s a whole other level of pain to add support a format existing tools won’t support, it’s not even sure the NPM protocol was built with that in mind. And a non back-compat compression might even make things worse in the grand scheme of things: you need 2 versions of the packages, so more storage space, and if you can’t add metadata to list available formats you get clients trying more than one URL, increasing server load.
This is false, completely false, but repeated over and over and over and over and over and over and over again, then upvoted over and over and over again.
It can be fixed, and the Wayland developers said it can be fixed, they just didn’t want to.
And that’s a good enough reason. As far as I know, none of us are paying them for their work. Since we benefit from what they (or their employers) are freely giving, we have no right to complain.
I’m not an X11 partisan, it seems plausible enough that Wayland is a good idea, but you’re just begging the question that we’re benefiting from what they’re freely giving.
We’re using what they’re freely given, but it’s not guaranteed that we’re benefiting.
With the risk of engaging with someone who’s words read as an emotional defence…
What do you suppose the benefits of sticking with X.ORG were?
I know of no person who could reasonably debug it- even as a user, and the code itself had become incredibly complex over time, accumulating a vast array of workarounds.
The design (I’m told) was gloriously inefficient but hacked over time to be performant - often violating the principles in which it existed in the first place.
It also forced single-threaded compositing due to the client-server architecture…
Fixing these issues in X11 would have been incredibly disruptive, and they chose to do the python3 thing of a clean break, which must have been refreshing.
The “Hate” seems to be categorisable in a few ways:
“My global hotkeys don’t work” - which, is something your window manager should have had a say in, not any random program.
“Screensharing is broken” - which, hasn’t been an issue in 7 years, with the xdg-portal system.
“My program does not support wayland”; which is odd, because every UI toolkit supports it, exception: Java’s old SWIG, but even that supports it for 3 years now, waiting for projects to get on board.
“It has bugs”; like all software, it matures over time. It has made tremendous strides in recent years and is very mature today.
“X11 wasn’t broke”; but it was broke, the authors said as such based on what I said above.
“Wayland is a black box”; this, is probably the only truly fair criticism of all the online vitriol that I see. It works more often out of the box without any tweaking or hacks by distro makers, but, when it doesn’t work, it’s completely opaque. I levy this same criticism against systemd.
Do you have something else to add to this list, or do you think I’ve mischaracterised anything?
What do you suppose the benefits of sticking with X.ORG were?
It just works for tons of people. X runs the software I want it to run, and it does it today. Wayland does not.
I think I could make it work with probably hundreds of hours of effort but…. why? I have better things to do than run on the code treadmill.
I know of no person who could reasonably debug it
I’ve encountered one bug in X over the last decade. I make a repro script, git cloned the x server, built it, ran my reproduction in gdb, wrote a fix, and submitted it upstream in about 30 minutes of work. (Then subsystem maintainer Peter Hutterer rewrote it since my fix didn’t get to the root cause of the issue, my fix was to fail the operation when the pointer was null, his fix ensured the pointer never was null in the first place. So the total time was more than the 30 mins I spent, but even if my original patch was merged unmodified, it would have solved the issue from my perspective.)
After hearing all the alleged horror stories, I thought it’d be harder than it was! But it wasn’t that bad at all. Maybe other parts of the code are worse, I don’t know. But I also haven’t had a need to know, since it works for me (and also for a lot of other people).
Also important to realize that huge parts of the X ecosystem are outside the X server itself. The X server is, by design, hands off of a lot decisions, allowing independent innovation. You can have a major, meaningful contribution to the core user experience without ever touching the core server code, by working on inter-client protocols, window managers, input methods, toolkits, compositors, desktop panels, etc., etc., etc.
clean break, which must have been refreshing.
Every programmer likes the idea of a clean break. Experienced programmers usually know that the grass might be greener now if you rip out Chesterson’s fence, but it won’t stay that way for long.
With all the posturing you’d think someone would have stepped up to maintain it;
Like I said before, I had one issue with it recently, last year, and the bug report got a quick reply and it was fixed.
I don’t know what you expect from an open source maintainer, but replying promptly to issues, reviewing and merging merge requests, and fixing bugs is what I expect from them. And that’s what I got out of the upstream X team, so I can’t complain. Maybe I’m the lucky one, idk, but worst case, if it does ever actually get abandoned, yeah, compiling it was way easier than I thought it would be, so I guess I could probably do it myself. But as of this writing, there’s no need to.
I know of no person who could reasonably debug it- even as a user, and the code itself had become incredibly complex over time, accumulating a vast array of workarounds.
Yes, I remember looking at the code of xinit (that is supposed to be a small shell script just to start the X server) and it was… bad to say the least. I also had some experience implementing a protocol extension (Xrandr) in python-xlib and while I was surprised it worked first try with me just following the X11 documentation at the time, it was really convoluted to implement even including the fact that the code base of python-xlib already abstracted a lot for me.
I don’t know if I designed a replacement of X.org it would like Wayland, I think the least X.org needed was a full rewrite with lots of testing, but it wouldn’t happen considering even the most knowledge people in its codebase don’t touch parts of it because of fear. If anything, Wayland is good because there are people that have entusiasm hacking its code base and constantly improving it, something that couldn’t be said about X.org (that even before Wayland existed, had a pretty slow development).
I think a lot of the hate happens because “it has bugs” magnifies every other complaint.
The xdg-portal system certainly addresses things like screensharing on a protocol level. But pretty much every feature that depends on it – screen sharing, recording etc. – is various kinds of broken at an application level.
E.g. OBS, which I have to use like twice a year at most, currently plays a very annoying trick on me where it will record something in a Wayland session, but exactly once. In slightly different ways – under labwc the screen video input only shows up the first time, then it’s gone; on KDE it will produce a completely blank video.
I’m sure there’s something wrong with my system but a) I’ve done my five years of Gentoo back in 2005, I am not spending any time hunting bugs and misconfiguration until 2035 at the very least and b) that just works fine on X11.
If one’s goal is to hate Wayland, b) is juuuust the kind of excuse one needs.
I understand the frustration, but I was raised on X11 and these bugs sound extremely milquetoast compared to the graphical artifacting, non-starting and being impossible to debug, mesa incompatibility, proprietary extensions not driving the screen and the fucking “nvidia-xconfig” program that worked 20% of the time.
X11 is not flawless either, we got better at distro maintainers handling various hacks to make it function well enough- especially out of the box, but it remains one of the more brittle components of desktop linux by a pretty wide margin.
Oh, no, I’m not disagreeing, that was meant as a sort of “I think that’s why Wayland gets even more hate than its design warrants”, not as a “this is why I hate Wayland”. I mean, I use labwc, that occasional OBS recording is pretty much the only reason why I keep an X11 WM installed.
I’m old enough for my baseline in terms of quirks to be XFree86, nvidia-xconfig felt like graduating from ed to Notepad to me at some point :-D.
But then they’d be maintaining this new model and the rest of X.Org, which the maintainers could not do. It might not matter that it’s technically possible if it’s not feasible.
Wayland is the new model plus xwayland so that the useful parts of X11 keep working without them having to maintain all the old stuff.
But then they’d be maintaining this new model and the rest of X.Org, which the maintainers could not do.
Again, provably false - they are maintaining the rest of X.Org, first like you said, xwayland is a thing… that’s all the parts they said were obsolete and wanted to eliminate! But also from the same FAQ link:
Why duplicate all this work?
Wayland is not really duplicating much work. Where possible, Wayland reuses existing drivers and infrastructure. One of the reasons this project is feasible at all is that Wayland reuses the DRI drivers, the kernel side GEM scheduler and kernel mode setting. Wayland doesn’t have to compete with other projects for drivers and driver developers, it lives within the X.org, mesa and drm community and benefits from all the hardware enablement and driver development happening there.
A lot of the code is actually shared, and much of the rest needs very little work. So compared to what is actually happening today, not developing Wayland would have been less maintenance work, not more. All that stuff in the kernel, the DRI stuff, even the X server via xwayland, is maintained either way. Only difference is now they’ve spent 15 years (poorly) reinventing every wheel to barely regain usability parity because it turns out most the stuff they asserted was useless actually was useful and in demand.
I like to say Graphics are, at best, 1/3 of GUI. OK, let’s assume you achieved “every frame is perfect”. You still have a long way to go before you have a usable UI.
That’s like saying that we are maintaining both English and German language. Who are “they”? And X is very much in life support mode only - take a look at the commit logs.
XWayland is much smaller than the whole of X, it’s just the API surface necessary to keep existing apps working. That’s like saying that a proxy is a web browser..
Code sharing:
The DRM subsystem and GPU code is “shared”, as in it is properly modularized in Linux and both make use of it. If anything, Wayland compositors make much better use of the actual Linux kernel APIs, and are not huge monoliths with optional proprietary binary blobs. It’s pretty trivial to write a wayland compositor from scratch with no external libraries at all - where is the shared X code?
There’s also backends for other X servers (xnest), Macs (xquartz), Windows (xwin), and, of course, the big one for physical hardware (xfree86, which is much smaller than it used to be - over 100,000 lines of code were deleted from there around 2008 - git checkout a commit from 2007 and find … 380k lines according to a crude find | wc for .c files, git checkout master and find 130k lines by the same measure in that hw/xfree86 folder. Yeah, that’s still a lot of code, but much less than it was, because yes, some if it was simply deleted, but also a lot of it was moved - so nowadays both the X server and Wayland compositors can use it).
But outside of the hw folder, notice how much of the X server core code is shared among all these implementations, including pretty much every user-facing api and associated implementation bookkeeping in the X server.
XWayland not only is a whole X server, it is a build of the same X.org code.
“Maintaining” can mean different things. The X server is “maintained” in the sense that it’s kept working for the benefit of XWayland, but practically nobody is adding new features any more. Not having to maintain the “bare metal” backend also removes a lot of the workload in practice.
This is a much simpler and less work than continuing to try to add features to keep the X protocol up to date with the expectations of a modern desktop.
In other words, yes, the X server is maintained, in the sense that it’s in maintenance mode, but there’s very little active development outside of XWayland-specific stuff.
If you look at the Git tags on that repo, you’ll see that XWayland is also on a completely different release cadence now, with a different version number and more frequent releases. So even though it’s the same repo, it’s a different branch.
XWayland is not a mandatory part of the Wayland protocol, though. Of course they chose the easiest/most compatible way to implement the functionality, which will be to build on the real thing, but it’s a bit dishonest on your part to say that an optional part, meant to provide backwards compatibility, could be considered “shared code”.
I have to admit that I don’t know much about X.Org’s internals, but I think a lot of that stuff extracted to libraries and into the kernel is in addition to, not replacing, the old stuff.
For example X.Org still has to implement its own modesetting in addition to KMS. It has to support two font systems in addition to whatever the clients use. It has to support the old keyboard configuration system andlibxkbcommon. It has to support evdev and libinput. It has to implement a whole graphics API in addition to kernel DRM.
Wayland can drop all this old stuff and just use the new. Xwayland isn’t X.Org, I don’t think it has to implement any of this. It’s “just” a translation layer for the protocol.
not developing Wayland would have been less maintenance work, not more.
Please be careful about assuming what somebody else will find easy / less work. If the maintainers said they can’t support it anymore, I’m inclined to believe them. Sometimes cutting your losses is the easier option.
Only difference is now they’ve spent 15 years (poorly) reinventing every wheel to barely regain usability parity because it turns out most the stuff they asserted was useless actually was useful and in demand.
I think this is unfair to people working on Wayland. I know you don’t like it, but It’s an impressive project that works really well for many people.
I don’t think anybody was “asserting [features] are useless”, they just needed the right person to get involved. I’m assuming you mean things like screen sharing, remote forwarding, and colour management. People do this work voluntarily, it might not happen immediately if at all, and that’s fine.
XWayland does have to implement all the font/drawing/XRender/etc. stuff, since X11 clients need that to work. That’s part of its job as a “translation” layer.
(I should know, Xorg/XWayland does some unholy things with OpenGL in its GLAMOR graphics backend that no other app does, and it has tripped up our Apple Mesa driver multiple times!)
But the most important part is that XWayland doesn’t have to deal with the hardware side (including modesetting, but also multi screen, input device management, and more), which is where a lot of the complexity and maintenance workload of Xorg lies.
The core Wayland protocol is indeed just buffer exchange and update based on a core Linux subsystem. It’s so lean that it is used in car entertainment systems.
And HTTP 1.0 is another protocol, that can also be added to other programs. They are pretty trivial, obviously they can be incorporated into X. I can also attach an electric plug to my water plumping, but it wouldn’t make much sense either. Having two ways to do largely overlapping stuff, that will just overstep each other’s boundaries would be bad design.
Wayland has a ready-made answer to “every frame is perfect” – adding that to X would just make some parts of a frame perfect. Wayland also supports multiple displays that have different DPIs, which is simply not possible at all in X.
GPUs weren’t even a thing when X was designed - it had a good run, but let it rest now. I really don’t support reinventing the wheel unnecessarily, that seems all too common in IT, but there are legitimate points where we have to ask if we really are heading in the direction we want to go. It is the correct decision in case of X vs Wayland, as seen by the display protocols of literally every other OS, whose internals are pretty similar to it.
EDIT: I posted a video comparison in this comment because I’m tired of arguing in text about something that is obvious when you see it in real life. X11 and Wayland are not the same, and only Wayland can seamlessly handle mixed DPI.
We already went over this in another thread here in the past. X11 does not implement different DPIs for different monitors today, and it doesn’t work out of the box (where you say “janky”, what you really mean is “needs manual hacks and configuration and cannot handle hotplug properly”).
Even if you did add the metadata to the protocol (which is possible, but hasn’t been done), it’s only capable of sudden DPI switching when you move windows from one screen to another.
X11 cannot do seamless DPI transitions across monitors, or drawing a window on two monitors at once with the correct DPI on both, the way macOS and KDE Wayland can, because its multi-monitor model, which is based on a layout using physical device pixels, is incompatible with that. There’s no way to retroactively fix that without breaking core assumptions of the X11 protocol, which would break backwards compatibility. At that point you might as well use Wayland.
The links posted by @adam_d_ruppe are good, but I think the oldest post talking about it is oblomov’s blog post. He even had patches that implement mixed DPI for GTK on X11, but obviously the GTK folks would never merge something that would improve their X11 backend (mind you, they are still stuck on xlib).
It gets even funnier, because proper fractional mixed DPI scaling was possible in X11 long before Wayland got the fractional scaling protocol, so there was ironically a brief period not long ago where scaling was working better in XWayland than in native Wayland (that particular MR was a prime example of Wayland bikeshedding practice, being blocked for an obnoxious amount of time mostly because GTK didn’t even support fractional scaling itself, even quite some time after the protocol was added).
X11 cannot do seamless DPI transitions across monitors
Yes it can and it works ootb with any proper toolkit (read: not GTK) and renders at the native DPI of each monitor switching seemlessly inbetween. We don’t even have to lock ourselfes to Qt, even ancient toolkits like wxWidgets support it.
drawing a window on two monitors at once with the correct DPI on both
This doesn’t work, neither on Wayland nor X11 and not a single major toolkit supports this usecase and for good reason, the complexity needed would be insane and you can forget basically about any optimizations across the whole rendering stack.
Also the whole usecase is a literal edge case in its own rights, I don’t think many people are masochist enough to keep a window permanently on the edge of different-DPI monitors for a long time.
the way macOS and KDE Wayland can
It doesn’t work on KDE Wayland the way you think it does, you get the same sudden switch as soon as you move more than half over to the other monitor. If you don’t believe me, try setting one monitor to a scale factor that corresponds to a different physical size. Obviously you will not notice any sudden DPI changes if the scale factors end up as the same physical size, but that works in X11 too.
At this point the only difference is that X11 does not have native per-window atoms for DPI, so every toolkit adds their own variables, but that hardly makes any difference in practice. And to get a bit diabolical here, since Wayland is so keen on shoehorning every usecase they missed (i.e. everything going beyond a basic Kiosk usecase) through a sideband dbus API, wp-fractional-scale-v1 might as well have become a portal API that would have worked the same way on X11.
After a decade of “Wayland is the future” it is quite telling that all the Wayland arguments are still mostly based on misconceptions like this (or the equally common “the Wayland devs are the Xorg devs” - not realizing that all of the original Wayland devs have long jumped the burning ship), while basic features such as relative-window positioning or god forbid I mention network transparency are completely missing.
I’m tired of pointless arguing, so here’s a video. This is what happens on X11, with everything perfectly manually configured through environment variables (which took some experimentation because the behavior wrt the global scale is a mess and unintuitive):
In both tests the screen configuration, resolution, relative layout (roughly*), and scale factors are the same (1.5 left, 2.0 right).
These two videos are not the same.
I guess I should also point out the tearing on the left display and general jank in the X11 video, which are other classic symptoms of how badly X11 works for some systems. This is with the default modesetting driver, which is/was supposed to be the future of X11 backends and is the only one that is driver-independent and relies on the Linux KMS backend exclusively, but alas… it still doesn’t work well. Wayland doesn’t need hardware-specific compositor backends to work well.
Also, the reason why I used a KDialog window is that it only works as expected with fixed size windows. With resizeable windows, when you jump from low scale to high scale, the window expands to fit the (now larger) content, but when you jump back, it keeps the same size in pixels (too large relative to the content now), which is even more broken. That’s something that would need even more window manager integration to make work as intended on X11. This is all a consequence of the X11 design that uses physical pixels for all window management. Wayland has no issue since it deals in logical/scale-independent units only for this, which is why everything is seamless.
Also, note how the window decorations are stuck at 2.0 scale in X11, even more jank.
* X11 doesn’t support the concept of DPI-independent layout of displays with different DPI, so it’s impossible to literally achieve the same layout anyway. It just works completely differently.
Funnily enough on my system the KDE Wayland session behaves exactly like what you consider so broken in the X11 session.
There is a lot to unpack here, but let’s repeat the obvious since you completely ignored my comment and replied with basically a video version of “I had a hard time figuring out the variables, let’s stick with the most broken one and just dump it as a mistake of X11”: What’s happening in the Wayland session is most certainly not what you think it is, Qt (or any toolkit for that matter) is not capable of rendering at multiple DPIs at the same time in the same window. As I stated before, something like that would require ridiculous complexity in the entire render path. Imagine drawing an arc somewhere in a low-level library and you suddenly have to change all your math, because you cross a monitor boundary. Propagating that information alone is a huge effort and let’s not even start with more advanced details, like rotated windows.
The reason why you see no jump in the Wayland session is because you chose the scaling factor quite conveniently so that it is identical in physical size on both monitors (and that would work on X11 too). Instead of doubling down, it would have taken you 5 seconds to try out what I suggested, i.e. set one of the scale factors to one with a different physical size (maybe 1.0 left, 2.0 right) and you will observe that indeed also Wayland cannot magically render at different DPIs at the same time and yes you will observe those jumps.
Now obviously your scaling factors of 1.5 vs 2.0 should produce the same result on X11. I don’t know your exact configuration, so I can only reach to my magic crystal ball, but since you already said you had a hard time figuring out the variables, a configuration error is not far fetched: Maybe you are applying scaling somewhere twice, e.g. from leftover hacks with xrandr scaling or hardcoding font DPI, or kwin is interfering in a weird way (set PLASMA_USE_QT_SCALING to prevent that). But honestly, given that you start your reply with “Sigh” and proceed to ignore my entire comment, I don’t think you are really interested in finding out what was wrong to begin with. If you are though, feel free to elaborate.
In my case I have the reverse setup of yours, i.e. my larger monitor is 4k and my laptop is 1080p, so I apply the larger scale factor to my larger screen (I don’t even know my scaling factors, I got lucky in the EDID lotto and just use QT_USE_PHYSICAL_DPI). So yes this means also on Wayland I get the “jump” once a window is halfway over to the next monitor. The size explosions are not permanent as you say though, neither on Wayland nor on X11.
What’s happening in the Wayland session is most certainly not what you think it is, Qt (or any toolkit for that matter) is not capable of rendering at multiple DPIs at the same time in the same window.
She obviously doesn’t think that that is what’s happening. In the comment you yourself linked, she explains how it works:
KWin Wayland is different. It arranges monitors in a logical coordinate space that is DPI-independent. That means that when you move a window from a 200 scale monitor to a 150 scale monitor, it’s already being scaled down to 150 scale the instant it crosses into the second monitor, only on that monitor. This doesn’t require applications to cooperate in any way, and it even works for X11 applications with XWayland, and has worked that way for over two years. Windows never “jump” size or appear at the wrong DPI, partially or fully, on any monitor. It’s completely seamless, like macOS.
What you need application cooperation for is to then adjust the window buffer scale and re-render the UI to optimize for the monitor the window is (mostly) on. That’s more recent functionality, and only works for Wayland apps that implement the fractional scaling protocol, not X11 apps. For X11 apps on XWayland, KWin chooses the highest screen scale, and scales down on other screens. The only visual difference is that, for Wayland apps with support, the rendering remains fully 1:1 pixel perfect and sharp once a window is moved to a new screen. The sizing behavior doesn’t change.
With this correct understanding, we can see that the rest of your comment is incorrect:
The reason why you see no jump in the Wayland session is because you chose the scaling factor quite conveniently so that it is identical in physical size on both monitors (and that would work on X11 too). Instead of doubling down, it would have taken you 5 seconds to try out what I suggested, i.e. set one of the scale factors to one with a different physical size (maybe 1.0 left, 2.0 right) and you will observe that indeed also Wayland cannot magically render at different DPIs at the same time and yes you will observe those jumps.
I just tried using the wrong DPI and there was no jump (I’m on Sway). On on screen, the window I moved was much bigger, and on the other, it was much smaller. But it never changed in size. The only thing that changed was the DPI it was rendering it, while seamlessly occupying the exact same space on each monitor as it did so. This works because Wayland uses logical coordinates instead of physical pixels to indicate where windows are located or how big they are. So when a window is told to render at a different scale, it remains in the same logical position, at the same logical size.
There is a noticeable change, but it’s just the rendering scale adjustment kicking in causing the text on the monitor the window is being moved into to become pixel sharp, and the text in the old monitor getting a bit of a blur.
This changed a few years ago to allow per-display spaces (what Linux calls “workspaces”) - presumably they decided they didn’t want to deal with edge cases where a window is on an active space on one monitor and an inactive space on another. (Or what happens if you move a space containing half a window across monitors?)
You can get the old behavior back by turning off Settings > Desktop & Dock > Mission Control > Displays have separate spaces.
I’m tired of arguing in text about something that is completely obvious when you see it in real life.
Indeed - the fact that it does actually work, albeit with caveats, proves that it is, not, in fact, “simply not possible at all”.
We can discuss the pros and cons of various implementations for various use cases, there’s legitimate shortcomings in Qt and KWin, some of which are easy to fix* (e.g. the configuration ui, the hotplugging different configurations), some are not (the window shape straddling monitor boundaries), there’s some advantages so it too (possibly better performance and visual fidelity), but a prerequisite to a productive technical discussion is to do away with the blatant falsehoods that universally start these threads.
“You can do it, but….” is something reasonable people can discuss.
“It is simply not possible at all” is provably flat-out false.
I know these are easy to fix because they do work on my computer with my toolkit. But I prefer to work in verifiable primary sources that interested people can try for themselves - notice how most my comments here have supporting links - and Qt/KDE is mainstream enough that you can try it yourself, likely using programs you already have installed, so you don’t have to take my word for it.
I appreciate that you’ve now tried it yourself. I hope you’ll never again repeat the false information that it is impossible.
The use of quote marks here implies that the commenter you’re replying to used this exact term in their comment, but the only hit for searching the string is your comment.
I’m flagging this comment as unkind, because my reading of this and other comments by you in this thread is that you are arguing in bad faith.
The use of quote marks here implies that the commenter you’re replying to used this exact term in their comment, but the only hit for searching the string is your comment.
Try not to accuse people of personal attacks - which is itself a personal attack, you’re calling me an unkind liar - without being damn sure you have your facts right.
That’s a different person. The person you replied to did not say that.
This is what the person you replied to actually said:
X11 cannot do seamless DPI transitions across monitors, or drawing a window on two monitors at once with the correct DPI on both, the way macOS and KDE Wayland can, because its multi-monitor model, which is based on a layout using physical device pixels, is incompatible with that.
Looking at the two videos it’s pretty obvious that they are not doing the same thing at all. That dialog window is not being drawn with the correct DPI on each monitor, it’s either one or the other. “Mixed” is sufficiently elastic a word that I’m sure some semantic tolerance helps but I’m not exactly inclined to call that behaviour “mixed”, just like I also can’t point at the bottle of Coke in my fridge, the stack of limes in my kitchen and the bottle of rum on my shelf and claim that what I actually have is a really large Cuba Libre. (Edit:) I.e. because they’re not mixed, they’re obviously exclusive.
I don’t know if that’s all that X11 can do, or if it’s literally impossible to achieve what @lina is showing in the second video – at the risk of being an embarrassment to nerddom everywhere I’ve stoped caring a few years back and I’m just happy if the pixels are pixeling. But from what I see in the video, that’s not false information at all.
100% the same thing as Wayland is impossible in X. It can’t handle arranging mixed DPI monitors in a DPI-independent coordinate space, such that rendering is still pixel perfect on every monitor (for windows that don’t straddle monitors). X11 has no concept of window buffer scale that is independent of window dimensions.
The closest you can get is defining the entire desktop as the largest DPI and all monitors in that unit, then having the image scaled down for all the other monitors. This means you’re rendering more pixels though, so it’s less efficient and makes everything slightly blurry on the lower DPI monitors. It’s impossible to have pixel perfect output of any window on those monitors in this setup, and in practice, depending on your hardware, it might perform very poorly. It’s basically a hacky workaround.
This is actually what XWayland fakes when you use KDE. If you have mixed DPI monitors, it sets the X11 DPI to the largest value. Then, in the monitor configuration presented via fake XRandR to X11 clients, all monitors with a lower DPI have their pixel dimensions scaled up to what they would be at the max DPI. So X11 sees monitors with fake, larger resolutions, and that allows the relative layout to be correct and the positioning to work well. If I had launched KDialog under KDE Wayland with the backend forced to X11, it would have looked the same as Wayland in the video in terms of window behavior. It also wouldn’t have any tearing or glitches, since the Wayland compositor behind the scenes is doing atomic page flips for presentation properly, unlike Xorg. The only noticeable difference would have been that it’s slightly less sharp on the left monitor, since the window would be getting downscaled there.
That all works better than trying to do it in a native X11 session, because XWayland is just passing the window buffers to Wayland so only the X11 windows get potentially scaled down during compositing, not the entire screen.
Where it falls apart is hotplug and reconfiguration. There’s no way to seamlessly transition the X11 world to a higher DPI, since you have to reset all window positions, dimensions, monitor dimensions and layout, and client DPI, to new numbers. X11 can’t do that without glitches. In fact, in general, changing DPI under X11 requires restarting apps for most toolkits. So that’s where the hacky abstraction breaks, and where the proper Wayland design is required. X11 also doesn’t have any way for clients to signal DPI awareness and can’t handle mixed DPI clients either, so any apps that aren’t DPI aware end up tiny (in fact, at less than 1.0 scale on monitors without the highest DPI). This affects XWayland too and there’s no real way around it.
At best, in XWayland, you could identify which clients aren’t DPI aware somehow (like manual user config) and give them a different view of the X11 world with 1.0 monitor scales. That would mostly work as long as X11 windows from both “worlds” don’t try to cooperate/interact in some way. KDE today just gives you two global options, either what I described or just always using 1.0 scale for X11 (which makes everything very blurry on HiDPI monitors, but all apps properly scaled).
The closest you can get is defining the entire desktop as the largest DPI and all monitors in that unit, then having the image scaled down for all the other monitors [without hotplug or per-client DPI awareness].
That’s what I thought was happening, too, but I wasn’t really sure if my knowledge was up-to-date here. Like I said, I’m just happy if my pixels are pixeling, and I didn’t want to go down a debate where I’d have to read my way through source code. This end of the computing stack just isn’t fun for me.
I’m not exactly inclined to call that behaviour “mixed”,
This isn’t a term we invented in this thread, it is very common, just search the web for “mixed dpi” and you’ll find it, or click the links elsewhere in this thread and see how it is used.
A blog posted in a cousin comment sums it up pretty well: “A mixed-DPI configuration is a setup where the same display server controls multiple monitors, each with a different DPI.”
(DPI btw formally stands for “dots per inch”, but in practice, it refers to a software scaling factor rather than the physical size because physical size doesn’t take into account the distance the user’s eyes are from the display. Why call it DPI then? Historical legacy!)
Or, if that’s too far, go back to the grandfather post that spawned this very thread:
Wayland also supports multiple displays that have different DPIs, which is simply not possible at all in X.
“displays that have different DPIs”, again, the common definition spelled out.
What, exactly, happens when a window straddles two different monitors is implementation-dependent. On Microsoft Windows and most X systems, the window adopts the scaling factor for the monitor under its center point, and uses that across the whole window. If the monitors are right next to each other, this may cause the window to appear non-rectangular and larger on one monitor than the other. This is satisfactory for millions of people. (I’d be surprised if many people actually commonly straddle windows between monitors at all, since you still have the screen bezel at least right down the middle of it… I’d find that annoying. It is common for window managers to try to snap to monitor boundaries to avoid this, and some versions of Apple MacOS (including the Monterey 12.7.6 I have on my test computer) will not even allow you to place a window between monitors! It makes you choose one or the other.)
edit: just was reminded of this comment: https://lobste.rs/s/oxtwre/hard_numbers_wayland_vs_x11_input_latency#c_1f0zhn and yes that setting is available on my mac version, but it requires a log out and back in. woof. not worth it for a demo here, but interesting that Apple apparently also saw fit to change their default behavior to prohibit straddling windows between monitors! They apparently also didn’t see much value in this rare use case. /edit
On Apple operating systems and most (perhaps all?) Wayland implementations… and some X installs, using certain xrandr settings (such as described here https://blog.summercat.com/configuring-mixed-dpi-monitors-with-xrandr.html), they do it differently: the window adopts the highest scaling factor the window appears on (or is present in the configuration? tbh im not exactly sure), using a virtual coordinate space, then the system downscales that to the target area on screen. This preserves its rectangular appearance - assuming the monitors are physically arranged next to each other and the software config mirrors that physical arrangement… and the OS lets you place it there permanently (but you can still see it while dragging at least) - but has its own trade offs; it has a performance cost and can lose visual fidelity (e.g. blurriness), especially if the scale factors are not integer multiples of each other, but sometimes even if they are because the application is drawing to a virtual screen which is scaled by a generic algorithm with limited knowledge about each other.
In all these cases, there is just one scale factor per window. Doing it below that level is possible, but so horribly messy to implement, massive complexity for near zero benefit (again, how often do people actually straddle windows between monitors?), so nobody does it irl. The difference is the Mac/Wayland approach makes it easier to pretend this works… but it is still pretending. The illusion can be pretty convincing a lot of the time though, like I said in that whole other lobsters link with lina before, I can understand why people like this experience, even if it doesn’t matter to me.
The question isn’t if the abstraction leaks. It is when and where it leaks.
This isn’t a term we invented in this thread, it is very common, just search the web for “mixed dpi” and you’ll find it, or click the links elsewhere in this thread and see how it is used.
I tried to before posting that since it’s one of those things that I see people talking past each other about everywhere, and virtually all the results I get are… both implementation-specific and kind of useless, because the functional boundary is clearly traced somewhere and different communities seem to disagree on where.
A single display server controlling multiple monitors, each with the same DPI is something that X11 has basically always supported, I’m not sure how that’s controversial. Even before Xinerama (or if your graphics card didn’t work with Xinerama, *sigh*) you could always just set up two X screens, one for each monitor. Same display server, two monitors, different DPIs – glorious, I was doing mixed DPI before everyone was debating it, and all thanks to shitty S3 drivers and not having money to buy proper monitors.
But whenever this is discussed somewhere, it seems that there’s a whole series of implicit “but also” attached to it, having to do with fractional scaling, automatic configuration, what counts as being DPI-aware and whatnot.
So it’s not just something we invented in this thread, it’s something everyone invents in their own thread. In Windows land, for example, where things like getting WM_DPICHANGED when the window moves between monitors are a thing, you can query DPI per window, and set DPI awareness mode per thread, I’m pretty sure you’ll find developers who will argue that the xrandr-based global DPI + scaling system we’ve all come to know and love isn’t mixed-DPI, either.
(Edit:) To be clear – I haven’t used that system in a while, but as I recall, the way it worked was it set a global DPI, and you relied on the display server for scaling to match the viewports’ sizes. There was no way for an application to “know” what DPI/scaling factor combination they were working with on each monitor so they could adjust their rendering for whatever monitor they were on (for their implementation-defined definition of on, “on”, sure – midpoint, immediate transition, complete transition, whatever). Toolkits tried to shoehorn that in, but that, too, was weird in all sorts of ways and assumed a particular setup, at least back in 2016-ish or however long ago it was.
I’m pretty sure you’ll find developers who will argue that the xrandr-based global DPI + scaling system we’ve all come to know and love isn’t mixed-DPI, either.
Well, I wouldn’t call that not mixed dpi, but I would call it suboptimal. So it seems you’re familiar with the way it works on Windows: move between monitors or change the settings in display properties, and the system broadcasts the WM_DPICHANGED message to top level windows that opted into the new protocol. Other windows are bitmap scaled to the new factor, as needed.
Applications use the current DPI for their monitor in their drawing commands - some of this is done automatically by the system APIs, others you multiply out yourself. You need to use some care not to double multiply - do it yourself, then the system api does it again - so it is important to apply it at the right places.
Your window is also automatically resized, as needed, as it crosses scaling boundaries, by the system.
Qt/KDE tries to apply similar rules… but they half-assed it. Instead of sending a broadcast message (a PropertyChange notification would be about the same in the X world), they settled for an environment variable. (The reason I know where that is in the source is that I couldn’t believe that’s the best they did…. for debugging, sure, but shipping that to production? Had to verify but yes, that’s what they shipped :( the XWAYLAND extension has proposed a property - see here https://gitlab.freedesktop.org/xorg/xserver/-/merge_requests/1197 - but they couldn’t agree on the details and dropped it, alas) There’s also no standard protocol for opting out of auto scaling, though I think XWAYLAND proposed one too, I can’t find that link in my browser history so I might be remembering wrong.
The KDE window manager, KWin, tries to apply scale as it crosses monitor boundaries right now, just like Windows does, but it seems to only ever scale up, not back down. I don’t know why it does this, could be a simple bug. Note that this is KWin’s doing, not Qt’s, since the same application in a different window manager does not attempt to resize the window at all, it just resizes the contents of the window.
But, even in the half-assed impl, it works; the UI content is automatically resized for each monitor’s individual scale factor. User informs of each monitor’s scale factor, either by position or by port name (again, half-assed, it should have used some other identifier which would work better with hotplugging, but does still work). If a monitor configuration changes, xrandr sends out a notification. The application queries the layout to determine which scaling factor applies to which bounding box in the coordinate space, then listens to ConfigureNotify messages from the window manager to inform them of where they are. Quick check of rectangle.contains(window_coordinate) tells it what scale it has, then this fires off the internal dpi changed event, if necessary. At this point, the codepaths between X and Windows merge as the toolkit applies the new factor. At this point, the actual scaling is done client side, and the compositor should not double scale it… but whether this works or not is hit and miss, since there’s no standardization! (The one nice thing about xwayland is they’re finally dismissing the utter nonsense that X cannot do this and dealing with reality - if the standard comes from wayland, i don’t really care, i just want something defined!)
A better way would be if the window manager sent the scale factor as a ClientMessage (similar to other EMWH messages) as it crosses the boundary, so the application need not look it up itself, which would also empower the user (through the window manager) to change the scale factor of individual windows on-demand - a kind of generic zoom functionality - and to opt some individual windows out of automatic bitmap scaling, even if the application itself isn’t written to support it. I haven’t actually implemented this in my window manager or toolkit; the thought actually just came to mind a few weeks ago in the other thread with lina, but I’d like to, I think it would be useful and a nice little innovation.
As a practical matter, even if the window manager protocol is better, applications would probably still want to fallback to doing it themselves if there is no window manager support; probably query _NET_SUPPORTED, and if absent, keep the DIY impl.
None of this is at all extraordinary. Once I implement mine, I might throw it across the freedesktop mailing list, maybe even the xwayland people, to try to get some more buy-in. Working for me is great - and I’ll take it alone - but would be even nicer if it worked for everybody.
If a framework has to go way out of its way to implement some hack to make it work despite X’s shortcomings, but all the other frameworks don’t support it at all, then X simply doesn’t support this feature.
I want to preface this by saying: 1) I run neither X nor Wayland 99% of the time; the kernel’s framebuffer is usually enough for my personal needs; 2) it’s been months since I tried Wayland on any hardware.
That said, the one thing I seemed to notice in my “toe dip” into the Wayland world was pretty problematic to me. When I would start X to run some graphical config tool on a remote machine with no GPU and low-power CPU, it seemed to me that “ssh -X” put almost no load on the remote computer; however, attempting to run the same via waypipe put a lot of load on the remote machine, making it essentially unusable for the only things I ever needed a graphical interface for.
If I’ve incorrectly understood the root cause here, I’d love to have someone explain it better. While I don’t use either one very often, it’s clear that X11 is not getting the development resources wayland is, and I’d like to be able to continue using my workflow decades into the future…
The primary reason for Wayland was the security model, was it not? That I believe is truly unfixable. And if you’ve decided to prioritize that, then it makes sense to stop working on X.
No, the primary reason for Wayland is, in Kristian Høgsberg’s words, the goal of “every frame is perfect”.
And even if security was the priority… X also “fixed” it (shortly before Wayland was born; X Access Control Extension 2.0 released 10 Mar 2008, Wayland initial release 30 September 2008), but it never caught on (I’d argue because it doesn’t solve a real world problem, but the proximate cause is probably that the XACE is all code hooks, no user-friendly part. But that could be solved if anybody actually cared.)
Worth noting that Microsoft had man of the same questions with Windows: they wanted to add a compositor, add elevated process isolation, per-monitor fractional scaling, all the stuff people talk about, and they successfully did it with near zero compatibility breaks, despite Win32 and X sharing a lot of functionality. If it were fundamentally impossible for X, it would have been fundamentally impossible for Windows too.
Security can’t be solved retroactively. You can’t plug all the holes of a swiss cheese.
“Solutions” were nesting X servers into one another and such, at that point I might as well run a whole other VM.
And good for windows, maybe if we would have the legacy X API available for use under Wayland, so that Wayland’s security benefits could apply, while also not losing decades of programs already written, we could also have that for linux.. maybe we could call it WaylandX! [1]
“Solutions” were nesting X servers into one another and such, at that point I might as well run a whole other VM
No they weren’t. X11 is a client-server protocol. The only things that a client (app) sees are messages sent from the server (X.org). The default policy was that apps were trusted or untrusted. If they were untrusted, they couldn’t connect to the server. If they were trusted, they could do anything.
The security problems came from the fact that ‘anything’ meant read any key press, read the mouse location, and inspect the contents of any other window. Some things needed these abilities. For example, a compositing window manager needed to be able to redirect window contents and composite it. A window manager that did focus-follows-louse needed to be able to read all mouse clicks to determine which window was the current active one and tell the X server to send keyboard events there. A screenshot or screen sharing app needed to be able to see the rendered window or sceen contents. Generally, these were exceptions.
The X Access Control Extensions provided a general mechanism (with pluggable policies) to allow you to restrict which messages any client could see. This closed the holes for things like key loggers, while allowing you to privilege things like on-screen keyboards and screenshot tools without needing them to be modified. In contrast, Wayland just punted on this entirely and made it the compositor’s problem to solve all of these things.
XFS is quite popular in the server space if I’m not mistaken. At least at GitLab I believe it was the filesystem we ran for everything, though perhaps that has changed since I left.
Around 10 years ago, I chose XFS because it had features I needed that ext4 did not at the time. I don’t recall exactly what those were (64-bit inodes maybe?), but it also performed better with lots of small files, doesn’t require a fsck at pre-determined intervals. And it’s just been rock-solid. It’s like the Debian of filesystems.
It’s solid, stable and fast. It’s boring, but in a good way.
There is more seldom a disk check that delays the boot, compared to ext4. It did not have the raid issues btrfs had. And it’s not as experimental as bcachefs.
I’ve started using it for NixOS because it seems to cope better with its high demand for inodes than ext4 does. It also seems to be faster than btrfs, particularly in VMs for some reason.
My anecdotal evidence as a XFS user since XFSv4 (probably the last 6 years? I lost the count to be honest).
XFS used to be a filesystem only recommended for servers and other systems that had some kind of backup power system to ensure clean shutdown. I used in a desktop for a few months, until a system forced reboot mounted my system read-only, and xfs_repair completely corrupted the file system. But even before that I lost a few files thanks to forced shutdowns. Well, went back to Ext4, and stayed there for a few years.
After trying btrfs and getting frustrated with performance (this was before NVMe were common, so I was using a SATA SSD), I decided to go back to XFS and this time not only it solved my performance issues, I hadn’t have problems with corruption or missing files anymore. The file system is simple rock solid. So I still use it by default, unless I want some specific feature (like compression) that is not supported in XFS.
It’s a popular filesystem in some low-latency storage situations such as with Seastar-based software like ScyllaDB here and Redpanda.
We use it at work for our Clickhouse disks. If we could start over I’d have probably gone with ext4 instead as that’s what Clickhouse is mainly tested on. There was some historic instability with XFS but it seems to have gotten better (partly with updates, partly with tuning on our end to minimise situations where the disk is under high load). Like most things XFS is a good choice if your software is explicitly tested against it.
At the time I chose XFS several years ago, I wanted to be able to use things like reflinks without needing to use btrfs (which is pretty stable these days but I wasn’t very confident in it back then). I can certainly say that’s it’s been quite resilient, even with me overflowing my thin pool multiple times (I am very good at disk accounting /s) and throwing a bunch of hard shutoffs at it.
If you often have problems filling up your disk, you are going to have a very, VERY bad time on btrfs. Low disk space handling is NOT stable in btrfs. In fact it is almost non-existent.
After 15 years, btrfs can still get into situations where you’re getting screwed by its chunk handling and have to plug a USB drive in to give it enough free space to deallocate some stuff. Even though df / reports (<10 but >1) gigabytes of free storage. This blog post was 9 years old when I consulted it, and I still needed all the tips in it.
I find it unconscionable that Fedora made btrfs the default with this behavior still not fixed. I will never, ever be putting a new system on btrfs again.
I find it unconscionable that Fedora made btrfs the default with this behavior still not fixed. I will never, ever be putting a new system on btrfs again.
100% this.
I have had openSUSE self-destruct 5 or 6 times in a few years because snapper filled the disks with snapshots and Btrfs self-destructed.
For me the killer misfeatures are this:
Self-destructs if the volume fills up
Volumes are easy to fill by accident because df does not give accurate or valid numbers on Btrfs
There is no working equivalent of fsck and the existing Btrfs-repair tool routinely destroys damaged volumes.
Any one of those alone would be a deal-breaker. Two would rule it straight out. All 3 means it’s off the table.
Note: I have not yet even mentioned the multiple problems with multi-disk Btrfs volumes.
I have raised these issues internally and externally at SUSE; they were dismissed out of hand, without discussion.
Thank you! Count me along with the camp of “btrfs is great except for when you really need it to be” — low disk being one of those times (high write load being my personal burn moment)
I wanted bcachefs to work but this and related articles are keeping me away from it too.
I force my Fedora installs to ext4 (sometimes atop lvm) and move on with my life :shrug:
This is why I bite the out of tree bullet and just use ZFS. People tell me I’m crazy for running ZFS instead of Btrfs on single disk systems like my laptop, but like, no! I cannot consider Btrfs reliable in any scenario.
100% agree. I have found DKMS ZFS to be more stable than in-tree btrfs. Other than one nasty deadlock problem years ago it’s been rock solid. (Just some memory accounting weirdness…)
Yeah, I still get bitten by that one once or twice a year. I find btrfs useful for container pools, etc, but I still don’t use it for stuff I can’t easily regenerate.
Count me among the XFS users, albeit only on one machine at this point. I think I set up my current home server (running Fedora) around the same time Red Hat made XFS the default for RHEL, and I wanted to be E N T E R P R I S E. I’ll likely use Btrfs for my next build, as I have for all my laptops and random desktop machines in recent years. Transparent compression is very nice to have.
EDIT: I believe Fedora Server also defaults to XFS, or at least it did at some point.
Last time I mkfs’d (going back a few years now) it had dynamic sized xattr support, and ext4 set a fixed size at creation time. This was important for me at the time for preserving macOS metadata.
I do hope this unified treatment of code generation, generics and type inference under “comptime” becomes standard practice, as it looks very reasonable to the user.
In Haskell we have typed TH which comes close, but also has some limitations in the types of terms that can be type-inferred (if you’re into experimental type systems, that is).
As a non-Zig user, my impression is that using comptime as generics has the exact same limitations as C++ templates: the generic code (or at least the usages of the generic type parameter) is not type-checked until it is actually used somewhere in the program, and this means that when you write generic libraries you don’t get static guarantees until you use them with example clients. This will make the experience much worse than proper generics, at scale. I am also worried about the quality of the editor-level type feedback in presence of heavy generics usage, for similar reasons.
(I’ve said this in the past and some Zig maintainers pointed out that Zig works hard to partially type-check code with comptime arguments and that it probably works fine in practice. My intuition rather stubbornly tells me that this will be very annoying in practice when used at scale.)
The problem is that when you say comptime T : type, you don’t give any static information about what the code below actually assumes about T. If it handles T as a completely generic/opaque type on which nothing is known, this is fine. But in practice most code like this will assume things about T, that it has certain fields, support certain operations, etc., and it will work fine because it will be used by callers with types that match these assumptions. But these assumptions are not made explicit in the generic function, and thus they cannot be reasoned about statically.
What makes generics hard in most languages is the desire to type-check assumptions about them statically. For example, if a function is generic over a type-former (a parametrized type) such as List, maybe you want to use subtyping in the body of the function, and so the type-system designers have to come up with a small static language to express variance assumptions about generic type-former parameters – it is one of the complex and annoying parts of Java generics, for example. They could also give up and say “well let’s just check on each usage type that the subtyping assumptions is in fact correct”, this would be much simpler design-wise, and the ergonomics would be much worse.
Maybe “worse is better” and having a simple type system with worse ergonomics is indeed a good idea that will become standard practice. (It certainly helps in lowering the entry barrier to designing type systems, and it possibly makes it easier for programmers to be confident about what is going on.) But I remain skeptical of such claims, especially when they are formulated without acknowledging the notable downsides of this approach.
As a Zig user, fully agree with all of the above! Some extra thoughts:
While I am 0.9 sure that for simple-to-medium cases, declaration side type-checking leads to better ergonomics, I am maybe at 0.5 that there’s complexity tipping point, where call-site checking becomes easier to reason about for the user. In other words, I observe that in languages with expressive generics, some libraries will evolve to try to encode everything in the type-signatures, leading to a programming style where most of the code written manipulates types, instead of doing the actual work. I’ve certainly seen a number of head-scratching Rust signatures. Here’s a recent relatively tame example of this sort of dynamics playing out: https://github.com/rust-lang/rust/pull/107122#issuecomment-2385640802.
I am not sure that just doing what Zig does would magically reduce the overall complexity here, but it seems at least plausible that, at the point where you get into the Turing tarpit when specifying function signatures, it might be better to just use the base imperative language for types?
When comparing with C++, it’s worth noting that you get both instantiation-time type-checking and a Turing tarpit. A big part of perceived C++ complexity is due to the fact that the tools of expressiveness are overloading, ADL, and SFINAE. Zig keeps instantiation-time checking (or rather, dials it up to 11, as even non-generic functions are checked at call-site), but also simplifies everything else a lot.
Another dimension to think about here is crates.io style packages. It seems that declaration-checking plays a major role in SemVer — semantic versioning starts with defining what is and what is not your API. But, at the same time, the resulting ecosystem also depends on culture of making changes, not only on technical means to enforce it. And Zig’s package manager/build systems is shaping up to be the best-in-class general purpose small-scale dependency management solution. I am extremely curious what the ecosystem ends up looking like, after the language stabilizes.
And Zig’s package manager/build systems is shaping up to be the best-in-class general purpose small-scale dependency management solution. I am extremely curious what the ecosystem ends up looking like, after the language stabilizes
Could you say a few words (or point us to some documentation) on what you think makes Zig’s package manager/build system the best?
There are no docs! As a disclaimer, Zig is a work-in-progress. If you want to just use the thing,
it’s much too early for that, come back five years later!
That being said, why I am excited about a hypothetical Zig ecosystem:
First, Zig aims to be dependency zero. One problem that is traditionally hard in this space is how
do you get the environment that can execute the build/packaging logic? There’s a lot of tools
that, eg, depend on Python, which make building software at least as hard as provisioning Python.
Another common gratuitous dependency is sh/bash and core utils. Yet another option is JVM (gradle,
bazel).
In contrast, zig is a statically linked binary that already can execute arbitrary scripts (via zig run) and can download stuff from the internet (via zig fetch). That is big! If you can run stuff,
and can download stuff to run from the internet, you can do anything with no headache. What’s more,
it’s not confined to your build system, you can write normal software in Zig too (though, tbh, I am
personally still pretty skeptical about viability of only-spatially-memory-safe language for general
purpose stuff).
Second, I think Zig arrived at the most useful general notion of what is a dependency — a
directory of files identified by a hash. From the
docs:
This field (hash) is the source of truth; packages do not come from a url; they come from a
hash. url is just one of many possible mirrors for how to obtain a package matching this hash.
There’s no special casing for “Zig” dependencies. You use the same mechanism to fetch anything (eg,
in TigerBeetle we use this to fetch a prebuilt copy of llvm-objcopy). I expanded on this a bit in
https://matklad.github.io/2024/12/30/what-is-dependency.html. inb4 someone mentions nix: nix can do
some of this, but it is not a good dependency zero, because it itself depends on posix.
Third, the build system is adequate. It uses general purpose imperative code to generate a static
build graph which is then incrementally executed. This feels like the least sour spot for general
purpose build systems. While you get some gradle-vibes from the requirement to explicitly
structure your build as two phases, the fact that it all is simple procedural code in a
statically-typed language, rather than a DSL, makes the end result much more understandable.
Similarly, while static build graph can’t describe every imaginable builds, some builds (at lest at
a medium scale) are better left to imagination.
Fourth, Zig is serious about avoiding dependencies. For example, cross compilation works. From
windows, you can build software that dynamically links a specific version of glibc, because the
Zig folks did the work of actually specifying the ABI of specific glibc version. This, combined with
the fact that Zig also is a C/C++ compiler, makes it possible to produce good builds for existing
native software.
I like the theoretical idea of Zig being dependency zero, but in practice this ends up being horrible: if your toolchain is your bootstrap point, you’re chained at the waist to whatever version of the compiler you happen to have installed. Compare this to rustup, which allows a single installation but will auto-detect and install the toolchain version required for each project. It’s not just rustup either: there’s a reason that games (Veloren for example) separate the main body of the code from their installer/launcher: it allows the former to have a higher update cadence than the latter without enormous annoyance for the user.
One elegant solution is to not install anything! I don’t have zig in my PATH, I always use ./zig/zig to run stuff. For example, hacking on TigerBeetle is
git clone https://github.com/tigerbeetle/tigerbeetle && cd tigerbeetle
./zig/download.sh
./zig/zig build
Having a tiny .sh/.bat to download the thing is not elegant, but is not too bad. Certainly simpler than rustup installer!
Actually, I should have added “Zig promotes local installs” to the list above: Zig’s pretty clear that make install is generally a bad idea, and that you should install stuff locally more often.
And then, as kristoff says and Go demonstrates, nothing prevents the toolchain from updating itself. Zig already JIT some lesser used commands (compiler ships its components in the form of the source code), it certainly can learn to upgrade itself.
build.zig.zon (the Zig equivalent of package.json) supports specifying a minimum version supported of the compiler toolchain. This field is currently not used by the toolchain, but there’s a proposal to have Zig download another copy of Zig when the installed version doesn’t satisfy the constraint declared in build.zig.zon.
But even without adding this support in the toolchain itself, you could have a zigup project responsable to auto-detect and install the toolchain in the required version of each project.
In other words, I observe that in languages with expressive generics, some libraries will evolve to try to encode everything in the type-signatures, leading to a programming style where most of the code written manipulates types, instead of doing the actual work. I’ve certainly seen a number of head-scratching Rust signatures.
That’s part of the issue with what Wedson presented, and that Ts’o reacted to. Wedson had a very nice slide with a call that returned a very extensive type definition and claimed that was good.
Don’t get me wrong, Ts’o’s reaction was very bad. He should have behaved better. But on the technical merit, I think Wedson missed the mark.
I can’t see similar thing happening with Zig (at least not anytime soon) - not because you can’t do it, but because the ecosystem around the language seems to be allergic to cramming all things into the type system. Comptime presents itself as an easy escape valve to keep things simple.
But than, it needs to be compared with safe transmute which is quite an infra…
I am still not sure what I think here. The signature is quite impenetrable either way! But then, the fact that I can just write the logic for “is this reasonable to transmute?” like this is neat:
As a Go user, a lot of problems other languages solve with complex type constraints I see get solved with func f(x any) { if !matchConstraint(x) { panic("type must foo and bar") …. (I do it in my own code here.) In practice it usually isn’t a problem because you catch the panics with even minimal testing. It is unsatisfying though.
Using any has lots of downsides though, for one you lose the help of the compiler and your tooling to e.g. auto-complete a method. It is fine for a self contained method in a relatively small code base, but starts to get hairy as your code increases in complexity.
it seems at least plausible that, at the point where you get into the Turing tarpit when specifying function signatures, it might be better to just use the base imperative language for types?
I feel like this doesn’t entirely preclude better ergonomics though, at least if there were something like C++ concepts. Then you’d at least be able to observe the signatures to see what the expectations of the type are.
When comparing with C++, it’s worth noting that you get both instantiation-time type-checking and a Turing tarpit. A big part of perceived C++ complexity is due to the fact that the tools of expressiveness are overloading, ADL, and SFINAE. Zig keeps instantiation-time checking (or rather, dials it up to 11, as even non-generic functions are checked at call-site), but also simplifies everything else a lot.
IME this actually doesn’t affect the day-to-day ergonomics that much; ADL fails are usually pretty obvious in client code unless you’re doing some REALLY cursed library stuff, and SFINAE errors are actually pretty decent these days. The big thing that wasn’t fixed until concepts was just goofs like “I thought I had a map and not a vector so I passed a value type as the allocator” and suddenly you have like a zillion errors and need to fish out the actual goof. Zig…well it kinda fixes that w/ generally shorter instantiation traces, due to enhancements like “actually having loops”, but it’s still not super great.
This suggests that your programming style around generics is fairly simple, and therefore easy to test – there is not a lot of conditional logic in your generics that would require several distinct tests, etc. You would also do just fine in Zig if you were to write similar code. This is good news!
But several members of the C++ design community have spent a decade of their life working on C++ Concepts to solve these issues (the first proposal started in 2005-2006 I believe, it was planned in C++0x that became C++11, and then dropped because too complex, and then “Concepts Lite” appeared in 2016 but were rejected from C++17 and finally merged in C++20). I believe that this dedication comes from real user-stories about the perils of these aspects of C++ templates – which are largely documented online; there was a clear perceived need within the C++ community that comes from the fact that a lot of template code that many people are using was in fact much more complex than yours and suffered from these scaling issues.
Yeah definitely, I think it’s best to keep to simple generic code.
I don’t find the philosophy of “maxing out” compile time in C++ to be effective, and I don’t see the programmers I admire using it a lot, with maybe a few exceptions. (e.g. Carmack, Jeff Dean, Bellard, DJB, don’t really care about compile time programming as far as I can tell. They just get a lot of work done) There was also a recent (troll-ish) post by Zed Shaw saying that C++ is fun once you push aside the TMP stuff
All of Oils is written with Python as the metaprogramming language for C++, with textual code gen. Textual code gen takes some work, but it’s easy and simple to reason about.
IMO, it’s nicer than using the C preprocessor or using the C++ template system. (Although we also use the C++ template system for a few things – notably the compiler is the only thing that has access to certain info, like sizeof() and offsetof() )
The main thing that would make it better is if the C++ type system didn’t have all these HOLES due to compatibility with C! I mentioned that here:
(although I also forgot to mention that the C++ type system is extremely expressive too, what I called “hidden static expressiveness”)
The other main downside is that you need a good build system to handle code gen, which is why I often write about Ninja!
So I might think of comptime as simply using the same language, rather than having the Python/C++ split. I can see why people might not like that solution, and there are downsides, but I think it works fine. The known alternatives have steep tradeoffs.
Metaprogramming has a pretty common taxonomy where you decide which part of the compiler pipeline you hook into:
textual source code – the kind of metaprogramming that is supported by every language!
the lexer - the C preprocessor has its own C-like lexer, and hooks in here
the reader/parser - Lisp-like macros - for “bicameral syntax”
the runtime - I think Lua/Terra is more in this category. I think you can create Terra data structures “directly” in Lua, with an API
I don’t think any way is strictly better than the others – they all have tradeoffs.
But we are doing #1 and I think Lua/Terra is more like #4, or #3.
But spiritually you can write the same kinds of programs. It just means that we end up generating the source code of C++ functions rather than having some kind of API to C++.
Of course you often write little “runtime” shims to make this easier – that is sort of like your API to C++. The garbage collected data structures are the biggest runtime shim!
I do still think we need a “proper” language that supports this model.
It could be YSH – What if the shell and the C preprocessor were the same language? :-)
the generic code (or at least the usages of the generic type parameter) is not type-checked until it is actually used somewhere in the program
comptime extends this to all code; anything not reachable from an export as a general rule. It’s what allows for the dependent-esque decision making, cross-compilation/specializing using normal control flow, and general reflection. Without it, other langs default to a secondary declarative system like Rust’s #[cfg()] or C’s #ifdef. It’s a bit restrictive though, so a higher level build script (like proc-macros in Rust) is used for similar comptime effect.
But these assumptions are not made explicit in the generic function, and thus they cannot be reasoned about statically.
They technically can given comptime reflection (i.e. @typeInfo) is available. It just ends up being quite verbose so in practice most rely on duck typing instead.
My intuition rather stubbornly tells me that this will be very annoying in practice when used at scale. […] the ergonomics would be much worse
Ergonomics ok for now given you can x: anytype. What sort of environments do you see it causing the most annoyance? Im thinking maybe for cases where people learn exclusively through an LSP.
Without it, other langs default to a secondary declarative system like Rust’s #[cfg()] or C’s #ifdef.
I’m not saying that all uses of comptime are bad, and maybe it’s nice when it replaces macros for conditional compilation. I was pointing out that it is probably not the magic answer to all problems about “generics and code inference” that should become “standard practice”.
What sort of environments do you see it causing the most annoyance?
I would expect the usual downsides of late-type-checking of C++ templates to show up:
The validity of the template code is only checked at callsites. Library authors often write generic code without having full coverage of all possible configuration in their testsuite, and they will commit changes that break in practice because they didn’t have user code checking a particular configuration. This problem gets worse as templates get more elaborate, with conditional logic etc., and an exponential blowup in the number of different scenarios to test.
Error messages can be quite poor because the type-checker does not know whether to blame the author of the template or the caller of the template. The type T provided does not offer operation fobar, is it a typo in the template code or a mistake of the caller? If you write generic code with, say, type-classes or traits (where the expected operations are explicitly listed in the class constraint / trait bound present in the generic code), you can tell when there is a typo in the generic code and not blame the caller.
Compilation times can become quite large because each callsite needs to be re-checked for validity. This can become an issue or not depending on the amount of generic logic used by the programming community, but these conventions typically change over time and there can be surprisingly exponential cliffs.
Maybe Zig has (technical or social) solutions to some or all of these problems, and/or maybe the people explaining how great comptime is are too naive about this. If there is some secret sauce that makes this all work well, then it should be carefully explained and documented along the explanation of how simple comptime is; this is important in the context of encouraging other languages to adopt this approach (as done in the post I was replying to), they need to also understand the pitfalls and how to avoid them.
The first point is sometimes an issue for cross-compilation especially. Zig’s ability to do so from any machine (-target whatever) makes it easier to test locally but in practice this error is often caught by CI.
Error messages are surprisingly readable; “Type does not provide operation” is usually the caller’s fault (genuinely never seen it be the callee’s - whats an example of that?) and can be figured out through docs or variable naming. A good example of this is forgetting to wrap the format args parameter for std.fmt.format in a tuple.
Comptime speed does indeed become noticeable in larger projects. But its primarily due to reflection and constexpr execution rather than type-checking (AFAIK that part alone is always fast even for multiple nested instantiations).
I dont think there’s secret sauce. You tend to either 1. not run into these 2. figure them out intuitively / with a little help due to lacking documentation or 3. cannot adjust / dont prefer it, having come from other langs. After the first or second time hitting them, it becomes a non-issue (like eager-PRO). It’s similar to how zig-folk recommend reading the stdlib to learn the language; wouldn’t really be a good idea other languages like C++, Rust, etc. but makes perfect sense (and works) in Zig.
I once tried using stdlib JSON decoder to decode some structure that contained std.hash_map.HashMap. Let’s say that the error wasn’t clear at all why it is happening and how I can resolve it. It is especially painful when it happen deeply within some obscure internals of something that is mostly out of your control.
Zig is nice, but yeah, their generic errors make me remember old C++ templating failures.
Curious what it looked like. The worst cases IME are when it doesnt print the trace due to compiler bugs or when it gives error: expected type 'T', found 'T' which is pretty unclear. Or the slightly easier (but still annoying) case of expected T(..., a, ...), found T(..., b, ...).
“Type does not provide operation” is usually the caller’s fault (genuinely never seen it be the callee’s - whats an example of that?)
Imaginary scenario: for scientific programming, there is a naming convention for types that satisfy a certain set of operations, that comes from the important (imaginary) library ZigNum. Your project is using library Foo, which implements datatype-generic algorithms you care about, and also ZigNum directly – calling generic functions from both on the same types. The next version of ZigNum decides, for consistency reasons, to rename one of the operation (their previous name for the cartesian product was a poor choice). Then you code starts breaking, and there are two kind of errors that are displayed in exactly the same way:
For the errors in calls to functions from ZigNum, the code break because the operation names assumed by ZigNum have changed, you want to follow best practices so you update your own code; in a sense it was “the caller’s fault”, or at least the “right fix” is to change the caller.
But now you start getting errors in calls from Foo as well, because Foo hasn’t been updated and is using the previous name. In this case you have decided to stick to the naming conventions of the ZigNum project, so for those errors the library Foo is to blame, it should be updated with the new names.
If ZigNum would export a [type-class / module interface / concept / trait] that describes the operation names it expects, then compiling Foo against the new version of ZigNum would have failed, blaming the code (in Foo) that needs to be updated. Instead the error occurs in your own user code. If the person encountering this failure happens to be familiar with ZigNum and Foo, they will figure things out. Otherwise they may find this all fairly confusing.
Is ZigNum here an interface or an implementation? Having difficultly following the confusion: “A uses B. I just updated B and compiler says A started breaking on B stuff. I probably need to update A too.” seems like a fair reaction.
The error would occur in Foo, but zig errors are stack traces so the trace would lead back down to your code. This scenario however still looks like its the caller’s fault for passing incompatible types to a library. Something similar can also happen when you compile a zig 0.11.0 codebase (with dependencies also expecting a 0.11.0 stdlib) using a zig 0.13.0 compiler.
Well articulated and leads to a fascinating bit of reasoning.
First thought is, “well, you need multiple phases then.” Execute the comptime code, settle on what it produces, and then typecheck the code that relies on it.
But a moment’s thought shows that:
(a) you’d eventually need 3 levels, sometimes 4, etc.
(b) … and this is really just plain old code generation!
So we of course face the same fundamental issue as always. Wording it in terms of Lisp macros, you need to be able to flip back and forth between using the macro, and seeing/understanding the full expansion of the macro.
What we need is an outstanding solution to that fundamental problem.
C++ is gradually adding things that you’re permitted to do in constexpr contexts. You can now allocate memory, though it must be destroyed at the end of the constant evaluation. Generalising this much further is hard because the language needs to be able to track pointers and replace them with relocations, which is not possible in the general case for C/C++ (pointers can be converted to integers and back and so on). That’s mostly solvable, but there are also problems with things like trees that use addresses as identity, because even the relationship between the addresses at compile time isn’t constant: if I create two globals and put them in a tree, I after COMDAT merging and linking I don’t know which will be at a lower address.
Being able to merge the expression and type languages is very nice, but you either need to be able to create types at run time, or have a subset of the expression language that is only available at compile time. Smalltalk takes the former approach, C++ is quite heavily entrenched in the latter.
We are going to be increasingly inclined to not accept new proposed features in the library core.
Alex/Carson – you’ve been implicitly pushing the idea that “complete” is a worthy goal, and this finally makes it explicit. Software driven by a cohesive philosophy always shines, and htmx is no exception. I’m very appreciative of what you are doing, along with the rest of the htmx contributors.
I don’t know if it was ever made formal policy, but I seem to remember that at one point a lead maintainer of RSpec opined that version 3 was basically the final form of the project. Upgrades have been pleasantly unexciting for about a decade now.
It’s not that they never add features. It’s that, at least since sqlite 3 (so for the past 20 years):
I never need to upgrade sqlite unless I’m feeling pain from a specific bug or want a specific new feature.
Code that I wrote against sqlite in 2004/2005 works just as well against sqlite now as it did against sqlite then.
I have no worries about reading a database now that I wrote in 2005. And I’m confident that I could write against 2025 sqlite and, if I’m careful about what features I use, read it with code that I wrote in 2005.
I think sqlite is very much an example of software driven by the cohesive philosophy that @jjude asked about. As @adriano said, it’s not necessarily feature complete, but features are added very carefully and deliberately. There aren’t many things I’m as confident using as I am in sqlite. It makes me happy that htmx (another thing I like a lot) aspires to that, but it’s got to keep going a long time to prove it’s in the same league. (I suspect it will.)
I’m not sure whether sqlite has a cohesive philosophy, but note that @jjude’s question, as I understand it, is about software with a cohesive philosophy; not necessarily software with feature completeness as its philosophy.
If I were to guess what the sqlite authors’ philosophy might be, it’s that the world needs a high quality SQL implementation that remains in the public domain.
Thus highlighting the issues with “feature-complete for stability’s sake”.
Things change. It is the one constant. If the software is static in a sense, then rolling a new major version or forking with this kind of “fix” is both reasonable and necessary for the long term needs.
I also have rosy memories of the last good Python in my mind (2.5), but realistically, it was always accreting features at pretty fast rate. It’s just a lot of people got stuck on 2.7 for a long time and didn’t observe it.
You have to go back further than that IMO. Pre-2.0 was when you could argue that Python was more or less adhering to its own Zen. To me the addition of list comprehensions marks the time when that ship has set sail.
One thing that I found interesting is that many changes in each Ruby release would be considered a big no for any other language (for example the “Keyword splatting nil” change) since the possibility of breaking existing code is huge, but Ruby community seems to just embrace those changes.
I always think about the transition between Python 2 and 3 that the major change was adopting UTF-8 and everyone lost their minds thanks to the breakage, but Ruby did a similar migration in the version 2.0 and I don’t remember anyone complaining.
I am not sure if this is just because the community is smaller, if the developers of Ruby are just better in deprecating features, or something else. But I still find interesting.
Ruby’s version of the Python 2 to 3 experience (by my memory) came years earlier, going from 1.8 to 1.9. It certainly still wasn’t as big of an issue as Python’s long-lingering legacy version, but it was (again, my perception at the time) the Ruby version that had the most lag in adoption.
Yes, and it was very well managed. For example, some changes were deliberately chosen in a way that you had to take care, but you could relatively easy write Ruby 1.8/1.9 code that worked on both systems.
The other part is that Ruby 1.8 got a final release that implemented as much as the stdlib of 1.9 as possible. Other breaking things, like the default file encoding and so on where gradually introduced. A new Ruby version is always some work, but not too terrible. It was always very user centric.
It was still a chore, but the MRI team was pretty active at making it less of a chore and getting important community members on board to spread knowledge and calm the waves.
Honestly, I think Ruby is not getting enough cred for its change management. I wish Python had learned from it, the mess of 2 vs 3 could have been averted.
Interesting POV. As a long-time Rubyist, I’ve often felt that Ruby-core was too concerned with backwards compatibility. For instance, I would have preferred a more aggressive attempt to minimize the C extension API in order to make more performance improvements via JIT. I’m happy to see them move down the path of frozen strings by default.
One thing that I found interesting is that many changes in each Ruby release would be considered a big no for any other language (for example the “Keyword splatting nil” change) since the possibility of breaking existing code is huge, but Ruby community seems to just embrace those changes.
Like others already said, the Ruby core team stance is almost exactly the opposite: it is extremely concerned with backward compatibility and not breaking existing code (to the extent that during discussion of many changes, some of the core team members run grep through the codebase of all existing gems to confirm or refute an assumption of the required change scale).
As an example, the string literal freezing was discussed for many years, attempted before Ruby 3.0, was considered too big a change (despite the major version change); only pragma for opt-in was introduced, and now the deprecation is introduced in the assumption that the existence of pragma prepared most of the codebases for the future changes. This assumption was recently challenged, though, and the discussion is still ongoing.
Keyword splatting nil change might break only the code that relies on the impossibility of the nil splatting, which is quite a stretch (and the one that is considered acceptable in order to make any progress).
Keyword splatting nil change might break only the code that relies on the impossibility of the nil splatting, which is quite a stretch (and the one that is considered acceptable in order to make any progress).
This seems like really easy code to write and accidentally rely on.
def does_stuff(argument):
output = do_it(argument)
run_output(output) # now `output` might be `{}`
rescue StandardError => e
handle(e)
end
def do_it(arg)
splats(arg)
end
If nil was expected but was just rolled up into the general error handling, this code feels very easy to write.
Well… it is relatively easy to write, yes, but in practice, this exact approach (blanket error catching as a normal flow instead of checking the argument) is relatively rare—and would rather be a part of an “unhappy” path, i.e., “something is broken here anyway” :)
But I see the point from which this change might be considered too brazen. It had never come out during the discussion of the feature. (And it was done in the most localized way: instead of defining nil.to_hash—which might’ve been behaving unexpectedly in some other contexts—it is just a support for **nil on its own.)
I have to doubt that. It’s extremely common in Python, for example, to catch ‘Exception’ and I know myself when writing Ruby I’ve caught StandardError.
I don’t mean catching StandardError is rare, I mean the whole combination of circumstances that will lead to “nil was frequently splatted there and caught by rescue, and now it is not raising, and the resulting code is not producing an exception that would be caught by rescue anyway, but is broken in a different way”.
Like others already said, the Ruby core team stance is almost exactly the opposite: it is extremely concerned with backward compatibility and not breaking existing code (to the extent that during discussion of many changes, some of the core team members run grep through the codebase of all existing gems to confirm or refute an assumption of the required change scale).
But this doesn’t really matter, because there are always huge proprietary codebases that are affected for every change and you can’t run grep on them for obvious purposes. And those are the people that generally complain the most about those breaking changes.
Well, it matters in a way that the set of code from all existing gems covers a high doze of possible approaches and views on how Ruby code might be written. Though, of course, it doesn’t exclude some “fringe” approaches that never see the light outside the corporation dangeons.
So, well… From inside the community, the core team’s stance feels like pretty cautious/conservative, but I believe it might not seem so comparing to other communities.
It doesn’t seem anything special really. Of course Python 2 to 3 was a much bigger change (since they decided “oh, we are going to do breaking changes anyway, let’s fix all those small things that were bothering us for a while”), but at the tail end of the migration most of the hold ups were random scripts written by a Ph. D. trying to run some experiments. I think if anything, it does seem to me that big corporations were one of the biggest pushers for Python 3 once it became clear that Python 2 was going to go EOL.
I’d say that the keyword splatting nil change is probably not as breaking as the frozen string literal or even the it change (though I do not know the implementation details of the latter, so it might not be as breaking as I think). And for frozen string literals, they’ve been trying to make it happen for years now. It was scheduled to be the default in 3 and was put off for 4 whole years because they didn’t want to break existing code.
Over the years I feel like Ruby shops have been dedicated to keeping the code tidy and up-to-date. Every Ruby shop I’ve been at has had linting fail the build. Rubocop (probably the main linter now) is often coming out with rule adjustments, and often they have an autocorrect as well making it very easy to update the code. These days I just write the code and rubocop formats and maybe adjusts a few lines, I don’t mind.
I always think about the transition between Python 2 and 3 that the major change was adopting UTF-8 and everyone lost their minds thanks to the breakage, but Ruby did a similar migration in the version 2.0 and I don’t remember anyone complaining.
From what I remember, UTF-8 itself wasn’t the problem— most code was essentially compatible with it. The problem was that in Python 2 you marked unicode literals with u"a u prefix", and Python 3 made that a syntax error. This meant a lot of safe Python 2 code had to be made unsafe in Python 2 in order to run in Python 3. Python 3.3 added unicode literals just to make migrations possible.
On top of that, Python 3 had a lot of other breaking changes, like making print() a function and changing the type signatures of many of the list functions.
As someone who was maintaining a python package and had to make it compatible with 2 and 3, it was a nightmare. For instance the try/except syntax changed.
Python 2
try:
something
except ErrorClass, error:
pass
Python 3
try:
something
except ErrorClass as error:
pass
Basically the same thing but both are syntax error in the other version, that was a nightmare to handle. You can argue the version 3 is more consistent with other construct but it’s hard to believe it would have been particularly hard to support both syntax for a while to ease the transition.
Ruby change way more things, but try its best to support old and new code for a while to allow a smooth transition. It’s still work to keep up, but it’s smoothed out over time making it acceptable to most users.
It’s been a while, and I was just starting out with Python at the time, so take this with a grain of salt, but I think the problem was deeper than that. Python 2’s unicode handling worked differently to Python 3, so even when Python 3 added unicode literals, that didn’t solve the problem because the two string types would still behave differently enough that you’d run into compatibility issues. Certainly I remember reading lots of advice to just ignore the unicode literal prefix because it made things harder than before.
Googling a bit, I think this was because of encoding issues — in Python 2, you could just wrap things in unicode() and the right thing would probably happen, but in Python 3 you had to be more explicit about the encoding when using files and things. But it’s thankfully been a while since I needed to worry about any of this!
My recollection at Dropbox was that UTF8 was the problem and the solution was basically to use mypy everywhere so that the code could differentiate between utf8 vs nonutf8 strings.
In my experience the core issue was unicode strings and removing implicit encoding / decoding was, as well as updating a bunch of APIs to try and clean things up (not always successfully). This was full of runtime edge cases as it’s essentially all dynamic behaviour.
Properly doing external IO was on some concern but IME pretty minor.
On top of that, Python 3 had a lot of other breaking changes, like making print() a function and changing the type signatures of many of the list functions.
This is why I said the “major” change was UTF-8. I remember lots of changes were trivial (like making print a function, you could run 2to3 and it would mostly fix it except for a few corner cases).
To me, the big problem wasn’t so much to convert code from 2 to 3, but to make code run of both. So many of the “trivial” syntax changes were actually very challenging to make work on both versions with the same codebase.
It was a challenge early on, after ~3.3 it was mostly a question of having a few compatibility shims (some very cursed, e.g. if you used exec) and a bunch of lints to prevent incompatible constructs.
The string model change and APIs moving around both physically and semantically were the big ticket which kept lingering, and 2to3 (and later modernize/ futurize) did basically nothing to help there.
It wasn’t an easy transition. As others said, you’re referring to the 1.8-1.9 migration. It was a hard migration. It took around 6-7 years. An entirely.new VM was developed. It took several releases until.there was a safe 1.9 to migrate to, which was 1.9.3 . Before that, there were memory leaks, random segfaults, and one learned to avoid APIs which caused them. Because of this, a big chunk of the community didn’t even try 1.9 for years. It was going so poorly that github maintained a fork called “ruby enterprise edition”, 1.8 with a few GC enhancements.
In the end , the migration was successful. That’s because, once it stabilised, 1.9 was significantly faster than 1.8 , which offset the incompatibilities. That’s why python migration failed for so long: all work and no carrot. For years, python 3 was same order of performance or worse than python 2. That only changed around 3.5 or 3.6 .
Fwiw the ruby core team learned to never do that again, and ruby upgrades since 1.9 are fairly uneventful.
Ruby 2 was a serious pain for many large projects. Mainly with extensions behaving slightly differently with the encoding. I remember being stuck on custom builds of 1.9 for ages at work.
I have a really mixed feeling about built-in TOML support in Python. It came from the tomli project that, for reasons beyond my understanding, was split in two sub-projects: tomli itself for parsing and tomli-w for printing.
The now-official tomllib module only contains a parser. You still need to install tomli-w if you want for printing, so the module doesn’t follow the familiar load(s) and dump(s) API — it only provides the load(s) part.
Well, I have a mixed feeling about TOML itself. It’s neither obvious nor minimal in its current version, it has really weird ideas — my pet peeve is the requirement not to have line breaks in inline tables (foo = { bar = 1,\n baz = 2 } must cause a parse error). But from widely-known and widely-supported formats, I’ll take it over YAML any time.
I wish UCL had seen more love. There are a few things I don’t like. In particular, it’s tied to the JSON model and so things like durations and sizes are just lowered to numbers, which is annoying loss of units (if you write 1 kb in a field that expects a duration, you will read it as 1024 seconds).
The thing UCL gets right is having a composition model built in. You can define two UCL files and how to combine them, so you can have things like defaults, per-system default config, and per-user config with well-defined merging rules. This makes it very easy to build immutable system images: user config is not part of the system image and can be in a completely separate path, but the defaults are all visible in the default config file.
The implementation is not the best. I’d love to see Rust and Go reimplementations, and C bindings for the Rust version, and a v2 UCL that has a richer data model (dates, durations, file sizes, and 64-bit integers as first-class citizens, in particular).
The biggest problem now seems to be the coordination problem of how to get off of YAML, which everyone agrees is bad but no one can agree what to replace it with. Maybe some large project could create SYAML (Sane subset of YAML) and get it widely adopted, but I’m not hopeful.
Apple open sourced something recently which has a good set of data types (a superset of the property list types) but doesn’t do the composition thing as well as UCL. My requirements for the format are:
Unambiguous and easy-to-write syntax.
Comments.
Tree structure.
Rich data types (including things like URL, ISO time stamps, and so on that can be validated early).
Well-defined composition that allows replacing or merging nodes based on different files.
A fully specified grammar that admits multiple implementations.
Public key signatures (one of the really nice things with UCL is that you can specify a URL for an inclusion and validate it against a signature, so you can provision VMs that grab their config from some remote thing and get end-to-end integrity checks).
A simple schema format that lets you validate configs without loading them into their final application.
Well-defined information-preserving round tripping through JSON (or CBOR/NVList/whatever) so that you can build the config parser externally to the main application and provide something that’s easier to parse to the more-privileged part of the program.
Implementations in multiple languages, with a conformance test suite that they can all parse.
A lot of things give me 60-80% of this. I’d love to get a group of people who care about config files together to define something that meets 100% of them plus the important ones that I’ve missed. None of the existing formats can win because they’re all only solving part of the problem and so do not work well in the situations where the other parts of the problem are more important.
I feel that that TOML, which does not support nested data structures well, gets more traction, though I agree that many applications do not need nested data structures.
The reason that the stdlib has no support for writing is because it is not obvious what to do with e.g. comments and preserve the style (should it preserve the current indentation of the file? should it reformat?). Here is the PEP talking about why not support writing.
And the reason for support is pretty obvious: pyproject.toml. Before tomlib there was no way to parse the pyproject.toml using the stdlib (while parts of stdlib had to parse TOML already, e.g. pip). The fact that you can also use tomlib to parse other TOML files is a plus.
If the decision whether to preserve the current indentation of original JSON files in json.dumps() was never a question, I don’t get how it’s suddenly more important for TOML. ;)
Printing Python data structures that don’t contain objects or functions as TOML is exactly as straightforward as printing them as JSON. Style-preserving libraries is a whole different genre anyway. *.dumps() has no concept of the “original file”, its input is a Python data structure.
If the decision whether to preserve the current indentation of original JSON files in json.dumps() was never a question, I don’t get how it’s suddenly more important for TOML. ;)
Because TOML is supposed to be edited by humans, while JSON doesn’t. Opening a TOML file that you wrote with Python, just to see your comments vanish would be bad.
The use cases are different so there are different considerations.
It would be bad if a developer of an application that is supposed to preserve comments and style when handling TOML files used a library that wasn’t designed to do that. It would also be bad if a library didn’t document that.
The load(s) part of tomli doesn’t preserve semantically-insignificant information so it’s a limited implementation as well. And that’s fine because for the purpose of mapping TOML data to Python values, comments and formatting are not important. But they are also unimportant for mapping Python types to TOML files because Python values have no concept of comments or formatting attached to them.
All good points, but this doesn’t change that someone needs to talk about those concerns and someone needs to have the desire to implement a library even considering those concerns. And if you look at the PEP link that I posted, nobody wants to talk about it or implements it. So the current solution, that I found perfectly fine.
The author proposes that JSON parsers should “accept comments”, without specifying what that means. Should we care about interoperability between different JSON parsers?
I did not find two libraries that exhibit the very same behaviour. Moreover, I found that edge cases and maliciously crafted payloads can cause bugs, crashes and denial of services.
My contribution is that non-standard extensions like “comments”, especially in the absence of a precise specification, make this problem worse. Different parsers that “accept comments” will probably not agree on comment syntax, and that this may cause problems or even open security holes.
For example, consider a Javascript “//” comment that continues to the end of the line. If the comment contains a CR character (aka ‘\r’ or U+000D), and the CR is not immediately followed by NL (aka ‘\n’ or U+000A), then does the CR terminate the comment, or does the comment continue to the next NL character?
One way you could react to this question is: whatever, it doesn’t matter, my app will never generate JSON that looks like this.
But to a person who crafts security exploits, this looks like an opportunity to add “cloaked” JSON elements which are processed by some JSON parsers and ignored by other JSON parsers. That kind of difference in interpretation can be the beginnings of a security exploit.
It’s not just theoretical. I skimmed the code – the linked parsers seem to have exploitable differences in how they handle line endings. In addition to \r and\n, Json5 supports \u2028 and \u2029 to end lines, dkjson.lua does not. Therefore:
will be interpreted differently by the two parsers. Swift Foundation implements a 3rd behavior; it assumes ‘\n’ and ‘\r\n’ are the only valid line endings, so:
{
// maybe a 2-line comment \r "user_id": 2
}
will be a 1 entry object when parsed by JSON5, dkjson.lua, and Sqlite, but empty when parsed by Swift.
I couldn’t be bothered to dig through the Jackson code.
I think there are 3 different use cases for JSON here:
One for serialization. In this case, yes, comments are bad because they’re complicated and are not defined in spec. But also, why would you need comments in serialization?
One for interop between different programs, that is a similar case as the above. In this case you may want comments, but it is best to avoid them (unless you’re using something like _comment: "my comment"), since you never know if the other implementations will have similar behavior
A third use case is for something more interactive, e.g.: confiuration files. In this case you probably don’t care about other consumers, having comments can be really useful. If we are assuming those files are edited by humans trailing commas are also useful for similar reasons
But to a person who crafts security exploits, this looks like an opportunity to add “cloaked” JSON elements which are processed by some JSON parsers and ignored by other JSON parsers. That kind of difference in interpretation can be the beginnings of a security exploit.
This also depends on the context. For the third use case in many cases if you have the privilege of setting a configuration file you already lost.
Using comments in your JSON doesn’t mean including comments in spec-compliant JSON!
The gift the JSON spec is giving you is that by not allowing comments it’s telling you what to do with the comments: throw them in the garbage in some step before they get to the real JSON parser. In this way, many kinds of comments can be supported (properly, without vulnerability) because once they’re gone they really can’t possibly mean anything (solving the security problem).
JSON is a very forgiving language syntactically so %#// and /* */<!-- --> are all equally valid ways to add unambiguous comments to your JSON. That is to say, since these syntaxes don’t conflict with JSON’s own syntaxes, there’s no chance of a real confusion where a comment is interpreted as valid data or visa versa. The fact that JSON absolutely rejects comments and other unexpected syntax is, paradoxically, what means that it works with every flavor of comments so long as you can define what it means to strip them out.
I think you missed the point, which was that anyone implementing such a lax parser, whether that’s via a “stripping function” or a parser which skips them natively, will do it in a slightly different way and this can cause conflicts which may be exploited. This would simply add more problems to the existing security issues JSON already has, like where parsers disagree on what to do with non-unique keys.
Can you provide a concrete example? In my mind the vulnerability is completely addressed by a stripping preparser (in a way that it is very much not addressed by a lax parser)
orib’s example of existing JSON5 parsers parsing comments differently could just as easily happen with multiple comment-stripping JSON pre-parsers. The pre-parsers could disagree on whether to strip text after a // and after one of \u2028 or \r but before \n.
To be specific the reason the problem goes away is that after the stripping parser is done the 100% formal very standard JSON parser takes over, bringing with it the exact same set of security guarantees that we have currently
I’d rather have the implementation be a little more aggressive, like re-wrapping text after removing syntax markers (soft-wrap source, hard-wrap output, like [sometimes] markdown), and providing a way to line up the second line of the inital arguments breakdown that looks good on both ends – but these are minor nits and all the more reason to support the creation of small, low-dep, opinionated tools. More like this please! And it’s great to see someone taking the time to highlight a tool that brought them joy.
It actually rewraps the text, I am doing line breaks at 80 columns for legibility, but it doesn’t need to be so (actually I am not completely sure about this, I read somewhere that the format expects break at each 80 columns, but I remember passing it a few times without issue).
I have a similar experience that Python type hints improve code quality even when sometimes it feels you’re writing code only to make the mypy happy.
Maybe it is because I was writing code that was too smart for its own good and making the type checker unhappy, and I ended up refactoring in some way that is less smart but pass the type checks, and in most cases I end up thinking “ok, this is probably for the better”.
It may be because I was doing some assumptions (e.g.: a regex match that can return either the match or None, but in my particular case it can’t ever be None but the type checker can’t know this), that I end up adding an assert in the code that express “if this ever happen, something went terrible wrong”.
It may be some explicit conversions from Path to str that at least make me think “huh, this could go wrong if the user has a non-UTF8 encoding set in its filesystem”, that may be as well something that I am ignoring now but if it ever causes issues I know where to look for fixes.
mypy definitely is far from perfect, but for me it is a net improvement for anything that is non-trivial.
For those starting in NixOS (for Nix inside macOS or other distros this doesn’t make much sense since you can just use plain VSCode), there is also vscode-fhs that set up a VSCode inside a buildFHSUserEnv that “emulates” a traditional distro filesystem and allows you to use technically any extension from the Marketplace without needing patching.
Keep in mind that this is impure, so it goes kinda against with the principles of NixOS (and it means you can’t e.g.: rely in external CIs to build your configuration for caching), but it works and to get someone up to the speed this may well be a good use of your time until you understand how to do packaging, etc.
We’d really prefer not to have to add typecasts everywhere we’re calling these methods. We’d also really prefer to not have to verify the runtime type within the SetMap function. Even if we’re 100% confident that these checks and casts will never fail, the whole purpose of having a type system is so that the compiler checks this for us.
This is precisely where the conventional type safety wisdom conflicts with the apparent Go philosophy. The “whole purpose of having a type system” is not for having the compiler check things for you. That’s one purpose. But it certainly isn’t the only one and isn’t necessarily the most important one either. There is a balance between what’s worth checking or not vs other concerns.
The “simplest” approach to solving the problem discussed in this article is to add some type casts and move on. Avoid attempting to solve a puzzle you’ve created for yourself.
I’m curious what you think the point of a type system is? Because when I use a dynamically type language, what I miss is the automatic checking. When I’ve seen people just try to use type annotations as documentation in say Python and then later switch to actually checking them mypy it turns out often they got the annotation wrong and mypy is effectively catching a documentation bug. So I really don’t see the value without the checking part.
I would emphasize two of those points as being more significant than they sound from that list.
First, function type signatures aren’t merely a useful thing to read, they’re computer verified documentation. For example:
If a function in JS takes a string as an argument, its docs need to state that that argument is a string. If the same function is written in Java, the docs don’t need to say that because the type signature already does.
If a function in C++ takes two pointers and returns a pointer, the docs need to say how long the returned pointer lives for: does it live for the lifetime of the first argument, or the second argument, or some other duration? If the same function is written in Rust, the docs don’t need to say that because the type signature already does.
And you can’t forget to update these docs because the compiler checks them for you.
Second, the sorts of errors that type systems eliminate aren’t just null pointers and trivial errors like typos. The sorts of bugs that a type system prevents can catch can be arbitrarily complex. It only seems like it’s preventing local bugs because the type system makes them local. Without type signatures, they aren’t local.
The sorts of bugs that a type system prevents can catch can be arbitrarily complex.
What I am saying here is that there are diminishing returns here: most value in detecting simple bugs fast, not at preventing complex bugs.
I really need to get to writing a proper post here, but, generally, type-system as bug prevention mechanism only works for code that you haven’t run even once. So, error handling and concurrency. For normal code, the added safety is marginal, and is dwarfed by dev tooling improvements.
Based on the most common use of generics in the most commonly used language with generics (Java) I would say this is empirically incorrect. The top two motivations are clearly: tooling assistance, and preventing putting values of the wrong type inside generic collections.
But Java got generics only in version 6! Hence it follows that the thing that Java-style generics are useful for are not the top priorities for a type-system!
Also, your other list of priorities seems rather subjective. I feel that checking many different properties at compile time is by far the biggest benefit of a type system, far above performance under most circumstances.
Counterargument: Typescript, Mypy, Sorbet, even Gleam all provide type systems that don’t and cannot make the code faster. I think there’s also an important case between “detect nulls” and “units of measure” which is features like ADTs that allow for the whole “make invalid states impossible” thing to happen.
More fundamentally, I think this is a roughly accurate list of how much different features of a type system are used, but I think it’s not necessarily right to call that importance. For example, Typescript is definitely designed to support LSP features in Javascript even when the type system isn’t fully there, and it often gets sold on the typo handling stuff — these are key features to Typescript’s success. But Mypy can do this stuff as well, yet my impression is Mypy hasn’t found half as much success in Python codebases as Typescript has in Javascript codebases.
I suspect this is because Typescript does a much better job of modelling the implicit types that already existed in Javascript codebases than Mypy does for Python. (This is partly because Typescript is more powerful, but I suspect also because Javascript is less dynamic, and so easier to model.) This, then, is the more important feature of a type system: it can model the problems that developers are using it for. If it can’t do that, then it is not fit for purpose and will be adjusted or replaced. But if it is sufficient for that purpose, then people will use it in roughly the order your describe in your list.
That said, I do think “the problems that developers are using it for” is such a broad statement that it will look differently for different languages. For example, you probably don’t want to model complex concepts in C, so C’s type system can remain relatively simple. Whereas modelling functions of complex states in a way that prevents errors feels like a definitional statement for functional programming, so MLs and friends will typically have much more complex type systems.
How this applies to Go, though, I’m not sure. Go’s designers definitely want their types and modelled concepts to be as simple as possible, but the way they keep on adding more complex features to the language suspects that they’ve not found quite the right local maxima yet.
Yes, they go after the second priority — dev tooling.
But the extent to which they are successful at it depends — in my experience — more on the later priorities than the earlier ones. That is to say: dev tooling by itself is a high priority. But to get that dev tooling, you need to be able to correctly model the concepts that your users want to model. Be that complex dynamic types as in Python or Typescript, data types as in many functional languages, or lifetimes as @withoutboats points out in a sibling comment. If you can’t model that (and I don’t think e.g. mypy does model that very well), then the type system isn’t very useful.
This is why I think your list matches what users of a type system want to use, but doesn’t necessarily match the priorities from a language designer perspective.
You don’t need types here! Erlang has sum types, and it doesn’t have a null problem, but it also doesn’t have types.
I’m a bit sceptical here, but I admit I have almost no practical experience with Erlang & friends. I’ve used sum types plenty in Javascript, though, and in my experience they work okay up to a point, but they’re so much more usable with a type system to validate things.
Erlang’s type annotations support union types but not sum types. It’s a dynamic language so you can pass nil to a function that expects a tuple, which is just like the null problem – tho I suspect that Erlang’s usual coding style makes it less of a problem than in other languages. I don’t know if dializer is strict enough that you can use it to ensure invalid states are unrepresentable.
I would also consider type checking of “ownership patterns” whether through monads, substructural types, lifetime analysis, or whatever else to also be a very important property of type systems, just one most currently type systems don’t give you much help with. Bugs in this area (especially in the case of concurrency) are demonstrably common and difficult to diagnose and repair.
On the other hand, I also think there are diminishing returns to trying to encode increasingly complex correctness contracts into the type system. Quickly it seems to me the difficulty of understanding how the contract has been encoded and how to use it properly outweighs the difficulty of avoiding the error yourself. The cost can be worth it in safety critical systems, I assume.
I periodically attempt to take some of my Python libraries – which do use type hints as documentation – and get mypy to run on them.
I have never had mypy catch an actual type error when doing this (meaning, a situation where an attempt was made to use a value of a type incompatible with the expected one described in the annotation). I have, however, gone down far more rabbit holes than I’d care to in an attempt to figure out how to express things in a way mypy will understand.
My most recent attempt at this involved a situation where mypy’s type-narrowing facilities left a lot to be desired. I am going to describe this here so you can see what I mean, and so you can get a sense of the frustration I felt. And to be clear: I am not a newbie to either Python (which I’ve been doing professionally for decades) or static typing (I first learned Java and C back in the mid-2000s).
So. The real code here isn’t particularly relevant. What you need to know is that it’s a list of values each of which (because they’re being read from environment variables which might or might not be set) is initially str | None. The code then did an if not all(the_list): check and would bail out with an exception in that branch. Which, crucially, means that all code past that point can safely assume all the values have been narrowed to str (since if any of them were None, the all() check would have failed).
Later code would start checking to see if the values were URL-like, because ultimately that’s what they’re supposed to be. So imagine some code like this for a simplified example:
items: list[str | None]
# Now imagine some code that fills in the list...
if all(items):
print(item.startswith("http://") for item in items)
But mypy looks at this perfectly idiomatic and perfectly safe Python code and says error: Item "None" of "str | None" has no attribute "startswith" [union-attr]. Because although we humans can clearly see that the type of items must have been narrowed, mypy can’t. OK, mypy’s documentation suggests an is not None check will narrow an optional type:
if all(item is not None for item in items):
print(item.startswith("http://") for item in items)
But no, that gets the same error from mypy. So does this, though mypy says isinstance() checks can be used for narrowing:
if all(isinstance(item, str) for item in items):
print(item.startswith("http://") for item in items)
The actual problem, of course, is mypy doesn’t understand that all() would return False if any of the values actually were None, and so cannot infer from the return value of all() that the type has in fact been narrowed from str | None to just str. We have to help it. If you’re actually reading mypy’s type-narrowing docs, the next thing it will suggest is writing a guard function with TypeGuard. OK:
from typing import TypeGuard
def guard_str(value: list[str | None]) -> TypeGuard[list[str]]:
"""
Narrowing type guard which indicates whether the given value
is a list of strings.
"""
return all(isinstance(v, str) for v in value)
if guard_str(items):
print(item.startswith("http://") for item in items)
And mypy actually accepts this! But there’s a problem: remember I wanted to do an if not all(items): to bail out with an error, and then have a clean path beyond that where the type has been narrowed from str | None to str? Well, turns out TypeGuard can only narrow the “true” branch of a conditional. To narrow both branches, you need to use TypeIs instead. OK, so here’s the TypeIs version:
from typing import TypeIs
def typeis_str(value: list[str | None]) -> TypeIs[list[str]]:
"""
Narrowing type guard which indicates whether the given value
is a list of strings.
"""
return all(isinstance(v, str) for v in value)
if typeis_str(items):
print(item.startswith("http://") for item in items)
So naturally mypy accepts that, right?
Haha, just kidding:
typeis.py:9: error: Narrowed type "list[str]" is not a subtype of input type "list[str | None]" [narrowed-type-not-subtype]
typeis.py:19: error: "Never" has no attribute "__iter__" (not iterable) [attr-defined]
You see, TypeGuard doesn’t care about generic type variance, but TypeIsdoes! And it turns out list is defined by the bolted-on Python “static type” system to be invariant. So now we have to go redefine everything to use a different generic type. Probably the best choice here is Sequence, which is covariant.
from collections.abc import Sequence
from typing import TypeIs
items: Sequence[str | None]
# Now imagine some code that fills in the list...
def typeis_str(value: Sequence[str | None]) -> TypeIs[Sequence[str]]:
"""
Narrowing type guard which indicates whether the given value
is a sequence of strings.
"""
return all(isinstance(v, str) for v in value)
if typeis_str(items):
print(item.startswith("http://") for item in items)
This, finally, will do the correct thing. It satisfies the type narrowing problem in both branches of a conditional, which is what’s ultimately wanted, and does so in a way that makes the narrowing obvious to mypy. And it only took, what, half a dozen tries and a bunch of frustrating errors? Again, I’m not new to Python or to static typing, and even though I actually understood the reason for most of the errors after a quick skim, this was still an incredible amount of pointless and frustrating busy-work just to satisfy mypy of something that mypy should have been able to figure out from the initial idiomatic implementation. And, worse, this has introduced expensive runtimeisinstance() checks as a cost of making the “static” type checker happy!
All of which is just the most recent of many examples of why I continue to add type hints to my code as documentation, because I do find them useful for that purpose, but also continue not to run mypy as part of my linting suite and why I do not include a py.typed file in my own packages.
My experience is with Typescript vs Javascript, but I’ve converted two or three codebases over to Typescript at this point, and each time the act of defining types for functions where the types weren’t entirely clear has helped me find bugs. It’s also allowed me to remove overly-defensive code and make the code easier to read overall.
I think part of this is that Typescript has a more powerful type system that better models real-world Javascript code (whereas Mypy’s always feels like Java with some light sugar on top). But I also suspect that Javascript is a lot easier to model than Python, as there are fewer opportunities for wild, dynamic code, and even when there are, most Javascript developers tend to avoid that style unless it’s really useful.
For your example specifically, Array#every, which is roughly the equivalent of all(...), does include the requisite type information to correctly handle this case:
declare const items: Array<string | null>;
// `.every(...)` always requires a predicate to be passed in, unlike `all(...)`, which
// means this is probably exactly how you'd write this in Javascript without types
if (items.every(items => items !== null)) {
for (const item of items) {
console.log(item.startsWith("http://"));
}
}
JavaScript is kind of an interesting example to bring up, because people love to talk about making Python “strongly typed” when it already is. You could make a sort of grid of languages like this to show the difference:
I can see how the number of implicit type conversion behaviors in JavaScript, which you mostly have to just know and remember not to trip over, would lead to a desire to work in something a bit stricter, and how doing so could yield benefits in code quality.
And TypeScript is also kind of a different example because it’s not required to remain syntactically compatible with JavaScript. “Typed Python”, on the other hand, does have to maintain syntactic compatibility with plain Python (which is why several things from the static-type-checking Python world have had to be imported into Python itself).
But I also stand by the fact that mypy has never uncovered an actual bug in Python code I’ve run it on. It’s only ever uncovered weird limitations of mypy requiring workarounds like the ones described in my comment above.
I absolutely agree with your last part. My experience with Python has been very similar, and I’ve stopped using Mypy much these days, even when working with Python, because there are too many cases that it found but weren’t actual bugs, and too many actual bugs that it didn’t catch for one reason or another.
But I think that’s largely because Mypy isn’t very good at typechecking idiomatic Python code, and not because the concept as a whole is flawed.
I do wonder, though, if Mypy would have worked better from the start if annotations had always been lazy — or even if they’d only been available at runtime as strings. This would have given the type checkers more chances to experiment with syntax without needing changes to the runtime.
I think it is important to remember that Mypy is one of the implementations of PEP-484, and while it is probably the most famous and popular, it is not the only one. Also important to note that PEP-484 does not define how the types should be enforced (it defines a few things to allow interoperability, but leave the majority of behaviors for the implementors). Heck, they’re essentially comment strings for the interpreter (especially after PEP-563 2).
For example, Pyright explicitly tries to infer more things than Mypy 3, while the fact that Mypy doesn’t is a design choice 4.
This is true, but the fact that until PEP563 the type annotations were interpreted and therefore needed to be valid, and the way that after 563 they’re still semi-interpretable (IIRC the __future__ annotations import is now discouraged because it has other weird side effects and they’re looking in a different direction, but I’ve not hugely been following the discussion) — all that means that annotations are still very much tied to Python-defined semantics. You couldn’t easily, for example, define a custom mapped type syntax using dict expressions (similar to Typescript’s mapped types), because a lot of stuff would break in weird and wonderful ways.
Like I say, I’ve given up on this stuff for now, and I’m hoping that it might get better at some point, but last time I used it Pyright was more strict, but only in the sense that I needed to contort my code more aggressively to make it work. (IIRC, the last problem I ran into with Pyright was that it inferred a value as having type Unknown, and then started raising angry type errors, even though I never interacted with the value, and therefore it being Unknown was of no consequence.)
I am using Pyright as my LSP while I used mypy --strict in this project and they never disagree. But to be clear, this is a small project with ~2000 lines of code, and also doesn’t interact with external APIs so I think this avoid the weird corner cases of Mypy.
I have never had mypy catch an actual type error when doing this (meaning, a situation where an attempt was made to use a value of a type incompatible with the expected one described in the annotation). I have, however, gone down far more rabbit holes than I’d care to in an attempt to figure out how to express things in a way mypy will understand.
My experience is quite the opposite, mypy did catch lots of typing errors that would otherwise just cause issues in runtime (or maybe not, but it was definitely doing something not the way I wanted to express). One example from yesterday, I was using subprocess.run (actually a wrapper above it, more details below) and wanted to pass an env parameter (that is a Mapping), and I did something like:
subprocess.run(["foo"], env={"ENV", "foo"}
Do you see the error? Probably yes, since this is only one line of the code that I was working for, but I didn’t, and mypy gladly got the issue. And even better, this was before I tested the code, and once I run for the first time (after fixing all mypy errors), the code Just Worked (TM).
The other day I was refactoring the code of the same project and I was not running mypy in the tests yet. I decided to setup mypy in the tests just to be sure and boom, I forgot to change the caller in the tests. Since they’re mocked, they still passed, but the tests didn’t made any sense. While it was annoying to type the test methods itself (that I don’t think makes much sense), after that experience I decided that I definitely need mypy running in my tests too.
I concur that mypy is weird sometimes, and far from perfect. For example, I got some really strange issues when trying to write a wrapper for another method (again, subprocess.run) and typing its keyword arguments. Something like:
class RunArgs(TypedDict, total=False):
...
env: dict[str, str] | None # actually a Mapping[StrOrBytesPath] | None, but decided to simplify here
def run_wrapper(args: list[str], extra_env: dict[str, str] | None = None, **kwargs: Unpack[RunArgs]):
env = kwargs.get("env")
if extra_env:
env = (env or os.environ) | extra_env
subprocess.run(args, env=env)
And this caused some weird issues. Moving env to the method parameter instead worked and even simplified the code, but yes, far from ideal.
But even with the somewhat strange issue, I still thing that mypy was a huge plus in this project. The fact that I can refactor code and be confident that if the tests and mypy pass it is very likely that the code is still correct is worth the work that I have to do sometimes to “please” mypy.
I’ve definitely found errors when converting python programs to have type hints, but not as many afterwards; perhaps it’s just that having the type annotations makes it harder for me (and other people working on it) to write incorrect code? Either way, I fully agree that mypy/pyright work in ways that are just too annoying and I disable them in IDE and linting unless I’m required to have them.
One thing that I found valuable in mypy is during refactoring. I just refactored a function parameter from bool to str | None and while this didn’t broke any tests (because in the function I was doing something like if v: something, so it doesn’t really matter if I am passing False or None as v), mypy correctly identified the issue.
You could argue that it doens’t matter since this didn’t break the logic, but I think it does. Having tests with incorrect parameters, even if they works, are confusing because you lose the context of what the tests are actually trying to test.
I never understood the desire for building external tools using the go.mod of the repo you use them in. I’d rather just pin and build tools I used based on the released versions with same dependencies that the author builds, tests, and releases with.
While this is a good point, it doesn’t cover the fact that Go was one of the only languages that I know that didn’t had a good way to pin tools versions, that ends up creating the unfortunate issue in how you declare those tools in the specific version you need and keep it up to date.
This almost always involved a sub-optimal solution, either tools.go (with all its issues) or having some other way to declare it (it could be in README.md of the project or a Makefile), but you ended up needing something.
I usually just write go run tool@version in scripts but I see the issue. This tools solution would be fine to me UX wise but I’d always want the tools built with the go.mod.of their main!
True, but it’s not enough that your source code is POSIX shell. You also have to run it with a shell that’s not vulnerable! For example, this is a POSIX shell program:
x=$1
echo $(( x + 1 ))
And it’s vulnerable when you run it under bash/ksh, but not when you run it under dash, busybox ash, or OSH!
the reason that some OS’s decide that when i say “/bin/sh” what i really meant was some other shell that’s not sh is why i have trust issues. i’ve been told that it’s fine though, and to stop being pedantic.
I think this is a good reason to not use shell, or at least avoid as much as possible in any context that the user can control the inputs, because it is so easy to commit mistakes in shell, and as another commenter said below you can’t even trust that your POSIX compliant script will run in a shell that handles this (and other issues) correctly.
Can we use determinate Nix without
nixd? My reason is that I assume things like parallel evaluation will be implemented in Nix and notnixd. I have zero interest onnixdbut I would really appreciate parallel evaluation.Sure, you can get the source from https://github.com/DeterminateSystems/nix-src. It is fully functional on its own. Obviously we don’t support using it in that fashion, but you probably didn’t care about that anyway.
I think this is a great idea, but I am anticipating folks explainIng why it isn’t.
The main argument against is that even if you assume good intentions, it won’t be as close to production as an hosted CI (e.g. database version, OS type and version, etc).
Lots of developers develop on macOS and deploy on Linux, and there’s tons of subtle difference between the two systems, such as case sensitivity of the filesystem, as well as default ordering just to give an example.
To me the point of CI isn’t to ensure devs ran the test suite before merging. It’s to provide an environment that will catch as many things as possible that a local run wouldn’t be able to catch.
I’m basically repeating my other comment but I’m amped up about how much I dislike this idea, probably because it would tank my productivity, and this was too good as example to pass up: the point of CI isn’t (just) to ensure I ran the test suite before merging - although that’s part of it, because what if I forgot? The bigger point, though, is to run the test suite so that I don’t have to.
I have a very, very low threshold for what’s acceptably fast for a test suite. Probably 5-10 seconds or less. If it’s slower than that, I’m simply not going to run the entire thing locally, basically ever. I’m gonna run the tests I care about, and then I’m going to push my changes and let CI either trigger auto-merge, or tell me if there’s other tests I should have cared about (oops!). In the meantime, I’m fully context switched away not even thinking about that PR, because the work is being done for me.
You’re definitely correct here but I think there are plenty of applications where you can like… just trust the intersection between app and os/arch is gonna work.
But now that I think about it, this is such a GH-bound project and like… any such app small enough in scope or value for this to be worth using can just use the free Actions minutes. Doubt they’d go over.
Yes, that’s the biggest thing that doesn’t make sense to me.
I get the argument that hosted runners are quite weak compared to many developer machines, but if your test suite is small enough to be ran on a single machine, it can probably run about as fast if you parallelize your CI just a tiny bit.
I wonder if those differences are diminished if everything runs on Docker
With a fully containerized dev environment yes, that pretty much abolish the divergence in software configuration.
But there are more concern than just that. Does your app relies on some caches? Dependencies?
Where they in a clean state?
I know it’s a bit of an extreme example, but I spend a lot of time using
bundle openand editing my gems to debug stuff, it’s not rare I forget togem pristineafter an investigation.This can lead me to have tests that pass on my machine, and will never work elsewhere. There are millions of scenarios like this one.
I was once rejected from a job (partly) because the Dockerfile I wrote for my code assignment didn’t build on the assessor’s Apple Silicon Mac. I had developed and tested on my x86-64 Linux device. Considering how much server software is built with the same pair of configurations just with the roles switched around, I’d say they aren’t diminished enough.
Was just about to point this out. I’ve seen a lot of bugs in aarch64 Linux software that don’t exist in x86-64 Linux software. You can run a container built for a non-native architecture through Docker’s compatibility layer, but it’s a pretty noticeable performance hit.
One of the things that I like having a CI is the fact that it forces you to declare your dev environment programmatically. It means that you avoid the famous “works in my machine” issue because if tests works in your machine but not in CI, something is missing.
There are of course ways to avoid this issue, maybe if they enforced that all dev tests also run in a controlled environment (either via Docker or maybe something like testcontainers), but it needs more discipline.
This is by far the biggest plus side to CI. Missing external dependencies have bitten me before, but without CI, they’d bite me during deploy, rather than as a failed CI run. I’ve also run into issues specifically with native dependencies on Node, where it’d fetch the correct native dependency on my local machine, but fail to fetch it on CI, which likely means it would’ve failed in prod.
Here’s one: if you forget to check in a file, this won’t catch it.
It checks if the repo is not dirty, so it shouldn’t.
This is something “local CI” can check for. I’ve wanted this, so I added it to my build server tool (that normally runs on a remote machine) called ding. I’ll run something like “ding build make build” where “ding build” is the ci command, and “make build” is what it runs. It clones the current git repo into a temporary directory, and runs the command “make build” in it, sandboxed with bubblewrap.
The point still stands that you can forget to run the local CI.
What’s to stop me from lying and making the gh api calls manually?
It is sad that Tk doesn’t support Wayland and there doesn’t seem to be any effort to do so except for a port for Android that doesn’t seem to be much supported anywhere.
It would make for a really good for a cross-platform UI toolkit even if it looks slightly strange. One of my favorite Git UIs (gitk) is written on it and I still use every other week, but it looks really ugly in Wayland because of the non-integer scaling I use in my monitors.
Not really related to the topic, but one of the things that I most dislike in Go is the heavy usage of magic comments.
I understand the idea that they want new code to be parsable in older versions of the language, but considering how heavily the language relies on magic comments I can’t see why it didn’t had a preprocessor marker like C and
#instead.But anyway, I know that the ship already sailed.
I wouldn’t stay it’s “heavy”, more like a sprinkling. There are only three magic comments that are generally used (build, embed, generate) and even then it’s sparingly. Grafana’s Tempo has ~170k lines of Go and only 5 of those lines are
//go:directives. A very large class of programs will never use any of them.I think that’s a pretty good deal.
It really depends, for example I have a small project of around 2k lines of code in Go and I had to use multiple build flags since the code is for a desktop application and I had to create different code paths for Linux/macOS/Windows.
It is even worse if you need to use CGo, since if you need to embed C code you basically need to write C code as comments. And since they’re comments, you don’t have syntax highlight at all (or at least I don’t know one text editor that supports it, and my Neovim does support embedded code since it works fine in Nix).
Now keep in mind that in general I don’t think it is a huge issue, but I still think the language would have been more elegant if it had something else to do those transformations. But again, this is Go, so maybe being ugly is better.
Quick reply, but it’s more about being simple rather than ugly. A preprocessor is even more magical and unwieldy than magic comments. C code often has endless runs of
#ifdef ... #ifndef ...to separate build platforms making the code mostly unreadable. Not to mention the magic macros that change the code from underneath your feet so you don’t even know what you’re reading.Even Rust’s macros allows one to create their own little language which no one else understands. Given how the express purpose of Go was to be easy to read for anyone not familiar with the codebase that’d have been a non-starter.
I figure the Go authors wanted, rightfully, to steer clear of the mess that is preprocessors.
Maybe I was not clear, I was not suggesting for Go to have a preprocessor, just that those kind of magic be indicated with a symbol different from comments. I suggested that it could be the marker of preprocessor in C (e.g.
#), but not that this is a preprocessor per see.The idea is because this creates a clean separation between what a comment is and what is something special interpreted by the compiler. Something that was different so it is easier to see that it is doing something, e.g. have different syntax highlight for it.
Looking at the comments, it seems that it is the time of the month to complain about open-source desktop stacks. Let me add my own complaint: why aren’t “window manager” and “desktop environment” separate things in practice? I’m using Gnome with keybinding hacks to feel somewhat like a tiling wm. I would prefer to use a proper wm, but I want the “desktop environment” part of Gnome more: providing me with configuration screens to decide the display layout when plugging an external monitor, having plugging an USB disk just work, having configuration screens to configure bluetooth headsets, having an easy time setting up a printer, having a secrets manager handle my SSH connections, etc.
None of this should be intrisically coupled with window management logic (okay maybe the external-monitor configuration one), yet for some reason I don’t know of any project that succeeded in taking the “desktop environment” of Gnome or Kde or XFCE, and swapping the window manager to something nice. (There have been hacks on top of Kwin or gnome-shell, some of them like PaperWM are impressive, but they feel like piling complexity and corner cases on top of a mess rather than a proper separation of concerns.)
The alternative that I know of currently is to spend days reading the ArchLinux wiki to find out how to setup a systray on your tiling WM to get the NetworkManager applet (for some reason the NetworkManager community can’t be bothered to come up with a decent TUI, although it would clearly be perfectly appropriate for its configuration), re-learn about another system-interface layer to get usb keys to automount, figure out which bluetooth deamon to run manually, etc. (It may be that Nix or other declarative-minded systems make it easier than old-school distributions.) This is also relevant for the Wayland discussion because Wayland broke things for several of these subsystems, and forced people to throw away decades of such manual configuration to rebuild it in various way.
Another approach, of course, would be to have someone build a pleasant, consistent “desktop environment” experience that is easy to reuse on top of independent WM projects. But I suspect that this is actually the more painful and less fun part of the problem – this plumbing gets ugly fast – so it may be that only projects that envision themselves with a large userbase of non-expert users can be motivated enough to pull this through. Maybe this would have more chances of succeeding if we had higher-level abstractions to talk to these subsystems (maybe syndicate and its system layer project which discusses exactly this, maybe Goblins, whatever), that various subsystems owner would be willing to adopt, and that would make it easier to have consistent tools to manipulate and configure them.
The latest versions of lxqt and xfce support running on pretty much any compositor that supports xdg-layer-shell (and in fact, neither lxqt nor xfce ship a compositor of their own). Cosmic also has some support for running with other compositors, although it does ship its own. There’s definitely room for other desktop environments to support this, too.
I think this is the main reason I use NixOS nowadays: you configure things the way you want, and they will be there even if you reinstall the system. In some ways I think NixOS is more of a meta-distro, where you customize the way you want, and to make things easier there are lots of modules that make configuring things like audio or systray easier.
You will still need to spend days reading documentation and code to get up there, but once it is working this rarely breaks (of course it does break eventually, but generally it is only one thing instead of several of them, so it is relatively easy to get it working again).
What you describe is a declarative configuration of the hodgepodge of services that form a “desktop environment” today, which is easy to transfer in new systems and to tweak as things change. This is not bad (and I guess most tiling-WM-with-not-much-more users have a version of this), it is a way to manage the heterogeneity that exists today.
But I had something better in mind. Those services could support a common format/protocol to export their configuration capabilities, and it would be easy for user-facing systems to export unified configuration tools for them (in your favorite GUI toolkit, as a TUI, whatever). systemd standardized a lot of things about actually running small system services, not much about exposing their options/configurations to users.
This just sounds like a really bad idea. If the language is unapproachable, change the language or help people learn it. Requiring an LLM for generating configuration will just make the problem worse over time.
It’s not required, it’s there to aid generating a scaffold.
Let me rephrase: If the path of least resistance is generating configuration with an LLM, most people will follow this path, and this path doesn’t aid in learning in any way.
Also, it will cover the language complexity problems, making it potentially worse over time.
The learning path is never followed, and the complexity isn’t tackled. Hence: The LLM becomes a de-facto requirement.
I find that a strawman, if it doesn’t help then you still need a better language. If it does then problem solved :)
It helps with generating configuration without thinking about it or understanding it. The configuration becomes something obscure and assumed to work well that only gets updated by LLMs and no one else. There’s no incentive for the average person to understand what they’re doing.
If that’s okay, then sure, go ahead.
But this is already a problem right? I can ask right now any LLM to generate a
shell.nixfor X project, and it will generate it. While I don’t like the idea of auto generating code via LLM, I can understand how having something to scaffold code can be nice in some cases.Heck, even before LLMs people would have things like snippets to generate code, and we also had things like
rails generateto generate boilerplace code for you. This only goes one step further.Yes, and we don’t want to make it worse or pretend that it’s acceptable.
I have the opinion that boilerplate generators (LLM or not) are a symptom of a problem, not a proper solution. Ignoring that, at least a regular generator:
LLMs are not good learning tools because it cannot say “no”. You need to be experienced enough to make reasonable questions in order to get reasonable answers. Portraying an LLM as an alternative to learning for newcomers is counter-productive.
This is an optional feature though, you can use it or not. If your argument is that “this makes people lazy”, well, they can already be lazy by opening ChatGPT or any other LLM and do the same.
While the post seems to suggest this is for newcomers it is not necessary true. I could see myself using it considering I had in the past copied and pasted my Nix configuration from some random project to start a new one.
People is free to do so, but embracing it in the project itself is different.
If the project encourages it, then it’s not a shortcut, it’s the default way to do things.
Obviously anyone can take advantage of this, but newcomers are by far the most impacted. It doesn’t flatten the learning curve, it side-steps it.
I think you are reading too much on this, I am not seeing the project encouraging it, just being an alternative.
They added it to their CLI. They published a blog post about it. They set up a dedicated marketing website. They made sure it’s literally the first CTA you see on their home page.
Do we just live in completely separate universes?
I just went to their homepage and I see no mention about this feature. But even if it had, as long as it is beside the manual way I wouldn’t say it is encouraging, it is an alternative.
Encouragement would be if they remove all mentions of manual methods or buried it up in the documentation. This is not what is happening here, if I go to their documentation they still have lots of guides in how everything works. Here, just go to: https://devenv.sh/basics/.
I ask the same for you. Maybe you’re seeing a different version of the homepage, or maybe in your universe a blog post is the same as home page.
It’s indeed in a very prominent position on the homepage
Now I see it, but it is not as prominient as you both making for. And also, the manual methods are still described in the same homepage.
Go to https://devenv.sh/.
Here are the first three paragraphs:
There is a perfect phrase for this, this is basically a “cargo cult”.
Thanks to LLMs I now use a huge array of DSL and configuration based technologies that I used not to use, because I didn’t have the time and mental capacity to learn 100s of different custom syntaxes.
Just a few examples: jq, bash, AppleScript, GitHub Actions YAML, Dockerfile are all things that I used to mostly avoid (unless I really needed them) because I knew it would take me 30+ minutes to spin back up on the syntax… and now I use them all the time because I don’t have to do that any more.
Add Nix to that list.
I would not feel confident trusting some config that an LLM spits. I would check if it does what its supposed to do, and lose more time than gaining it.
If I cannot scale the amount of different technologies, I use less or simplify. Example: Bash is used extensively in CI. GitHub Actions just calls bash scripts.
It only takes me a few seconds to confirm that what an LLM has written for me works: I try it out, and if it does the thing then great! If it spits out an error I loop that through the LLM a couple of times, if that doesn’t get me to a working solution I ditch the LLM and figure it out by myself.
The productivity boost I get from working like this is enormous.
I’m wondering: Doesn’t that make your work kinda un-reproducible?
I spend a lot of time figuring out why something in a codebase is like it is or does what it does. (And the answers are often quite surprising.)
“Because an LLM said so, at this point in time” is almost never what I’m looking for. It’s just as bad as “The person who implemented (and never got around to documenting) this moved to France and became a Trappist monk”.
I’d have to completely reconstruct the code in both cases.
You have to be really disciplined with this stuff.
Throwaway prototype? Don’t worry about it. Do the Andrej Karpathy vibe coding thing.
Code that you’re going to be maintaining for a long time? Don’t commit anything unless you not only understand it but could explain how it works to someone else.
In my experience it’s way more frustrating and erratic. Good that it works for you.
Apart from that, I think there is value in facing repetitive and easy tasks. Eventually you get tired of it, build a better solution, and learn along the way.
For non repetitive and novel tasks, I just want to learn it myself. Productivity is a secondary concern.
Thank you for building it. I know nix, yet I still just want to quickly build an env and move on to building. I think it’s. Cool exploration
The fact of the matter is that the argument for this feature works regardless of the underlying system.
Your unspoken premise appears to me to be that we should all become masters of all our tools. There was a time I agreed with that premise, but now I think we are so thoroughly surrounded by tools we have to be selective about which ones to master. For the most part with devenv, you set it up once and get on with your life, so there isn’t the same incentive to master the tool or the underlying technology as there is with your primary programming language. I’m using Nix flakes and direnv on several projects at my work; my coworkers who use Nix are mostly way less literate in it than I am and it isn’t a huge obstacle to their getting things done with the benefit of it. Very few people do a substantial amount of Nix programming.
No, it’s not.
You don’t need to master every tool you use, just a basic understanding, a sense of boundaries and what it can or can’t do.
It doesn’t matter if your objective is “mastering” or “basic understanding”, both things require some learning, and LLMs do not provide that. That’s the main premise in my argument.
I don’t use a tool if I don’t know anything about it.
I could not agree more with that. LLMs have accelerated me to that point for so many new technologies. They help me gain exactly that understanding, while saving me from having to memorize the syntax.
If you’re using LLMs to avoid learning what tools can and cannot do then you’re not taking full advantage of the benefits they can bring.
My experience with LLMs is having to re-check everything in case its an hallucination (often is) and ending up checking the docs anyways.
The syntax is easy to remember for me from tool to tool. Most projects tend to have examples on their website and that helps to remember the details. I stick to that.
At a technical level, while I understand the appeal of sticking to DEFLATE compression, the more appealing long term approach is probably to switch to zstd–it offers much better compression without slowdowns. It’s a bigger shift, but it’s a much clearer win if you can make it happen.
I admit to being a bit disappointed by the “no one will notice” line of thinking. It’s probably true for the vast majority of users, but this would rule out a lot of useful performance improvements. The overall bandwidth used by CI servers and package managers is really tremendous.
Node already ships Brotli, and Brotli works quite well on JS, it basically has been designed for it.
Took me a minute to realize that by “on JS” you meant, on the contents of
.js/.mjsfiles. At first I thought you meant, to be implemented in JS. Very confusing :DYes, especially since the change can’t recompress older versions anyway because of the checksum issue. Having a modern compression algorithm could result in smaller packages AND faster/equivalent performance (compression/decompression).
I agree. Gzip is just about as old as it gets. Surely npm can push for progress (I’m a gzip hater, I guess). That said,
I do wonder if npm could/would come up with a custom dictionary that would be optimized for, well, anything at all, be it the long tail of small packages or a few really big cornerstones.
[1] https://en.wikipedia.org/wiki/Zstd
HTTP is adding shared dictionary support, including a Brotli archive that is like zip + Brotli + custom dictionaries:
https://datatracker.ietf.org/doc/draft-vandevenne-shared-brotli-format/13/
I agree a better compression algorithm is always nice, but here back-compat is really important given there’s lots of tools and users.
It’s a whole other level of pain to add support a format existing tools won’t support, it’s not even sure the NPM protocol was built with that in mind. And a non back-compat compression might even make things worse in the grand scheme of things: you need 2 versions of the packages, so more storage space, and if you can’t add metadata to list available formats you get clients trying more than one URL, increasing server load.
Wayland bad. X11 good. /s
Imagine if all that Wayland effort was put into fixing the supposed issues with X11…
They can’t be fixed, these are design issues. And this is said by the very developers and maintainers of X, who moved on to develop Wayland.
This is false, completely false, but repeated over and over and over and over and over and over and over again, then upvoted over and over and over again.
It can be fixed, and the Wayland developers said it can be fixed, they just didn’t want to.
See the official wayland faq: https://wayland.freedesktop.org/faq.html#heading_toc_j_5
“ It’s entirely possible to incorporate the buffer exchange and update models that Wayland is built on into X.”
From their own website, in their own words.
And that’s a good enough reason. As far as I know, none of us are paying them for their work. Since we benefit from what they (or their employers) are freely giving, we have no right to complain.
I’m not an X11 partisan, it seems plausible enough that Wayland is a good idea, but you’re just begging the question that we’re benefiting from what they’re freely giving.
We’re using what they’re freely given, but it’s not guaranteed that we’re benefiting.
With the risk of engaging with someone who’s words read as an emotional defence…
What do you suppose the benefits of sticking with X.ORG were?
I know of no person who could reasonably debug it- even as a user, and the code itself had become incredibly complex over time, accumulating a vast array of workarounds.
The design (I’m told) was gloriously inefficient but hacked over time to be performant - often violating the principles in which it existed in the first place.
It also forced single-threaded compositing due to the client-server architecture…
Fixing these issues in X11 would have been incredibly disruptive, and they chose to do the python3 thing of a clean break, which must have been refreshing.
The “Hate” seems to be categorisable in a few ways:
Do you have something else to add to this list, or do you think I’ve mischaracterised anything?
It just works for tons of people. X runs the software I want it to run, and it does it today. Wayland does not.
I think I could make it work with probably hundreds of hours of effort but…. why? I have better things to do than run on the code treadmill.
I’ve encountered one bug in X over the last decade. I make a repro script, git cloned the x server, built it, ran my reproduction in gdb, wrote a fix, and submitted it upstream in about 30 minutes of work. (Then subsystem maintainer Peter Hutterer rewrote it since my fix didn’t get to the root cause of the issue, my fix was to fail the operation when the pointer was null, his fix ensured the pointer never was null in the first place. So the total time was more than the 30 mins I spent, but even if my original patch was merged unmodified, it would have solved the issue from my perspective.)
After hearing all the alleged horror stories, I thought it’d be harder than it was! But it wasn’t that bad at all. Maybe other parts of the code are worse, I don’t know. But I also haven’t had a need to know, since it works for me (and also for a lot of other people).
Also important to realize that huge parts of the X ecosystem are outside the X server itself. The X server is, by design, hands off of a lot decisions, allowing independent innovation. You can have a major, meaningful contribution to the core user experience without ever touching the core server code, by working on inter-client protocols, window managers, input methods, toolkits, compositors, desktop panels, etc., etc., etc.
Every programmer likes the idea of a clean break. Experienced programmers usually know that the grass might be greener now if you rip out Chesterson’s fence, but it won’t stay that way for long.
Keep running it then, nobodies stopping you if it works, it just won’t be updated.
With all the posturing you’d think someone would have stepped up to maintain it; based on this comment you have the capability.
Like I said before, I had one issue with it recently, last year, and the bug report got a quick reply and it was fixed.
I don’t know what you expect from an open source maintainer, but replying promptly to issues, reviewing and merging merge requests, and fixing bugs is what I expect from them. And that’s what I got out of the upstream X team, so I can’t complain. Maybe I’m the lucky one, idk, but worst case, if it does ever actually get abandoned, yeah, compiling it was way easier than I thought it would be, so I guess I could probably do it myself. But as of this writing, there’s no need to.
Yes, I remember looking at the code of
xinit(that is supposed to be a small shell script just to start the X server) and it was… bad to say the least. I also had some experience implementing a protocol extension (Xrandr) in python-xlib and while I was surprised it worked first try with me just following the X11 documentation at the time, it was really convoluted to implement even including the fact that the code base of python-xlib already abstracted a lot for me.I don’t know if I designed a replacement of X.org it would like Wayland, I think the least X.org needed was a full rewrite with lots of testing, but it wouldn’t happen considering even the most knowledge people in its codebase don’t touch parts of it because of fear. If anything, Wayland is good because there are people that have entusiasm hacking its code base and constantly improving it, something that couldn’t be said about X.org (that even before Wayland existed, had a pretty slow development).
I think a lot of the hate happens because “it has bugs” magnifies every other complaint.
The xdg-portal system certainly addresses things like screensharing on a protocol level. But pretty much every feature that depends on it – screen sharing, recording etc. – is various kinds of broken at an application level.
E.g. OBS, which I have to use like twice a year at most, currently plays a very annoying trick on me where it will record something in a Wayland session, but exactly once. In slightly different ways – under labwc the screen video input only shows up the first time, then it’s gone; on KDE it will produce a completely blank video.
I’m sure there’s something wrong with my system but a) I’ve done my five years of Gentoo back in 2005, I am not spending any time hunting bugs and misconfiguration until 2035 at the very least and b) that just works fine on X11.
If one’s goal is to hate Wayland, b) is juuuust the kind of excuse one needs.
I understand the frustration, but I was raised on X11 and these bugs sound extremely milquetoast compared to the graphical artifacting, non-starting and being impossible to debug, mesa incompatibility, proprietary extensions not driving the screen and the fucking “nvidia-xconfig” program that worked 20% of the time.
X11 is not flawless either, we got better at distro maintainers handling various hacks to make it function well enough- especially out of the box, but it remains one of the more brittle components of desktop linux by a pretty wide margin.
Oh, no, I’m not disagreeing, that was meant as a sort of “I think that’s why Wayland gets even more hate than its design warrants”, not as a “this is why I hate Wayland”. I mean, I use labwc, that occasional OBS recording is pretty much the only reason why I keep an X11 WM installed.
I’m old enough for my baseline in terms of quirks to be XFree86,
nvidia-xconfigfelt like graduating fromedto Notepad to me at some point :-D.mesa incompat wil bite people in WL too.
But then they’d be maintaining this new model and the rest of X.Org, which the maintainers could not do. It might not matter that it’s technically possible if it’s not feasible.
Wayland is the new model plus xwayland so that the useful parts of X11 keep working without them having to maintain all the old stuff.
Again, provably false - they are maintaining the rest of X.Org, first like you said, xwayland is a thing… that’s all the parts they said were obsolete and wanted to eliminate! But also from the same FAQ link:
A lot of the code is actually shared, and much of the rest needs very little work. So compared to what is actually happening today, not developing Wayland would have been less maintenance work, not more. All that stuff in the kernel, the DRI stuff, even the X server via xwayland, is maintained either way. Only difference is now they’ve spent 15 years (poorly) reinventing every wheel to barely regain usability parity because it turns out most the stuff they asserted was useless actually was useful and in demand.
I like to say Graphics are, at best, 1/3 of GUI. OK, let’s assume you achieved “every frame is perfect”. You still have a long way to go before you have a usable UI.
That’s like saying that we are maintaining both English and German language. Who are “they”? And X is very much in life support mode only - take a look at the commit logs.
XWayland is much smaller than the whole of X, it’s just the API surface necessary to keep existing apps working. That’s like saying that a proxy is a web browser..
Code sharing:
The DRM subsystem and GPU code is “shared”, as in it is properly modularized in Linux and both make use of it. If anything, Wayland compositors make much better use of the actual Linux kernel APIs, and are not huge monoliths with optional proprietary binary blobs. It’s pretty trivial to write a wayland compositor from scratch with no external libraries at all - where is the shared X code?
All right here, notice how xwayland is just one of the many “hw” backends of the same xserver core:
https://gitlab.freedesktop.org/xorg/xserver
There’s also backends for other X servers (xnest), Macs (xquartz), Windows (xwin), and, of course, the big one for physical hardware (xfree86, which is much smaller than it used to be - over 100,000 lines of code were deleted from there around 2008 - git checkout a commit from 2007 and find … 380k lines according to a crude
find | wcfor .c files, git checkout master and find 130k lines by the same measure in that hw/xfree86 folder. Yeah, that’s still a lot of code, but much less than it was, because yes, some if it was simply deleted, but also a lot of it was moved - so nowadays both the X server and Wayland compositors can use it).But outside of the hw folder, notice how much of the X server core code is shared among all these implementations, including pretty much every user-facing api and associated implementation bookkeeping in the X server.
XWayland not only is a whole X server, it is a build of the same X.org code.
“Maintaining” can mean different things. The X server is “maintained” in the sense that it’s kept working for the benefit of XWayland, but practically nobody is adding new features any more. Not having to maintain the “bare metal” backend also removes a lot of the workload in practice.
This is a much simpler and less work than continuing to try to add features to keep the X protocol up to date with the expectations of a modern desktop.
In other words, yes, the X server is maintained, in the sense that it’s in maintenance mode, but there’s very little active development outside of XWayland-specific stuff.
If you look at the Git tags on that repo, you’ll see that XWayland is also on a completely different release cadence now, with a different version number and more frequent releases. So even though it’s the same repo, it’s a different branch.
XWayland is not a mandatory part of the Wayland protocol, though. Of course they chose the easiest/most compatible way to implement the functionality, which will be to build on the real thing, but it’s a bit dishonest on your part to say that an optional part, meant to provide backwards compatibility, could be considered “shared code”.
I have to admit that I don’t know much about X.Org’s internals, but I think a lot of that stuff extracted to libraries and into the kernel is in addition to, not replacing, the old stuff.
For example X.Org still has to implement its own modesetting in addition to KMS. It has to support two font systems in addition to whatever the clients use. It has to support the old keyboard configuration system and libxkbcommon. It has to support evdev and libinput. It has to implement a whole graphics API in addition to kernel DRM.
Wayland can drop all this old stuff and just use the new. Xwayland isn’t X.Org, I don’t think it has to implement any of this. It’s “just” a translation layer for the protocol.
Please be careful about assuming what somebody else will find easy / less work. If the maintainers said they can’t support it anymore, I’m inclined to believe them. Sometimes cutting your losses is the easier option.
I think this is unfair to people working on Wayland. I know you don’t like it, but It’s an impressive project that works really well for many people.
I don’t think anybody was “asserting [features] are useless”, they just needed the right person to get involved. I’m assuming you mean things like screen sharing, remote forwarding, and colour management. People do this work voluntarily, it might not happen immediately if at all, and that’s fine.
XWayland does have to implement all the font/drawing/XRender/etc. stuff, since X11 clients need that to work. That’s part of its job as a “translation” layer.
(I should know, Xorg/XWayland does some unholy things with OpenGL in its GLAMOR graphics backend that no other app does, and it has tripped up our Apple Mesa driver multiple times!)
But the most important part is that XWayland doesn’t have to deal with the hardware side (including modesetting, but also multi screen, input device management, and more), which is where a lot of the complexity and maintenance workload of Xorg lies.
The core Wayland protocol is indeed just buffer exchange and update based on a core Linux subsystem. It’s so lean that it is used in car entertainment systems.
And HTTP 1.0 is another protocol, that can also be added to other programs. They are pretty trivial, obviously they can be incorporated into X. I can also attach an electric plug to my water plumping, but it wouldn’t make much sense either. Having two ways to do largely overlapping stuff, that will just overstep each other’s boundaries would be bad design.
Wayland has a ready-made answer to “every frame is perfect” – adding that to X would just make some parts of a frame perfect. Wayland also supports multiple displays that have different DPIs, which is simply not possible at all in X.
GPUs weren’t even a thing when X was designed - it had a good run, but let it rest now. I really don’t support reinventing the wheel unnecessarily, that seems all too common in IT, but there are legitimate points where we have to ask if we really are heading in the direction we want to go. It is the correct decision in case of X vs Wayland, as seen by the display protocols of literally every other OS, whose internals are pretty similar to it.
Then how do you explain it being implemented and functional and working in X (with Qt or KDE applications) since at least 2016? Its configuration is a bit janky https://github.com/qt/qtbase/blob/1da7558bfd7626bcc40a214a90ae5027f32f6c7f/src/gui/kernel/qhighdpiscaling.cpp#L488 but it does work.
EDIT: I posted a video comparison in this comment because I’m tired of arguing in text about something that is obvious when you see it in real life. X11 and Wayland are not the same, and only Wayland can seamlessly handle mixed DPI.
We already went over this in another thread here in the past. X11 does not implement different DPIs for different monitors today, and it doesn’t work out of the box (where you say “janky”, what you really mean is “needs manual hacks and configuration and cannot handle hotplug properly”).
Even if you did add the metadata to the protocol (which is possible, but hasn’t been done), it’s only capable of sudden DPI switching when you move windows from one screen to another.
X11 cannot do seamless DPI transitions across monitors, or drawing a window on two monitors at once with the correct DPI on both, the way macOS and KDE Wayland can, because its multi-monitor model, which is based on a layout using physical device pixels, is incompatible with that. There’s no way to retroactively fix that without breaking core assumptions of the X11 protocol, which would break backwards compatibility. At that point you might as well use Wayland.
Indeed and I am not sure why you keep repeating the same misconceptions despite being told multiple times that all these things do work in X11.
The links posted by @adam_d_ruppe are good, but I think the oldest post talking about it is oblomov’s blog post. He even had patches that implement mixed DPI for GTK on X11, but obviously the GTK folks would never merge something that would improve their X11 backend (mind you, they are still stuck on xlib).
It gets even funnier, because proper fractional mixed DPI scaling was possible in X11 long before Wayland got the fractional scaling protocol, so there was ironically a brief period not long ago where scaling was working better in XWayland than in native Wayland (that particular MR was a prime example of Wayland bikeshedding practice, being blocked for an obnoxious amount of time mostly because GTK didn’t even support fractional scaling itself, even quite some time after the protocol was added).
Yes it can and it works ootb with any proper toolkit (read: not GTK) and renders at the native DPI of each monitor switching seemlessly inbetween. We don’t even have to lock ourselfes to Qt, even ancient toolkits like wxWidgets support it.
This doesn’t work, neither on Wayland nor X11 and not a single major toolkit supports this usecase and for good reason, the complexity needed would be insane and you can forget basically about any optimizations across the whole rendering stack.
Also the whole usecase is a literal edge case in its own rights, I don’t think many people are masochist enough to keep a window permanently on the edge of different-DPI monitors for a long time.
It doesn’t work on KDE Wayland the way you think it does, you get the same sudden switch as soon as you move more than half over to the other monitor. If you don’t believe me, try setting one monitor to a scale factor that corresponds to a different physical size. Obviously you will not notice any sudden DPI changes if the scale factors end up as the same physical size, but that works in X11 too.
At this point the only difference is that X11 does not have native per-window atoms for DPI, so every toolkit adds their own variables, but that hardly makes any difference in practice. And to get a bit diabolical here, since Wayland is so keen on shoehorning every usecase they missed (i.e. everything going beyond a basic Kiosk usecase) through a sideband dbus API,
wp-fractional-scale-v1might as well have become a portal API that would have worked the same way on X11.After a decade of “Wayland is the future” it is quite telling that all the Wayland arguments are still mostly based on misconceptions like this (or the equally common “the Wayland devs are the Xorg devs” - not realizing that all of the original Wayland devs have long jumped the burning ship), while basic features such as relative-window positioning or god forbid I mention network transparency are completely missing.
Sigh.
I’m tired of pointless arguing, so here’s a video. This is what happens on X11, with everything perfectly manually configured through environment variables (which took some experimentation because the behavior wrt the global scale is a mess and unintuitive):
https://photos.app.goo.gl/H9TRvexd2SQWxLg28
This is what happens on Wayland out of the box, just setting your favorite scale factors in System Settings, with no environment variable tweaks:
https://photos.app.goo.gl/XjS36F2MbHye1F276
In both tests the screen configuration, resolution, relative layout (roughly*), and scale factors are the same (1.5 left, 2.0 right).
These two videos are not the same.
I guess I should also point out the tearing on the left display and general jank in the X11 video, which are other classic symptoms of how badly X11 works for some systems. This is with the default modesetting driver, which is/was supposed to be the future of X11 backends and is the only one that is driver-independent and relies on the Linux KMS backend exclusively, but alas… it still doesn’t work well. Wayland doesn’t need hardware-specific compositor backends to work well.
Also, the reason why I used a KDialog window is that it only works as expected with fixed size windows. With resizeable windows, when you jump from low scale to high scale, the window expands to fit the (now larger) content, but when you jump back, it keeps the same size in pixels (too large relative to the content now), which is even more broken. That’s something that would need even more window manager integration to make work as intended on X11. This is all a consequence of the X11 design that uses physical pixels for all window management. Wayland has no issue since it deals in logical/scale-independent units only for this, which is why everything is seamless.
Also, note how the window decorations are stuck at 2.0 scale in X11, even more jank.
* X11 doesn’t support the concept of DPI-independent layout of displays with different DPI, so it’s impossible to literally achieve the same layout anyway. It just works completely differently.
Funnily enough on my system the KDE Wayland session behaves exactly like what you consider so broken in the X11 session.
There is a lot to unpack here, but let’s repeat the obvious since you completely ignored my comment and replied with basically a video version of “I had a hard time figuring out the variables, let’s stick with the most broken one and just dump it as a mistake of X11”: What’s happening in the Wayland session is most certainly not what you think it is, Qt (or any toolkit for that matter) is not capable of rendering at multiple DPIs at the same time in the same window. As I stated before, something like that would require ridiculous complexity in the entire render path. Imagine drawing an arc somewhere in a low-level library and you suddenly have to change all your math, because you cross a monitor boundary. Propagating that information alone is a huge effort and let’s not even start with more advanced details, like rotated windows.
The reason why you see no jump in the Wayland session is because you chose the scaling factor quite conveniently so that it is identical in physical size on both monitors (and that would work on X11 too). Instead of doubling down, it would have taken you 5 seconds to try out what I suggested, i.e. set one of the scale factors to one with a different physical size (maybe 1.0 left, 2.0 right) and you will observe that indeed also Wayland cannot magically render at different DPIs at the same time and yes you will observe those jumps.
Now obviously your scaling factors of 1.5 vs 2.0 should produce the same result on X11. I don’t know your exact configuration, so I can only reach to my magic crystal ball, but since you already said you had a hard time figuring out the variables, a configuration error is not far fetched: Maybe you are applying scaling somewhere twice, e.g. from leftover hacks with
xrandrscaling or hardcoding font DPI, or kwin is interfering in a weird way (setPLASMA_USE_QT_SCALINGto prevent that). But honestly, given that you start your reply with “Sigh” and proceed to ignore my entire comment, I don’t think you are really interested in finding out what was wrong to begin with. If you are though, feel free to elaborate.In my case I have the reverse setup of yours, i.e. my larger monitor is 4k and my laptop is 1080p, so I apply the larger scale factor to my larger screen (I don’t even know my scaling factors, I got lucky in the EDID lotto and just use
QT_USE_PHYSICAL_DPI). So yes this means also on Wayland I get the “jump” once a window is halfway over to the next monitor. The size explosions are not permanent as you say though, neither on Wayland nor on X11.She obviously doesn’t think that that is what’s happening. In the comment you yourself linked, she explains how it works:
With this correct understanding, we can see that the rest of your comment is incorrect:
I just tried using the wrong DPI and there was no jump (I’m on Sway). On on screen, the window I moved was much bigger, and on the other, it was much smaller. But it never changed in size. The only thing that changed was the DPI it was rendering it, while seamlessly occupying the exact same space on each monitor as it did so. This works because Wayland uses logical coordinates instead of physical pixels to indicate where windows are located or how big they are. So when a window is told to render at a different scale, it remains in the same logical position, at the same logical size.
There is a noticeable change, but it’s just the rendering scale adjustment kicking in causing the text on the monitor the window is being moved into to become pixel sharp, and the text in the old monitor getting a bit of a blur.
My Mac refuses to show a window spanning two monitors.
This changed a few years ago to allow per-display spaces (what Linux calls “workspaces”) - presumably they decided they didn’t want to deal with edge cases where a window is on an active space on one monitor and an inactive space on another. (Or what happens if you move a space containing half a window across monitors?)
You can get the old behavior back by turning off Settings > Desktop & Dock > Mission Control > Displays have separate spaces.
Indeed - the fact that it does actually work, albeit with caveats, proves that it is, not, in fact, “simply not possible at all”.
We can discuss the pros and cons of various implementations for various use cases, there’s legitimate shortcomings in Qt and KWin, some of which are easy to fix* (e.g. the configuration ui, the hotplugging different configurations), some are not (the window shape straddling monitor boundaries), there’s some advantages so it too (possibly better performance and visual fidelity), but a prerequisite to a productive technical discussion is to do away with the blatant falsehoods that universally start these threads.
“You can do it, but….” is something reasonable people can discuss.
“It is simply not possible at all” is provably flat-out false.
I appreciate that you’ve now tried it yourself. I hope you’ll never again repeat the false information that it is impossible.
The use of quote marks here implies that the commenter you’re replying to used this exact term in their comment, but the only hit for searching the string is your comment.
I’m flagging this comment as unkind, because my reading of this and other comments by you in this thread is that you are arguing in bad faith.
https://lobste.rs/s/oxtwre/hard_numbers_wayland_vs_x11_input_latency#c_argozj
Try not to accuse people of personal attacks - which is itself a personal attack, you’re calling me an unkind liar - without being damn sure you have your facts right.
That’s a different person. The person you replied to did not say that.
This is what the person you replied to actually said:
Looking at the two videos it’s pretty obvious that they are not doing the same thing at all. That dialog window is not being drawn with the correct DPI on each monitor, it’s either one or the other. “Mixed” is sufficiently elastic a word that I’m sure some semantic tolerance helps but I’m not exactly inclined to call that behaviour “mixed”, just like I also can’t point at the bottle of Coke in my fridge, the stack of limes in my kitchen and the bottle of rum on my shelf and claim that what I actually have is a really large Cuba Libre. (Edit:) I.e. because they’re not mixed, they’re obviously exclusive.
I don’t know if that’s all that X11 can do, or if it’s literally impossible to achieve what @lina is showing in the second video – at the risk of being an embarrassment to nerddom everywhere I’ve stoped caring a few years back and I’m just happy if the pixels are pixeling. But from what I see in the video, that’s not false information at all.
100% the same thing as Wayland is impossible in X. It can’t handle arranging mixed DPI monitors in a DPI-independent coordinate space, such that rendering is still pixel perfect on every monitor (for windows that don’t straddle monitors). X11 has no concept of window buffer scale that is independent of window dimensions.
The closest you can get is defining the entire desktop as the largest DPI and all monitors in that unit, then having the image scaled down for all the other monitors. This means you’re rendering more pixels though, so it’s less efficient and makes everything slightly blurry on the lower DPI monitors. It’s impossible to have pixel perfect output of any window on those monitors in this setup, and in practice, depending on your hardware, it might perform very poorly. It’s basically a hacky workaround.
This is actually what XWayland fakes when you use KDE. If you have mixed DPI monitors, it sets the X11 DPI to the largest value. Then, in the monitor configuration presented via fake XRandR to X11 clients, all monitors with a lower DPI have their pixel dimensions scaled up to what they would be at the max DPI. So X11 sees monitors with fake, larger resolutions, and that allows the relative layout to be correct and the positioning to work well. If I had launched KDialog under KDE Wayland with the backend forced to X11, it would have looked the same as Wayland in the video in terms of window behavior. It also wouldn’t have any tearing or glitches, since the Wayland compositor behind the scenes is doing atomic page flips for presentation properly, unlike Xorg. The only noticeable difference would have been that it’s slightly less sharp on the left monitor, since the window would be getting downscaled there.
That all works better than trying to do it in a native X11 session, because XWayland is just passing the window buffers to Wayland so only the X11 windows get potentially scaled down during compositing, not the entire screen.
Where it falls apart is hotplug and reconfiguration. There’s no way to seamlessly transition the X11 world to a higher DPI, since you have to reset all window positions, dimensions, monitor dimensions and layout, and client DPI, to new numbers. X11 can’t do that without glitches. In fact, in general, changing DPI under X11 requires restarting apps for most toolkits. So that’s where the hacky abstraction breaks, and where the proper Wayland design is required. X11 also doesn’t have any way for clients to signal DPI awareness and can’t handle mixed DPI clients either, so any apps that aren’t DPI aware end up tiny (in fact, at less than 1.0 scale on monitors without the highest DPI). This affects XWayland too and there’s no real way around it.
At best, in XWayland, you could identify which clients aren’t DPI aware somehow (like manual user config) and give them a different view of the X11 world with 1.0 monitor scales. That would mostly work as long as X11 windows from both “worlds” don’t try to cooperate/interact in some way. KDE today just gives you two global options, either what I described or just always using 1.0 scale for X11 (which makes everything very blurry on HiDPI monitors, but all apps properly scaled).
That’s what I thought was happening, too, but I wasn’t really sure if my knowledge was up-to-date here. Like I said, I’m just happy if my pixels are pixeling, and I didn’t want to go down a debate where I’d have to read my way through source code. This end of the computing stack just isn’t fun for me.
This isn’t a term we invented in this thread, it is very common, just search the web for “mixed dpi” and you’ll find it, or click the links elsewhere in this thread and see how it is used.
A blog posted in a cousin comment sums it up pretty well: “A mixed-DPI configuration is a setup where the same display server controls multiple monitors, each with a different DPI.”
(DPI btw formally stands for “dots per inch”, but in practice, it refers to a software scaling factor rather than the physical size because physical size doesn’t take into account the distance the user’s eyes are from the display. Why call it DPI then? Historical legacy!)
Or, if that’s too far, go back to the grandfather post that spawned this very thread:
“displays that have different DPIs”, again, the common definition spelled out.
What, exactly, happens when a window straddles two different monitors is implementation-dependent. On Microsoft Windows and most X systems, the window adopts the scaling factor for the monitor under its center point, and uses that across the whole window. If the monitors are right next to each other, this may cause the window to appear non-rectangular and larger on one monitor than the other. This is satisfactory for millions of people. (I’d be surprised if many people actually commonly straddle windows between monitors at all, since you still have the screen bezel at least right down the middle of it… I’d find that annoying. It is common for window managers to try to snap to monitor boundaries to avoid this, and some versions of Apple MacOS (including the Monterey 12.7.6 I have on my test computer) will not even allow you to place a window between monitors! It makes you choose one or the other.)
edit: just was reminded of this comment: https://lobste.rs/s/oxtwre/hard_numbers_wayland_vs_x11_input_latency#c_1f0zhn and yes that setting is available on my mac version, but it requires a log out and back in. woof. not worth it for a demo here, but interesting that Apple apparently also saw fit to change their default behavior to prohibit straddling windows between monitors! They apparently also didn’t see much value in this rare use case. /edit
On Apple operating systems and most (perhaps all?) Wayland implementations… and some X installs, using certain xrandr settings (such as described here https://blog.summercat.com/configuring-mixed-dpi-monitors-with-xrandr.html), they do it differently: the window adopts the highest scaling factor the window appears on (or is present in the configuration? tbh im not exactly sure), using a virtual coordinate space, then the system downscales that to the target area on screen. This preserves its rectangular appearance - assuming the monitors are physically arranged next to each other and the software config mirrors that physical arrangement… and the OS lets you place it there permanently (but you can still see it while dragging at least) - but has its own trade offs; it has a performance cost and can lose visual fidelity (e.g. blurriness), especially if the scale factors are not integer multiples of each other, but sometimes even if they are because the application is drawing to a virtual screen which is scaled by a generic algorithm with limited knowledge about each other.
In all these cases, there is just one scale factor per window. Doing it below that level is possible, but so horribly messy to implement, massive complexity for near zero benefit (again, how often do people actually straddle windows between monitors?), so nobody does it irl. The difference is the Mac/Wayland approach makes it easier to pretend this works… but it is still pretending. The illusion can be pretty convincing a lot of the time though, like I said in that whole other lobsters link with lina before, I can understand why people like this experience, even if it doesn’t matter to me.
The question isn’t if the abstraction leaks. It is when and where it leaks.
I tried to before posting that since it’s one of those things that I see people talking past each other about everywhere, and virtually all the results I get are… both implementation-specific and kind of useless, because the functional boundary is clearly traced somewhere and different communities seem to disagree on where.
A single display server controlling multiple monitors, each with the same DPI is something that X11 has basically always supported, I’m not sure how that’s controversial. Even before Xinerama (or if your graphics card didn’t work with Xinerama, *sigh*) you could always just set up two X screens, one for each monitor. Same display server, two monitors, different DPIs – glorious, I was doing mixed DPI before everyone was debating it, and all thanks to shitty S3 drivers and not having money to buy proper monitors.
But whenever this is discussed somewhere, it seems that there’s a whole series of implicit “but also” attached to it, having to do with fractional scaling, automatic configuration, what counts as being DPI-aware and whatnot.
So it’s not just something we invented in this thread, it’s something everyone invents in their own thread. In Windows land, for example, where things like getting WM_DPICHANGED when the window moves between monitors are a thing, you can query DPI per window, and set DPI awareness mode per thread, I’m pretty sure you’ll find developers who will argue that the xrandr-based global DPI + scaling system we’ve all come to know and love isn’t mixed-DPI, either.
(Edit:) To be clear – I haven’t used that system in a while, but as I recall, the way it worked was it set a global DPI, and you relied on the display server for scaling to match the viewports’ sizes. There was no way for an application to “know” what DPI/scaling factor combination they were working with on each monitor so they could adjust their rendering for whatever monitor they were on (for their implementation-defined definition of on, “on”, sure – midpoint, immediate transition, complete transition, whatever). Toolkits tried to shoehorn that in, but that, too, was weird in all sorts of ways and assumed a particular setup, at least back in 2016-ish or however long ago it was.
Well, I wouldn’t call that not mixed dpi, but I would call it suboptimal. So it seems you’re familiar with the way it works on Windows: move between monitors or change the settings in display properties, and the system broadcasts the WM_DPICHANGED message to top level windows that opted into the new protocol. Other windows are bitmap scaled to the new factor, as needed.
Applications use the current DPI for their monitor in their drawing commands - some of this is done automatically by the system APIs, others you multiply out yourself. You need to use some care not to double multiply - do it yourself, then the system api does it again - so it is important to apply it at the right places.
Your window is also automatically resized, as needed, as it crosses scaling boundaries, by the system.
Qt/KDE tries to apply similar rules… but they half-assed it. Instead of sending a broadcast message (a PropertyChange notification would be about the same in the X world), they settled for an environment variable. (The reason I know where that is in the source is that I couldn’t believe that’s the best they did…. for debugging, sure, but shipping that to production? Had to verify but yes, that’s what they shipped :( the XWAYLAND extension has proposed a property - see here https://gitlab.freedesktop.org/xorg/xserver/-/merge_requests/1197 - but they couldn’t agree on the details and dropped it, alas) There’s also no standard protocol for opting out of auto scaling, though I think XWAYLAND proposed one too, I can’t find that link in my browser history so I might be remembering wrong.
The KDE window manager, KWin, tries to apply scale as it crosses monitor boundaries right now, just like Windows does, but it seems to only ever scale up, not back down. I don’t know why it does this, could be a simple bug. Note that this is KWin’s doing, not Qt’s, since the same application in a different window manager does not attempt to resize the window at all, it just resizes the contents of the window.
But, even in the half-assed impl, it works; the UI content is automatically resized for each monitor’s individual scale factor. User informs of each monitor’s scale factor, either by position or by port name (again, half-assed, it should have used some other identifier which would work better with hotplugging, but does still work). If a monitor configuration changes, xrandr sends out a notification. The application queries the layout to determine which scaling factor applies to which bounding box in the coordinate space, then listens to ConfigureNotify messages from the window manager to inform them of where they are. Quick check of rectangle.contains(window_coordinate) tells it what scale it has, then this fires off the internal dpi changed event, if necessary. At this point, the codepaths between X and Windows merge as the toolkit applies the new factor. At this point, the actual scaling is done client side, and the compositor should not double scale it… but whether this works or not is hit and miss, since there’s no standardization! (The one nice thing about xwayland is they’re finally dismissing the utter nonsense that X cannot do this and dealing with reality - if the standard comes from wayland, i don’t really care, i just want something defined!)
A better way would be if the window manager sent the scale factor as a ClientMessage (similar to other EMWH messages) as it crosses the boundary, so the application need not look it up itself, which would also empower the user (through the window manager) to change the scale factor of individual windows on-demand - a kind of generic zoom functionality - and to opt some individual windows out of automatic bitmap scaling, even if the application itself isn’t written to support it. I haven’t actually implemented this in my window manager or toolkit; the thought actually just came to mind a few weeks ago in the other thread with lina, but I’d like to, I think it would be useful and a nice little innovation.
As a practical matter, even if the window manager protocol is better, applications would probably still want to fallback to doing it themselves if there is no window manager support; probably query
_NET_SUPPORTED, and if absent, keep the DIY impl.None of this is at all extraordinary. Once I implement mine, I might throw it across the freedesktop mailing list, maybe even the xwayland people, to try to get some more buy-in. Working for me is great - and I’ll take it alone - but would be even nicer if it worked for everybody.
If a framework has to go way out of its way to implement some hack to make it work despite X’s shortcomings, but all the other frameworks don’t support it at all, then X simply doesn’t support this feature.
Also, take a look at @lina ’s videos.
I want to preface this by saying: 1) I run neither X nor Wayland 99% of the time; the kernel’s framebuffer is usually enough for my personal needs; 2) it’s been months since I tried Wayland on any hardware.
That said, the one thing I seemed to notice in my “toe dip” into the Wayland world was pretty problematic to me. When I would start X to run some graphical config tool on a remote machine with no GPU and low-power CPU, it seemed to me that “ssh -X” put almost no load on the remote computer; however, attempting to run the same via waypipe put a lot of load on the remote machine, making it essentially unusable for the only things I ever needed a graphical interface for.
If I’ve incorrectly understood the root cause here, I’d love to have someone explain it better. While I don’t use either one very often, it’s clear that X11 is not getting the development resources wayland is, and I’d like to be able to continue using my workflow decades into the future…
The primary reason for Wayland was the security model, was it not? That I believe is truly unfixable. And if you’ve decided to prioritize that, then it makes sense to stop working on X.
No, the primary reason for Wayland is, in Kristian Høgsberg’s words, the goal of “every frame is perfect”.
And even if security was the priority… X also “fixed” it (shortly before Wayland was born; X Access Control Extension 2.0 released 10 Mar 2008, Wayland initial release 30 September 2008), but it never caught on (I’d argue because it doesn’t solve a real world problem, but the proximate cause is probably that the XACE is all code hooks, no user-friendly part. But that could be solved if anybody actually cared.)
Worth noting that Microsoft had man of the same questions with Windows: they wanted to add a compositor, add elevated process isolation, per-monitor fractional scaling, all the stuff people talk about, and they successfully did it with near zero compatibility breaks, despite Win32 and X sharing a lot of functionality. If it were fundamentally impossible for X, it would have been fundamentally impossible for Windows too.
Security can’t be solved retroactively. You can’t plug all the holes of a swiss cheese.
“Solutions” were nesting X servers into one another and such, at that point I might as well run a whole other VM.
And good for windows, maybe if we would have the legacy X API available for use under Wayland, so that Wayland’s security benefits could apply, while also not losing decades of programs already written, we could also have that for linux.. maybe we could call it WaylandX! [1]
[1] Arguably this name would make more sense
No they weren’t. X11 is a client-server protocol. The only things that a client (app) sees are messages sent from the server (X.org). The default policy was that apps were trusted or untrusted. If they were untrusted, they couldn’t connect to the server. If they were trusted, they could do anything.
The security problems came from the fact that ‘anything’ meant read any key press, read the mouse location, and inspect the contents of any other window. Some things needed these abilities. For example, a compositing window manager needed to be able to redirect window contents and composite it. A window manager that did focus-follows-louse needed to be able to read all mouse clicks to determine which window was the current active one and tell the X server to send keyboard events there. A screenshot or screen sharing app needed to be able to see the rendered window or sceen contents. Generally, these were exceptions.
The X Access Control Extensions provided a general mechanism (with pluggable policies) to allow you to restrict which messages any client could see. This closed the holes for things like key loggers, while allowing you to privilege things like on-screen keyboards and screenshot tools without needing them to be modified. In contrast, Wayland just punted on this entirely and made it the compositor’s problem to solve all of these things.
Microsoft has literal orders of magnitude more people to throw at backcompat work.
They also have literal orders of magnitude more users to create backcompat work in the first place, though.
People are still using xfs? Is there a reason for that?
It’s fast and it has a clear codebase with great authors. For a classic filesystem, XFS is still my choice over ext4.
Although fedora comes with btrfs which has worked great. Raid1 with compression.
I wish all the luck for Kent. I’ll be coming back to bcachefs in a few years.
XFS is quite popular in the server space if I’m not mistaken. At least at GitLab I believe it was the filesystem we ran for everything, though perhaps that has changed since I left.
Around 10 years ago, I chose XFS because it had features I needed that ext4 did not at the time. I don’t recall exactly what those were (64-bit inodes maybe?), but it also performed better with lots of small files, doesn’t require a fsck at pre-determined intervals. And it’s just been rock-solid. It’s like the Debian of filesystems.
It’s solid, stable and fast. It’s boring, but in a good way.
There is more seldom a disk check that delays the boot, compared to ext4. It did not have the raid issues btrfs had. And it’s not as experimental as bcachefs.
I’ve started using it for NixOS because it seems to cope better with its high demand for inodes than ext4 does. It also seems to be faster than btrfs, particularly in VMs for some reason.
My anecdotal evidence as a XFS user since XFSv4 (probably the last 6 years? I lost the count to be honest).
XFS used to be a filesystem only recommended for servers and other systems that had some kind of backup power system to ensure clean shutdown. I used in a desktop for a few months, until a system forced reboot mounted my system read-only, and
xfs_repaircompletely corrupted the file system. But even before that I lost a few files thanks to forced shutdowns. Well, went back to Ext4, and stayed there for a few years.After trying btrfs and getting frustrated with performance (this was before NVMe were common, so I was using a SATA SSD), I decided to go back to XFS and this time not only it solved my performance issues, I hadn’t have problems with corruption or missing files anymore. The file system is simple rock solid. So I still use it by default, unless I want some specific feature (like compression) that is not supported in XFS.
It’s a popular filesystem in some low-latency storage situations such as with Seastar-based software like ScyllaDB here and Redpanda.
We use it at work for our Clickhouse disks. If we could start over I’d have probably gone with ext4 instead as that’s what Clickhouse is mainly tested on. There was some historic instability with XFS but it seems to have gotten better (partly with updates, partly with tuning on our end to minimise situations where the disk is under high load). Like most things XFS is a good choice if your software is explicitly tested against it.
It’s (was?) the default filesystem in at least RHEL/CentOS.
At the time I chose XFS several years ago, I wanted to be able to use things like reflinks without needing to use btrfs (which is pretty stable these days but I wasn’t very confident in it back then). I can certainly say that’s it’s been quite resilient, even with me overflowing my thin pool multiple times (I am very good at disk accounting /s) and throwing a bunch of hard shutoffs at it.
If you often have problems filling up your disk, you are going to have a very, VERY bad time on btrfs. Low disk space handling is NOT stable in btrfs. In fact it is almost non-existent.
After 15 years, btrfs can still get into situations where you’re getting screwed by its chunk handling and have to plug a USB drive in to give it enough free space to deallocate some stuff. Even though
df /reports (<10 but >1) gigabytes of free storage. This blog post was 9 years old when I consulted it, and I still needed all the tips in it.I find it unconscionable that Fedora made btrfs the default with this behavior still not fixed. I will never, ever be putting a new system on btrfs again.
100% this.
I have had openSUSE self-destruct 5 or 6 times in a few years because
snapperfilled the disks with snapshots and Btrfs self-destructed.For me the killer misfeatures are this:
dfdoes not give accurate or valid numbers on Btrfsfsckand the existing Btrfs-repair tool routinely destroys damaged volumes.Any one of those alone would be a deal-breaker. Two would rule it straight out. All 3 means it’s off the table.
Note: I have not yet even mentioned the multiple problems with multi-disk Btrfs volumes.
I have raised these issues internally and externally at SUSE; they were dismissed out of hand, without discussion.
Thank you! Count me along with the camp of “btrfs is great except for when you really need it to be” — low disk being one of those times (high write load being my personal burn moment)
I wanted bcachefs to work but this and related articles are keeping me away from it too.
I force my Fedora installs to ext4 (sometimes atop lvm) and move on with my life :shrug:
This is why I bite the out of tree bullet and just use ZFS. People tell me I’m crazy for running ZFS instead of Btrfs on single disk systems like my laptop, but like, no! I cannot consider Btrfs reliable in any scenario.
I’ve been using ZFSBootMenu on my Fedora single disk laptops for a while now and find it hard to imagine a different setup.
100% agree. I have found DKMS ZFS to be more stable than in-tree btrfs. Other than one nasty deadlock problem years ago it’s been rock solid. (Just some memory accounting weirdness…)
Yeah, I still get bitten by that one once or twice a year. I find btrfs useful for container pools, etc, but I still don’t use it for stuff I can’t easily regenerate.
Count me among the XFS users, albeit only on one machine at this point. I think I set up my current home server (running Fedora) around the same time Red Hat made XFS the default for RHEL, and I wanted to be E N T E R P R I S E. I’ll likely use Btrfs for my next build, as I have for all my laptops and random desktop machines in recent years. Transparent compression is very nice to have.
EDIT: I believe Fedora Server also defaults to XFS, or at least it did at some point.
Last time I mkfs’d (going back a few years now) it had dynamic sized xattr support, and ext4 set a fixed size at creation time. This was important for me at the time for preserving macOS metadata.
I do hope this unified treatment of code generation, generics and type inference under “comptime” becomes standard practice, as it looks very reasonable to the user. In Haskell we have typed TH which comes close, but also has some limitations in the types of terms that can be type-inferred (if you’re into experimental type systems, that is).
As a non-Zig user, my impression is that using comptime as generics has the exact same limitations as C++ templates: the generic code (or at least the usages of the generic type parameter) is not type-checked until it is actually used somewhere in the program, and this means that when you write generic libraries you don’t get static guarantees until you use them with example clients. This will make the experience much worse than proper generics, at scale. I am also worried about the quality of the editor-level type feedback in presence of heavy generics usage, for similar reasons.
(I’ve said this in the past and some Zig maintainers pointed out that Zig works hard to partially type-check code with comptime arguments and that it probably works fine in practice. My intuition rather stubbornly tells me that this will be very annoying in practice when used at scale.)
The problem is that when you say
comptime T : type, you don’t give any static information about what the code below actually assumes aboutT. If it handlesTas a completely generic/opaque type on which nothing is known, this is fine. But in practice most code like this will assume things aboutT, that it has certain fields, support certain operations, etc., and it will work fine because it will be used by callers with types that match these assumptions. But these assumptions are not made explicit in the generic function, and thus they cannot be reasoned about statically.What makes generics hard in most languages is the desire to type-check assumptions about them statically. For example, if a function is generic over a type-former (a parametrized type) such as
List, maybe you want to use subtyping in the body of the function, and so the type-system designers have to come up with a small static language to express variance assumptions about generic type-former parameters – it is one of the complex and annoying parts of Java generics, for example. They could also give up and say “well let’s just check on each usage type that the subtyping assumptions is in fact correct”, this would be much simpler design-wise, and the ergonomics would be much worse.Maybe “worse is better” and having a simple type system with worse ergonomics is indeed a good idea that will become standard practice. (It certainly helps in lowering the entry barrier to designing type systems, and it possibly makes it easier for programmers to be confident about what is going on.) But I remain skeptical of such claims, especially when they are formulated without acknowledging the notable downsides of this approach.
As a Zig user, fully agree with all of the above! Some extra thoughts:
While I am 0.9 sure that for simple-to-medium cases, declaration side type-checking leads to better ergonomics, I am maybe at 0.5 that there’s complexity tipping point, where call-site checking becomes easier to reason about for the user. In other words, I observe that in languages with expressive generics, some libraries will evolve to try to encode everything in the type-signatures, leading to a programming style where most of the code written manipulates types, instead of doing the actual work. I’ve certainly seen a number of head-scratching Rust signatures. Here’s a recent relatively tame example of this sort of dynamics playing out: https://github.com/rust-lang/rust/pull/107122#issuecomment-2385640802.
I am not sure that just doing what Zig does would magically reduce the overall complexity here, but it seems at least plausible that, at the point where you get into the Turing tarpit when specifying function signatures, it might be better to just use the base imperative language for types?
When comparing with C++, it’s worth noting that you get both instantiation-time type-checking and a Turing tarpit. A big part of perceived C++ complexity is due to the fact that the tools of expressiveness are overloading, ADL, and SFINAE. Zig keeps instantiation-time checking (or rather, dials it up to 11, as even non-generic functions are checked at call-site), but also simplifies everything else a lot.
Another dimension to think about here is crates.io style packages. It seems that declaration-checking plays a major role in SemVer — semantic versioning starts with defining what is and what is not your API. But, at the same time, the resulting ecosystem also depends on culture of making changes, not only on technical means to enforce it. And Zig’s package manager/build systems is shaping up to be the best-in-class general purpose small-scale dependency management solution. I am extremely curious what the ecosystem ends up looking like, after the language stabilizes.
Could you say a few words (or point us to some documentation) on what you think makes Zig’s package manager/build system the best?
There are no docs! As a disclaimer, Zig is a work-in-progress. If you want to just use the thing, it’s much too early for that, come back five years later!
That being said, why I am excited about a hypothetical Zig ecosystem:
First, Zig aims to be dependency zero. One problem that is traditionally hard in this space is how do you get the environment that can execute the build/packaging logic? There’s a lot of tools that, eg, depend on Python, which make building software at least as hard as provisioning Python. Another common gratuitous dependency is sh/bash and core utils. Yet another option is JVM (gradle, bazel).
In contrast, zig is a statically linked binary that already can execute arbitrary scripts (via
zig run) and can download stuff from the internet (viazig fetch). That is big! If you can run stuff, and can download stuff to run from the internet, you can do anything with no headache. What’s more, it’s not confined to your build system, you can write normal software in Zig too (though, tbh, I am personally still pretty skeptical about viability of only-spatially-memory-safe language for general purpose stuff).Second, I think Zig arrived at the most useful general notion of what is a dependency — a directory of files identified by a hash. From the docs:
There’s no special casing for “Zig” dependencies. You use the same mechanism to fetch anything (eg, in TigerBeetle we use this to fetch a prebuilt copy of
llvm-objcopy). I expanded on this a bit in https://matklad.github.io/2024/12/30/what-is-dependency.html. inb4 someone mentions nix: nix can do some of this, but it is not a good dependency zero, because it itself depends on posix.Third, the build system is adequate. It uses general purpose imperative code to generate a static build graph which is then incrementally executed. This feels like the least sour spot for general purpose build systems. While you get some gradle-vibes from the requirement to explicitly structure your build as two phases, the fact that it all is simple procedural code in a statically-typed language, rather than a DSL, makes the end result much more understandable. Similarly, while static build graph can’t describe every imaginable builds, some builds (at lest at a medium scale) are better left to imagination.
Fourth, Zig is serious about avoiding dependencies. For example, cross compilation works. From windows, you can build software that dynamically links a specific version of glibc, because the Zig folks did the work of actually specifying the ABI of specific glibc version. This, combined with the fact that Zig also is a C/C++ compiler, makes it possible to produce good builds for existing native software.
I like the theoretical idea of Zig being dependency zero, but in practice this ends up being horrible: if your toolchain is your bootstrap point, you’re chained at the waist to whatever version of the compiler you happen to have installed. Compare this to
rustup, which allows a single installation but will auto-detect and install the toolchain version required for each project. It’s not justrustupeither: there’s a reason that games (Veloren for example) separate the main body of the code from their installer/launcher: it allows the former to have a higher update cadence than the latter without enormous annoyance for the user.One elegant solution is to not install anything! I don’t have
zigin my PATH, I always use./zig/zigto run stuff. For example, hacking on TigerBeetle isHaving a tiny .sh/.bat to download the thing is not elegant, but is not too bad. Certainly simpler than rustup installer!
Actually, I should have added “Zig promotes local installs” to the list above: Zig’s pretty clear that
make installis generally a bad idea, and that you should install stuff locally more often.And then, as kristoff says and Go demonstrates, nothing prevents the toolchain from updating itself. Zig already JIT some lesser used commands (compiler ships its components in the form of the source code), it certainly can learn to upgrade itself.
build.zig.zon(the Zig equivalent ofpackage.json) supports specifying a minimum version supported of the compiler toolchain. This field is currently not used by the toolchain, but there’s a proposal to have Zig download another copy of Zig when the installed version doesn’t satisfy the constraint declared inbuild.zig.zon.This is not true, the toolchain can also be responsable to manage versions, especially since everything in Zig is also compiled statically, a.k.a. how Go does: https://kokada.dev/blog/quick-bits-go-automatically-downloads-a-newer-toolchain-if-needed/.
But even without adding this support in the toolchain itself, you could have a zigup project responsable to auto-detect and install the toolchain in the required version of each project.
That’s part of the issue with what Wedson presented, and that Ts’o reacted to. Wedson had a very nice slide with a call that returned a very extensive type definition and claimed that was good.
fn get_or_create_inode(&self, ino: Ino) -> Result<Either<ARef<INode<T>>, inode::New<T>>>.Don’t get me wrong, Ts’o’s reaction was very bad. He should have behaved better. But on the technical merit, I think Wedson missed the mark.
I can’t see similar thing happening with Zig (at least not anytime soon) - not because you can’t do it, but because the ecosystem around the language seems to be allergic to cramming all things into the type system. Comptime presents itself as an easy escape valve to keep things simple.
What signature would you have preferred?
This feels similar-lish:
https://github.com/ziglang/zig/blob/6a21d18adfe9ae4ff7f4beacbd4faed4d04832b8/lib/std/mem.zig#L4079
But than, it needs to be compared with safe transmute which is quite an infra…
I am still not sure what I think here. The signature is quite impenetrable either way! But then, the fact that I can just write the logic for “is this reasonable to transmute?” like this is neat:
https://github.com/tigerbeetle/tigerbeetle/blob/5b485508373f5eed99cb52a75ec692ec569a6990/src/stdx.zig#L312
As a Go user, a lot of problems other languages solve with complex type constraints I see get solved with
func f(x any) { if !matchConstraint(x) { panic("type must foo and bar") …. (I do it in my own code here.) In practice it usually isn’t a problem because you catch the panics with even minimal testing. It is unsatisfying though.Using
anyhas lots of downsides though, for one you lose the help of the compiler and your tooling to e.g. auto-complete a method. It is fine for a self contained method in a relatively small code base, but starts to get hairy as your code increases in complexity.Wait until you learn about languages where there is nothing except any! It’s a crazy world out there.
I feel like this doesn’t entirely preclude better ergonomics though, at least if there were something like C++ concepts. Then you’d at least be able to observe the signatures to see what the expectations of the type are.
IME this actually doesn’t affect the day-to-day ergonomics that much; ADL fails are usually pretty obvious in client code unless you’re doing some REALLY cursed library stuff, and SFINAE errors are actually pretty decent these days. The big thing that wasn’t fixed until concepts was just goofs like “I thought I had a map and not a vector so I passed a value type as the allocator” and suddenly you have like a zillion errors and need to fish out the actual goof. Zig…well it kinda fixes that w/ generally shorter instantiation traces, due to enhancements like “actually having loops”, but it’s still not super great.
Note: while writing this reply I ended up encountering this blog post, Zig-style generics are not well-suited for most languages, which makes basically the same point.
I personally don’t find this to be a problem in C++ … Whenever I write generic code, I write a unit test, with an instantiation
IME, that completely solves the problem, and it’s not like I didn’t need to test that code …
This suggests that your programming style around generics is fairly simple, and therefore easy to test – there is not a lot of conditional logic in your generics that would require several distinct tests, etc. You would also do just fine in Zig if you were to write similar code. This is good news!
But several members of the C++ design community have spent a decade of their life working on C++ Concepts to solve these issues (the first proposal started in 2005-2006 I believe, it was planned in C++0x that became C++11, and then dropped because too complex, and then “Concepts Lite” appeared in 2016 but were rejected from C++17 and finally merged in C++20). I believe that this dedication comes from real user-stories about the perils of these aspects of C++ templates – which are largely documented online; there was a clear perceived need within the C++ community that comes from the fact that a lot of template code that many people are using was in fact much more complex than yours and suffered from these scaling issues.
Yeah definitely, I think it’s best to keep to simple generic code.
I don’t find the philosophy of “maxing out” compile time in C++ to be effective, and I don’t see the programmers I admire using it a lot, with maybe a few exceptions. (e.g. Carmack, Jeff Dean, Bellard, DJB, don’t really care about compile time programming as far as I can tell. They just get a lot of work done) There was also a recent (troll-ish) post by Zed Shaw saying that C++ is fun once you push aside the TMP stuff
All of Oils is written with Python as the metaprogramming language for C++, with textual code gen. Textual code gen takes some work, but it’s easy and simple to reason about.
IMO, it’s nicer than using the C preprocessor or using the C++ template system. (Although we also use the C++ template system for a few things – notably the compiler is the only thing that has access to certain info, like sizeof() and offsetof() )
The main thing that would make it better is if the C++ type system didn’t have all these HOLES due to compatibility with C! I mentioned that here:
https://www.oilshell.org/blog/2024/09/retrospective.html#appendix-more-viewpoints
https://lobste.rs/s/wtk2rk/types_comparison_rust_zig#c_5lbiuf
(although I also forgot to mention that the C++ type system is extremely expressive too, what I called “hidden static expressiveness”)
The other main downside is that you need a good build system to handle code gen, which is why I often write about Ninja!
So I might think of comptime as simply using the same language, rather than having the Python/C++ split. I can see why people might not like that solution, and there are downsides, but I think it works fine. The known alternatives have steep tradeoffs.
Are you familiar with Terra? https://terralang.org/
Sounds like you rolled something similar yourself, with Python as the meta language for C++.
Yeah I always found Terra very interesting, it’s linked here: https://github.com/oils-for-unix/oils/wiki/Metaprogramming
Unfortunately the implementation is “research-grade”: https://erikmcclure.com/blog/a-rant-on-terra/
Metaprogramming has a pretty common taxonomy where you decide which part of the compiler pipeline you hook into:
I don’t think any way is strictly better than the others – they all have tradeoffs.
But we are doing #1 and I think Lua/Terra is more like #4, or #3.
But spiritually you can write the same kinds of programs. It just means that we end up generating the source code of C++ functions rather than having some kind of API to C++.
Of course you often write little “runtime” shims to make this easier – that is sort of like your API to C++. The garbage collected data structures are the biggest runtime shim!
I do still think we need a “proper” language that supports this model.
It could be YSH – What if the shell and the C preprocessor were the same language? :-)
comptimeextends this to all code; anything not reachable from anexportas a general rule. It’s what allows for the dependent-esque decision making, cross-compilation/specializing using normal control flow, and general reflection. Without it, other langs default to a secondary declarative system like Rust’s#[cfg()]or C’s#ifdef. It’s a bit restrictive though, so a higher level build script (like proc-macros in Rust) is used for similar comptime effect.They technically can given comptime reflection (i.e.
@typeInfo) is available. It just ends up being quite verbose so in practice most rely on duck typing instead.Ergonomics ok for now given you can
x: anytype. What sort of environments do you see it causing the most annoyance? Im thinking maybe for cases where people learn exclusively through an LSP.I’m not saying that all uses of
comptimeare bad, and maybe it’s nice when it replaces macros for conditional compilation. I was pointing out that it is probably not the magic answer to all problems about “generics and code inference” that should become “standard practice”.I would expect the usual downsides of late-type-checking of C++ templates to show up:
Tprovided does not offer operationfobar, is it a typo in the template code or a mistake of the caller? If you write generic code with, say, type-classes or traits (where the expected operations are explicitly listed in the class constraint / trait bound present in the generic code), you can tell when there is a typo in the generic code and not blame the caller.Maybe Zig has (technical or social) solutions to some or all of these problems, and/or maybe the people explaining how great
comptimeis are too naive about this. If there is some secret sauce that makes this all work well, then it should be carefully explained and documented along the explanation of how simplecomptimeis; this is important in the context of encouraging other languages to adopt this approach (as done in the post I was replying to), they need to also understand the pitfalls and how to avoid them.The first point is sometimes an issue for cross-compilation especially. Zig’s ability to do so from any machine (
-target whatever) makes it easier to test locally but in practice this error is often caught by CI.Error messages are surprisingly readable; “Type does not provide operation” is usually the caller’s fault (genuinely never seen it be the callee’s - whats an example of that?) and can be figured out through docs or variable naming. A good example of this is forgetting to wrap the format args parameter for
std.fmt.formatin a tuple.Comptime speed does indeed become noticeable in larger projects. But its primarily due to reflection and constexpr execution rather than type-checking (AFAIK that part alone is always fast even for multiple nested instantiations).
I dont think there’s secret sauce. You tend to either 1. not run into these 2. figure them out intuitively / with a little help due to lacking documentation or 3. cannot adjust / dont prefer it, having come from other langs. After the first or second time hitting them, it becomes a non-issue (like eager-PRO). It’s similar to how zig-folk recommend reading the stdlib to learn the language; wouldn’t really be a good idea other languages like C++, Rust, etc. but makes perfect sense (and works) in Zig.
I once tried using stdlib JSON decoder to decode some structure that contained
std.hash_map.HashMap. Let’s say that the error wasn’t clear at all why it is happening and how I can resolve it. It is especially painful when it happen deeply within some obscure internals of something that is mostly out of your control.Zig is nice, but yeah, their generic errors make me remember old C++ templating failures.
Curious what it looked like. The worst cases IME are when it doesnt print the trace due to compiler bugs or when it gives
error: expected type 'T', found 'T'which is pretty unclear. Or the slightly easier (but still annoying) case ofexpected T(..., a, ...), found T(..., b, ...).Imaginary scenario: for scientific programming, there is a naming convention for types that satisfy a certain set of operations, that comes from the important (imaginary) library ZigNum. Your project is using library Foo, which implements datatype-generic algorithms you care about, and also ZigNum directly – calling generic functions from both on the same types. The next version of ZigNum decides, for consistency reasons, to rename one of the operation (their previous name for the cartesian product was a poor choice). Then you code starts breaking, and there are two kind of errors that are displayed in exactly the same way:
If ZigNum would export a [type-class / module interface / concept / trait] that describes the operation names it expects, then compiling Foo against the new version of ZigNum would have failed, blaming the code (in Foo) that needs to be updated. Instead the error occurs in your own user code. If the person encountering this failure happens to be familiar with ZigNum and Foo, they will figure things out. Otherwise they may find this all fairly confusing.
Is ZigNum here an interface or an implementation? Having difficultly following the confusion: “A uses B. I just updated B and compiler says A started breaking on B stuff. I probably need to update A too.” seems like a fair reaction.
The error would occur in Foo, but zig errors are stack traces so the trace would lead back down to your code. This scenario however still looks like its the caller’s fault for passing incompatible types to a library. Something similar can also happen when you compile a zig 0.11.0 codebase (with dependencies also expecting a 0.11.0 stdlib) using a zig 0.13.0 compiler.
Well articulated and leads to a fascinating bit of reasoning.
First thought is, “well, you need multiple phases then.” Execute the comptime code, settle on what it produces, and then typecheck the code that relies on it.
But a moment’s thought shows that: (a) you’d eventually need 3 levels, sometimes 4, etc. (b) … and this is really just plain old code generation!
So we of course face the same fundamental issue as always. Wording it in terms of Lisp macros, you need to be able to flip back and forth between using the macro, and seeing/understanding the full expansion of the macro.
What we need is an outstanding solution to that fundamental problem.
C++ is gradually adding things that you’re permitted to do in
constexprcontexts. You can now allocate memory, though it must be destroyed at the end of the constant evaluation. Generalising this much further is hard because the language needs to be able to track pointers and replace them with relocations, which is not possible in the general case for C/C++ (pointers can be converted to integers and back and so on). That’s mostly solvable, but there are also problems with things like trees that use addresses as identity, because even the relationship between the addresses at compile time isn’t constant: if I create two globals and put them in a tree, I after COMDAT merging and linking I don’t know which will be at a lower address.Being able to merge the expression and type languages is very nice, but you either need to be able to create types at run time, or have a subset of the expression language that is only available at compile time. Smalltalk takes the former approach, C++ is quite heavily entrenched in the latter.
Alex/Carson – you’ve been implicitly pushing the idea that “complete” is a worthy goal, and this finally makes it explicit. Software driven by a cohesive philosophy always shines, and htmx is no exception. I’m very appreciative of what you are doing, along with the rest of the htmx contributors.
any other examples of such software available in public?
sqlite?
I don’t know if it was ever made formal policy, but I seem to remember that at one point a lead maintainer of RSpec opined that version 3 was basically the final form of the project. Upgrades have been pleasantly unexciting for about a decade now.
It’s not that they never add features. It’s that, at least since sqlite 3 (so for the past 20 years):
I think sqlite is very much an example of software driven by the cohesive philosophy that @jjude asked about. As @adriano said, it’s not necessarily feature complete, but features are added very carefully and deliberately. There aren’t many things I’m as confident using as I am in sqlite. It makes me happy that htmx (another thing I like a lot) aspires to that, but it’s got to keep going a long time to prove it’s in the same league. (I suspect it will.)
I’m not sure whether sqlite has a cohesive philosophy, but note that @jjude’s question, as I understand it, is about software with a cohesive philosophy; not necessarily software with feature completeness as its philosophy.
If I were to guess what the sqlite authors’ philosophy might be, it’s that the world needs a high quality SQL implementation that remains in the public domain.
TeX - Knuth said many many times that it was feature-complete.
And yet the default configuration flames anyone daring to have a non-ASCII character in their own name.
Stability also means that defaults in many cases can’t be changed, otherwise you could break existing users.
Thus highlighting the issues with “feature-complete for stability’s sake”.
Things change. It is the one constant. If the software is static in a sense, then rolling a new major version or forking with this kind of “fix” is both reasonable and necessary for the long term needs.
From the answers it seems the boring tech would be: go + htmx + sqlite.
Thanks.
Common Lisp, I think. If I were starting a fresh Web project I’d look to Common Lisp + htmx. Probably SBCL.
Emacs. You can often find a package written (and forgotten) easily over ten years ago and it’s highly likely it will just work.
Go. Python used to be like that too before it got huge.
I also have rosy memories of the last good Python in my mind (2.5), but realistically, it was always accreting features at pretty fast rate. It’s just a lot of people got stuck on 2.7 for a long time and didn’t observe it.
You have to go back further than that IMO. Pre-2.0 was when you could argue that Python was more or less adhering to its own Zen. To me the addition of list comprehensions marks the time when that ship has set sail.
I would say the “unix philosophy” is the most central one, guiding hundreds of terminal apps in the POSIX standard set and beyond.
Common Lisp, perhaps?
Hare, although it’s not finished yet. https://harelang.org/blog/2023-11-08-100-year-language/
Didn’t Paul Graham once say that about his lisp?
One thing that I found interesting is that many changes in each Ruby release would be considered a big no for any other language (for example the “Keyword splatting nil” change) since the possibility of breaking existing code is huge, but Ruby community seems to just embrace those changes.
I always think about the transition between Python 2 and 3 that the major change was adopting UTF-8 and everyone lost their minds thanks to the breakage, but Ruby did a similar migration in the version 2.0 and I don’t remember anyone complaining.
I am not sure if this is just because the community is smaller, if the developers of Ruby are just better in deprecating features, or something else. But I still find interesting.
Ruby’s version of the Python 2 to 3 experience (by my memory) came years earlier, going from 1.8 to 1.9. It certainly still wasn’t as big of an issue as Python’s long-lingering legacy version, but it was (again, my perception at the time) the Ruby version that had the most lag in adoption.
Yes, and it was very well managed. For example, some changes were deliberately chosen in a way that you had to take care, but you could relatively easy write Ruby 1.8/1.9 code that worked on both systems.
The other part is that Ruby 1.8 got a final release that implemented as much as the stdlib of 1.9 as possible. Other breaking things, like the default file encoding and so on where gradually introduced. A new Ruby version is always some work, but not too terrible. It was always very user centric.
It was still a chore, but the MRI team was pretty active at making it less of a chore and getting important community members on board to spread knowledge and calm the waves.
Honestly, I think Ruby is not getting enough cred for its change management. I wish Python had learned from it, the mess of 2 vs 3 could have been averted.
Yep, that’s my take too. IIRC 1.9 had a number of breaking API changes which were really low value. For instance, File.exists? -> File.exist?
File.exists? started emitting deprecation warnings in Ruby 2.1 (2013) and was finally removed in Ruby 3.2 (2022)
I guess IDRC!
I feel like Python was pretty deeply ingrained in a bunch of operating systems and scripts that was excruciating to update.
Ruby is mostly run as web apps
Interesting POV. As a long-time Rubyist, I’ve often felt that Ruby-core was too concerned with backwards compatibility. For instance, I would have preferred a more aggressive attempt to minimize the C extension API in order to make more performance improvements via JIT. I’m happy to see them move down the path of frozen strings by default.
Like others already said, the Ruby core team stance is almost exactly the opposite: it is extremely concerned with backward compatibility and not breaking existing code (to the extent that during discussion of many changes, some of the core team members run
grepthrough the codebase of all existing gems to confirm or refute an assumption of the required change scale).As an example, the string literal freezing was discussed for many years, attempted before Ruby 3.0, was considered too big a change (despite the major version change); only pragma for opt-in was introduced, and now the deprecation is introduced in the assumption that the existence of pragma prepared most of the codebases for the future changes. This assumption was recently challenged, though, and the discussion is still ongoing.
Keyword splatting nil change might break only the code that relies on the impossibility of the
nilsplatting, which is quite a stretch (and the one that is considered acceptable in order to make any progress).This seems like really easy code to write and accidentally rely on.
If nil was expected but was just rolled up into the general error handling, this code feels very easy to write.
Well… it is relatively easy to write, yes, but in practice, this exact approach (blanket error catching as a normal flow instead of checking the argument) is relatively rare—and would rather be a part of an “unhappy” path, i.e., “something is broken here anyway” :)
But I see the point from which this change might be considered too brazen. It had never come out during the discussion of the feature. (And it was done in the most localized way: instead of defining
nil.to_hash—which might’ve been behaving unexpectedly in some other contexts—it is just a support for**nilon its own.)I have to doubt that. It’s extremely common in Python, for example, to catch ‘Exception’ and I know myself when writing Ruby I’ve caught
StandardError.I don’t have strong opinions.
I don’t mean catching
StandardErroris rare, I mean the whole combination of circumstances that will lead to “nilwas frequently splatted there and caught byrescue, and now it is not raising, and the resulting code is not producing an exception that would be caught byrescueanyway, but is broken in a different way”.But we’ll see.
But this doesn’t really matter, because there are always huge proprietary codebases that are affected for every change and you can’t run grep on them for obvious purposes. And those are the people that generally complain the most about those breaking changes.
Well, it matters in a way that the set of code from all existing gems covers a high doze of possible approaches and views on how Ruby code might be written. Though, of course, it doesn’t exclude some “fringe” approaches that never see the light outside the corporation dangeons.
So, well… From inside the community, the core team’s stance feels like pretty cautious/conservative, but I believe it might not seem so comparing to other communities.
It doesn’t seem anything special really. Of course Python 2 to 3 was a much bigger change (since they decided “oh, we are going to do breaking changes anyway, let’s fix all those small things that were bothering us for a while”), but at the tail end of the migration most of the hold ups were random scripts written by a Ph. D. trying to run some experiments. I think if anything, it does seem to me that big corporations were one of the biggest pushers for Python 3 once it became clear that Python 2 was going to go EOL.
I’d say that the keyword splatting nil change is probably not as breaking as the frozen string literal or even the
itchange (though I do not know the implementation details of the latter, so it might not be as breaking as I think). And for frozen string literals, they’ve been trying to make it happen for years now. It was scheduled to be the default in 3 and was put off for 4 whole years because they didn’t want to break existing code.Over the years I feel like Ruby shops have been dedicated to keeping the code tidy and up-to-date. Every Ruby shop I’ve been at has had linting fail the build. Rubocop (probably the main linter now) is often coming out with rule adjustments, and often they have an autocorrect as well making it very easy to update the code. These days I just write the code and rubocop formats and maybe adjusts a few lines, I don’t mind.
From what I remember, UTF-8 itself wasn’t the problem— most code was essentially compatible with it. The problem was that in Python 2 you marked unicode literals with
u"a u prefix", and Python 3 made that a syntax error. This meant a lot of safe Python 2 code had to be made unsafe in Python 2 in order to run in Python 3. Python 3.3 added unicode literals just to make migrations possible.On top of that, Python 3 had a lot of other breaking changes, like making
print()a function and changing the type signatures of many of the list functions.As someone who was maintaining a python package and had to make it compatible with 2 and 3, it was a nightmare. For instance the
try/exceptsyntax changed.Python 2
Python 3
Basically the same thing but both are syntax error in the other version, that was a nightmare to handle. You can argue the version 3 is more consistent with other construct but it’s hard to believe it would have been particularly hard to support both syntax for a while to ease the transition.
Ruby change way more things, but try its best to support old and new code for a while to allow a smooth transition. It’s still work to keep up, but it’s smoothed out over time making it acceptable to most users.
It’s been a while, and I was just starting out with Python at the time, so take this with a grain of salt, but I think the problem was deeper than that. Python 2’s unicode handling worked differently to Python 3, so even when Python 3 added unicode literals, that didn’t solve the problem because the two string types would still behave differently enough that you’d run into compatibility issues. Certainly I remember reading lots of advice to just ignore the unicode literal prefix because it made things harder than before.
Googling a bit, I think this was because of encoding issues — in Python 2, you could just wrap things in
unicode()and the right thing would probably happen, but in Python 3 you had to be more explicit about the encoding when using files and things. But it’s thankfully been a while since I needed to worry about any of this!My recollection at Dropbox was that UTF8 was the problem and the solution was basically to use mypy everywhere so that the code could differentiate between utf8 vs nonutf8 strings.
In my experience the core issue was unicode strings and removing implicit encoding / decoding was, as well as updating a bunch of APIs to try and clean things up (not always successfully). This was full of runtime edge cases as it’s essentially all dynamic behaviour.
Properly doing external IO was on some concern but IME pretty minor.
This is why I said the “major” change was UTF-8. I remember lots of changes were trivial (like making print a function, you could run
2to3and it would mostly fix it except for a few corner cases).To me, the big problem wasn’t so much to convert code from 2 to 3, but to make code run of both. So many of the “trivial” syntax changes were actually very challenging to make work on both versions with the same codebase.
It was a challenge early on, after ~3.3 it was mostly a question of having a few compatibility shims (some very cursed, e.g. if you used exec) and a bunch of lints to prevent incompatible constructs.
The string model change and APIs moving around both physically and semantically were the big ticket which kept lingering, and 2to3 (and later modernize/ futurize) did basically nothing to help there.
It wasn’t an easy transition. As others said, you’re referring to the 1.8-1.9 migration. It was a hard migration. It took around 6-7 years. An entirely.new VM was developed. It took several releases until.there was a safe 1.9 to migrate to, which was 1.9.3 . Before that, there were memory leaks, random segfaults, and one learned to avoid APIs which caused them. Because of this, a big chunk of the community didn’t even try 1.9 for years. It was going so poorly that github maintained a fork called “ruby enterprise edition”, 1.8 with a few GC enhancements.
In the end , the migration was successful. That’s because, once it stabilised, 1.9 was significantly faster than 1.8 , which offset the incompatibilities. That’s why python migration failed for so long: all work and no carrot. For years, python 3 was same order of performance or worse than python 2. That only changed around 3.5 or 3.6 .
Fwiw the ruby core team learned to never do that again, and ruby upgrades since 1.9 are fairly uneventful.
Minor correction: Ruby Enterprise Edition was maintained by Phusion (who did Passenger), not GitHub.
Ruby 2 was a serious pain for many large projects. Mainly with extensions behaving slightly differently with the encoding. I remember being stuck on custom builds of 1.9 for ages at work.
I have a really mixed feeling about built-in TOML support in Python. It came from the
tomliproject that, for reasons beyond my understanding, was split in two sub-projects:tomliitself for parsing andtomli-wfor printing. The now-official tomllib module only contains a parser. You still need to install tomli-w if you want for printing, so the module doesn’t follow the familiarload(s)anddump(s)API — it only provides theload(s)part.Well, I have a mixed feeling about TOML itself. It’s neither obvious nor minimal in its current version, it has really weird ideas — my pet peeve is the requirement not to have line breaks in inline tables (
foo = { bar = 1,\n baz = 2 }must cause a parse error). But from widely-known and widely-supported formats, I’ll take it over YAML any time.I wish UCL had seen more love. There are a few things I don’t like. In particular, it’s tied to the JSON model and so things like durations and sizes are just lowered to numbers, which is annoying loss of units (if you write 1 kb in a field that expects a duration, you will read it as 1024 seconds).
The thing UCL gets right is having a composition model built in. You can define two UCL files and how to combine them, so you can have things like defaults, per-system default config, and per-user config with well-defined merging rules. This makes it very easy to build immutable system images: user config is not part of the system image and can be in a completely separate path, but the defaults are all visible in the default config file.
The implementation is not the best. I’d love to see Rust and Go reimplementations, and C bindings for the Rust version, and a v2 UCL that has a richer data model (dates, durations, file sizes, and 64-bit integers as first-class citizens, in particular).
The biggest problem now seems to be the coordination problem of how to get off of YAML, which everyone agrees is bad but no one can agree what to replace it with. Maybe some large project could create SYAML (Sane subset of YAML) and get it widely adopted, but I’m not hopeful.
Apple open sourced something recently which has a good set of data types (a superset of the property list types) but doesn’t do the composition thing as well as UCL. My requirements for the format are:
A lot of things give me 60-80% of this. I’d love to get a group of people who care about config files together to define something that meets 100% of them plus the important ones that I’ve missed. None of the existing formats can win because they’re all only solving part of the problem and so do not work well in the situations where the other parts of the problem are more important.
Try these:
As well as I understand, all they can generate JSON and other simple formats
I mean, that’s more or less what StrictYAML is. I have no idea how widely adopted it is though.
UCL looks nice. https://kdl.dev/ is also a good choice.
I feel that that TOML, which does not support nested data structures well, gets more traction, though I agree that many applications do not need nested data structures.
The reason that the stdlib has no support for writing is because it is not obvious what to do with e.g. comments and preserve the style (should it preserve the current indentation of the file? should it reformat?). Here is the PEP talking about why not support writing.
And the reason for support is pretty obvious:
pyproject.toml. Beforetomlibthere was no way to parse thepyproject.tomlusing the stdlib (while parts of stdlib had to parse TOML already, e.g.pip). The fact that you can also usetomlibto parse other TOML files is a plus.If the decision whether to preserve the current indentation of original JSON files in
json.dumps()was never a question, I don’t get how it’s suddenly more important for TOML. ;)Printing Python data structures that don’t contain objects or functions as TOML is exactly as straightforward as printing them as JSON. Style-preserving libraries is a whole different genre anyway.
*.dumps()has no concept of the “original file”, its input is a Python data structure.Because TOML is supposed to be edited by humans, while JSON doesn’t. Opening a TOML file that you wrote with Python, just to see your comments vanish would be bad.
The use cases are different so there are different considerations.
It would be bad if a developer of an application that is supposed to preserve comments and style when handling TOML files used a library that wasn’t designed to do that. It would also be bad if a library didn’t document that.
The
load(s)part oftomlidoesn’t preserve semantically-insignificant information so it’s a limited implementation as well. And that’s fine because for the purpose of mapping TOML data to Python values, comments and formatting are not important. But they are also unimportant for mapping Python types to TOML files because Python values have no concept of comments or formatting attached to them.All good points, but this doesn’t change that someone needs to talk about those concerns and someone needs to have the desire to implement a library even considering those concerns. And if you look at the PEP link that I posted, nobody wants to talk about it or implements it. So the current solution, that I found perfectly fine.
The author proposes that JSON parsers should “accept comments”, without specifying what that means. Should we care about interoperability between different JSON parsers?
For a different perspective, read Parsing JSON is a Minefield 💣. Quote:
My contribution is that non-standard extensions like “comments”, especially in the absence of a precise specification, make this problem worse. Different parsers that “accept comments” will probably not agree on comment syntax, and that this may cause problems or even open security holes.
For example, consider a Javascript “//” comment that continues to the end of the line. If the comment contains a CR character (aka ‘\r’ or U+000D), and the CR is not immediately followed by NL (aka ‘\n’ or U+000A), then does the CR terminate the comment, or does the comment continue to the next NL character?
It’s not just theoretical. I skimmed the code – the linked parsers seem to have exploitable differences in how they handle line endings. In addition to \r and\n, Json5 supports \u2028 and \u2029 to end lines, dkjson.lua does not. Therefore:
will be interpreted differently by the two parsers. Swift Foundation implements a 3rd behavior; it assumes ‘\n’ and ‘\r\n’ are the only valid line endings, so:
will be a 1 entry object when parsed by JSON5, dkjson.lua, and Sqlite, but empty when parsed by Swift.
I couldn’t be bothered to dig through the Jackson code.
I think there are 3 different use cases for JSON here:
_comment: "my comment"), since you never know if the other implementations will have similar behaviorThis also depends on the context. For the third use case in many cases if you have the privilege of setting a configuration file you already lost.
Using comments in your JSON doesn’t mean including comments in spec-compliant JSON!
The gift the JSON spec is giving you is that by not allowing comments it’s telling you what to do with the comments: throw them in the garbage in some step before they get to the real JSON parser. In this way, many kinds of comments can be supported (properly, without vulnerability) because once they’re gone they really can’t possibly mean anything (solving the security problem).
JSON is a very forgiving language syntactically so
%#//and/* */<!-- -->are all equally valid ways to add unambiguous comments to your JSON. That is to say, since these syntaxes don’t conflict with JSON’s own syntaxes, there’s no chance of a real confusion where a comment is interpreted as valid data or visa versa. The fact that JSON absolutely rejects comments and other unexpected syntax is, paradoxically, what means that it works with every flavor of comments so long as you can define what it means to strip them out.I think you missed the point, which was that anyone implementing such a lax parser, whether that’s via a “stripping function” or a parser which skips them natively, will do it in a slightly different way and this can cause conflicts which may be exploited. This would simply add more problems to the existing security issues JSON already has, like where parsers disagree on what to do with non-unique keys.
Can you provide a concrete example? In my mind the vulnerability is completely addressed by a stripping preparser (in a way that it is very much not addressed by a lax parser)
orib’s example of existing JSON5 parsers parsing comments differently could just as easily happen with multiple comment-stripping JSON pre-parsers. The pre-parsers could disagree on whether to strip text after a
//and after one of\u2028or\rbut before\n.I see, yeah, that could happen. It would certainly be less of a risk if the pre-parser was highly reusable, even across different languages
To be specific the reason the problem goes away is that after the stripping parser is done the 100% formal very standard JSON parser takes over, bringing with it the exact same set of security guarantees that we have currently
I’d rather have the implementation be a little more aggressive, like re-wrapping text after removing syntax markers (soft-wrap source, hard-wrap output, like [sometimes] markdown), and providing a way to line up the second line of the inital arguments breakdown that looks good on both ends – but these are minor nits and all the more reason to support the creation of small, low-dep, opinionated tools. More like this please! And it’s great to see someone taking the time to highlight a tool that brought them joy.
It actually rewraps the text, I am doing line breaks at 80 columns for legibility, but it doesn’t need to be so (actually I am not completely sure about this, I read somewhere that the format expects break at each 80 columns, but I remember passing it a few times without issue).
I have a similar experience that Python type hints improve code quality even when sometimes it feels you’re writing code only to make the mypy happy.
Maybe it is because I was writing code that was too smart for its own good and making the type checker unhappy, and I ended up refactoring in some way that is less smart but pass the type checks, and in most cases I end up thinking “ok, this is probably for the better”.
It may be because I was doing some assumptions (e.g.: a regex match that can return either the match or
None, but in my particular case it can’t ever beNonebut the type checker can’t know this), that I end up adding anassertin the code that express “if this ever happen, something went terrible wrong”.It may be some explicit conversions from
Pathtostrthat at least make me think “huh, this could go wrong if the user has a non-UTF8 encoding set in its filesystem”, that may be as well something that I am ignoring now but if it ever causes issues I know where to look for fixes.mypy definitely is far from perfect, but for me it is a net improvement for anything that is non-trivial.
For those starting in NixOS (for Nix inside macOS or other distros this doesn’t make much sense since you can just use plain VSCode), there is also
vscode-fhsthat set up a VSCode inside abuildFHSUserEnvthat “emulates” a traditional distro filesystem and allows you to use technically any extension from the Marketplace without needing patching.Keep in mind that this is impure, so it goes kinda against with the principles of NixOS (and it means you can’t e.g.: rely in external CIs to build your configuration for caching), but it works and to get someone up to the speed this may well be a good use of your time until you understand how to do packaging, etc.
This is precisely where the conventional type safety wisdom conflicts with the apparent Go philosophy. The “whole purpose of having a type system” is not for having the compiler check things for you. That’s one purpose. But it certainly isn’t the only one and isn’t necessarily the most important one either. There is a balance between what’s worth checking or not vs other concerns.
The “simplest” approach to solving the problem discussed in this article is to add some type casts and move on. Avoid attempting to solve a puzzle you’ve created for yourself.
I’m curious what you think the point of a type system is? Because when I use a dynamically type language, what I miss is the automatic checking. When I’ve seen people just try to use type annotations as documentation in say Python and then later switch to actually checking them mypy it turns out often they got the annotation wrong and mypy is effectively catching a documentation bug. So I really don’t see the value without the checking part.
Purposes of the type system, in order of importance:
I would emphasize two of those points as being more significant than they sound from that list.
First, function type signatures aren’t merely a useful thing to read, they’re computer verified documentation. For example:
And you can’t forget to update these docs because the compiler checks them for you.
Second, the sorts of errors that type systems eliminate aren’t just null pointers and trivial errors like typos. The sorts of bugs that a type system prevents can catch can be arbitrarily complex. It only seems like it’s preventing local bugs because the type system makes them local. Without type signatures, they aren’t local.
What I am saying here is that there are diminishing returns here: most value in detecting simple bugs fast, not at preventing complex bugs.
I really need to get to writing a proper post here, but, generally, type-system as bug prevention mechanism only works for code that you haven’t run even once. So, error handling and concurrency. For normal code, the added safety is marginal, and is dwarfed by dev tooling improvements.
Based on the most common use of generics in the most commonly used language with generics (Java) I would say this is empirically incorrect. The top two motivations are clearly: tooling assistance, and preventing putting values of the wrong type inside generic collections.
But Java got generics only in version 6! Hence it follows that the thing that Java-style generics are useful for are not the top priorities for a type-system!
The top priorities of an (uninvented) type system are the properties desired by the inventors of the type system.
The top priorities of users wrt to type systems are the revealed preferences of the users of type systems.
Which was 18(!) years ago.
Also, your other list of priorities seems rather subjective. I feel that checking many different properties at compile time is by far the biggest benefit of a type system, far above performance under most circumstances.
Counterargument: Typescript, Mypy, Sorbet, even Gleam all provide type systems that don’t and cannot make the code faster. I think there’s also an important case between “detect nulls” and “units of measure” which is features like ADTs that allow for the whole “make invalid states impossible” thing to happen.
More fundamentally, I think this is a roughly accurate list of how much different features of a type system are used, but I think it’s not necessarily right to call that importance. For example, Typescript is definitely designed to support LSP features in Javascript even when the type system isn’t fully there, and it often gets sold on the typo handling stuff — these are key features to Typescript’s success. But Mypy can do this stuff as well, yet my impression is Mypy hasn’t found half as much success in Python codebases as Typescript has in Javascript codebases.
I suspect this is because Typescript does a much better job of modelling the implicit types that already existed in Javascript codebases than Mypy does for Python. (This is partly because Typescript is more powerful, but I suspect also because Javascript is less dynamic, and so easier to model.) This, then, is the more important feature of a type system: it can model the problems that developers are using it for. If it can’t do that, then it is not fit for purpose and will be adjusted or replaced. But if it is sufficient for that purpose, then people will use it in roughly the order your describe in your list.
That said, I do think “the problems that developers are using it for” is such a broad statement that it will look differently for different languages. For example, you probably don’t want to model complex concepts in C, so C’s type system can remain relatively simple. Whereas modelling functions of complex states in a way that prevents errors feels like a definitional statement for functional programming, so MLs and friends will typically have much more complex type systems.
How this applies to Go, though, I’m not sure. Go’s designers definitely want their types and modelled concepts to be as simple as possible, but the way they keep on adding more complex features to the language suspects that they’ve not found quite the right local maxima yet.
Yes, they go after the second priority — dev tooling.
You don’t need types here! Erlang has sum types, and it doesn’t have a null problem, but it also doesn’t have types.
But the extent to which they are successful at it depends — in my experience — more on the later priorities than the earlier ones. That is to say: dev tooling by itself is a high priority. But to get that dev tooling, you need to be able to correctly model the concepts that your users want to model. Be that complex dynamic types as in Python or Typescript, data types as in many functional languages, or lifetimes as @withoutboats points out in a sibling comment. If you can’t model that (and I don’t think e.g. mypy does model that very well), then the type system isn’t very useful.
This is why I think your list matches what users of a type system want to use, but doesn’t necessarily match the priorities from a language designer perspective.
I’m a bit sceptical here, but I admit I have almost no practical experience with Erlang & friends. I’ve used sum types plenty in Javascript, though, and in my experience they work okay up to a point, but they’re so much more usable with a type system to validate things.
Erlang’s type annotations support union types but not sum types. It’s a dynamic language so you can pass
nilto a function that expects a tuple, which is just like the null problem – tho I suspect that Erlang’s usual coding style makes it less of a problem than in other languages. I don’t know if dializer is strict enough that you can use it to ensure invalid states are unrepresentable.No, Erlang is qualitatively different to Java and Python with respect to null.
If someting is nullable in Erlang, then non-null is represented as
{ok, Value}rather than justValue.That is, if you plug a nullable result of a function into a non-nullable argument, the thing blows up even if non-null is returned!
In contrast, Java/Python only blow up when null/None actually happen.
I would also consider type checking of “ownership patterns” whether through monads, substructural types, lifetime analysis, or whatever else to also be a very important property of type systems, just one most currently type systems don’t give you much help with. Bugs in this area (especially in the case of concurrency) are demonstrably common and difficult to diagnose and repair.
On the other hand, I also think there are diminishing returns to trying to encode increasingly complex correctness contracts into the type system. Quickly it seems to me the difficulty of understanding how the contract has been encoded and how to use it properly outweighs the difficulty of avoiding the error yourself. The cost can be worth it in safety critical systems, I assume.
I periodically attempt to take some of my Python libraries – which do use type hints as documentation – and get mypy to run on them.
I have never had mypy catch an actual type error when doing this (meaning, a situation where an attempt was made to use a value of a type incompatible with the expected one described in the annotation). I have, however, gone down far more rabbit holes than I’d care to in an attempt to figure out how to express things in a way mypy will understand.
My most recent attempt at this involved a situation where mypy’s type-narrowing facilities left a lot to be desired. I am going to describe this here so you can see what I mean, and so you can get a sense of the frustration I felt. And to be clear: I am not a newbie to either Python (which I’ve been doing professionally for decades) or static typing (I first learned Java and C back in the mid-2000s).
So. The real code here isn’t particularly relevant. What you need to know is that it’s a list of values each of which (because they’re being read from environment variables which might or might not be set) is initially
str | None. The code then did anif not all(the_list):check and would bail out with an exception in that branch. Which, crucially, means that all code past that point can safely assume all the values have been narrowed tostr(since if any of them wereNone, theall()check would have failed).Later code would start checking to see if the values were URL-like, because ultimately that’s what they’re supposed to be. So imagine some code like this for a simplified example:
But mypy looks at this perfectly idiomatic and perfectly safe Python code and says
error: Item "None" of "str | None" has no attribute "startswith" [union-attr]. Because although we humans can clearly see that the type ofitemsmust have been narrowed, mypy can’t. OK, mypy’s documentation suggests anis not Nonecheck will narrow an optional type:But no, that gets the same error from mypy. So does this, though mypy says
isinstance()checks can be used for narrowing:The actual problem, of course, is mypy doesn’t understand that
all()would returnFalseif any of the values actually wereNone, and so cannot infer from the return value ofall()that the type has in fact been narrowed fromstr | Noneto juststr. We have to help it. If you’re actually reading mypy’s type-narrowing docs, the next thing it will suggest is writing a guard function withTypeGuard. OK:And mypy actually accepts this! But there’s a problem: remember I wanted to do an
if not all(items):to bail out with an error, and then have a clean path beyond that where the type has been narrowed fromstr | Nonetostr? Well, turns outTypeGuardcan only narrow the “true” branch of a conditional. To narrow both branches, you need to useTypeIsinstead. OK, so here’s theTypeIsversion:So naturally mypy accepts that, right?
Haha, just kidding:
You see,
TypeGuarddoesn’t care about generic type variance, butTypeIsdoes! And it turns outlistis defined by the bolted-on Python “static type” system to be invariant. So now we have to go redefine everything to use a different generic type. Probably the best choice here isSequence, which is covariant.This, finally, will do the correct thing. It satisfies the type narrowing problem in both branches of a conditional, which is what’s ultimately wanted, and does so in a way that makes the narrowing obvious to mypy. And it only took, what, half a dozen tries and a bunch of frustrating errors? Again, I’m not new to Python or to static typing, and even though I actually understood the reason for most of the errors after a quick skim, this was still an incredible amount of pointless and frustrating busy-work just to satisfy mypy of something that mypy should have been able to figure out from the initial idiomatic implementation. And, worse, this has introduced expensive runtime
isinstance()checks as a cost of making the “static” type checker happy!All of which is just the most recent of many examples of why I continue to add type hints to my code as documentation, because I do find them useful for that purpose, but also continue not to run mypy as part of my linting suite and why I do not include a
py.typedfile in my own packages.My experience is with Typescript vs Javascript, but I’ve converted two or three codebases over to Typescript at this point, and each time the act of defining types for functions where the types weren’t entirely clear has helped me find bugs. It’s also allowed me to remove overly-defensive code and make the code easier to read overall.
I think part of this is that Typescript has a more powerful type system that better models real-world Javascript code (whereas Mypy’s always feels like Java with some light sugar on top). But I also suspect that Javascript is a lot easier to model than Python, as there are fewer opportunities for wild, dynamic code, and even when there are, most Javascript developers tend to avoid that style unless it’s really useful.
For your example specifically,
Array#every, which is roughly the equivalent ofall(...), does include the requisite type information to correctly handle this case:JavaScript is kind of an interesting example to bring up, because people love to talk about making Python “strongly typed” when it already is. You could make a sort of grid of languages like this to show the difference:
I can see how the number of implicit type conversion behaviors in JavaScript, which you mostly have to just know and remember not to trip over, would lead to a desire to work in something a bit stricter, and how doing so could yield benefits in code quality.
And TypeScript is also kind of a different example because it’s not required to remain syntactically compatible with JavaScript. “Typed Python”, on the other hand, does have to maintain syntactic compatibility with plain Python (which is why several things from the static-type-checking Python world have had to be imported into Python itself).
But I also stand by the fact that mypy has never uncovered an actual bug in Python code I’ve run it on. It’s only ever uncovered weird limitations of mypy requiring workarounds like the ones described in my comment above.
I absolutely agree with your last part. My experience with Python has been very similar, and I’ve stopped using Mypy much these days, even when working with Python, because there are too many cases that it found but weren’t actual bugs, and too many actual bugs that it didn’t catch for one reason or another.
But I think that’s largely because Mypy isn’t very good at typechecking idiomatic Python code, and not because the concept as a whole is flawed.
I do wonder, though, if Mypy would have worked better from the start if annotations had always been lazy — or even if they’d only been available at runtime as strings. This would have given the type checkers more chances to experiment with syntax without needing changes to the runtime.
I think it is important to remember that Mypy is one of the implementations of PEP-484, and while it is probably the most famous and popular, it is not the only one. Also important to note that PEP-484 does not define how the types should be enforced (it defines a few things to allow interoperability, but leave the majority of behaviors for the implementors). Heck, they’re essentially comment strings for the interpreter (especially after PEP-563 2).
For example, Pyright explicitly tries to infer more things than Mypy 3, while the fact that Mypy doesn’t is a design choice 4.
This is true, but the fact that until PEP563 the type annotations were interpreted and therefore needed to be valid, and the way that after 563 they’re still semi-interpretable (IIRC the
__future__annotations import is now discouraged because it has other weird side effects and they’re looking in a different direction, but I’ve not hugely been following the discussion) — all that means that annotations are still very much tied to Python-defined semantics. You couldn’t easily, for example, define a custom mapped type syntax using dict expressions (similar to Typescript’s mapped types), because a lot of stuff would break in weird and wonderful ways.Like I say, I’ve given up on this stuff for now, and I’m hoping that it might get better at some point, but last time I used it Pyright was more strict, but only in the sense that I needed to contort my code more aggressively to make it work. (IIRC, the last problem I ran into with Pyright was that it inferred a value as having type
Unknown, and then started raising angry type errors, even though I never interacted with the value, and therefore it beingUnknownwas of no consequence.)I am using Pyright as my LSP while I used
mypy --strictin this project and they never disagree. But to be clear, this is a small project with ~2000 lines of code, and also doesn’t interact with external APIs so I think this avoid the weird corner cases of Mypy.My experience is quite the opposite,
mypydid catch lots of typing errors that would otherwise just cause issues in runtime (or maybe not, but it was definitely doing something not the way I wanted to express). One example from yesterday, I was usingsubprocess.run(actually a wrapper above it, more details below) and wanted to pass anenvparameter (that is aMapping), and I did something like:Do you see the error? Probably yes, since this is only one line of the code that I was working for, but I didn’t, and
mypygladly got the issue. And even better, this was before I tested the code, and once I run for the first time (after fixing allmypyerrors), the code Just Worked (TM).The other day I was refactoring the code of the same project and I was not running
mypyin the tests yet. I decided to setupmypyin the tests just to be sure and boom, I forgot to change the caller in the tests. Since they’re mocked, they still passed, but the tests didn’t made any sense. While it was annoying to type the test methods itself (that I don’t think makes much sense), after that experience I decided that I definitely needmypyrunning in my tests too.I concur that
mypyis weird sometimes, and far from perfect. For example, I got some really strange issues when trying to write a wrapper for another method (again,subprocess.run) and typing its keyword arguments. Something like:And this caused some weird issues. Moving
envto the method parameter instead worked and even simplified the code, but yes, far from ideal.But even with the somewhat strange issue, I still thing that
mypywas a huge plus in this project. The fact that I can refactor code and be confident that if the tests andmypypass it is very likely that the code is still correct is worth the work that I have to do sometimes to “please”mypy.I’ve definitely found errors when converting python programs to have type hints, but not as many afterwards; perhaps it’s just that having the type annotations makes it harder for me (and other people working on it) to write incorrect code? Either way, I fully agree that mypy/pyright work in ways that are just too annoying and I disable them in IDE and linting unless I’m required to have them.
One thing that I found valuable in
mypyis during refactoring. I just refactored a function parameter frombooltostr | Noneand while this didn’t broke any tests (because in the function I was doing something likeif v: something, so it doesn’t really matter if I am passingFalseorNoneasv),mypycorrectly identified the issue.You could argue that it doens’t matter since this didn’t break the logic, but I think it does. Having tests with incorrect parameters, even if they works, are confusing because you lose the context of what the tests are actually trying to test.
Some of the things that I found interesting:
go.mod(https://go.dev/issue/48429): no moretools.gojust so you can pin tools versionsruntime/debug.BuildInfo.Main.Version: https://go.dev/issue/50603go test/build -json): https://go.dev/issue/62067crypto(pure Go instead of the previous solution using BoringSSL): https://go.dev/issue/69536I never understood the desire for building external tools using the go.mod of the repo you use them in. I’d rather just pin and build tools I used based on the released versions with same dependencies that the author builds, tests, and releases with.
While this is a good point, it doesn’t cover the fact that Go was one of the only languages that I know that didn’t had a good way to pin tools versions, that ends up creating the unfortunate issue in how you declare those tools in the specific version you need and keep it up to date.
This almost always involved a sub-optimal solution, either
tools.go(with all its issues) or having some other way to declare it (it could be in README.md of the project or a Makefile), but you ended up needing something.I usually just write go run tool@version in scripts but I see the issue. This tools solution would be fine to me UX wise but I’d always want the tools built with the go.mod.of their main!
You’re telling me there’s yet another reason to stick to POSIX shell? I am shocked I tell you, shocked.
Well not really that shocked.
True, but it’s not enough that your source code is POSIX shell. You also have to run it with a shell that’s not vulnerable! For example, this is a POSIX shell program:
And it’s vulnerable when you run it under bash/ksh, but not when you run it under dash, busybox ash, or OSH!
the reason that some OS’s decide that when i say “/bin/sh” what i really meant was some other shell that’s not sh is why i have trust issues. i’ve been told that it’s fine though, and to stop being pedantic.
Yeah it’s best if
/bin/shis a judiciously modernized Almquist shell.I think this is a good reason to not use shell, or at least avoid as much as possible in any context that the user can control the inputs, because it is so easy to commit mistakes in shell, and as another commenter said below you can’t even trust that your POSIX compliant script will run in a shell that handles this (and other issues) correctly.