I think this is completely untrue. RISC-V is real, exists, works, there are hardware products being built, embedded into mass produced devices, etc.
It’s just in the space that most of us are mostly interested in - modern high performance CPUs - the instruction set is maybe 1% of the story. Modern CPUs are one of the most elaborated artifacts of human engineering and result of decades of research and improvements, part of a huge industrial and economical base built around it. That’s not something that can be significantly altered rapidly.
Just look how long it took Arm to get into high performance desktop CPUs. And there was big and important business behind it, with strategy and everything.
They’re not asking for high-performance desktop CPUs here though. Judging by the following statement on IRC:
<q66> there isn't one that is as fast as rpi5
<q66> it'd be enough if it was
it sounds like anything that even approaches a current day midrange phone SoC would be enough. RPi5 is 4x Cortex-A76, which is about a midrange phone SoC in 2021.
Last I checked, the most commonly recommended RISC-V boards were slower than a Raspberry Pi 3, and the latest and greatest and most expensive boards were somewhere in the middle between RPi 3 and 4. So yeah, pretty slow.
Beyond microcontrollers, I really haven’t seen anything remotely usable. I’d love to be wrong though.
I tried to find a Pi Zero replacement, but the few boards available all had terrible performance. The one I bought in the end turned out has an unusable draft implementation of vector instructions and it’s significantly slower than any other Linux board I’ve ever used, from IO to just CPU performance. Not to mention the poor software support (I truly despise device trees at this point).
Just for a data point, the RP2350 chip in the Raspberry Pi Pico 2 includes a pair of RV32IMACZb* cores as alternates for the main Cortex-M33 cores: you can choose which architecture to boot into. The Pico 2 costs $5 in quantities of 1.
I use Niri and I wholeheartedly agree that Niri is doing much better job technically than the clunky quirky C code in wlroots and Sway. It’s also a nicer user experience overall. Niri is only getting better by the day, leaving stagnating Sway C code in the dust.
Still, I somewhat miss the UX of “tabbed/stacked layouts with nested containers, the least ergonomic Band-Aid™ for the space issue I’ve ever seen” and I don’t agree that containers are a band-aid, even though I agree that they bring their own cognitive load.
On Sway, I often had eleven workspaces open.
added shortcuts to workspaces 11-20
On Sway I was never actually using more than five workspaces. A laptop screen + a big 4k screen + tabbed containers were actually a solution to the proliferation of workspaces: I kept a small number of workspaces with an arbitrary number of temporary windows in a tabbed container for a task that I could manipulate as one unit (closing, moving to another workspace, etc). I had to adjust to avoid multiplying windows in a Niri workspace, because I quickly lose track of invisible windows beyond the edges of screen (and Waybar has not been cooperative to display icons only from the current workspace), while on Sway I could see window decorations and titles for every top-level window/container in the workspace at the same time.
Another little annoyance is that I like that Sway workspaces have their own global namespace. I’m used to Sway workspaces 1 and 3 living on the laptop screen, 2 (and rarely 4) on my external monitor by default. When I unplug the external monitor, the workspace 2 (and maybe 4) just migrates to the laptop screen temporarily without changing the numbers, while on Niri workspaces 1 and 2 from the external monitor become something like 4 and 5 on the laptop screen.
Another feature that I’m waiting for is to be able to bind gestures to actions (swiping a workspace to another monitor can be nice).
Still, Niri is just awesome overall, I use it and I highly recommend it.
I’m gonna give Niri a try. Maybe it’ll even fix the bad perf and crashes I’ve had under Sway.
I agree on tabs. I used to have over 20 Sway workspaces, all with tabs. Recently Firefox has started having trouble when I have over 100 windows open, so I’ve had to cut back, and only have 9 workspaces, but I still have lots of tabs and I feel like they make it very quick and easy to find what I’m looking for. But I navigate by mouse. Whenever I happen to need to use the keyboard to navigate my sea of tabs inside tabs, it sucks. Maybe @eBPF always uses keyboard navigation and that’s why they dislike using tabs?
Maybe I’ll implement tabs for Niri if I miss them :laughing:
Edit: wait, Niri already has, I just discovered while trying to get started. Though they’re opposite to what I was envisioning. Niri tabs stack windows inside a single spot in the horizontal infinite scroll, while what I was imagining was a each tab being a container for a separate horizontal workspace.
Since when did X11 virtual desktops extend infinitely? As far as I remember every virtual desktop implementation in X11 WMs were just separate workspaces you could switch between.
Traditional tiling window managers have a side effect of forcing you to be as efficient as possible with your window layout. There is an additional cognitive load incentivizing you to optimize for the wrong thing:
I disagree with this, my tiling WM is forcing me about nothing, it’s just how you use it, or what problems you want to solve. My setup exactly takes away all cognitive load - one of the main problems with my current macOS setup, I need to think about window placement all the time.
That said, I’m planning to give niri an honest trial soon, not because I am fed up with any specific tiling WM, but I have a new machine that needs a setup anyway.
I disagree with this, my tiling WM is forcing me about nothing, it’s just how you use it, or what problems you want to solve.
As someone who 100% agrees with the author here (to the point where I finally gave up on tiling after about five years of using it), let me explain it differently: if you’re tiling, every new window you open forces you to make a decision about it. Whether that window is epehemeral or not, whether it belongs in this space or not, it showing up will make every other window in the space reflow, i.e. move around or change sizes. This is a core feature of tiling! So you end up doing things like the current top reply to your comment where you keep one full window per screen, or at most two, and if you accidentally open a new window even for a microsecond that’s going to thrash your layout. The moment I realized tiling was actively hurting me was when, upon reflection, I found out that I almost had muscle memory for opening a new window and making it full screen so my existing layout didn’t explode.
I think this problem is specific to dynamic tiling. It’s one of the reasons why I never managed to get along with i3 (or Sway). I got along with wmii to some degree back in the day. I don’t recall the details, I vaguely remember its stacking feature being a little less awkward than i3’s but I don’t remember exactly why; I do remember that it was equally annoying to switch among windows though (I routinely had to open like a dozen datasheets, and then I’d have to switch between them by patiently bringing each one into view).
Manual tiling has always been the way to go for me back when I used a tiling WM. Ratpoison (which I used), and @jcs’ own sdorfehs, trivially fix this problem: opening a new window never causes reflow, it always opens exactly where you want it (actually, it always open exactly where your eyes already are, because you’ve switched to that container), and you get instant full screen vs. tiling by just moving a window to its own workspace/back to its original workspace.
Or at least that’s how it used to work at some point, I’m not sure if this is still the case – I haven’t used a tiling WM in years, I have a large monitor and tiling WMs are super annoying (if I tile two windows side-by-side they’re too wide to read comfortably, and making one narrower makes the other one even wider; if I tile 3 or more, depending on what’s in them, at least one of them is now too narrow to be useful).
if I tile two windows side-by-side they’re too wide to read comfortably, and making one narrower makes the other one even wider; if I tile 3 or more, depending on what’s in them, at least one of them is now too narrow to be useful
This is exactly what Niri solves. Every window opens at its preferred width. Niri doesn’t unnecessarily insist on filling up your whole screen. With every new window, your screen fills up until the furthest to the left scrolls out of view.
Indeed, although (with the usual “to each his own” caveat) I honestly just… prefer the stacking solution here. I played with Niri a few weeks (or months?) ago and I can’t say I was a fan, because it just introduces the opposite problem – if I open three windows, one of them is either to narrow to be useful, or off-screen. If it’s off-screen, I have to scroll left/right to get to it, and the associated animation (which is kind of necessary for spatial cues) gets tiring pretty fast.
I liked ratpoison for a bunch of other reasons that go well with manual tiling though. E.g. window switching is extremely fast and works well with “local” muscle memory.
I’ve been using Sway for a few years now, and I still miss Notion. I set each Sway region to tabbed, which gives a similar effect, but you have to do it manually after booting and keep at least one window in each region, which is slightly annoying. And sometimes I press the wrong key and the layout gets messed up in some confusing way.
Looking at the Notion site again, I see it links to volare, a fork of Sway aiming to make it more Notion like. Just trying it out now… looks promising!
I set each Sway region to tabbed, which gives a similar effect, but you have to do it manually after booting and keep at least one window in each region, which is slightly annoying.
You may be interested in the following config option: workspace_layout tabbed. It makes every new container tabbed at creation.
This isn’t feasible for me since i have a lot of things open at once, and I run out of workspace keybinds. I also do like being able to see two/three things at once.
As someone who has been using tiling WMs for a long time, I also recommend a keybinding that lets you textcomplete a window.
Something like rofi -show window -auto-select can really do wonders for navigating around. While it’s nice to have a keybinding to jump to any workspace, you can get really far just jumping to the windows themselves
I had swayr set up to help me find windows I had lost, and I put together a little script to do the same on Niri. It’s useful, but it ended up being a last-resort thing unfortunately.
one of the main problems with my current macOS setup, I need to think about window placement all the time.
There are some tiling managers for macos as well, yabai is my personal favorite currently!
I agree tiling managers have saved me from needing to care about window placement at all. I can just simply focus on coding. Which usually takes a max of 3 windows for me, all which can fit fairly well on my super wide monitors.
I bet there is a screen size difference for people here. Some may have less screen to work with. When using my 15” MBA I feel like the tiling manager isn’t as helpful as with a giant screen.
I’m not so sure…I’ve been doing some embedded Rust (which sort of sits at the same place on bare hardware as the kernel) and async has been really helpful for keeping the code organized. I think about it as essentially making state machines automatically for me. There is actually a tradeoff here because it’s almost too ergonomic — it’s easy to forget what it’s really doing and run into issues because something was a “local” variable (i.e., a field of the invisible state machine state) when you really needed to give it external ownership. But that didn’t take long to internalize.
My belief is that Rust’s async story shines particularly bright[0] when writing software for embedded systems, which Niko includes in his “foundational” category.
[0]: The whole complexity mess comes a lot from the fact that you want to provide zero-cost state machines for futures, without boxing nor a GC.
It would be much simpler if Signal adopted something like RFC 9420 (Messaging Layer Security), but MLS doesn’t provide the metadata resistance that Signal prioritized.
I wonder what metadata resistance Signal offers that Wire, through its use of MLS, doesn’t?
The metadata resistance of signal is largely mythical anyway since the necessarily have the metadata via other channels and just pinky promise not to look or store it
You can derive it from necessity if you like. Signal server sees the message come in over a network connection from an app. The server must be able to deliver it to a target user. This is the metadata. That the message data on the wire doesn’t contain this metadata doesn’t prevent the server from knowing it, it must know it in order to function at all. Signal has never claimed otherwise they only claim that the server forgets right away. But of course that must be taken on trust
At best, that associates two IP addresses… not withstanding CGNAT, VPNs, MASQUE, and friends.
But it doesn’t associate them with accounts / contacts. That’s a stronger guarantee than Matrix or XMPP. It may also be a stronger guarantee than Wire?
But it doesn’t associate them with accounts / contacts.
That isn’t true. Signal messages need to be routed by account identifier, an IP address is not sufficient. And unless you have the “sealed sender” feature turned on, messages identify their senders.
There’s no mechanism for the Signal server to know the IP addresses of iOS clients because an iOS device only maintains one persistent connection to Apple for notifications. There’s no way a Signal client can keep track of the IP addresses of its contacts, because it isn’t a mesh network, it’s a star. Even for non-iOS devices, an IP address isn’t sufficient to identify a client because (for example) there are multiple clients in our house and our house has only one IP address.
So it is. As far as I can tell the official documentation for the feature is still this blog post https://signal.org/blog/sealed-sender/ which makes it sound like the feature is incomplete, but the last few paragraphs say they were (in 2018) rolling it out to everyone so I guess the preview was actually the main event.
I just checked in settings. There’s only “show when it’s used” and “allow for even unknown senders” preferences for me, which makes me conclude that it’s already enabled by default and can not be disabled.
Sealed sender is also not a good protection if Signal was to actually start keeping logs. There are two sources of metadata leakage with sealed sender:
You need to acquire a sender certificate before you can use sealed sender. If you do this from the same IP as you later use when sending a message, your IP and your identity can be linked.
When you send a message, the receiver sends a delivery notice back to you. This is a simple correlation, a sealed message to Person A on IP address X from IP address Y is immediately followed by a sealed message from IP address X to Person B on IP address Y.
Yes, and if you do have Sealed Sender turned on, the only metadata left on the server that’s needed for message delivery is a 96-bit “delivery token” derived from a “profile key” that conveniently rotates whenever you block an account.
My reading of the description of sealed sender is that the delivery token is used check that the sender is allowed to send to the recipient – it’s an anti-abuse mechanism. It is used when the server is deciding whether to accept a message, it isn’t used to decide where to deliver the message.
That is not my reading of the server code for either single or multi-recipient messages. And Signal iOS at least seems to use sealed sender by default, though it falls back to unsealed send if there’s an auth failure, which seems bad. (so the server can force the client to identify itself? … but I also can’t find anywhere that throws RequestMakerUDAuthError.udAuthFailure, so maybe it’s dead code…)
But I admit it’s a very casual reading of the code!
To say what sibling says in a different way, the connection the message is delivered to the server over must be authenticated. If it weren’t the server would not accept the message, due to spam reasons etc. so the server knows the account of the sender. And it needs to know the account of the receiver for delivery to be possible
That article specifically admits this is true. Signal doesn’t choose to write it down (assuming the published code is what they run) which means it cannot be recovered after the fact (if you trust the server to not have recorded this) of course any other operator could also not write this down and one could choose to trust that operator. It’s not specific to signal really.
I believe we agree that the server must know the recipient of a message. I believe we disagree about whether the server needs to know the sender of a message.
Erm, so what do you mean by authenticated?
That article notes the sender’s metadata is (e2e) encrypted. The server accepts and routes messages whose envelope includes a delivery token. And, similarly, that delivery token is shared via e2e encrypted sessions to all a recipient’s contacts.
It’s unclear to me how unknown senders / randos are handled, however. I haven’t read that deep into the code.
Sure, that’s fair.
But I was hoping your claim was more substantial than just this, since, as since child comment below says, almost all signal competitors suffer from this.
Not just almost all. It is fundamentally impossible for a communications system to operate if whoever does the routing doesn’t know sender and receiver identity at some point (and send/receive time, which is also metadata)
If you do onion routing you could make it so only one part knows sender and one part knows receiver, which is how the remailer network worked but that’s the only instance I’m aware of doing that. Everyone else has the metadata and it’s just various shades of promising not to write it down.
Aren’t there protocols for deniable drop offs on servers and similar? Those wouldn’t scale well, but AFAIK they work. So they are possible (just not practical).
There is SecureDrop, but as far as the technology is concerned it’s a web app accessed via Tor. The rest of the anonymity guarantees come from server-side opsec performed by the recipient org https://docs.securedrop.org/en/stable/what_is_securedrop.html
SimpleX is a chat system that does onion routing. Only two hops, and I am not vouching for anything about the app or its servers; just noting this feature.
No one claimed otherwise. The context is the claim expressed above that you get worse metadata resistance than Signal, which seems irrelevant given that Signal doesn’t really have it either.
Sorry. I hear this line of argument on Hacker News and Reddit a lot, only for the person to turn around and recommend XMPP or Matrix instead. I wanted to cut it off at the pass.
If I was ever looking for an ‘out’ I think this was a fair trigger. A link to … a todo on a git forge with nothing of substance that hasn’t already been said a decade+ ago? Did I misspell phoronix.com? …
I’m not sure if it belongs here, but it’s not a TODO, it’s a WIP merge request currently with almost 2000 lines of new code that you can compile and try.
pmeunier has an account here and would be most qualified to add some context, but I can add a bit here.
Pijul is a patch-based VCS, as opposed to snapshot-based, like Git, Mercurial, etc. The leading example of a patch-based VCS otherwise up to this point has been Darcs, written in Haskell. I haven’t used it much. Pijul’s main motivation over Darcs was algorithmic improvements that resolve the worst-case exponential time Darcs can run into; see the Why Pijul? and Theory pages for a bit more context. There’s supposed to be some theoretical soundness improvements as well over Darcs, but I don’t know as much about that.
Nest has been the main service for natively hosting Pijul projects with a web UI, made by the same team/developer. From what I remember, like git, a Pijul repository can be used remotely over SSH; Nest is more like having gitea/gitlab in addition to that.
I remember it being closed source with the intention of making it open source later; I think the rationale was around Nest still being alpha and not having the resources at the time to field it as a full open source project in addition to Pijul itself, though nest.pijul.com was available to host other open source projects with Pijul.
The news here would be that Nest’s been recently open sourced. As someone who’s been interested in Pijul but hasn’t had much opportunity to use it yet, this sounds like significant news that should make adoption more practical in general. Congrats to pmeunier and co., and thank you for your interesting and generous work in the VCS space!
There’s supposed to be some theoretical soundness improvements as well over Darcs, but I don’t know as much about that.
As I recall, the problem is that patches in Darcs don’t necessarily commute. For all the nice math to work out, you want independent patches to commute, that is to say, applying patch A followed by patch B, should give you the same result as applying patch B first, followed by patch A. But patches aren’t guaranteed to do that in Darcs, and the only way to ensure this is to simply test pairs of patches by applying them and seeing if both orders give the same result.
In Pijul, if you have two patches that don’t depend on each other, they always commute. Either they don’t conflict, in which case the non-conflicting outcome is the same regardless of the order, or they do conflict, in which case you get the exact same conflict state regardless of the order.
I found these public posts about the history of this Nest implementation and the plans to open source it:
pmeunier’s 2023-05-23 blog post “A new direction for Pijul’s hosting service” announced a rewrite of Nest that would be more maintainable and would also be open source.
pmeunier’s 2023-11-10 forum post said “There are currently two very different versions of the Nest, of which the most recent one was supposed to become open source and self-hostable, but it never worked out. I’m working on merging their features.”
The reason it was closed-source wasn’t really by design, it was just that the service had accumulated a lot of tech debt after transitioning through way too many versions of the Rust async ecosystem (the Nest started in 2015). So, this is a marker of Pijul being mature enough that I was able to spend some time rewriting things using the (now) stabilised Rust libs such as Diesel and Axum. Also, Svelte is fun on the front-end but didn’t exist back then, I love how you can have the best of both worlds (static and dynamic).
I wouldn’t say “descended from Darcs” because that may give the wrong connotations. Pijul isn’t a fork of Darcs. Pijul has a rigorous mathematical foundation, unlike Darcs. They are conceptually related though, so I think it is clearer to say Pijul is inspired by Darcs.
Pijul is the first distributed version control system to be based on a sound mathematical theory of changes. It is inspired by Darcs, but aims at solving the soundness and performance issues of Darcs.
I started pay attention to Pijul many years ago. When it comes to systems that manage essential information, I tend to bias in favor of systems with formal guarantees.
It cannot expose an API to write all memory, because it does not have access to all memory
Unless it uses DMA (which that network device likely has). So you need an IOMMU as well. And of course the device must not be in cahoots with the driver, which is not necessarily the case on embedded systems like this where the vendor taped out the chip and wrote the driver.
I don’t entirely agree, an IOMMU is a separate piece of hardware that needs to be programmed appropriately to allow DMA to go to separate regions and which also does address translation (many of the earliest ones were designed only for the translation, not security: they let you ship cheap 32-bit devices in 64-bit systems with more than 4 GiB of RAM). The programmer model involves multiple systems tracking permissions. In contrast, a CHERIoT DMA unit lets a compartment DMA to any memory that it can access and enforces the same rules as the core. If that’s an IOMMU then CHERI is an MMU.
I was just in the process of pulling up the SeL4 FAQ about DMA to ask if the same restrictions applied to CHERIoT. They mention x86 VT-d and SystemMMU. But I guess if CHERI already has full control of the hardware (by being a hardware security implementation) they can fix that separately.
Also I think this is two things at the same time, it should be either “and so on” or “and friends”:
Would scheme-rs be a good choice for adding general purpose runtime scripting to Rust applications (in sync or async Rust)?
Would scheme-rs be suitable for running untrusted user provided scripts? This entails limiting what a script has access to, and limiting runtime and memory usage.
Would you say scheme-rs’s code is mature enough that 3rd parties could jump in and start contributing? Or is the project still in enough flux that it would be difficult?
Yes, specifically for async right now. Although I do want to provide a sync interface at some point.
To some extent. Limiting what the script has access to, absolutely. You get to decide what functions an environment can access. However there are no mechanisms for limiting memory usage or runtime at the moment.
Yes. Most of the architecture is fixed at this point. Some things will change, i.e. the value enum will eventually become opaque so that we can properly optimize it, but since I wrote this post a couple of people have begun to contribute to the project with no issues
Time and time again wlroots proves how solid it is as a project. Really outstanding work!
It’s just a shame that Wayland didn’t dare to define such things on the protocol level in the first place. I mean, given the rock-sold colour space support in macOS, any sane engineer designing a new display manager/compositor in the 2010’s would have put colour management as a design-centerpiece. Libraries like Little CMS prove that you don’t even need to do much in terms of colour transformations by hand; simply define your surfaces in a sufficiently large working colour space and do the transformations ad-hoc.
From what I remember back then, the only thing the Wayland engineers seemed to care about was going down to the lowest common denominator and ‘no flickering’ (which they saw in X in some cases).
For instance, it is not possible to portably place an application window ‘at the top’, given one may not dare to assume this even though 99.99% of all displays support this. It would have made more sense to have ‘feature flags’ for displays or have more strict assumptions on the coordinate space.
In the end, a wayland compositor requires close to 50.000 LOC of boilerplate, which wlroots gracefully provides, and this boilerplate is fragile as you depend on proprietary interfaces and extensions. You can write a basic X display manager in 500 LOC only based on the stable X libraries. With all of X’s flaws, this is still a strong point today.
In the end, a wayland compositor requires close to 50.000 LOC of boilerplate, which wlroots gracefully provides, and this boilerplate is fragile as you depend on proprietary interfaces and extensions. You can write a basic X display manager in 500 LOC only based on the stable X libraries. With all of X’s flaws, this is still a strong point today.
This instinctually bothers me too, but I don’t think it’s actually correct. The reason that your X display manager can be 500 LOC is because of the roughly 370 LOC in Xorg. The dominance of wlroots feels funny to me based on my general dislike for monocultures, but if you think of wlroots as just “the guts of Xorg, but in ‘window manager userland’”, it actually is not that much worse than Xorg and maybe even better.
I don’t really get your criticism. Wayland is used on a lot of devices, including car displays and KIOSK-like installations. Does an application window even make sense if you only have a single application displayed at all times? Should Wayland not scale down to such setups?
Especially that it has an actually finely-working extension system so that such a functionality can be trivially added (either as a standard if it’s considered widely useful, or as a custom extension if it only makes sense for a single server implementation).
A Wayland compositors’ 50 thousands LOC is the whole thing. It’s not boilerplate, it’s literally a whole display server communicating in a shared “language” with clients, sitting on top core Linux kernel APIs. That’s it. Your 500 LOC comparison under X is just a window manager plugin, just because it operates as a separate binary it is essentially the same as a tiling window manager plugin for Gnome.
It’s just a shame that Wayland didn’t dare to define such things on the protocol level in the first place.
Then it would have taken 2× as long to get it out of the door and gain any adoption at all.
Routine reminder that the entire F/OSS ecosystem worth of manpower and funding is basically a rounding error compared to what Apple can pour into macOS in order to gain “rock-solid colour space support” from day zero.
For instance, it is not possible to portably place an application window ‘at the top’, given one may not dare to assume this even though 99.99% of all displays support this. It would have made more sense to have ‘feature flags’ for displays or have more strict assumptions on the coordinate space.
I find it slightly odd that an “epic treatise on error models” would fail to mention Common Lisp and Smaltalk, whose error models provide a facility that all others lack: resuming from an error.
Hi, author here, the title also does say “for systems programming languages” :)
For continuations to work in a systems programming language, you can probably only allow one-shot delimited continuations. It’s unclear to me as to how one-shot continuations can be integrated into a systems language where you want to ensure careful control over lifetimes. Perhaps you (or someone else here) knows of some research integrating ownership/borrowing with continuations/algebraic effects that I’m unfamiliar with?
The closest exception to this that I know of is Haskell, which has support for both linear types and a primitive for continuations. However, I haven’t seen anyone integrate the two, and I’ve definitely seen some soundness-related issues in various effect systems libraries in Haskell (which doesn’t inspire confidence), but it’s also possible I missed some developments there as I haven’t written much Haskell in a while.
I’m sorry for the slightly snarky tone of my original reply, but even if you were to discount the Lisp machines, or all the stuff Xerox and others did with Smalltalk (including today’s Croquet), as somehow not being systems, I would have expected an epic treatise to at least mention that error resumption exists – especially since academia is now rediscovering this topic as effect handlers (typically without any mention of the prior art).
For continuations to work in a systems programming language, you can probably only allow one-shot delimited continuations.
This misconception is so common (and dear to my heart) that I have to use bold:
Resumable exceptions do not require first-class continuations, whether delimited or undelimited, whether one-shot or multi-shot. None at all. Nada. Zilch.
Suppose write() discovers that the disk is full (e.g. from an underlying primitive). This causes it to call signal_disk_is_full(). Note that the call to signal_disk_is_full() happens inside the stack of write() (obviously).
Now signal_disk_is_full() looks for a handler and calls it: disk_is_full_handler(). Again, the call to the handler happens inside the stack of signal_disk_is_full() (and write()). The handler can return normally to write() once it has cleaned up space.
write() is never popped off the stack. It always stays on the stack. IOW, there is never a need to capture a continuation, and never a need to reinstate one. The disk_is_full_handler() runs inside the stack of the original call to write().
effect systems
A side note: most effect systems do use and even require first-class continuations, but IMO that’s completely overkill and only needed for rarely used effects like nondeterminism. For simple effects, like resumable exceptions, no continuations are needed whatsoever.
but even if you were to discount the Lisp machines, or all the stuff Xerox and others did with Smalltalk (including today’s Croquet), as somehow not being systems
I provided the working definition of “systems programming language” that I used in the blog post. It’s a narrow one for sure, but I have to put a limit somewhere. My point is not trying to exclude the work done by smart people; but I need a stopping point somewhere after 100~120 hours of research and writing.
Resumable exceptions do not require first-class continuations, whether delimited or undelimited, whether one-shot or multi-shot. None at all. Nada. Zilch.
Thank you for writing down a detailed explanation with a concrete example. I will update the post with some of the details you shared tomorrow.
You will notice that my comment does not use the phrase “first-class” anywhere; that was deliberate, but perhaps I should’ve been more explicit about it. 😅
As I see it, the notion of a continuation is that of a control operator, which allows one to “continue” a computation from a particular point. So in that sense, it’s a bit difficult for me to understand where exactly you disagree, perhaps you’re working with a different definition of “continuation”? Or perhaps the difference of opinion is because of the focus on first-class continuations specifically?
At the time of the Common Lisp design, Scheme did not have an error system, and so its contribution to the dialog on condition systems was not that of contributing an operator or behavior. However, it still did have something to contribute: the useful term continuation […] This metaphor was of tremendous value to me socially in my efforts to gain acceptance of the condition system, because it allowed a convenient, terse explanation of what “restarts” were about in Common Lisp. [..] And so I have often found myself thankful for the availability of a concept so that I could talk about the establishment of named restart points as “taking a continuation, labeling it with a tag, and storing it away on a shelf somewhere for possible later use.”
So it might be the case that the mismatch here is largely due to language usage, or perhaps my understanding of continuations is lacking.
I’m also a little bit confused as to why your current comment (and the linked blog post) focus on unwinding/stack representation. For implementing continuations, there are multiple possible implementation strategies, sure, and depending on the exact restrictions involved, one can potentially use more efficient strategies. If a continuation is second-class in the sense that it must either be immediately invoked (or discarded), it makes sense that the existing call stack can be reused.
Regardless of the specifics of whether we can call Common Lisp style conditions and resumption a form of continuations or not, I believe the concern about non-local control flow interacting with type systems and notions of ownership/regions/lifetimes still applies.
As I see it, the notion of a continuation is that of a control operator, which allows one to “continue” a computation from a particular point. … Or perhaps the difference of opinion is because of the focus on first-class continuations specifically?
Typically, there are two notions of continuations:
Continuations as an explanatory or semantic concept. E.g. consider the expression f(x + y). To evaluate this, we first need to compute x + y. At this point our continuation is f(_), where _ is the place into which we will plug the result of x + y. This is the notion of a continuation as “what happens next” or “the rest of the program”.
Continuations as an actually reified value/object in a programming language, i.e. first-class continuations. You can get such a first-class continuation e.g. from Scheme’s call/cc or from delimited control operators. This typically involves copying or otherwise remembering some part of the stack on the part of the language implementation.
Resumable exceptions have no need for first-class continuations (2). Continuations as an explanatory concept (1) of course still apply, but only because they apply to every expression in a program.
I believe the concern about non-local control flow interacting with type systems and notions of ownership/regions/lifetimes still applies.
The example I used has no non-local control flow at all. write() calls signal_disk_is_full() and that calls the disk_is_full_handler(), and that finally returns normally to write(). This is my point: resumption does not require any non-local control flow.
As well as what @manuel wrote, it’s worth noting that basically every language has second-class continuations: a return statement skips to the current function’s continuation.
Your comment talked about one-shot delimited continuations, which are a kind of first-class continuation in that (per Strachey’s definition of first vs second class) they can be assigned to variables and passed around like other values.
it’s worth noting that basically every language has second-class continuations: a return statement skips to the current function’s continuation.
In most languages, a return statement cannot be passed as an argument to a function call. So is it still reasonable to call it as “support for a second-class continuation”?
Your comment talked about one-shot delimited continuations, which are a kind of first-class continuation in that (per Strachey’s definition of first vs second class) they can be assigned to variables and passed around like other values.
I understand your and @manuel’s points that the common usage may very well be that “one-shot delimited continuation” implies “first-class” (TIL, thank you).
We can make this same point about functions where generally functions are assumed to be first class. However, it’s not unheard of to have second-class functions (e.g. Osvald et al.’s Gentrification gone too far? and Brachthäuser et al.’s Effects, Capabilities, and Boxes describe such systems). I was speaking in this more general sense.
As I see it, the “one-shot delimited” aspect is disconnected from the “second class” aspect.
In most languages, a return statement cannot be passed as an argument to a function call. So is it still reasonable to call it as “support for a second-class continuation”?
That you can’t pass it as an argument is exactly why it’s called second-class. Only a first-class continuation is reified into a value in the language, and therefore usable as an argument.
As I see it, the “one-shot delimited” aspect is disconnected from the “second class” aspect.
One-shot strongly implies a first-class continuation. Second-class continuations are always one-shot, since, again, you can’t refer to them as values, so how would you invoke one multiple times?
One-shot strongly implies a first-class continuation. Second-class continuations are always one-shot, since, again, you can’t refer to them as values, so how would you invoke one multiple times?
Here is the wording from Strachey’s paper, as linked by @fanf
they always have to appear in person and can never be represented by a variable or expression (except in the case of a formal parameter) [emphasis added]
Isn’t this “except in the case of a formal parameter” exactly what is used by Osvald et al. and Brachthäuser et al. in their papers? Here is the bit from Osvald et al.’s paper:
Our solution is a type system extension that lets us define
file as a second-class value, and that ensures that such
second-class values will not escape their defining scope. We
introduce an annotation @local to mark second-class values,
and change the signature of withFile as follows:
def withFile[U](n: String)(@local fn: (@local File) => U): U
[..] Note that the callback function fn itself is also required
to be second-class, so that it can close over other
second-class values. This enables, for example, nesting calls
to withFile
In the body of withFile, fn is guaranteed to have several restrictions (it cannot be escaped, it cannot be assigned to a mutable variable etc.). But the type system (as in the paper) cannot prevent the implementation of withFile from invoking fn multiple times. That would require an additional restriction – that fn can only be invoked 0-1 times in the body of withFile.
In ALGOL a real number may appear in an expression or be assigned to a variable, and either may appear as an actual parameter in a procedure call. A procedure, on the other hand, may only appear in another procedure call either as the operator (the most common case) or as one of the actual parameters. There are no other expressions involving procedures or whose results are procedures. Thus in a sense procedures in ALGOL are second class citizens—they always have to appear in person and can never be represented by a variable or expression (except in the case of a formal parameter), while we can write (in ALGOL still)
(if x > 1 then a else b) + 6
when a and b are reals, we cannot correctly write
(if x > 1 then sin else cos)(x)
nor can we write a type procedure (ALGOL’s nearest approach to a function) with a result which is itself a procedure.
Regardless of the specifics of whether we can call Common Lisp style conditions and resumption a form of continuations or not, I believe the concern about non-local control flow interacting with type systems and notions of ownership/regions/lifetimes still applies.
That’s a concern, sure, but most “systems” languages have non-local control flow, right? C++ has exceptions, and Rust panics can be caught and handled. It would be very easy to implement a Common Lisp-like condition system with nothing more than thread local storage, function pointers (or closures) and catch/throw.
(And I’m pretty sure you can model exceptions / anything else that unwinds the stack as essentially being a special form of “return”, and handle types, ownership, and lifetimes just the same as you do with the ? operator in Rust)
My point is not about ease of implementation, it’s about usability when considering type safety and memory safety. It’s not sufficient to integrate a type system with other features – the resulting thing needs to be usable…
I’ve added a section at the end, Appendix A8 describing the concrete concerns.
Early Rust did have conditions and resumptions (as Steve pointed out elsewhere in the thread), but they were removed because of usability issues.
If you dig into the code a bit, you discover that SEH on Windows has full support for Lisp-style restartable and resumable exceptions in the lower level, they just aren’t exposed in the C/C++ layer. The same component is used in the NT kernel and so there’s an existence proof that you can support both of these models in systems languages, I just don’t know of anyone who does.
The SEH model is designed to work in systems contexts. Unlike the Itanium model (used everywhere except Windows) it doesn’t require heap allocation. The throwing frame allocates the exception and metadata and then invokes the unwinder. The unwinder then walks the stack and invokes ‘funclets’ for each frame being unwound. A funclet is a function that runs on the top of the stack but with access to another frame’s stack pointer and so can handle all cleanup for that frame without actually doing the unwind. As with the Itanium model, this is a two-stage process, with the first determining what needs to happen on the unwind and the second running cleanup and catch logic.
This model is very flexible because (as with the Lisp and Smalltalk exception models) the stack isn’t destroyed until after the first phase. This means that you can build any kind of policy on top quite easily.
Oh yes, that reminds me, Microsoft’s Annex K broken C library extensions have a runtime constraint handler that is vaguely like a half-arsed Lisp condition.
A two-phase exception-handling model is not strictly necessary to implement C++ language semantics, but it does provide some benefits. For example, the first phase allows an exception-handling mechanism to dismiss an exception before stack unwinding begins, which allows resumptive exception handling (correcting the exceptional condition and resuming execution at the point where it was raised). While C++ does not support resumptive exception handling, other languages do, and the two-phase model allows C++ to coexist with those languages on the stack.
Are you referring to some closed-source code here, or is the implementation source-available/open-source somewhere? I briefly looked that the microsoft/STL repo, and the exception handling machinery seems to be linked to vcruntime which is closed-source AFAICT.
The SEH model is designed to work in systems contexts [..]
Thanks for the context, I haven’t seen a simple explanation of SEH works elsewhere, so this is good to know. I have one follow-up question:
it doesn’t require heap allocation. The throwing frame allocates the exception and metadata
So the exception and metadata is statically sized (and hence space for it is already reserved on the throwing frame’s stack frame)? Or can it be dynamically sized (and hence there is a risk of triggering stack overflow when throwing)?
The same component is used in the NT kernel and so there’s an existence proof that you can support both of these models in systems languages, I just don’t know of anyone who does.
As Steve pointed out elsewhere in the thread, Rust pre-1.0 did support conditions and resumptions, but they removed it.
Are you referring to some closed-source code here, or is the implementation source-available/open-source somewhere?
I thought I read it in a public repo, but possibly it was a MS internal one.
So the exception and metadata is statically sized (and hence space for it is already reserved on the throwing frame’s stack frame)? Or can it be dynamically sized (and hence there is a risk of triggering stack overflow when throwing)?
The throwing context allocates the exception on the stack. The funclet can then use it in place. If it needs to persist beyond the catch scope, the funclet can copy it elsewhere.
This can lead to stack overflow (which is fun because stack overflow is, itself, handled as an SEH exception.
I’ve only dabbled slightly with both - how is resuming from an error different from catching it? Is it that execution restarts right after the line that threw the error?
A program wants to write() something to a file, but – oops – the disk is full.
In ordinary languages, this means write() will simply fail, signal an error (via error code or exception or …), and unwind its stack.
In languages with resumable or restartable errors, something entirely different happens: write() doesn’t fail, it simply pauses and notifies its calling environment (i.e. outer, enclosing layers of the stack) that it has encountered a DiskIsFull situation.
In the environment, there may be programmed handlers that know how to deal with such a DiskIsFull situation. For example, a handler may try to empty the /tmp directory if this happens.
Or there may be no such handler, in which case an interactive debugger is invoked and presented to the human user. The user may know how to make space such as deleting some no longer needed files.
Once a handler or the user has addressed the DiskIsFull situation, it can tell write() to try writing again. Remember, write() hasn’t failed, it is still paused on the stack.
Well, now that space is available, write() succeeds, and the rest of the program continues as if nothing had happened.
Only if there is no handler that knows how to deal with DiskIsFull situations, or if the user is not available to handle the situation interactively, would write() fail conclusively.
Is it that execution restarts right after the line that threw the error?
Yes. Common Lisp and Smalltalk use condition systems, where the handler gets executed before unwinding.
So unwinding is just one possible option (one possible restart), other common ones are to start a debugger, to just resume, to resume with a value (useful to provide e.g. default values, or replacement for invalid values), etc… the signalling site can provide any number of restart for the condition they signal.
It’s pretty cool in that it’s a lot more flexible, although because it’s adjacent to dynamic scoping it can make the program’s control flow much harder to grasp if you start using complex restarts or abusing conditions.
Exactly. For example “call with current continuation” or call-cc allows you to optionally continue progress immediately after the throw. It’s a generalization of the callback/continuation style used in async-await systems.
Even if you want to do stack unwinding, you don’t need continuations. Catch and throw are adequate operations to implement restarts that unwind the stack to some point first.
Bunny is not an AWS reseller. It started as just a CDN and has gradually been expanding to other services. This is a marketing article though, probably doesn’t belong here.
On the Changelog podcast they presented some benchmarks of CDNs and Bunny beat the competition (Cloudflare) by a wide margin. So if they are reselling hardware that would still be pretty impressive. I didn‘t know them before, but their engineering seams sound. The name is a little odd though.
There are other cases to complain about too on this level, e.g. really efficient RCU that needs fences to work on some arches (e.g. liburcu-qsbr).
However, that being said, the current state of affairs with all related things is kind of a hot mess all over. It turns out the C/C++11 memory models are basically useless and broken anyways, and a lot of very smart people are still finding new problems with the whole way we think about all related things. In the meantime, things like ThreadSanitizer don’t handle fences well, and it’s reasonable to say that if you want to reduce/eliminate UB, maybe you shouldn’t allow constructs like fences that are going to cause a situation where you can’t even tell if UB is happening anymore, or on which architectures.
For the few use-cases where you really want to do this stuff, and you’re really sure it will work correctly on every targeted architecture (are you??), you’re probably going to drop into arch-specific (and likely, asm for at least some arches) code to implement the lower level of things like seqlocks or RCU constructs, where you’re free to make specific assumptions about the memory model guarantees of particular CPUs, and then others can just consume it as a library.
you’re probably going to drop into arch-specific (and likely, asm for at least some arches) code to implement the lower level of things like seqlocks or RCU constructs
The whole point of having it in the language is so you don’t have to implement fences for N cpu arches, and so the compiler doesn’t go behind your back and try to rearrange loads/stores.
Yes, ideally. But the C11 model isn’t trustworthy as a general abstraction in the first place. In very specific cases, on known hardware architectures, it is apparently possible to craft trustworthy, efficient mechanisms (e.g. seqlock, RCU, etc), as observed in e.g. the Linux kernel and some libraries like liburcu, which do not rely on the C11 memory model. But arguably, it is not possible to do so reliably and efficiently in an architecture-neutral way by staying out at the “C11 model” layer of abstraction. There is perhaps a safe subset of the C11 model where you avoid certain things (like fences which sequence relaxed atomics, and certain mis-uses of acquire/release), but you’re not gonna reach peak hardware efficiency in that subset.
The safest subset of the C11 model is just to stick to seq_cst ops on specific memory locations. The further you stray deeper, the more sanity questions arise. I “trust”, for example, liburcu, because it is very aware of low-level arch details, doesn’t rely on just the abstract C11 model, is authored and maintained by a real pro at this stuff, and has been battle-tested for a long time.
Given this state of affairs, IMHO in the non-kernel C world you expect the compiler or a very solid library like the above to implement advanced efficient constructs like seqlocks or RCU. It’s (IMHO) probably not sane to try to roll your own on top of the abstract C11 atomics model (in C or Zig, either way!) in a maximally-efficient way and just assume it will all be fine.
To be clear, the linux kernel and liburcu don’t use the C11 memory model. They have their own atomics and barriers implemented in assembler that predate C11. liburcu is largely a port of the linux kernel primitives to userland.
However, that being said, the current state of affairs with all related things is kind of a hot mess all over. It turns out the C/C++11 memory models are basically useless and broken anyways
I can’t speak on exactly what the parent comment is saying, but I do knowmemory_order_consume was finally removed in C++26 having never been implemented correctly, despite trying several times since C++11 introduced it to make it work. it’s been a lot, and IIRC it as well as a hardware issue also affected the transactional memory technical specification.
There’s also been more than a few cases in the mid-late 10s of experts giving talks on the memory model at CppCon only for someone in the crowd to notice a bug that just derails the whole talk as everyone realizes the subject matter is no longer correct.
If you want the long version that re-treads some ground you probably already understand, there’s an amazingly deep 3-part series from a few years ago by Russ Cox that’s worth reading: https://research.swtch.com/mm .
If you want the TL;DR link path out of there to some relevant and important insights, you can jump down partway through part 2 around https://research.swtch.com/plmm#acqrel (and a little further down as well in https://research.swtch.com/plmm#relaxed ) to see Russ’s thoughts on this topic with some backup research. I’ll quote a lengthy key passage here:
See the paper for the details, but at a high level, the C++11 spec had some formal rules trying to disallow out-of-thin-air values, combined with some vague words to discourage other kinds of problematic values. Those formal rules were the problem, so C++14 dropped them and left only the vague words. Quoting the rationale for removing them, the C++11 formulation turned out to be “both insufficient, in that it leaves it largely impossible to reason about programs with memory_order_relaxed, and seriously harmful, in that it arguably disallows all reasonable implementations of memory_order_relaxed on architectures like ARM and POWER.”
To recap, Java tried to exclude all acausal executions formally and failed. Then, with the benefit of Java’s hindsight, C++11 tried to exclude only some acausal executions formally and also failed. C++14 then said nothing formal at all. This is not going in the right direction.
Disturbingly, 40+ years after the first relaxed-memory hardware was introduced (the IBM 370/158MP), the field still does not have a credible proposal for the concurrency semantics of any general-purpose high-level language that includes high-performance shared-memory concurrency primitives.
Even defining the semantics of weakly-ordered hardware (ignoring the complications of software and compiler optimization) is not going terribly well. A paper by Sizhuo Zhang and others in 2018 titled “Constructing a Weak Memory Model” recounted more recent events:
Sarkar et al. published an operational model for POWER in 2011, and Mador-Haim et al. published an axiomatic model that was proven to match the operational model in 2012. However, in 2014, Alglave et al. showed that the original operational model, as well as the corresponding axiomatic model, ruled out a newly observed behavior on POWER machines. For another instance, in 2016, Flur et al. gave an operational model for ARM, with no corresponding axiomatic model. One year later, ARM released a revision in their ISA manual explicitly forbidding behaviors allowed by Flur’s model, and this resulted in another proposed ARM memory model. Clearly, formalizing weak memory models empirically is error-prone and challenging.
The researchers who have been working to define and formalize all of this over the past decade are incredibly smart, talented, and persistent, and I don’t mean to detract from their efforts and accomplishments by pointing out inadequacies in the results. I conclude from those simply that this problem of specifying the exact behavior of threaded programs, even without races, is incredibly subtle and difficult. Today, it seems still beyond the grasp of even the best and brightest researchers. Even if it weren’t, a programming language definition works best when it is understandable by everyday developers, without the requirement of spending a decade studying the semantics of concurrent programs.
Do you mean that using @atomicStore/@atomicLoad on the lock’s sequence number with the same AtomicOrder for @fence would not be equivalent? If not, can you say more about why?
I mean stuff like Linux’s seqcount_t. Used for write-mostly workloads like statistics counting. To implement them you at least need a read barrier and a write barrier, since acquire/release operations put the barrier on the wrong side of the load/store.
I’m surprised it works on Windows, because the kernel docs suggest it shouldn’t. DRM’d media is sent to the GPU as an encrypted stream, with the key securely exchanged between the GPU and the server. It’s decrypted as a special kind of texture and you can’t (at least in theory) copy that back, you can just composite it into frames that are sent to the display (the connection between the display and GPU is also end-to-end encrypted, though I believe this is completely broken).
My understanding of Widevine was that it required this trusted path to play HD content and would downgrade to SD if it didn’t exist.
No one is going to create bootleg copies of DRM-protected video one screenshotted still frame at a time — and even if they tried, they’d be capturing only the images, not the sound
If you have a path that goes from GPU texture back to the CPU, then you can feed this straight back into something that recompresses the video and save it. And I don’t know why you’d think this wouldn’t give you sound: secure path for the sound usually goes the same way, but most things also support sound via other paths because headphones typically don’t support the secure path. It’s trivial to write an Audio Unit for macOS that presents as an output device and writes audio to a file (several exist, I think there’s even an Apple-provided sample that does). That just leaves you having to synchronise the audio and video streams.
I’m pretty sure that what Gruber is describing is basically just “hardware acceleration is not being enabled on many Windows systems”, but because he has his own little narrative in his head he goes on about how somehow the Windows graphics stack must be less integrated. Windows is the primary platform for so much of this stuff!
I would discount this entire article’s technical contents and instead find some other source for finding out why this is the case.
Well it depends on the type of acceleration we’re speaking of. But I’ve tried forcing hardware acceleration on video decode and honestly you’d be surprised how much it failed and I did this on rather new hardware. It was actually shockingly unreliable. I’m fairly certain it’s significantly worse if you extend your view to older hardware and other vendors.
I’m also fairly sure, judging by people’s complaints, that throwing variable refresh rate, higher bit depths and hardware-accelerated scheduling in the mix has not resulted in neither flagship reliability or performance.
It can be the primary platform but this doesn’t mean it’s good or always does what it should or promises it’ll do.
I think it means: enabling the feature to screenshot DRM protected media would not by itself enable piracy, since people would not use screenshots to pirate media frame at a time.
What you are saying reads like “one technical implementation of allowing screenshots would enable piracy.” I trust that you’re probably right, but that doesn’t contradict the point that people would not use that UI affordance itself for piracy.
No one would use screenshots for piracy because all the DRM is already cracked. Every 4k Netflix, Disney, etc, show is already on piracy websites, and they’re not even re-encoded from the video output or anything, it’s straight up the original h264 or h265 video stream. Same with BluRays.
Yup, if you go through GitHub there are several reverse-engineered implementations of widevine, which just allow you to decrypt the video stream itself with no need to reencode. That then moves the hard part to getting the key - fairly easy to get the lower security ones since you can just root an Android device (and possibly even get it from Google’s official emulator? At least it supports playing widevine video!), the higher security ones are hardcoded into secure enclaves on the GPU/CPU/Video decoder though, but clearly people have found ways to extract them - those no-name TV streaming boxes don’t exactly have a good track record of security, so if I were to guess that’s where they’re getting the keys.
Still, no point blocking screenshots - pirates are already able to decrypt the video file itself which is way better than reencoding.
Those no-name TV streaming boxes usually use the vendor’s recommended way to do it, which is mostly secure, but it’s not super-unusual for provisioning data to be never deleted off the filesystem, even on big brand devices.
The bigger issue with the DRM ecosystem is that all it takes is for one secure enclave implementation to be cracked, and they have a near infinite series of keys to use. Do it on a popular device, and Google can’t revoke the entire series either.
Personally, I’m willing to bet the currently used L1 keys have come off Tegra based devices, since they have a compromised boot chain through the RCM exploit, as made famous by the Nintendo Switch.
Using random strings (e.g., generated by UUID.randomUUID().toString()) as primary keys is generally a bad idea for performance reasons:
Slower Comparisons: String comparisons are inherently slower than numeric comparisons. Databases can compare integers much more efficiently at the hardware level.
First of all, you should not store UUIDs as strings. As fs111 mentionned, UUIDs are 16 bytes, so they fit into a uint128_t. Postgres supports UUIDs natively, but even if your database is not supporting 128bit integers, if you encode it in CHAR(16), databases will be quite fast at comparing them.
Also, snowflake IDs are basically superseded by UUID7. You get a bigger space than snowflake IDs, and all the goodies of snowflake IDs and UUID4.
UUIDv7 requires generating a reasonably good quality random number every time you generate an ID, while Snowflake only needs to increment a counter, so it doesn’t seem like a straight upgrade. One could imagine a 128-bit snowflake-like format that’s like, 48-bit timestamp, 64-bit worker identifier (can be randomly generated) and a 16-bit counter
Fair. But I would argue that for most applications, the probability of generating two random sequence of 48 bits on the same timestamp is quite low.
Also, nothing guarantees that two sequence numbers won’t conflict on the same timestamp, what prevents conflict is the worker identifier, which you want to randomly generate in your proposal. How is that different from generating the entire remaining 74 bits? (like UUIDv7)
But you’re right, UUIDv7 is not guaranteeing full uniqueness, but I would just disregard the probability as “too low” like for UUIDv4.
This is the first release of Fish after the re-write into Rust. While it’s mostly very similar to the previous version of Fish, there are a few new features and a couple of breaking changes.
Instead of using async to implement sans-io state machines, I’ve just started exploring the idea of simply having a async IO traits that can be implemented by both non-blocking stuff (like Tokio’s IO types) and blocking stuff (like the standard library IO types). In the latter case, .await would simply block until the IO was done, and then it would return Poll::Ready(result).
RISC-V is like quantum…always just a few years away
I think this is completely untrue. RISC-V is real, exists, works, there are hardware products being built, embedded into mass produced devices, etc.
It’s just in the space that most of us are mostly interested in - modern high performance CPUs - the instruction set is maybe 1% of the story. Modern CPUs are one of the most elaborated artifacts of human engineering and result of decades of research and improvements, part of a huge industrial and economical base built around it. That’s not something that can be significantly altered rapidly.
Just look how long it took Arm to get into high performance desktop CPUs. And there was big and important business behind it, with strategy and everything.
They’re not asking for high-performance desktop CPUs here though. Judging by the following statement on IRC:
it sounds like anything that even approaches a current day midrange phone SoC would be enough. RPi5 is 4x Cortex-A76, which is about a midrange phone SoC in 2021.
Last I checked, the most commonly recommended RISC-V boards were slower than a Raspberry Pi 3, and the latest and greatest and most expensive boards were somewhere in the middle between RPi 3 and 4. So yeah, pretty slow.
Ah, roughly comparable to a Sun E450 then
Beyond microcontrollers, I really haven’t seen anything remotely usable. I’d love to be wrong though.
I tried to find a Pi Zero replacement, but the few boards available all had terrible performance. The one I bought in the end turned out has an unusable draft implementation of vector instructions and it’s significantly slower than any other Linux board I’ve ever used, from IO to just CPU performance. Not to mention the poor software support (I truly despise device trees at this point).
Just for a data point, the RP2350 chip in the Raspberry Pi Pico 2 includes a pair of RV32IMACZb* cores as alternates for the main Cortex-M33 cores: you can choose which architecture to boot into. The Pico 2 costs $5 in quantities of 1.
I use Niri and I wholeheartedly agree that Niri is doing much better job technically than the clunky quirky C code in wlroots and Sway. It’s also a nicer user experience overall. Niri is only getting better by the day, leaving stagnating Sway C code in the dust.
Still, I somewhat miss the UX of “tabbed/stacked layouts with nested containers, the least ergonomic Band-Aid™ for the space issue I’ve ever seen” and I don’t agree that containers are a band-aid, even though I agree that they bring their own cognitive load.
On Sway I was never actually using more than five workspaces. A laptop screen + a big 4k screen + tabbed containers were actually a solution to the proliferation of workspaces: I kept a small number of workspaces with an arbitrary number of temporary windows in a tabbed container for a task that I could manipulate as one unit (closing, moving to another workspace, etc). I had to adjust to avoid multiplying windows in a Niri workspace, because I quickly lose track of invisible windows beyond the edges of screen (and Waybar has not been cooperative to display icons only from the current workspace), while on Sway I could see window decorations and titles for every top-level window/container in the workspace at the same time.
Another little annoyance is that I like that Sway workspaces have their own global namespace. I’m used to Sway workspaces 1 and 3 living on the laptop screen, 2 (and rarely 4) on my external monitor by default. When I unplug the external monitor, the workspace 2 (and maybe 4) just migrates to the laptop screen temporarily without changing the numbers, while on Niri workspaces 1 and 2 from the external monitor become something like 4 and 5 on the laptop screen.
Another feature that I’m waiting for is to be able to bind gestures to actions (swiping a workspace to another monitor can be nice).
Still, Niri is just awesome overall, I use it and I highly recommend it.
I’m gonna give Niri a try. Maybe it’ll even fix the bad perf and crashes I’ve had under Sway.
I agree on tabs. I used to have over 20 Sway workspaces, all with tabs. Recently Firefox has started having trouble when I have over 100 windows open, so I’ve had to cut back, and only have 9 workspaces, but I still have lots of tabs and I feel like they make it very quick and easy to find what I’m looking for. But I navigate by mouse. Whenever I happen to need to use the keyboard to navigate my sea of tabs inside tabs, it sucks. Maybe @eBPF always uses keyboard navigation and that’s why they dislike using tabs?
Maybe I’ll implement tabs for Niri if I miss them :laughing:
Edit: wait, Niri already has, I just discovered while trying to get started. Though they’re opposite to what I was envisioning. Niri tabs stack windows inside a single spot in the horizontal infinite scroll, while what I was imagining was a each tab being a container for a separate horizontal workspace.
Yes, it has windows stacked in columns, like in Xmonad.
Woah, actually there’s something new in https://github.com/YaLTeR/niri/issues/933, just a few days after the 25-01 release.
That new thing is what I was referring to :)
The stacking seems only marginally useful to me.
This does seem interesting – however, this isn’t anything new as a concept.
It’s been around on X11 for years as virtual desktop support. What makes this “interesting” is how applications are arranged.
Now, under Wayland, this approach is easy as there’s fewer restrictions around how applications are sized. I hope this changes…
Since when did X11 virtual desktops extend infinitely? As far as I remember every virtual desktop implementation in X11 WMs were just separate workspaces you could switch between.
The old twm window manager has a big canvas virtual desktop instead of little boxes. Fvwm can be configured either way https://www.fvwm.org/Man/fvwm3/#_the_virtual_desktop
Wow, it would be a huge endorsement of uutils if a big distro like Ubuntu switched to it.
I need to get around to filing bug reports for the minor problems I’ve had.
I disagree with this, my tiling WM is forcing me about nothing, it’s just how you use it, or what problems you want to solve. My setup exactly takes away all cognitive load - one of the main problems with my current macOS setup, I need to think about window placement all the time.
That said, I’m planning to give niri an honest trial soon, not because I am fed up with any specific tiling WM, but I have a new machine that needs a setup anyway.
As someone who 100% agrees with the author here (to the point where I finally gave up on tiling after about five years of using it), let me explain it differently: if you’re tiling, every new window you open forces you to make a decision about it. Whether that window is epehemeral or not, whether it belongs in this space or not, it showing up will make every other window in the space reflow, i.e. move around or change sizes. This is a core feature of tiling! So you end up doing things like the current top reply to your comment where you keep one full window per screen, or at most two, and if you accidentally open a new window even for a microsecond that’s going to thrash your layout. The moment I realized tiling was actively hurting me was when, upon reflection, I found out that I almost had muscle memory for opening a new window and making it full screen so my existing layout didn’t explode.
I think this problem is specific to dynamic tiling. It’s one of the reasons why I never managed to get along with i3 (or Sway). I got along with wmii to some degree back in the day. I don’t recall the details, I vaguely remember its stacking feature being a little less awkward than i3’s but I don’t remember exactly why; I do remember that it was equally annoying to switch among windows though (I routinely had to open like a dozen datasheets, and then I’d have to switch between them by patiently bringing each one into view).
Manual tiling has always been the way to go for me back when I used a tiling WM. Ratpoison (which I used), and @jcs’ own sdorfehs, trivially fix this problem: opening a new window never causes reflow, it always opens exactly where you want it (actually, it always open exactly where your eyes already are, because you’ve switched to that container), and you get instant full screen vs. tiling by just moving a window to its own workspace/back to its original workspace.
Or at least that’s how it used to work at some point, I’m not sure if this is still the case – I haven’t used a tiling WM in years, I have a large monitor and tiling WMs are super annoying (if I tile two windows side-by-side they’re too wide to read comfortably, and making one narrower makes the other one even wider; if I tile 3 or more, depending on what’s in them, at least one of them is now too narrow to be useful).
This is exactly what Niri solves. Every window opens at its preferred width. Niri doesn’t unnecessarily insist on filling up your whole screen. With every new window, your screen fills up until the furthest to the left scrolls out of view.
Indeed, although (with the usual “to each his own” caveat) I honestly just… prefer the stacking solution here. I played with Niri a few weeks (or months?) ago and I can’t say I was a fan, because it just introduces the opposite problem – if I open three windows, one of them is either to narrow to be useful, or off-screen. If it’s off-screen, I have to scroll left/right to get to it, and the associated animation (which is kind of necessary for spatial cues) gets tiring pretty fast.
I liked ratpoison for a bunch of other reasons that go well with manual tiling though. E.g. window switching is extremely fast and works well with “local” muscle memory.
When I have to use X, I use notion (no not that one). It just Works and I can have a consistent layout.
I’ve been using Sway for a few years now, and I still miss Notion. I set each Sway region to tabbed, which gives a similar effect, but you have to do it manually after booting and keep at least one window in each region, which is slightly annoying. And sometimes I press the wrong key and the layout gets messed up in some confusing way.
Looking at the Notion site again, I see it links to volare, a fork of Sway aiming to make it more Notion like. Just trying it out now… looks promising!
You may be interested in the following config option:
workspace_layout tabbed. It makes every new container tabbed at creation.Thanks, that looks useful!
I have Sway configured to always make containers tabbed, so there’s never reflow when I open a window.
I’m gonna give Niri a try though, seems neat.
I came here to reply to very same quote: I practically always keep 1 window per screen, full screen. No cognitive load involved here lol
This isn’t feasible for me since i have a lot of things open at once, and I run out of workspace keybinds. I also do like being able to see two/three things at once.
As someone who has been using tiling WMs for a long time, I also recommend a keybinding that lets you textcomplete a window.
Something like
rofi -show window -auto-selectcan really do wonders for navigating around. While it’s nice to have a keybinding to jump to any workspace, you can get really far just jumping to the windows themselvesI had swayr set up to help me find windows I had lost, and I put together a little script to do the same on Niri. It’s useful, but it ended up being a last-resort thing unfortunately.
well, isn’t that literally the tiling wm forcing you to avoid reflows?
If it’s a terminal running tmux then that’s cheating ;-)
There are some tiling managers for macos as well, yabai is my personal favorite currently!
I agree tiling managers have saved me from needing to care about window placement at all. I can just simply focus on coding. Which usually takes a max of 3 windows for me, all which can fit fairly well on my super wide monitors.
I bet there is a screen size difference for people here. Some may have less screen to work with. When using my 15” MBA I feel like the tiling manager isn’t as helpful as with a giant screen.
Interestingly while “coding” it’s less of a problem for me, unless I’m screensharing.
It’s copying stuff from slack or emails or JIRA tabs, that’s a lot more window switching.
Will be interesting to hear if he sees the focus on async supports foundational software. Maybe for a data plane, but probably not in any kernel code.
I’m not so sure…I’ve been doing some embedded Rust (which sort of sits at the same place on bare hardware as the kernel) and async has been really helpful for keeping the code organized. I think about it as essentially making state machines automatically for me. There is actually a tradeoff here because it’s almost too ergonomic — it’s easy to forget what it’s really doing and run into issues because something was a “local” variable (i.e., a field of the invisible state machine state) when you really needed to give it external ownership. But that didn’t take long to internalize.
True that. I’ve been doing some STM32 hacks myself and the last one I did, I tried embassy. Surprisingly nice to have async on embedded.
My belief is that Rust’s async story shines particularly bright[0] when writing software for embedded systems, which Niko includes in his “foundational” category.
[0]: The whole complexity mess comes a lot from the fact that you want to provide zero-cost state machines for futures, without boxing nor a GC.
Cancellation is extra important in foundational code.
If I was making a brand new kernel today, it would probably be mostly async.
This reminded me about a throwaway paragraph in the Signal crypto review (previously):
I wonder what metadata resistance Signal offers that Wire, through its use of MLS, doesn’t?
The metadata resistance of signal is largely mythical anyway since the necessarily have the metadata via other channels and just pinky promise not to look or store it
A source for this claim would be appreciated.
You can derive it from necessity if you like. Signal server sees the message come in over a network connection from an app. The server must be able to deliver it to a target user. This is the metadata. That the message data on the wire doesn’t contain this metadata doesn’t prevent the server from knowing it, it must know it in order to function at all. Signal has never claimed otherwise they only claim that the server forgets right away. But of course that must be taken on trust
At best, that associates two IP addresses… not withstanding CGNAT, VPNs, MASQUE, and friends.
But it doesn’t associate them with accounts / contacts. That’s a stronger guarantee than Matrix or XMPP. It may also be a stronger guarantee than Wire?
That isn’t true. Signal messages need to be routed by account identifier, an IP address is not sufficient. And unless you have the “sealed sender” feature turned on, messages identify their senders.
There’s no mechanism for the Signal server to know the IP addresses of iOS clients because an iOS device only maintains one persistent connection to Apple for notifications. There’s no way a Signal client can keep track of the IP addresses of its contacts, because it isn’t a mesh network, it’s a star. Even for non-iOS devices, an IP address isn’t sufficient to identify a client because (for example) there are multiple clients in our house and our house has only one IP address.
Sealed sender is enabled by default, no?
So it is. As far as I can tell the official documentation for the feature is still this blog post https://signal.org/blog/sealed-sender/ which makes it sound like the feature is incomplete, but the last few paragraphs say they were (in 2018) rolling it out to everyone so I guess the preview was actually the main event.
I just checked in settings. There’s only “show when it’s used” and “allow for even unknown senders” preferences for me, which makes me conclude that it’s already enabled by default and can not be disabled.
Sealed sender is also not a good protection if Signal was to actually start keeping logs. There are two sources of metadata leakage with sealed sender:
You need to acquire a sender certificate before you can use sealed sender. If you do this from the same IP as you later use when sending a message, your IP and your identity can be linked.
When you send a message, the receiver sends a delivery notice back to you. This is a simple correlation, a sealed message to Person A on IP address X from IP address Y is immediately followed by a sealed message from IP address X to Person B on IP address Y.
Yes, and if you do have Sealed Sender turned on, the only metadata left on the server that’s needed for message delivery is a 96-bit “delivery token” derived from a “profile key” that conveniently rotates whenever you block an account.
My reading of the description of sealed sender is that the delivery token is used check that the sender is allowed to send to the recipient – it’s an anti-abuse mechanism. It is used when the server is deciding whether to accept a message, it isn’t used to decide where to deliver the message.
I was going off the above-linked blog post that dives into the Signal internals.
That is not my reading of the server code for either single or multi-recipient messages. And Signal iOS at least seems to use sealed sender by default, though it falls back to unsealed send if there’s an auth failure, which seems bad. (so the server can force the client to identify itself? … but I also can’t find anywhere that throws
RequestMakerUDAuthError.udAuthFailure, so maybe it’s dead code…)But I admit it’s a very casual reading of the code!
edit: found it!
To say what sibling says in a different way, the connection the message is delivered to the server over must be authenticated. If it weren’t the server would not accept the message, due to spam reasons etc. so the server knows the account of the sender. And it needs to know the account of the receiver for delivery to be possible
I strongly suspect you’ve misunderstood how Signal works. What do you think about https://soatok.blog/signal-crypto-review-2025-part-8/, specifically the addendum section?
That article specifically admits this is true. Signal doesn’t choose to write it down (assuming the published code is what they run) which means it cannot be recovered after the fact (if you trust the server to not have recorded this) of course any other operator could also not write this down and one could choose to trust that operator. It’s not specific to signal really.
I believe we agree that the server must know the recipient of a message. I believe we disagree about whether the server needs to know the sender of a message.
Erm, so what do you mean by authenticated?
That article notes the sender’s metadata is (e2e) encrypted. The server accepts and routes messages whose envelope includes a delivery token. And, similarly, that delivery token is shared via e2e encrypted sessions to all a recipient’s contacts.
It’s unclear to me how unknown senders / randos are handled, however. I haven’t read that deep into the code.
Sure, that’s fair.
But I was hoping your claim was more substantial than just this, since, as since child comment below says, almost all signal competitors suffer from this.
Not just almost all. It is fundamentally impossible for a communications system to operate if whoever does the routing doesn’t know sender and receiver identity at some point (and send/receive time, which is also metadata)
If you do onion routing you could make it so only one part knows sender and one part knows receiver, which is how the remailer network worked but that’s the only instance I’m aware of doing that. Everyone else has the metadata and it’s just various shades of promising not to write it down.
Aren’t there protocols for deniable drop offs on servers and similar? Those wouldn’t scale well, but AFAIK they work. So they are possible (just not practical).
There is SecureDrop, but as far as the technology is concerned it’s a web app accessed via Tor. The rest of the anonymity guarantees come from server-side opsec performed by the recipient org https://docs.securedrop.org/en/stable/what_is_securedrop.html
SimpleX is a chat system that does onion routing. Only two hops, and I am not vouching for anything about the app or its servers; just noting this feature.
They were also recently audited by Trail of Bits, so SimpleX is probably not clownshoes.
This level of metadata leakage (IP addresses) is also true of nearly every so-called Signal competitor too.
No one claimed otherwise. The context is the claim expressed above that you get worse metadata resistance than Signal, which seems irrelevant given that Signal doesn’t really have it either.
Sorry. I hear this line of argument on Hacker News and Reddit a lot, only for the person to turn around and recommend XMPP or Matrix instead. I wanted to cut it off at the pass.
Look at zkgroup for a deep dive into that question.
If I was ever looking for an ‘out’ I think this was a fair trigger. A link to … a todo on a git forge with nothing of substance that hasn’t already been said a decade+ ago? Did I misspell phoronix.com? …
I’m not sure if it belongs here, but it’s not a TODO, it’s a WIP merge request currently with almost 2000 lines of new code that you can compile and try.
What am I looking at?
pmeunier has an account here and would be most qualified to add some context, but I can add a bit here.
Pijul is a patch-based VCS, as opposed to snapshot-based, like Git, Mercurial, etc. The leading example of a patch-based VCS otherwise up to this point has been Darcs, written in Haskell. I haven’t used it much. Pijul’s main motivation over Darcs was algorithmic improvements that resolve the worst-case exponential time Darcs can run into; see the Why Pijul? and Theory pages for a bit more context. There’s supposed to be some theoretical soundness improvements as well over Darcs, but I don’t know as much about that.
Nest has been the main service for natively hosting Pijul projects with a web UI, made by the same team/developer. From what I remember, like git, a Pijul repository can be used remotely over SSH; Nest is more like having gitea/gitlab in addition to that.
I remember it being closed source with the intention of making it open source later; I think the rationale was around Nest still being alpha and not having the resources at the time to field it as a full open source project in addition to Pijul itself, though nest.pijul.com was available to host other open source projects with Pijul.
The news here would be that Nest’s been recently open sourced. As someone who’s been interested in Pijul but hasn’t had much opportunity to use it yet, this sounds like significant news that should make adoption more practical in general. Congrats to pmeunier and co., and thank you for your interesting and generous work in the VCS space!
As I recall, the problem is that patches in Darcs don’t necessarily commute. For all the nice math to work out, you want independent patches to commute, that is to say, applying patch A followed by patch B, should give you the same result as applying patch B first, followed by patch A. But patches aren’t guaranteed to do that in Darcs, and the only way to ensure this is to simply test pairs of patches by applying them and seeing if both orders give the same result.
In Pijul, if you have two patches that don’t depend on each other, they always commute. Either they don’t conflict, in which case the non-conflicting outcome is the same regardless of the order, or they do conflict, in which case you get the exact same conflict state regardless of the order.
I found these public posts about the history of this Nest implementation and the plans to open source it:
Great answer, nothing to add from me!
The reason it was closed-source wasn’t really by design, it was just that the service had accumulated a lot of tech debt after transitioning through way too many versions of the Rust async ecosystem (the Nest started in 2015). So, this is a marker of Pijul being mature enough that I was able to spend some time rewriting things using the (now) stabilised Rust libs such as Diesel and Axum. Also, Svelte is fun on the front-end but didn’t exist back then, I love how you can have the best of both worlds (static and dynamic).
Pijul is a version control system (alternative to git) descended from darcs, which is built around patches as opposed to snapshots (e.g. commits).
Nest is like gitea for pijul repositories, and this is the source code for nest hosted on the public instance of nest run by the pijul org.
I wouldn’t say “descended from Darcs” because that may give the wrong connotations. Pijul isn’t a fork of Darcs. Pijul has a rigorous mathematical foundation, unlike Darcs. They are conceptually related though, so I think it is clearer to say Pijul is inspired by Darcs.
From Why Pijul?
I started pay attention to Pijul many years ago. When it comes to systems that manage essential information, I tend to bias in favor of systems with formal guarantees.
Unless it uses DMA (which that network device likely has). So you need an IOMMU as well. And of course the device must not be in cahoots with the driver, which is not necessarily the case on embedded systems like this where the vendor taped out the chip and wrote the driver.
Nope. Our DMA controller is capability aware, DMA does not bypass the memory safety of the system.
That counts as an iommu…
I don’t entirely agree, an IOMMU is a separate piece of hardware that needs to be programmed appropriately to allow DMA to go to separate regions and which also does address translation (many of the earliest ones were designed only for the translation, not security: they let you ship cheap 32-bit devices in 64-bit systems with more than 4 GiB of RAM). The programmer model involves multiple systems tracking permissions. In contrast, a CHERIoT DMA unit lets a compartment DMA to any memory that it can access and enforces the same rules as the core. If that’s an IOMMU then CHERI is an MMU.
but it doesn’t do any of the things that a memory management unit does
An (IO)MMU does two things
If you make DMA controllers “capability aware” then they perform the second function of an (IO)MMU.
And of course the whole thing is moot if you are using a driver written by the chip vendor.
I was just in the process of pulling up the SeL4 FAQ about DMA to ask if the same restrictions applied to CHERIoT. They mention x86 VT-d and SystemMMU. But I guess if CHERI already has full control of the hardware (by being a hardware security implementation) they can fix that separately.
Also I think this is two things at the same time, it should be either “and so on” or “and friends”:
I have a few questions:
Would scheme-rs be a good choice for adding general purpose runtime scripting to Rust applications (in sync or async Rust)?
Would scheme-rs be suitable for running untrusted user provided scripts? This entails limiting what a script has access to, and limiting runtime and memory usage.
Would you say scheme-rs’s code is mature enough that 3rd parties could jump in and start contributing? Or is the project still in enough flux that it would be difficult?
Yes, specifically for async right now. Although I do want to provide a sync interface at some point.
To some extent. Limiting what the script has access to, absolutely. You get to decide what functions an environment can access. However there are no mechanisms for limiting memory usage or runtime at the moment.
Yes. Most of the architecture is fixed at this point. Some things will change, i.e. the value enum will eventually become opaque so that we can properly optimize it, but since I wrote this post a couple of people have begun to contribute to the project with no issues
Great, I will try looking into adding a resource limiting feature :)
Very cool! This should certainly be possible, although obviously a bit of an effort.
Time and time again wlroots proves how solid it is as a project. Really outstanding work!
It’s just a shame that Wayland didn’t dare to define such things on the protocol level in the first place. I mean, given the rock-sold colour space support in macOS, any sane engineer designing a new display manager/compositor in the 2010’s would have put colour management as a design-centerpiece. Libraries like Little CMS prove that you don’t even need to do much in terms of colour transformations by hand; simply define your surfaces in a sufficiently large working colour space and do the transformations ad-hoc.
From what I remember back then, the only thing the Wayland engineers seemed to care about was going down to the lowest common denominator and ‘no flickering’ (which they saw in X in some cases).
For instance, it is not possible to portably place an application window ‘at the top’, given one may not dare to assume this even though 99.99% of all displays support this. It would have made more sense to have ‘feature flags’ for displays or have more strict assumptions on the coordinate space.
In the end, a wayland compositor requires close to 50.000 LOC of boilerplate, which wlroots gracefully provides, and this boilerplate is fragile as you depend on proprietary interfaces and extensions. You can write a basic X display manager in 500 LOC only based on the stable X libraries. With all of X’s flaws, this is still a strong point today.
This instinctually bothers me too, but I don’t think it’s actually correct. The reason that your X display manager can be 500 LOC is because of the roughly 370 LOC in Xorg. The dominance of wlroots feels funny to me based on my general dislike for monocultures, but if you think of wlroots as just “the guts of Xorg, but in ‘window manager userland’”, it actually is not that much worse than Xorg and maybe even better.
I think you mean 370k LOC.
Yes indeed, my bad.
I don’t really get your criticism. Wayland is used on a lot of devices, including car displays and KIOSK-like installations. Does an application window even make sense if you only have a single application displayed at all times? Should Wayland not scale down to such setups?
Especially that it has an actually finely-working extension system so that such a functionality can be trivially added (either as a standard if it’s considered widely useful, or as a custom extension if it only makes sense for a single server implementation).
A Wayland compositors’ 50 thousands LOC is the whole thing. It’s not boilerplate, it’s literally a whole display server communicating in a shared “language” with clients, sitting on top core Linux kernel APIs. That’s it. Your 500 LOC comparison under X is just a window manager plugin, just because it operates as a separate binary it is essentially the same as a tiling window manager plugin for Gnome.
Then it would have taken 2× as long to get it out of the door and gain any adoption at all.
Routine reminder that the entire F/OSS ecosystem worth of manpower and funding is basically a rounding error compared to what Apple can pour into macOS in order to gain “rock-solid colour space support” from day zero.
What do you mean by this? I can’t understand it.
I find it slightly odd that an “epic treatise on error models” would fail to mention Common Lisp and Smaltalk, whose error models provide a facility that all others lack: resuming from an error.
Hi, author here, the title also does say “for systems programming languages” :)
For continuations to work in a systems programming language, you can probably only allow one-shot delimited continuations. It’s unclear to me as to how one-shot continuations can be integrated into a systems language where you want to ensure careful control over lifetimes. Perhaps you (or someone else here) knows of some research integrating ownership/borrowing with continuations/algebraic effects that I’m unfamiliar with?
The closest exception to this that I know of is Haskell, which has support for both linear types and a primitive for continuations. However, I haven’t seen anyone integrate the two, and I’ve definitely seen some soundness-related issues in various effect systems libraries in Haskell (which doesn’t inspire confidence), but it’s also possible I missed some developments there as I haven’t written much Haskell in a while.
I’m sorry for the slightly snarky tone of my original reply, but even if you were to discount the Lisp machines, or all the stuff Xerox and others did with Smalltalk (including today’s Croquet), as somehow not being systems, I would have expected an epic treatise to at least mention that error resumption exists – especially since academia is now rediscovering this topic as effect handlers (typically without any mention of the prior art).
This misconception is so common (and dear to my heart) that I have to use bold:
Resumable exceptions do not require first-class continuations, whether delimited or undelimited, whether one-shot or multi-shot. None at all. Nada. Zilch.
To take the example I posted earlier about writing to a full disk: https://lobste.rs/s/az2qlz/epic_treatise_on_error_models_for_systems#c_ss3n1k
Suppose
write()discovers that the disk is full (e.g. from an underlying primitive). This causes it to callsignal_disk_is_full(). Note that the call tosignal_disk_is_full()happens inside the stack ofwrite()(obviously).Now
signal_disk_is_full()looks for a handler and calls it:disk_is_full_handler(). Again, the call to the handler happens inside the stack ofsignal_disk_is_full()(andwrite()). The handler can return normally towrite()once it has cleaned up space.write()is never popped off the stack. It always stays on the stack. IOW, there is never a need to capture a continuation, and never a need to reinstate one. Thedisk_is_full_handler()runs inside the stack of the original call towrite().A side note: most effect systems do use and even require first-class continuations, but IMO that’s completely overkill and only needed for rarely used effects like nondeterminism. For simple effects, like resumable exceptions, no continuations are needed whatsoever.
I provided the working definition of “systems programming language” that I used in the blog post. It’s a narrow one for sure, but I have to put a limit somewhere. My point is not trying to exclude the work done by smart people; but I need a stopping point somewhere after 100~120 hours of research and writing.
Thank you for writing down a detailed explanation with a concrete example. I will update the post with some of the details you shared tomorrow.
You will notice that my comment does not use the phrase “first-class” anywhere; that was deliberate, but perhaps I should’ve been more explicit about it. 😅
As I see it, the notion of a continuation is that of a control operator, which allows one to “continue” a computation from a particular point. So in that sense, it’s a bit difficult for me to understand where exactly you disagree, perhaps you’re working with a different definition of “continuation”? Or perhaps the difference of opinion is because of the focus on first-class continuations specifically?
If I look at Chapter 3 in Advances in Exception Handling Techniques, titled ‘Condition Handling in the Lisp Language Family’ by Ken M. Pitman, that states:
So it might be the case that the mismatch here is largely due to language usage, or perhaps my understanding of continuations is lacking.
I’m also a little bit confused as to why your current comment (and the linked blog post) focus on unwinding/stack representation. For implementing continuations, there are multiple possible implementation strategies, sure, and depending on the exact restrictions involved, one can potentially use more efficient strategies. If a continuation is second-class in the sense that it must either be immediately invoked (or discarded), it makes sense that the existing call stack can be reused.
Regardless of the specifics of whether we can call Common Lisp style conditions and resumption a form of continuations or not, I believe the concern about non-local control flow interacting with type systems and notions of ownership/regions/lifetimes still applies.
Typically, there are two notions of continuations:
Continuations as an explanatory or semantic concept. E.g. consider the expression
f(x + y). To evaluate this, we first need to computex + y. At this point our continuation isf(_), where_is the place into which we will plug the result ofx + y. This is the notion of a continuation as “what happens next” or “the rest of the program”.Continuations as an actually reified value/object in a programming language, i.e. first-class continuations. You can get such a first-class continuation e.g. from Scheme’s
call/ccor from delimited control operators. This typically involves copying or otherwise remembering some part of the stack on the part of the language implementation.Resumable exceptions have no need for first-class continuations (2). Continuations as an explanatory concept (1) of course still apply, but only because they apply to every expression in a program.
The example I used has no non-local control flow at all.
write()callssignal_disk_is_full()and that calls thedisk_is_full_handler(), and that finally returns normally towrite(). This is my point: resumption does not require any non-local control flow.As well as what @manuel wrote, it’s worth noting that basically every language has second-class continuations: a return statement skips to the current function’s continuation.
Your comment talked about one-shot delimited continuations, which are a kind of first-class continuation in that (per Strachey’s definition of first vs second class) they can be assigned to variables and passed around like other values.
In most languages, a return statement cannot be passed as an argument to a function call. So is it still reasonable to call it as “support for a second-class continuation”?
I understand your and @manuel’s points that the common usage may very well be that “one-shot delimited continuation” implies “first-class” (TIL, thank you).
We can make this same point about functions where generally functions are assumed to be first class. However, it’s not unheard of to have second-class functions (e.g. Osvald et al.’s Gentrification gone too far? and Brachthäuser et al.’s Effects, Capabilities, and Boxes describe such systems). I was speaking in this more general sense.
As I see it, the “one-shot delimited” aspect is disconnected from the “second class” aspect.
That you can’t pass it as an argument is exactly why it’s called second-class. Only a first-class continuation is reified into a value in the language, and therefore usable as an argument.
One-shot strongly implies a first-class continuation. Second-class continuations are always one-shot, since, again, you can’t refer to them as values, so how would you invoke one multiple times?
Here is the wording from Strachey’s paper, as linked by @fanf
Isn’t this “except in the case of a formal parameter” exactly what is used by Osvald et al. and Brachthäuser et al. in their papers? Here is the bit from Osvald et al.’s paper:
In the body of
withFile,fnis guaranteed to have several restrictions (it cannot be escaped, it cannot be assigned to a mutable variable etc.). But the type system (as in the paper) cannot prevent the implementation ofwithFilefrom invokingfnmultiple times. That would require an additional restriction – thatfncan only be invoked 0-1 times in the body ofwithFile.@manuel wrote most of what I was going to (thanks, @manuel!) but I think it’s worth quoting the relevant passage from Strachey’s fundamental concepts in programming languages
That’s a concern, sure, but most “systems” languages have non-local control flow, right? C++ has exceptions, and Rust panics can be caught and handled. It would be very easy to implement a Common Lisp-like condition system with nothing more than thread local storage, function pointers (or closures) and catch/throw.
(And I’m pretty sure you can model exceptions / anything else that unwinds the stack as essentially being a special form of “return”, and handle types, ownership, and lifetimes just the same as you do with the
?operator in Rust)My point is not about ease of implementation, it’s about usability when considering type safety and memory safety. It’s not sufficient to integrate a type system with other features – the resulting thing needs to be usable…
I’ve added a section at the end, Appendix A8 describing the concrete concerns.
Early Rust did have conditions and resumptions (as Steve pointed out elsewhere in the thread), but they were removed because of usability issues.
If you dig into the code a bit, you discover that SEH on Windows has full support for Lisp-style restartable and resumable exceptions in the lower level, they just aren’t exposed in the C/C++ layer. The same component is used in the NT kernel and so there’s an existence proof that you can support both of these models in systems languages, I just don’t know of anyone who does.
The SEH model is designed to work in systems contexts. Unlike the Itanium model (used everywhere except Windows) it doesn’t require heap allocation. The throwing frame allocates the exception and metadata and then invokes the unwinder. The unwinder then walks the stack and invokes ‘funclets’ for each frame being unwound. A funclet is a function that runs on the top of the stack but with access to another frame’s stack pointer and so can handle all cleanup for that frame without actually doing the unwind. As with the Itanium model, this is a two-stage process, with the first determining what needs to happen on the unwind and the second running cleanup and catch logic.
This model is very flexible because (as with the Lisp and Smalltalk exception models) the stack isn’t destroyed until after the first phase. This means that you can build any kind of policy on top quite easily.
Oh yes, that reminds me, Microsoft’s Annex K broken C library extensions have a runtime constraint handler that is vaguely like a half-arsed Lisp condition.
Yes. However, even the Itanium model supports it: https://itanium-cxx-abi.github.io/cxx-abi/abi-eh.html
Are you referring to some closed-source code here, or is the implementation source-available/open-source somewhere? I briefly looked that the microsoft/STL repo, and the exception handling machinery seems to be linked to vcruntime which is closed-source AFAICT.
Thanks for the context, I haven’t seen a simple explanation of SEH works elsewhere, so this is good to know. I have one follow-up question:
So the exception and metadata is statically sized (and hence space for it is already reserved on the throwing frame’s stack frame)? Or can it be dynamically sized (and hence there is a risk of triggering stack overflow when throwing)?
As Steve pointed out elsewhere in the thread, Rust pre-1.0 did support conditions and resumptions, but they removed it.
To be clear, I don’t doubt whether you can support it, the question in my mind is whether can you support it in a way that is usable.
I thought I read it in a public repo, but possibly it was a MS internal one.
The throwing context allocates the exception on the stack. The funclet can then use it in place. If it needs to persist beyond the
catchscope, the funclet can copy it elsewhere.This can lead to stack overflow (which is fun because stack overflow is, itself, handled as an SEH exception.
You don’t need continuations for resumable errors. https://lobste.rs/s/az2qlz/epic_treatise_on_error_models_for_systems#c_9efawr
Incidentally, Rust had conditions long ago. They were removed because users preferred Result.
Is there any documentation or code examples of how they worked?
https://github.com/rust-lang/rust/issues/9795 Here’s the bug about removing them. There was some documentation in those early releases, I don’t have the time to dig right now.
I’ve only dabbled slightly with both - how is resuming from an error different from catching it? Is it that execution restarts right after the line that threw the error?
Consider the following:
A program wants to
write()something to a file, but – oops – the disk is full.In ordinary languages, this means
write()will simply fail, signal an error (via error code or exception or …), and unwind its stack.In languages with resumable or restartable errors, something entirely different happens:
write()doesn’t fail, it simply pauses and notifies its calling environment (i.e. outer, enclosing layers of the stack) that it has encountered aDiskIsFullsituation.In the environment, there may be programmed handlers that know how to deal with such a
DiskIsFullsituation. For example, a handler may try to empty the/tmpdirectory if this happens.Or there may be no such handler, in which case an interactive debugger is invoked and presented to the human user. The user may know how to make space such as deleting some no longer needed files.
Once a handler or the user has addressed the
DiskIsFullsituation, it can tellwrite()to try writing again. Remember,write()hasn’t failed, it is still paused on the stack.Well, now that space is available,
write()succeeds, and the rest of the program continues as if nothing had happened.Only if there is no handler that knows how to deal with
DiskIsFullsituations, or if the user is not available to handle the situation interactively, wouldwrite()fail conclusively.Yes. Common Lisp and Smalltalk use condition systems, where the handler gets executed before unwinding.
So unwinding is just one possible option (one possible restart), other common ones are to start a debugger, to just resume, to resume with a value (useful to provide e.g. default values, or replacement for invalid values), etc… the signalling site can provide any number of restart for the condition they signal.
It’s pretty cool in that it’s a lot more flexible, although because it’s adjacent to dynamic scoping it can make the program’s control flow much harder to grasp if you start using complex restarts or abusing conditions.
Exactly. For example “call with current continuation” or call-cc allows you to optionally continue progress immediately after the throw. It’s a generalization of the callback/continuation style used in async-await systems.
(There’s also hurl, which I think was intended as an esolang but stumbled upon something deep (yet already known): https://ntietz.com/blog/introducing-hurl/)
You don’t need continuations to implement resumable errors. The trick is simply to not unwind the stack when an error happens. I wrote an article about how it works a while ago: http://axisofeval.blogspot.com/2011/04/whats-condition-system-and-why-do-you.html
Even if you want to do stack unwinding, you don’t need continuations. Catch and throw are adequate operations to implement restarts that unwind the stack to some point first.
Ah thanks - that’s very informative.
Looks like spam. Another AI generated article from AWS resellers.
Bunny is not an AWS reseller. It started as just a CDN and has gradually been expanding to other services. This is a marketing article though, probably doesn’t belong here.
On the Changelog podcast they presented some benchmarks of CDNs and Bunny beat the competition (Cloudflare) by a wide margin. So if they are reselling hardware that would still be pretty impressive. I didn‘t know them before, but their engineering seams sound. The name is a little odd though.
Fences removed with no alternatives, neat. Stuff like seqlocks are now less performant with no recourse (except to do them in C…).
At first I was rolling my eyes at this comment, but I have to say, I’m actually really appreciating the ensuing discussion, so thank you (sincerely).
There are other cases to complain about too on this level, e.g. really efficient RCU that needs fences to work on some arches (e.g. liburcu-qsbr).
However, that being said, the current state of affairs with all related things is kind of a hot mess all over. It turns out the C/C++11 memory models are basically useless and broken anyways, and a lot of very smart people are still finding new problems with the whole way we think about all related things. In the meantime, things like ThreadSanitizer don’t handle fences well, and it’s reasonable to say that if you want to reduce/eliminate UB, maybe you shouldn’t allow constructs like fences that are going to cause a situation where you can’t even tell if UB is happening anymore, or on which architectures.
For the few use-cases where you really want to do this stuff, and you’re really sure it will work correctly on every targeted architecture (are you??), you’re probably going to drop into arch-specific (and likely, asm for at least some arches) code to implement the lower level of things like seqlocks or RCU constructs, where you’re free to make specific assumptions about the memory model guarantees of particular CPUs, and then others can just consume it as a library.
The whole point of having it in the language is so you don’t have to implement fences for N cpu arches, and so the compiler doesn’t go behind your back and try to rearrange loads/stores.
Yes, ideally. But the C11 model isn’t trustworthy as a general abstraction in the first place. In very specific cases, on known hardware architectures, it is apparently possible to craft trustworthy, efficient mechanisms (e.g. seqlock, RCU, etc), as observed in e.g. the Linux kernel and some libraries like liburcu, which do not rely on the C11 memory model. But arguably, it is not possible to do so reliably and efficiently in an architecture-neutral way by staying out at the “C11 model” layer of abstraction. There is perhaps a safe subset of the C11 model where you avoid certain things (like fences which sequence relaxed atomics, and certain mis-uses of acquire/release), but you’re not gonna reach peak hardware efficiency in that subset.
The safest subset of the C11 model is just to stick to
seq_cstops on specific memory locations. The further you stray deeper, the more sanity questions arise. I “trust”, for example, liburcu, because it is very aware of low-level arch details, doesn’t rely on just the abstract C11 model, is authored and maintained by a real pro at this stuff, and has been battle-tested for a long time.Given this state of affairs, IMHO in the non-kernel C world you expect the compiler or a very solid library like the above to implement advanced efficient constructs like seqlocks or RCU. It’s (IMHO) probably not sane to try to roll your own on top of the abstract C11 atomics model (in C or Zig, either way!) in a maximally-efficient way and just assume it will all be fine.
To be clear, the linux kernel and liburcu don’t use the C11 memory model. They have their own atomics and barriers implemented in assembler that predate C11. liburcu is largely a port of the linux kernel primitives to userland.
How so?
I can’t speak on exactly what the parent comment is saying, but I do know
memory_order_consumewas finally removed in C++26 having never been implemented correctly, despite trying several times since C++11 introduced it to make it work. it’s been a lot, and IIRC it as well as a hardware issue also affected the transactional memory technical specification.There’s also been more than a few cases in the mid-late 10s of experts giving talks on the memory model at CppCon only for someone in the crowd to notice a bug that just derails the whole talk as everyone realizes the subject matter is no longer correct.
If you want the long version that re-treads some ground you probably already understand, there’s an amazingly deep 3-part series from a few years ago by Russ Cox that’s worth reading: https://research.swtch.com/mm .
If you want the TL;DR link path out of there to some relevant and important insights, you can jump down partway through part 2 around https://research.swtch.com/plmm#acqrel (and a little further down as well in https://research.swtch.com/plmm#relaxed ) to see Russ’s thoughts on this topic with some backup research. I’ll quote a lengthy key passage here:
Do you mean that using
@atomicStore/@atomicLoadon the lock’s sequence number with the sameAtomicOrderfor@fencewould not be equivalent? If not, can you say more about why?I mean stuff like Linux’s
seqcount_t. Used for write-mostly workloads like statistics counting. To implement them you at least need a read barrier and a write barrier, since acquire/release operations put the barrier on the wrong side of the load/store.I’m surprised it works on Windows, because the kernel docs suggest it shouldn’t. DRM’d media is sent to the GPU as an encrypted stream, with the key securely exchanged between the GPU and the server. It’s decrypted as a special kind of texture and you can’t (at least in theory) copy that back, you can just composite it into frames that are sent to the display (the connection between the display and GPU is also end-to-end encrypted, though I believe this is completely broken).
My understanding of Widevine was that it required this trusted path to play HD content and would downgrade to SD if it didn’t exist.
If you have a path that goes from GPU texture back to the CPU, then you can feed this straight back into something that recompresses the video and save it. And I don’t know why you’d think this wouldn’t give you sound: secure path for the sound usually goes the same way, but most things also support sound via other paths because headphones typically don’t support the secure path. It’s trivial to write an Audio Unit for macOS that presents as an output device and writes audio to a file (several exist, I think there’s even an Apple-provided sample that does). That just leaves you having to synchronise the audio and video streams.
I’m pretty sure that what Gruber is describing is basically just “hardware acceleration is not being enabled on many Windows systems”, but because he has his own little narrative in his head he goes on about how somehow the Windows graphics stack must be less integrated. Windows is the primary platform for so much of this stuff!
I would discount this entire article’s technical contents and instead find some other source for finding out why this is the case.
Well it depends on the type of acceleration we’re speaking of. But I’ve tried forcing hardware acceleration on video decode and honestly you’d be surprised how much it failed and I did this on rather new hardware. It was actually shockingly unreliable. I’m fairly certain it’s significantly worse if you extend your view to older hardware and other vendors.
I’m also fairly sure, judging by people’s complaints, that throwing variable refresh rate, higher bit depths and hardware-accelerated scheduling in the mix has not resulted in neither flagship reliability or performance.
It can be the primary platform but this doesn’t mean it’s good or always does what it should or promises it’ll do.
Wait wait wait is this,,, checks URL, oh, lmao. Yeah Gruber is useless there’s literally no point in ever reading a single word he says.
I think it means: enabling the feature to screenshot DRM protected media would not by itself enable piracy, since people would not use screenshots to pirate media frame at a time.
What you are saying reads like “one technical implementation of allowing screenshots would enable piracy.” I trust that you’re probably right, but that doesn’t contradict the point that people would not use that UI affordance itself for piracy.
No one would use screenshots for piracy because all the DRM is already cracked. Every 4k Netflix, Disney, etc, show is already on piracy websites, and they’re not even re-encoded from the video output or anything, it’s straight up the original h264 or h265 video stream. Same with BluRays.
Yup, if you go through GitHub there are several reverse-engineered implementations of widevine, which just allow you to decrypt the video stream itself with no need to reencode. That then moves the hard part to getting the key - fairly easy to get the lower security ones since you can just root an Android device (and possibly even get it from Google’s official emulator? At least it supports playing widevine video!), the higher security ones are hardcoded into secure enclaves on the GPU/CPU/Video decoder though, but clearly people have found ways to extract them - those no-name TV streaming boxes don’t exactly have a good track record of security, so if I were to guess that’s where they’re getting the keys.
Still, no point blocking screenshots - pirates are already able to decrypt the video file itself which is way better than reencoding.
Those no-name TV streaming boxes usually use the vendor’s recommended way to do it, which is mostly secure, but it’s not super-unusual for provisioning data to be never deleted off the filesystem, even on big brand devices.
The bigger issue with the DRM ecosystem is that all it takes is for one secure enclave implementation to be cracked, and they have a near infinite series of keys to use. Do it on a popular device, and Google can’t revoke the entire series either.
Personally, I’m willing to bet the currently used L1 keys have come off Tegra based devices, since they have a compromised boot chain through the RCM exploit, as made famous by the Nintendo Switch.
Pre-DoD Speed Ripper early DVD rips jacked into a less-than-protected PowerDVD player and did just screenshot every frame.
First of all, you should not store UUIDs as strings. As fs111 mentionned, UUIDs are 16 bytes, so they fit into a
uint128_t. Postgres supports UUIDs natively, but even if your database is not supporting 128bit integers, if you encode it inCHAR(16), databases will be quite fast at comparing them.Also, snowflake IDs are basically superseded by UUID7. You get a bigger space than snowflake IDs, and all the goodies of snowflake IDs and UUID4.
UUIDv7 requires generating a reasonably good quality random number every time you generate an ID, while Snowflake only needs to increment a counter, so it doesn’t seem like a straight upgrade. One could imagine a 128-bit snowflake-like format that’s like, 48-bit timestamp, 64-bit worker identifier (can be randomly generated) and a 16-bit counter
Fair. But I would argue that for most applications, the probability of generating two random sequence of 48 bits on the same timestamp is quite low.
Also, nothing guarantees that two sequence numbers won’t conflict on the same timestamp, what prevents conflict is the worker identifier, which you want to randomly generate in your proposal. How is that different from generating the entire remaining 74 bits? (like UUIDv7)
But you’re right, UUIDv7 is not guaranteeing full uniqueness, but I would just disregard the probability as “too low” like for UUIDv4.
The RFC explains how to generate UUIDv7 using a counter, so you only occasionally need to get fresh randomness.
recently (3 comments) previously (46 comments) previously (47 comments)
Oh, somehow I missed that first one when I scanned /newest. @pushcx could you do a thread merge?
This is the first release of Fish after the re-write into Rust. While it’s mostly very similar to the previous version of Fish, there are a few new features and a couple of breaking changes.
Instead of using async to implement sans-io state machines, I’ve just started exploring the idea of simply having a async IO traits that can be implemented by both non-blocking stuff (like Tokio’s IO types) and blocking stuff (like the standard library IO types). In the latter case,
.awaitwould simply block until the IO was done, and then it would returnPoll::Ready(result).