For fun, I tried running the linked testcase on Haiku (there’s only a small number of things in it that are actually Linux-specific in the critical path, easily replaced; the /proc-reading logic can just be ignored), in a VM also with 4 cores and 4 GB RAM. Running it with the original parameters just caused it to crash with NULL dereferences a lot (Haiku does not have overcommit or an “OOM killer”, and malloc will indeed return NULL if the system is out of memory.) I reduced the number of threads to 250+250 instead of 1000+1000 and the sizes to 1GB, 1GB instead of 4GB, 4GB and it then it ran much longer.
The system got pretty sluggish with 500 threads all vying for activity on 4 CPU cores, but I could still move the mouse and interact with GUI despite the lag. It got to “Allocated all memory” in many of the file threads, but didn’t seem to actually get to printing “Iteration 1” in most of the memory threads but got bogged down in handling the page faults before that (it appears the memory threads all allocate one chunk each before printing “Iteration 1”). However, the system profiler and ps -aux continued to work fine, albeit laggy with all cores overloaded.
After about 25 minutes, I Ctrl+C’d the test, and it exited immediately and the system returned to normal at once. I then tried with 32+32 threads; this also seemed to get bogged down in page fault handling, but this time the system remained lively and responsive even with constant 100% CPU usage on all cores. (There was still some redraw lag, but if you weren’t paying close attention you might miss it; and in some apps, like Terminal, I couldn’t tell at all.)
Finally I reran the test with only 4+4 threads. The system remained mostly responsive during this, but somewhat occasionally the whole GUI seemed to stall for a second or two at a time, but it would soon recover. This variant didn’t seem bogged down in page faults; a quick run of the profiler showed most time being spent in userland. And once again everything returned to normal instantly when I Ctrl+C’d the test.
So, while Haiku might be able to do better on the page fault and context switch handling, we don’t do so badly here as it is (though admittedly we have more locks and less atomics in the kernel than Linux does, which helps in situations like this.)
I want an insanely great system to exist again, and I’m open to suggestions.
It looks to me like no one attempts to compete with Apple at their user experience game—consistent behavior, minimum surprise, things just working. Enjoying popularity and a lack of opposition in the space they’ve carved out, they no longer have to make their systems insanely great for many of us users to continue thinking they’re the best. Eventually that has meant they’re merely the least bad. A lot of the time I’m happy enough, but sometimes I feel stuck with all this. I wonder what battles they fight internally to keep the dream of the menu bar alive. But dammit, the same key shortcuts copy and paste in every app.
The last time I used Gnome, I mouse-scrolled through a settings screen, snagged on a horizontal slider control, adjusted the setting with no clue what the original value was, and found there’s no setting I could use to avoid that. The last time I used Windows, I was again in the system settings app but found it didn’t have a setting I remembered. I learned Control Panel still exists too, and the half the settings still live there. My Mac, on the other hand, is insanely OK.
If you’re open to suggestions, have you tried Haiku before? It, too, has the same key shortcuts copy/pasting in every app (even when you use “Windows” shortcuts mode, i.e. Ctrl not Alt as the main accelerator key). We’re not quite as full-featured or polished yet as macOS was/is, but we’d like to think the potential is there in ways it’s not for the “Linux desktop” :)
Thanks; I have eyed Haiku with interest from time to time! It does strike me as in line with some of my values. Maybe I’ll give it a more earnest try.
Update, impressions after a few hours of uptime: This is nice. I could get comfortable here. It’s really cohesive and all the orthogonal combinable parts feel well chosen. The spatial Finder is alive and well. I was especially impressed when I figured out how a plain text file was able to have rich text styling, while remaining perfectly functional with cat. Someone really must be steering this ship because it emphatically doesn’t suck.
Thanks for your kind words! Some of the “orthogonal combinable” parts (“styled text in extended attributes” and spatial Tracker included) are ideas we originally inherited from BeOS (but of course we’ve continued that philosophy). And we actually don’t have a single “project leader” role; the project direction is determined by the development team (with the very occasional formal vote to make final decisions; but more often there is sufficient consensus that this simply is not needed.) But we definitely is a have a real focus on cohesiveness and “doing things well”, which sometimes leads to strife and development taking longer than it does with other projects (I wrote about this a few years back in another comment on Lobsters – the article that comment was in a discussion thread for, about Haiku’s package manager is also excellent and worth a read), but in the end leads to a much more cohesive and holistic system design and implementation, which makes it all worth it, I think.
But dammit, the same key shortcuts copy and paste in every app.
So much this. I am continually disappointed that both Gnome and KDE managed to fuck this up. Given that they both started as conceptual clones of Windows (more-or-less) I guess it isn’t surprising, but still
A minimal Linux setup can be great. You have few components that are really well maintained, so nothing breaks. It also moves slowly. My setup is essentially the same as back in 2009: A WM (XMonad), a web browser (Firefox), and an editor (Emacs). When I need some other program for a one-off task, I launch an ephemeral Nix shell. If you prefer Wayland, Hyprland can give you many niceties offered by desktop environments, like gestures, but it is still really minimal.
Nix is really cool for keeping a system clean like that. I bounced off NixOS because persisting the user settings I cared about started to remind me of moving a cloud service definition to Terraform. I do love the GC’ed momentary tool workspaces.
If Emacs was my happy place, I think your setup would be really pleasing. But I am a GUI person, to the degree that tiling window managers make my nose wrinkle. Windows are meant to breathe and crowd, I think. That’s related to the main reason I want apps to work similarly, because I’ll be gathering several of them around a task.
I want to believe there is a world in which a critical mass of FOSS apps are behaviorally cohesive and free of pesky snags, but I have my doubts that I’d find it in Linux world, simply because the culture biases toward variety and whim, and away from central guidance and restraint. Maybe I just don’t know where to look. BSDs look nice this way in terms of their base systems, but maybe not all the way to their GUIs.
Thanks for your work on Lobsters and for writing all this up, Peter. Are there any social media posts which we can “boost for visibility” (whether ones linking to this post, or others)? And, if making a similar post on other forums that would be similarly affected (e.g. the Haiku project forums), do you mind if we cross-link, or perhaps even borrow some of your verbiage?
This is probably the best one, I’ve tried to summarize a woolly topic clearly. You’re welcome to use it as an example or quote as is useful. My goal for the post was that people with more jurisdictional relevance get involved in helping find a better outcome for this situation, so I’d really appreciate you helping make connections or progress with the relevant authorities.
A downside of using struct str everywhere is that the compiler can’t check printf strings. … But this only works on functions that have the same signature as printf so it doesn’t work on my implementation. Overall, I think this is an acceptable tradeoff because format strings are easier to reason about than arbitrary code and all possible issues are localized in calls to print functions.
At $PREVIOUS_JOB, a team member left for employment elsewhere, and I was tasked with taking over his code. He had needlessly wrapped syslog() (with a function called Syslog() with the same parameters, and our code would only ever run on Unix). When I removed the wrapper and fixed all the call sites (easy enough), boy, did I find errors. The compiler knew to check the format string for syslog(), but not of the wrapper. [1] I’m a proponent of having the compiler find bugs so you don’t have to.
[1] It didn’t help things that all the format strings were #defined elsewhere in the code, so all I initially saw was:
While in your particular case it obviously made sense to just drop the wrapper and use the real method, if one ever does want the compiler to check printf-format strings that are being passed to a function not named printf, you can use __attribute__ ((format (__printf__, (a), (b)))). For example:
declares a function debug_printf that takes a printf format string in argument #1 and the parameters to it starting in argument #2. So, both in the case you describe here as well as the one in the article, this attribute could be used to eliminate the “tradeoff” and keep the compiler type checking.
There’s also another, related attribute called format_arg which tells the compiler that any string passed to the specified argument will cause a “similar” format string to be returned. The GCC documentation notes this can be used for translations: e.g. if one declared a translate function this way, then
printf(translate("There are %d objects."), count);
will result in the compiler checking the string inside the call to translate against the printf invocation.
True, and I knew about that, but not the other developer. Also, while we did use GCC (on Linux) and clang (on Mac OS-X), we also had to compile for Solaris using their native C compiler. I don’t think it supports __attribute__().
Edited to add: yes, I know, conditional compilation and #defines can work around this. I just wanted to mention that not all C compilers support __attribute()__.
Yes, I did [create that] but it seems like it’s too destructive and causing [the] system [to become] unusable.
I don’t mean to kick up sand, but, that sounds like normal Haiku software. Haiku is only second to ReactOS as the most unstable OS I’ve ever used (sadly I’ve never gotten to try Copland OS.) Isn’t the web browser still causing kernel panics when you try to use it?
Isn’t the web browser still causing kernel panics when you try to use it?
I only know of one intermittently-reproducible kernel panic from the web browser, and it’s mostly on an issue on recently nightly builds only, not beta5 (it’s due to recently-added assert failing). What crashes are you encountering? We have Haiku VMs that accumulate pretty long uptimes for package builds with no crashes…
I didn’t know either of them were particularly unstable, although I’m less surprised about ReactOS. I think they started that in the slightly-wrong direction of duplicating the NT architecture with new code rather than just API-compatible services atop a proven core, and will forever be stuck in a bad place because of it.
We have a similar “madness” for the QtWebEngine port (which isn’t a full Chromium port, and it’s not exactly a recent one either…), and a lot of our patches were borrowed from FreeBSD, in fact.
Right, now that I’ve recovered from the man flu somewhat I can come back and say something more useful than hey man, cool interview! Sorry to spam the thread but I can’t edit my old one by now.
I think the most interesting and useful takeaway from this interview is this:
The most common one in my experience is the mistake of thinking of a ‘framebuffer’ that you batch write ‘pixels’ into, binned by some discrete synchronization signal (‘vsynch’, ‘vblank’, …). This is common because that is often at least part of what higher level graphics API used to offer the developer.
The printer was, and is, a more accurate model. Just as there was good reason for why the printer server part of Xorg fell out of favour, there was good reason for why it was there in the first place.
The framebuffer abstraction is super straightforward but if you peek under the hood of graphics systems for a bit you’ll find that, like, half the history of the development of modern hardware graphics hardware interfaces basically consists of trying to figure out how to build (what amounts to) a good framebuffer API -> vector operations pipeline transpiler. From a distance, it looks like it’s not a solved problem at all.
In most modern systems there’s an additional layer between the application and the drivers – the display server. These tend to be the most constrained in terms of API design choices, because most 3rd party developers will likely hate anything that doesn’t eventually boil down to “here’s a bucket of pixels, draw on it”. So most of them have no choice but to implement some framebuffer interface over not quite a framebuffer hardware. What would you say are the most common pitfalls in designing such a system? Or, to put it another way, if you guys had a senior year student to advise on his final project, what are the first quirks you just know their first design just won’t deal with properly?
Other than Arcan :-P what is some good prior art on securely managing shared graphical resources (like, remember GEM-Flink?)
I was otherwise engaged until this morning so didn’t have time to notice :-)
For 1 - the biggest pitfalls, locally and systemically, that damn near everyone runs into is synchronisation. That’s why I suggested the resize case as a mental model for unpacking graphics as the system needs to converge towards a steady state while invalidating requests might still be in-flight. There is a lot to unpack in that, and whatever you chose you get punished for something. Even if you are single source/sink /dev/fb0 kind of a deal, modeset is another that’ll get you.
Then comes colour representation and only thinking ‘encoding implies equivalent colour space’. This gets spicy when you only have partial control of legacy and clients. There will be those sending linear RGB for a system that expects sRGB and vice versa, and that goes for the hardware chain as well. On top of this is blending. On top of that is calibration and correction.
Then comes device representation that ties all of this together, e.g. what the output sink actually can handle versus what you can provide, and the practical reality that any kind of identity+capability description we’ve ever invented for hardware even as static as monitors gets increasingly creative interpretations as manufacturing starts.
For 2 - I well remember flink and might have good reason to not trust the current system aswell. The heavy investment into hardware compartments and network transparency is personally motivated by that distrust. Are you thinking of the whole stack (i.e. specific window contents, sharing model between multiple-source-multiple-sinks, deception like a full screen client that looks just like the login-screen?) or the GPU boundary specifically (i.e. GEM/dma-buf/…)?
I was otherwise engaged until this morning so didn’t have time to notice :-)
It’s okay, Paracetamol makes me drowsy as hell so I was “otherwise engaged”, too, as in I slept through most of Sunday :-D.
That list of common pitfalls pretty disappointing to read in a Linux context. I actually recognize some of those from my Linux BSP sweatshop days. Eek!
Are you thinking of the whole stack (i.e. specific window contents, sharing model between multiple-source-multiple-sinks, deception like a full screen client that looks just like the login-screen?) or the GPU boundary specifically (i.e. GEM/dma-buf/…)?
The GPU boundary specifically, the former covers way too much ground to make it a useful question IMHO.
That list of common pitfalls pretty disappointing to read in a Linux context. I actually recognize some of those from my Linux BSP sweatshop days. Eek!
If you contrast them with the state of some other project, not to invite the keyword, how well did their past experiences with maintaining a popular display server avoid these basic pitfalls?
The GPU boundary specifically, the former covers way too much ground to make it a useful question IMHO.
So Arcan has a systemic opinion in that ‘very large ground’ space as somehow all the layers need to be tied together for the boundary to have more bite than the equivalent ‘fork + unveil/pledge’ like short compartments for things. Do note that I have basically /ignored all of CUDA etc. for ‘other GPU uses’.
With the work it takes, the open source spectrum only really leads to that one coarse grained viable interface that everyone copies near verbatim post the render-node/dma-buf change, they are just in different stages of being synched to it (can’t say for Haiku though, a while since I last poked around in there, maybe @waddlesplash).
There are interpretations for how you can leverage them (afair Genode handles it slightly different for a good experimental outlier in general), but in the end there’s only so much you can ‘do’ – opaque interchangeable sets of tokens representing work items that piggyback on some other authentication channel or a negotiated/authenticated initial stream setup that gets renegotiated when boundary conditions (resize) change. Even Android isn’t much different in this regard outside of petty nuances.
In the proprietary space, although you’ll find little documentation on the implementation (or at least I back when I hadn’t just given up on the platform entirely), IOSurfaces in the OSX Sense has a more refined model on the opaque sense in constrast to EGLStreams.
(can’t say for Haiku though, a while since I last poked around in there, maybe @waddlesplash).
Haiku doesn’t have GPU acceleration yet (with the exception of one experimental Vulkan-only driver for Radeon Southern Islands that one contributor wrote), so we haven’t settled on a design for that API; doubtless we will just use Mesa so something under the hood will still provide dma-bufs or an equivalent, eventually.
But one thing which is notable here is that Haiku still does server-side drawing, for all applications that use the native graphics toolkit, anyway. Stuff like Qt and GTK of course just grabs a shared-memory bitmap and draws it repeatedly, but native applications send drawcalls to the server, specifying where they’re to be drawn into (a window, a bitmap, etc.) So, there’s a lot of leeway here, should we eventually get GPU acceleration and decide to experiment with GPU rendering, for the server to “batch” things, share contexts efficiently, etc. which applications on Linux can’t do anymore in the Wayland era (and didn’t do for a long time before that, usually, because X11’s rendering facilities didn’t really keep pace with what people expected 2D graphics drawing APIs to be.)
Well we can coquettishly suggest HVIF or SVG as a new entry to wl_shm::format and they’d tick the server side graphics box about as well as many other ones.
Neither of those are really designed for on-the-fly generation, though. HVIF in particular has a lot of features which make it very compact at the expense of writing and decoding time. The graphics protocol that Haiku uses for real-time drawing isn’t related to it. (Though it also has an “off the wire” form: BPicture files.)
Are USB audio devices in the kernel in Haiku? I have fond memories of BeOS 5 popping up a dialog to tell me that the audio stack had crashed and been restarted, with less than a second’s interruption to music playback, on the same machine, the sound card driver would routinely panic the Linux kernel and blue-screen Windows (Creative Labs absolutely could not be trusted in ring 0).
The USB audio driver is in-kernel for the moment, but that’s just because it interfaces with the media server the same way all the other audio drivers do. It could very much be ported to userland as a separate media output module instead, but that’d require a bit of work I didn’t see much reason to do at the moment.
The userland portions of Haiku’s audio stack can be restarted on-the-fly, but there’s some bugs remaining that mean applications currently outputting audio also have to be restarted and won’t automatically reconnect…
It’s also just a shell for the TCP implementation and nothing else; it doesn’t run the IPv4/v6 portions of the stack, or the routing logic, etc.
Being able to run pcap files against the network stack sounds really useful for fuzzing.
Ah, that’s not what the changes described in the report were about; that was just about adding a way to dump packets from the userland test harness into a pcap file. But in fact replaying pcap files can already be done with “tcpreplay”, I think.
Unrelated to this project, I had started writing a source-level compatibility library for BeOS decades ago as well on top of Linux.
IIRC, the common threads implementations at the time (this was in the LinuxThreads era, with true POSIX threads just starting to become available on Linux), couldn’t quite do what I needed them to do to emulate BeOS threads. At the time there wasn’t a reliable way to get a thread to start in a suspended state (I don’t think? It’s been a long time), and even today I think it’s not actually possible with pure POSIX threads. You could probably do it with clone(2) pretty easily now.
There were other, simpler, problems too, like how BeOS’s error constants were negative numbers.
Again, this was in like 1998, so take this with a grain of salt. It was super fun and educational and I will be watching this project with great interest.
(Also, if it were up to me — and it’s not — I’d have moved the query parser for filesystem queries to user space and have a system call that took a parsed tree instead. From what I understand from reading Practical Filesystem Design by Giampaolo, which describes the implementation of the Be filesystem, the query parsing code was all in kernel space and made up a significant amount of the BFS code base. There’s probably a good reason for doing it that way that I just don’t see.)
From what I understand from reading Practical Filesystem Design by Giampaolo, which describes the implementation of the Be filesystem, the query parsing code was all in kernel space and made up a significant amount of the BFS code base. There’s probably a good reason for doing it that way that I just don’t see
All of the things that you’re querying require direct access to inodes. If you move the query parser into userspace, you still have query execution in the kernel. That means that you need to take the query from userspace, parse it, and then serialise it in a form that the kernel can execute. It’s not clear to me that this would reduce the amount of parsing in the kernel considerably.
If you moved the query execution into userspace then you’d have a lot more kernel <-> userspace traffic, because you’d need to pass intermediate results up to userspace. If you didn’t do query planning correctly, you could very easily end up passing lists of most inodes in the system up to userspace accidentally.
Spotlight (designed by the same person as BFS) avoids this by keeping the database of metadata entirely in userspace. This has the downside that it can easily get out of sync.
the query parsing code … made up a significant amount of the BFS code base
At least on Haiku, it doesn’t: the query portions of BFS are 1039 SLoC out of 13,144 in BFS altogether, plus another few hundred in common code. (Numbers counted by cloc.) I guess it’s more if you count the indexing system, but that’s actually part of the on-disk structures.
And by the way, I replied to the post on the NetBSD mailing list a few weeks back.
At the time there wasn’t a reliable way to get a thread to start in a suspended state
The normal way of doing this is to start the thread where the first thing that it does is wait on a mutex, which you hold in the caller. Did you need to do something to the thread that relied on it being in a known state (e.g. poke at its saved register file)? You can do that via ptrace, but it’s a very painful (trace yourself, request tracing of thread-creation events, create the thread, catch the event, poke the thread).
There’s probably a good reason for doing it that way that I just don’t see.
Just a random guess, but I’d expect that it was for performance and shoving it in kernel space involved fewer context switches and/or page table walks than doing it in userland. IIRC Windows used to do the same with certain graphics code and such.
For what it’s worth: FreeBSD made getentropy a system call rather than a read of the device to better support Capsicum. Without this, you needed to open the random device before entering capability mode.
Yes, that’s what’s happening here: Haiku has a “generic syscall” mechanism which functions something like “ioctl without a file descriptor”. The kernel and drivers can register themselves to be invoked through this system. You can see the commit that implemented this.
Ah, I thought you were just asking if “getentropy” worked without using file-descriptor, to which the answer is “yes.” We didn’t implement it this way to support Capsicum or anything, but rather so that getentropy still works properly when a process is close to its limit of FDs, and also for efficiency so it doesn’t need an open/close on every invocation.
Ah. The perf wasn’t an issue on FreeBSD because a process rarely calls it (it seeds a PRNG in libc that periodically gets more entropy, if you’re calling it on a hot path then you’re doing something wrong, especially on modern x86 where you really want to feed RDRAND into your userspace PRNG and just use getentropy once to protect against supply chain attacks on the hardware RNG [FreeBSD doesn’t do this yet]). I don’t think the old code did an open and close every invocation, just an open on first use.
Running out of file descriptors might be an issue, but those limits have been high enough that non-malicious code doesn’t hit them and if you’re one below them then you have so many open that you’re probably going to see failures from other things soon. Does Haiku have low limits here? It was a problem on OS X for the first couple of releases because the default values hadn’t changed since OPENSTEP 4, I had a script that set the sysctl values to a couple of orders of magnitude higher than the default to prevent things dying, but after 10.4 Apple made the defaults sensible.
It was essential for Capsicum because you lose access to the global namespace once you enter capability mode and so can’t open the device mode. You’d need libc to open the device before you called cap_enter and that might have surprising effects (in particular, code often does closefrom just after for some defence in depth against things being accidentally opened concurrently). I believe Linux did it for similar reasons (seccomp-bpf policies wanting to disallow open and not being able to special case a particular path because BPF can’t see into string arguments).
I imagine the perf wouldn’t have been an issue indeed, but we wanted to do things correctly nonetheless.
I don’t think our limits are especially low. We’ve raised them now and again when big applications come along, but it’s been a while since we’ve needed to do that, I think.
I really like the Haiku icon format though SVG is a pretty evil comparison, it’s notoriously verbose. I think the Haiku format is a bit denser than PostScript, but closer to a factor of two (maybe less, depending on what you’re doing) rather than more than ten.
I am less convinced by the cost of loading all of the files though, for two reasons. First, I would expect the icon files to mostly live in the buffer cache if you keep reading the files. If you’re rendering directly from them then you can mmap them and that will keep them live in the buffer cache for anyone else. With modern SSDs (and even disks), you’re likely to write a load of them at once and so they’ll end up contiguous on disk (since you control the FS, you could even guarantee that) and so command queuing means that a load of reads for each of them will be coalesced and you’ll get a single read. With both SSDs and spinning rust, sequential reads are much cheaper than random reads, so doing two reads of sequential sectors is basically the same cost as doing a single read of a sector, but doing 100 reads of sectors scattered over the disk is 100 times slower than doing 100 reads of individual or pairs of sectors. You get much more of a speedup from packing multiple icons into a single contiguous block than from making the icons small.
Perhaps more importantly, modern disks have 4 KiB sectors and so a 500 byte icon is going to end up wasting 3/4 of the space. From what I remember of BFS, it regards the file contents as just another form of metadata and so will embed it in the inode if possible, so if BFS has been extended for 4 KiB sectors then you will be closer to 50% overhead because the file and the inode will fit in a single sector. BFS and NTFS both have a big-object or small-object split but NTFS packs small objects into a dedicated region (which results in write amplification).
I’m really curious how this composes with the newer Haiku ‘apps are immutable filesystem layers’ model (which is amazing and everyone should copy). Presumably those filesystems can be packed, so you don’t have wasted overhead and can do a single read for all icons in a particular app but you have a lot of random reads for different bundles (though, again, caching is your friend).
A particular application’s icons are usually stored in “resources” appended to the ELF binary, not extended attributes; it’s only the application icon itself that will be in an attribute.
You can pick BFS sector size upon initialization. The default is 2048, you can choose 4096. I think you are correct about metadata being stored with the inode, but I’m not very familiar with BFS internals, to be honest.
HPKGs are block-compressed, so yes, there’s no wasted space. There’s details on the format in our internals documentation (however it’s a bit outdated; there are since new minor versions including new attributes and a Zstd compression method.) But indeed reading 10 files from 10 different packages, even if they all appear to be in the same directory, will require 10 different reads, decompresses, etc.
I fully expected one of the drawbacks of the vector format (compared to bitmap) to be that the 16x16 icons are less legible (at least the example in this post, the casette player, is). There is something about hinting for small sizes. Must be why I love 9px Verdana :)
Would be interesting to try to come up with some kind of hinting for HVIF ’cause that example image shows a few fairly clear places where it matters most: darker outline, darker shadow, maybe higher contrast in general…
I’ve long had on my list of “hopes and dreams when bored” projects to create a Rust library for HVIF. So long that there are probably ones that exist now and I’m not going to look at crates.io because I’m more likely to be depressed by whatever I see than remaining blissfully ignorant of that fact that either no one beat me to it or someone beat me to it.
Well, someone beat you to it, but that person is a long-time Haiku community member & occasional contributor. So I don’t think you should feel too bad about it. (It’s also worth noting that HVIF is technically still an evolving format, e.g. there were some more features added as part of a GSoC project this past summer, and I don’t think that library supports those.)
Well, there’s more than “none”, and more than a lot of OSes had in the 90s/00s. We have SMAP/SMEP, NX bit, ASLR (including kernel ASLR), filesystem permissions checks (though these haven’t been battle-tested, since the GUI runs as root and very few users bother to useradd & su to a non-root user), syscall permissions checks (though these haven’t been fully audited or battle-tested either), and more.
TIL! Last I heard an official statement there was no prioritization of security features. Clearly I’m out of the loop! Any plans on supporting disk encryption?
Well, the statement is correct: they’re not a priority, but we have some anyway.
There is/was third-party support (though developed by a long-time project member) for disk encryption, but only off the boot disk. There are no immediate plans to incorporate that and support boot disk encryption at the moment, I don’t think.
Are there any plams to automatically create a non-root user on install and use it as the default? This is how I believe macos does it, prompting for a root password when needed.
FWIW testing Haiku is eternally on my “when I have time” list, and for a lot of cases I’d love to ditch unixes for a simple single-user OS, but that user doesn’t have to be root if the separation is for security purposes ;)
There aren’t really any particular plans to do any security enhancements. Sometimes developers work on them, but they’re indeed not presently on any priority list. So, I guess the answer is “no”.
Fantastic to see kqueue on Haiku! I’m surprised that they made .NET work, the .NET use of kqueue upstream makes a few assumptions about kqueue behaviour that are specific to the XNU implementation.
trungnt2910 made socketpair() work for non-STREAM sockets, meaning that UNIX domain sockets can now be created this way
I’m not sure what this means, AF_UNIX sockets can be SOCK_STREAM. Does this mean that it can now create SOCK_SEQPACKET sockets? If so, that’s a big quality of life improvement.
Does Haiku have clang support yet? Last time I looked it was GCC only and the prospect of having to use GCC put me off.
the .NET use of kqueue upstream makes a few assumptions about kqueue behaviour that are specific to the XNU implementation.
At present it’s only used for .NET socket notifications, which are pretty simple. Are you sure it really makes assumptions about XNU behavior? (There was kqueue use in a different component that made use of the “data” field, this component however didn’t even have an epoll version and so that was easy enough to disable for the present.)
Does this mean that it can now create SOCK_SEQPACKET sockets? If so, that’s a big quality of life improvement.
No, it means it can also be used for AF_UNIXSOCK_DGRAM, which were also implemented last month (and also noted in the activity report.)
Haiku does define SOCK_SEQPACKET in our headers, but I don’t think anything implements it at the moment. If you have a use-case (and ideally, test-case), I suppose you can open a ticket.
Does Haiku have clang support yet? Last time I looked it was GCC only and the prospect of having to use GCC put me off.
We’ve had clang in the software depots for a while, and there’s some packages that are always compiled with it, so yes. However, there are still some issues compiling Haiku itself with clang, and I think there’s at least one issue about Haiku’s own headers tripping up on clang’s -Werror (our non-varargs ioctl that wants a default parameter; there’s WIP changes to fix that, but as it’s only an issue for C, not C++, and only with clang and -Werror, it hasn’t gotten much attention.)
At present it’s only used for .NET socket notifications, which are pretty simple. Are you sure it really makes assumptions about XNU behavior?
I don’t remember the details, but I think it was in the filesystem watcher interface. The macOS implementation had some subtle differences to the FreeBSD one that needed working around.
Haiku does define SOCK_SEQPACKET in our headers, but I don’t think anything implements it at the moment. If you have a use-case (and ideally, test-case), I suppose you can open a ticket.
SOCK_SEQPACKET is basically my default for AF_UNIX things now. Every SOCK_STREAM thing that I’ve ever implemented with it has tried to implement a message-oriented system on top of the stream, relying on the kernel to do this removes a load of complexity in userspace. On FreeBSD, devd re-exports the kernel messages over a seqpacket socket, which is great for consumers: every recvmsg receives exactly one devd event.
We’ve had clang in the software depots for a while, and there’s some packages that are always compiled with it, so yes.
Great! Last time I looked there was a port of a very old version, but it looks as if it’s getting closer (16 is almost landed but 17 is in RC state upstream).
but I think it was in the filesystem watcher interface.
Ah, well, I didn’t implement that. I considered it, but Haiku’s filesystem notification system is pretty different from what kqueue provides, so for the moment I didn’t. It looked like libuv separated filesystem events from the rest of the event system, at least, though of course for kqueue it’s in the same interface.
Every SOCK_STREAM thing that I’ve ever implemented with it has tried to implement a message-oriented system on top of the stream, relying on the kernel to do this removes a load of complexity in userspace.
What’s the difference between SOCK_DGRAM? Guaranteed in-order message delivery, I guess?
Pretty sure most AF_UNIX+SOCK_DGRAM implementations won’t drop packets except maybe under certain shutdown conditions, not sure. Either way, it likely wouldn’t be too hard to implement then, but it’s probably not any sort of priority for us.
What’s the difference between SOCK_DGRAM? Guaranteed in-order message delivery, I guess?
Yes, exactly.
Pretty sure most AF_UNIX+SOCK_DGRAM implementations won’t drop packets except maybe under certain shutdown conditions, not sure.
There are often places in kernel buffers where they can be dropped or reordered. If your data gram implementation already provides the relevant guarantees then you can probably implement it trivially.
A glance through the manual pages indicates that the _umtx_op APIs are much more complicated than what Haiku has (and maybe even more complicated that Linux futex, but I’m not an expert in either.)
We do implement some FreeBSD APIs in userspace, but the underlying implementations often differ greatly from what the BSDs do, and furthermore we only implement such things when the POSIX APIs do not suffice in some area.
These reports would be a lot more readable if they included links or inline explanations of all of the proper nouns used to describe bits of the system. They very often read to me like ‘the FromdleService just gained support for FrobnicarionPlus’ and I have no idea whether I should be interested (and I say this as someone who ctually ram BeOS R5 and used BFS as a case study when teaching operating systems). It’s probably comprehensible to most of the Haiku developer community, but much less so to anyone outside who might want to become a contributor.
It’s probably comprehensible to most of the Haiku developer community, but much less so to anyone outside who might want to become a contributor.
Indeed these reports started years ago precisely to keep the community informed and on the same page, and to provide something less technical and more accessible than the commit logs. That the posts are now getting spread more widely than that likely means we should indeed try to make them a bit more accessible. But for the community, such explanations would be redundant, so it’s a tricky balance to strike.
of all of the proper nouns used to describe bits of the system.
Do you just mean internal Haiku-specific components, like (e.g.) app_server, userlandfs? Because (say) FreeBSD callouts are obviously not Haiku-specific, nor is strace (our implementation is, but the concept of course is not), etc.
But for the community, such explanations would be redundant, so it’s a tricky balance to strike.
For what it’s worth, when I helped edit the FreeBSD status reports, we had a policy of adding parenthetical definitions to these kind of things and never had complaints from the community (quite the revers).
Do you just mean internal Haiku-specific components, like (e.g.) app_server, userlandfs? Because (say) FreeBSD callouts are obviously not Haiku-specific, nor is strace (our implementation is, but the concept of course is not), etc.
I know what the fallout subsystem in FreeBSD does, but I would still have added a short description in an intro paragraph to a FreeBSD status report mentioning it. The same would be helpful in a Hakiu one. Note that in the FreeBSD status report, it would have linked the callout(9) man page, so the inline definition would be less important. You can use links like this or definitions that pop up on click to avoid disrupting the flow for people who know the subject well.
Please keep writing these reports though, they’re great for someone like me to see progress in Haiku.
“people think programming in Excel is programming”
I mean, it is Turing-complete with =LAMBDA. I find it a bit distressing when programmers, especially influential ones, try to denigrate an environment or language they don’t like as “not real programming”. This reminded me of an article on contempt culture.
there is no way to have a flexible innovative system and serve the Posix elephant.
IBM i, which actually predates POSIX by some amount, is somewhat popular in my circles as an example of “what could have been” regarding CLIs, alternative programming paradigms, etc. It has a functional POSIX layer via AIX emulation (named PASE).
DOS and OS/2 had EMX which provided most of POSIX atop them. Mac OS 8/9 had GUSI for pthreads atop the horror show known as Multiprocessing Services. I’m pretty sure the Amiga had a POSIX layer. Stratus VOS. INTEGRITY. There are plenty of non-traditional, non-Unix platforms that are – at least mostly – POSIX conformant.
What I’m saying is there is absolutely no technological reason you couldn’t slap a POSIX layer atop virtually anything, even if it wasn’t originally designed for it. Hell, I would even suggest you could go all-out and design this “flexible innovative system” and have someone else put a POSIX layer atop it. You inherit half the world’s software ecosystem for “free” with good enough emulation, and your native apps will run better and show the world why they should develop for that platform instead of against POSIX, right?
But then, even Windows is giving up and making WSL2 a first-class citizen. This isn’t because of some weird conspiracy to make all platforms POSIX. It is because the POSIX paradigm has evolved, admittedly slowly in some cases, to provide a “good enough” layer on which you can build different platforms.
And abandoning POSIX could also lead to a bunch of corporations making locked-in systems that are not interoperable. Let’s not forget the origins of X/Open and why this whole thing exists…
APIs for managing threads and access to shared memory should be re-thought with defaults created for many-core systems
Apple released libdispatch in 2011 with Snow Leopard under an Apache 2.0 license. It supports Mac OS, the BSDs, Linux, Solaris, and since 2017, Windows (using NT native blocks, even). I actually wrote an iOS app using GCD to process large XML API responses and found it did exactly what it was supposed to: on devices with more cores, more requests could be processed at once, making the system more responsive. At the same time, at least the UI thread didn’t lock up when your single-core 3GS was still churning through.
And yet nobody uses libdispatch. Sometimes I hear “ew, Apple”, which may have been a bigger influence in 2011. Now, there’s really no excuse. I think it’s just inertia. And nobody wants to introduce more dependencies when you’re guaranteed POSIX and it works “good enough”.
create systems that express in software the richness of modern hardware
I think it should be the exact opposite. Software shouldn’t care about the hardware it is running on. It could be running on a Raspberry Pi Zero, or a z16. The reason POSIX has endured for this long is because it gives everyone a base platform to build more rich frameworks atop. Libraries like libdispatch are a good example of what can be built to take advantage of different scales of hardware without abandoning the thing that ensures we have an open standard that all systems are “guaranteed” to (mostly) follow.
I might use this comment as the basis for an article on my own, and go into more detail about what I think POSIX gets right and wrong, and directions it could/should head.
I might use this comment as the basis for an article on my own, and go into more detail about what I think POSIX gets right and wrong, and directions it could/should head.
Relatedly, there is a misconception that has been around for years that Haiku, which I am one of the developers of, is “not a UNIX” or “only has POSIX compatibility non-‘natively’”. When this is corrected, some people are more than a little dismayed; they thought of Haiku as being “different” and “exotic” and are sad to discover that, under the hood, it’s less so than they imagined! (Or, often, it still is quite different and exotic; it’s just that “POSIX” means a whole lot less than most people may come to assume from Linux and the BSDs.)
The review of Haiku’s latest release in The Register originally included this misconception, and I wound up in an extended argument (note especially the reply down-thread which talks about feelings) with the author of the article about it (and also in an exchange with the publication itself on Twitter.)
Relatedly, there is a misconception that has been around for years that Haiku, which I am one of the developers of, is “not a UNIX”
Isn’t that true? It’s not a descendent of BSD or SysV, nor has it ever been certified as a UNIX. If someone called Haiku a UNIX then they’d have to say the same about Linux, which would be clearly off. Even Windows NT4 was POSIX-compliant and I’ve never met anyone who considers Windows to be a UNIX variant.
The review of Haiku’s latest release in The Register originally included this misconception, and I wound up in an extended argument (note especially the reply down-thread which talks about feelings) with the author of the article about it
Hah, I had a similar (though briefer) exchange with the same author at https://news.ycombinator.com/item?id=34772982. I think that particular person just doesn’t have much interest in getting terminology correct before rushing their articles out the door.
This may come as an unpleasant revelation, but sometimes, just saying to someone “that isn’t right” is not going to change their mind. You didn’t even bother to reply to my comment on HN, so how you can call that an “exchange” puzzles me. You posted a negative critical comment, I replied, and you didn’t.
Ah well. Your choice.
No, I do not “just rush stuff out”, and in fact, I care a very great deal about terminology. I’ve been a professional writer for 28 years, have written for some 15 magazines and sites in a paid capacity, and have been a professional editor as well. It is not possible to keep working in such a business for so long if you are slapdash or slipshod about it.
As for the technical stuff here:
I disagree with @waddlesplash on this, and I disagree with you as well.
I stand by my position on BeOS and Haiku: no, they are not Unixes, nor even especially Unix-like in their design. However, Haiku has a high degree of Unix compatibility – as does Windows, and it’s not a Unix either. OpenVMS and IBM z/OS also have high degrees of Unix compatibilty, and both have historically passed POSIX testing, meaning that they could, if they wished, brand as being “a UNIX”.
Which is where my disagreement with your comment here comes in.
Linux has passed the testing and as such it is a UNIX. Like it or not, it has won Open Group branding, and although none of the 2-3 vendors who’ve had it in the past still pay for the trademark, it did pass the test and thus it counts.
No direct derivative of AT&T UNIX is still in active development any more.
No BSD has ever sought the branding, but I am sure they easily could pass the test if they so wished. It would however be a waste of money.
I would characterise Haiku the same as I would OpenVMS, z/OS and Windows NT: (via its native POSIX personality) a non-Unix-like OS, which does not resemble traditional Unix in design, in implementation, in its filesystem design or layout, or in its native APIs. However, all of them are highly UNIX compatible – about as UNIX compatible as it’s possible to be without actually being one. OpenVMS even used to have its own native X11 server, although I don’t think it’s maintained any more. Haiku, like RISC OS, has its own compatibility library allowing X11 apps to run and display in the native GUI without running a full X server.
Linux is a UNIX-like design, implemented in the same language, conforming to the same spec, implementing the same APIs. Unlike Haiku, z/OS or OpenVMS, it has no other alternative native APIs or non-UNIX-like filesystems or anything else.
Linux is a UNIX. By the current strict technical definition: it passed the Open Group tests which subsumed and replaced POSIX decades ago. And by a description: it’s a UNIX-like design built with Unix tools in the Unix preferred language, and nothing else.
Haiku isn’t. It hasn’t passed testing, it isn’t Unix like in design, or implementation, or native APIs, or native functionality.
The one that is arguable, to me, is none of the above.
It’s macOS.
macOS has a non-Unix-like kernel, derived from Mach, but with a big in-kernel UNIX server derived from BSD code. It has its own native non-Unix-like APIs, but they mostly sit on top of a UNIX-derived and highly UNIX-like layer. It has its own native GUI, which is non-UNIX-like, and its own native configuration database and much else, which are non-UNIX-like and implemented in non-UNIX-like languages.
It doesn’t even have a case-sensitive filesystem, one of the lowest common denominators of Unix-like OSes.
But, aside from its kernel, it’s highly UNIX-like until you get up to the filesystem layout and the GUI layer – all the UNIX directories are there, just mostly empty, or populated with stubs pointing the curious explorer to Netinfo and so on.
For X11 apps, it does in fact run a whole X server based on X.org.
But macOS has passed testing and Apple does pay for the trademark so, by the strict technical definition, it 100% is a UNIX™.
If someone called Haiku a UNIX then they’d have to say the same about Linux, which would be clearly off.
Well, there are people who say it about Linux. After all, POSIX is part of the “single UNIX specification”, so it is somewhat reasonable. But if people want to be consistent and not use the term for either Linux or Haiku, that’s fine by me. It’s using the term for only one and not both that I object to as inconsistent.
libdispatch is kind of an ironic example. The APIs lends their implementations to heap allocations at every corner and thread explosion. Most of them could be addressed with intrusive memory and enforced asynchronous behavior at the API boundary.
It’s like POSIX in a sense where it’s “good enough” for taking some advantage of various hardware configurations but doesn’t quite meet expectations on scalability or feature set for some applications. POSIX apis like pthread and select/poll, under this lens, also take advantage of hardware and are “good enough”.
If that’s all that is required by the application then it’s fine, but lower/core components like schedulers, databases, runtimes, and those which provide the abstractions that people use over POSIX apis generally want to do as best they can. Only offering POSIX at the OS level limits this and I believe is why things like io_uring on linux, ulock on darwin, and even epoll/kqueue on both exists.
Now these core components either try (pretty hard) to design APIs that work well across all of these extensions (including, and limiting-ly so, POSIX) or they just specialize to a specific platform. It’s too late the change now, but there’s more scalable API decisions for memory, IO and synchronization that POSIX could have adopted that could be built on-top of older POSIX apis, surprisingly looking to windows ntdll here for inspiration.
What I’m saying is there is absolutely no technological reason you couldn’t slap a POSIX layer atop virtually anything, even if it wasn’t originally designed for it. Hell, I would even suggest you could go all-out and design this “flexible innovative system” and have someone else put a POSIX layer atop it.
Well there’s at least one, and the article starts into this a little bit: That POSIX layer you’re talking about takes up space and CPU, so if you’re designing a small system (or even a “big” one optimised for cost or power efficiency) you might like to have that on the negotiating table.
I heard a story about a chap who sold forth chips and every time he tried to break out they would ask for a POSIX demo. They eventually made one, and of course it was slow and made everything warm, so it didn’t help. Now if you know forth, this makes sense, but if you don’t know forth – and heck, clearly management didn’t either – you might not understand why you can’t have your cake and eat it too, so “slapping a POSIX layer atop” might even make sense. But forth systems are really different, really ideal if you can break your problem down into a bunch of little state machines, but it’s hard to sell that to someone whose problem is buying software.
Years later, I worked for a company who sold databases, and a frequent complaint voiced by the market, at trade shows and in the press, was that they didn’t have an SQL layer, so they made one, but it really just handled the ODBC and some basic syntactic differences, like maybe it was brely SQL92 if you squinted, so the complaint continued to be heard in the market and the company made another SQL layer. When I joined they were starting the fourth or fifth version, and I’m like, this is just like the forth systems!
But then, even Windows is giving up and making WSL2 a first-class citizen. This isn’t because of some weird conspiracy to make all platforms POSIX. It is because the POSIX paradigm has evolved
This might be more to do with the value of Linux as opposed to POSIX. For many developers (maybe even most), Linux is hands-down the best development environment you can have, even if your target is Windows or Mac or tiny forth chips, and I don’t think it’s because of POSIX, or really any one thing, but I do think if something else had been better, Microsoft probably would have used that instead (or in addition-to: look at how they’re treating the web platform with edge!)
That being said, I think POSIX was an important part of why Linux is successful: Once upon a time Linux was a pretty goofy system, and at that time a lot of patches were simply justified as compliance with POSIX, which rapidly expanded the suite of software Linux had access to. Having access to a pretty-good spec and standard meant people who ported programs to early-Linux fixed those problems in the right place (the kernel and/or libc) instead of adding another #ifdef __linux__
That POSIX layer you’re talking about takes up space and CPU, so if you’re designing a small system (or even a “big” one optimised for cost or power efficiency) you might like to have that on the negotiating table.
I can appreciate that. I focused on that because the article spent so much time waxing poetic about how it’s “hard” to find a computer with less than “tens of CPUs”. At that scale, it would be equally “hard” to justify not having a POSIX layer.
A chip designed to run Forth would be quite an interesting system! I don’t know if I’ve ever heard about one. I know of LispMs, and some of the specialised hardware to accelerate FORTRAN once upon a time.
they didn’t have an SQL layer, so they made one
You can make an SQL layer atop pretty much any database, even non-relational ones, if you squint hard enough. I suppose it’s the same thing with POSIX layers. Not always the best idea, but the standards are generous enough in their allowances that it can be done.
POSIX was an important part of why Linux is successful
Yes. In the early days, it gained it a lot of software with little amount of porting. Now, it makes it easy to port workloads off other Unix platforms (like Solaris). In the future, it might just be the way that Next-New-OS bridges to bring Linux workloads to it.
This is an excellent write-up! It was especially neat to see you came across (and apparently read, or at least skimmed!) the internals documentation, and utilized knowledge from that in the write-up. Kudos!
Excellent article, thanks for sharing!
For fun, I tried running the linked testcase on Haiku (there’s only a small number of things in it that are actually Linux-specific in the critical path, easily replaced; the
/proc-reading logic can just be ignored), in a VM also with 4 cores and 4 GB RAM. Running it with the original parameters just caused it to crash with NULL dereferences a lot (Haiku does not have overcommit or an “OOM killer”, and malloc will indeed return NULL if the system is out of memory.) I reduced the number of threads to 250+250 instead of 1000+1000 and the sizes to 1GB, 1GB instead of 4GB, 4GB and it then it ran much longer.The system got pretty sluggish with 500 threads all vying for activity on 4 CPU cores, but I could still move the mouse and interact with GUI despite the lag. It got to “Allocated all memory” in many of the file threads, but didn’t seem to actually get to printing “Iteration 1” in most of the memory threads but got bogged down in handling the page faults before that (it appears the memory threads all allocate one chunk each before printing “Iteration 1”). However, the system profiler and
ps -auxcontinued to work fine, albeit laggy with all cores overloaded.After about 25 minutes, I Ctrl+C’d the test, and it exited immediately and the system returned to normal at once. I then tried with 32+32 threads; this also seemed to get bogged down in page fault handling, but this time the system remained lively and responsive even with constant 100% CPU usage on all cores. (There was still some redraw lag, but if you weren’t paying close attention you might miss it; and in some apps, like Terminal, I couldn’t tell at all.)
Finally I reran the test with only 4+4 threads. The system remained mostly responsive during this, but somewhat occasionally the whole GUI seemed to stall for a second or two at a time, but it would soon recover. This variant didn’t seem bogged down in page faults; a quick run of the profiler showed most time being spent in userland. And once again everything returned to normal instantly when I Ctrl+C’d the test.
So, while Haiku might be able to do better on the page fault and context switch handling, we don’t do so badly here as it is (though admittedly we have more locks and less atomics in the kernel than Linux does, which helps in situations like this.)
I want an insanely great system to exist again, and I’m open to suggestions.
It looks to me like no one attempts to compete with Apple at their user experience game—consistent behavior, minimum surprise, things just working. Enjoying popularity and a lack of opposition in the space they’ve carved out, they no longer have to make their systems insanely great for many of us users to continue thinking they’re the best. Eventually that has meant they’re merely the least bad. A lot of the time I’m happy enough, but sometimes I feel stuck with all this. I wonder what battles they fight internally to keep the dream of the menu bar alive. But dammit, the same key shortcuts copy and paste in every app.
The last time I used Gnome, I mouse-scrolled through a settings screen, snagged on a horizontal slider control, adjusted the setting with no clue what the original value was, and found there’s no setting I could use to avoid that. The last time I used Windows, I was again in the system settings app but found it didn’t have a setting I remembered. I learned Control Panel still exists too, and the half the settings still live there. My Mac, on the other hand, is insanely OK.
If you’re open to suggestions, have you tried Haiku before? It, too, has the same key shortcuts copy/pasting in every app (even when you use “Windows” shortcuts mode, i.e. Ctrl not Alt as the main accelerator key). We’re not quite as full-featured or polished yet as macOS was/is, but we’d like to think the potential is there in ways it’s not for the “Linux desktop” :)
Thanks; I have eyed Haiku with interest from time to time! It does strike me as in line with some of my values. Maybe I’ll give it a more earnest try.
Update, impressions after a few hours of uptime: This is nice. I could get comfortable here. It’s really cohesive and all the orthogonal combinable parts feel well chosen. The spatial Finder is alive and well. I was especially impressed when I figured out how a plain text file was able to have rich text styling, while remaining perfectly functional with cat. Someone really must be steering this ship because it emphatically doesn’t suck.
Thanks for your kind words! Some of the “orthogonal combinable” parts (“styled text in extended attributes” and spatial Tracker included) are ideas we originally inherited from BeOS (but of course we’ve continued that philosophy). And we actually don’t have a single “project leader” role; the project direction is determined by the development team (with the very occasional formal vote to make final decisions; but more often there is sufficient consensus that this simply is not needed.) But we definitely is a have a real focus on cohesiveness and “doing things well”, which sometimes leads to strife and development taking longer than it does with other projects (I wrote about this a few years back in another comment on Lobsters – the article that comment was in a discussion thread for, about Haiku’s package manager is also excellent and worth a read), but in the end leads to a much more cohesive and holistic system design and implementation, which makes it all worth it, I think.
I’ll read up, thanks!
So much this. I am continually disappointed that both Gnome and KDE managed to fuck this up. Given that they both started as conceptual clones of Windows (more-or-less) I guess it isn’t surprising, but still
On the other hand, having an entire Super key available for essentially whatever you want to do with it is… super.
A minimal Linux setup can be great. You have few components that are really well maintained, so nothing breaks. It also moves slowly. My setup is essentially the same as back in 2009: A WM (XMonad), a web browser (Firefox), and an editor (Emacs). When I need some other program for a one-off task, I launch an ephemeral Nix shell. If you prefer Wayland, Hyprland can give you many niceties offered by desktop environments, like gestures, but it is still really minimal.
Nix is really cool for keeping a system clean like that. I bounced off NixOS because persisting the user settings I cared about started to remind me of moving a cloud service definition to Terraform. I do love the GC’ed momentary tool workspaces.
If Emacs was my happy place, I think your setup would be really pleasing. But I am a GUI person, to the degree that tiling window managers make my nose wrinkle. Windows are meant to breathe and crowd, I think. That’s related to the main reason I want apps to work similarly, because I’ll be gathering several of them around a task.
I want to believe there is a world in which a critical mass of FOSS apps are behaviorally cohesive and free of pesky snags, but I have my doubts that I’d find it in Linux world, simply because the culture biases toward variety and whim, and away from central guidance and restraint. Maybe I just don’t know where to look. BSDs look nice this way in terms of their base systems, but maybe not all the way to their GUIs.
Thanks for your work on Lobsters and for writing all this up, Peter. Are there any social media posts which we can “boost for visibility” (whether ones linking to this post, or others)? And, if making a similar post on other forums that would be similarly affected (e.g. the Haiku project forums), do you mind if we cross-link, or perhaps even borrow some of your verbiage?
This is probably the best one, I’ve tried to summarize a woolly topic clearly. You’re welcome to use it as an example or quote as is useful. My goal for the post was that people with more jurisdictional relevance get involved in helping find a better outcome for this situation, so I’d really appreciate you helping make connections or progress with the relevant authorities.
At
$PREVIOUS_JOB, a team member left for employment elsewhere, and I was tasked with taking over his code. He had needlessly wrappedsyslog()(with a function calledSyslog()with the same parameters, and our code would only ever run on Unix). When I removed the wrapper and fixed all the call sites (easy enough), boy, did I find errors. The compiler knew to check the format string forsyslog(), but not of the wrapper. [1] I’m a proponent of having the compiler find bugs so you don’t have to.[1] It didn’t help things that all the format strings were
#defined elsewhere in the code, so all I initially saw was:While in your particular case it obviously made sense to just drop the wrapper and use the real method, if one ever does want the compiler to check
printf-format strings that are being passed to a function not namedprintf, you can use__attribute__ ((format (__printf__, (a), (b)))). For example:declares a function
debug_printfthat takes a printf format string in argument #1 and the parameters to it starting in argument #2. So, both in the case you describe here as well as the one in the article, this attribute could be used to eliminate the “tradeoff” and keep the compiler type checking.There’s also another, related attribute called
format_argwhich tells the compiler that any string passed to the specified argument will cause a “similar” format string to be returned. The GCC documentation notes this can be used for translations: e.g. if one declared atranslatefunction this way, thenwill result in the compiler checking the string inside the call to
translateagainst theprintfinvocation.True, and I knew about that, but not the other developer. Also, while we did use GCC (on Linux) and clang (on Mac OS-X), we also had to compile for Solaris using their native C compiler. I don’t think it supports
__attribute__().Edited to add: yes, I know, conditional compilation and
#defines can work around this. I just wanted to mention that not all C compilers support__attribute()__.I don’t mean to kick up sand, but, that sounds like normal Haiku software. Haiku is only second to ReactOS as the most unstable OS I’ve ever used (sadly I’ve never gotten to try Copland OS.) Isn’t the web browser still causing kernel panics when you try to use it?
I only know of one intermittently-reproducible kernel panic from the web browser, and it’s mostly on an issue on recently nightly builds only, not beta5 (it’s due to recently-added assert failing). What crashes are you encountering? We have Haiku VMs that accumulate pretty long uptimes for package builds with no crashes…
I didn’t know either of them were particularly unstable, although I’m less surprised about ReactOS. I think they started that in the slightly-wrong direction of duplicating the NT architecture with new code rather than just API-compatible services atop a proven core, and will forever be stuck in a bad place because of it.
Wow this is super impressive! This isn’t even in Open or NetBSD.
How nice of them. Chromium team should take a lesson. Just look at this madness in FreeBSD!
The Chromium team refuses to take patches for supporting operating systems with few users. Except Fuchsia, of course.
We have a similar “madness” for the QtWebEngine port (which isn’t a full Chromium port, and it’s not exactly a recent one either…), and a lot of our patches were borrowed from FreeBSD, in fact.
Right, now that I’ve recovered from the man flu somewhat I can come back and say something more useful than hey man, cool interview! Sorry to spam the thread but I can’t edit my old one by now.
I think the most interesting and useful takeaway from this interview is this:
The framebuffer abstraction is super straightforward but if you peek under the hood of graphics systems for a bit you’ll find that, like, half the history of the development of modern hardware graphics hardware interfaces basically consists of trying to figure out how to build (what amounts to) a good framebuffer API -> vector operations pipeline transpiler. From a distance, it looks like it’s not a solved problem at all.
If @david_chisnall and @crazyloglad don’t mind expanding on the relay thing:
I was otherwise engaged until this morning so didn’t have time to notice :-)
For 1 - the biggest pitfalls, locally and systemically, that damn near everyone runs into is synchronisation. That’s why I suggested the resize case as a mental model for unpacking graphics as the system needs to converge towards a steady state while invalidating requests might still be in-flight. There is a lot to unpack in that, and whatever you chose you get punished for something. Even if you are single source/sink /dev/fb0 kind of a deal, modeset is another that’ll get you.
Then comes colour representation and only thinking ‘encoding implies equivalent colour space’. This gets spicy when you only have partial control of legacy and clients. There will be those sending linear RGB for a system that expects sRGB and vice versa, and that goes for the hardware chain as well. On top of this is blending. On top of that is calibration and correction.
Then comes device representation that ties all of this together, e.g. what the output sink actually can handle versus what you can provide, and the practical reality that any kind of identity+capability description we’ve ever invented for hardware even as static as monitors gets increasingly creative interpretations as manufacturing starts.
For 2 - I well remember flink and might have good reason to not trust the current system aswell. The heavy investment into hardware compartments and network transparency is personally motivated by that distrust. Are you thinking of the whole stack (i.e. specific window contents, sharing model between multiple-source-multiple-sinks, deception like a full screen client that looks just like the login-screen?) or the GPU boundary specifically (i.e. GEM/dma-buf/…)?
It’s okay, Paracetamol makes me drowsy as hell so I was “otherwise engaged”, too, as in I slept through most of Sunday :-D.
That list of common pitfalls pretty disappointing to read in a Linux context. I actually recognize some of those from my Linux BSP sweatshop days. Eek!
The GPU boundary specifically, the former covers way too much ground to make it a useful question IMHO.
If you contrast them with the state of some other project, not to invite the keyword, how well did their past experiences with maintaining a popular display server avoid these basic pitfalls?So Arcan has a systemic opinion in that ‘very large ground’ space as somehow all the layers need to be tied together for the boundary to have more bite than the equivalent ‘fork + unveil/pledge’ like short compartments for things. Do note that I have basically /ignored all of CUDA etc. for ‘other GPU uses’.
With the work it takes, the open source spectrum only really leads to that one coarse grained viable interface that everyone copies near verbatim post the render-node/dma-buf change, they are just in different stages of being synched to it (can’t say for Haiku though, a while since I last poked around in there, maybe @waddlesplash).
There are interpretations for how you can leverage them (afair Genode handles it slightly different for a good experimental outlier in general), but in the end there’s only so much you can ‘do’ – opaque interchangeable sets of tokens representing work items that piggyback on some other authentication channel or a negotiated/authenticated initial stream setup that gets renegotiated when boundary conditions (resize) change. Even Android isn’t much different in this regard outside of petty nuances.
In the proprietary space, although you’ll find little documentation on the implementation (or at least I back when I hadn’t just given up on the platform entirely), IOSurfaces in the OSX Sense has a more refined model on the opaque sense in constrast to EGLStreams.
Haiku doesn’t have GPU acceleration yet (with the exception of one experimental Vulkan-only driver for Radeon Southern Islands that one contributor wrote), so we haven’t settled on a design for that API; doubtless we will just use Mesa so something under the hood will still provide dma-bufs or an equivalent, eventually.
But one thing which is notable here is that Haiku still does server-side drawing, for all applications that use the native graphics toolkit, anyway. Stuff like Qt and GTK of course just grabs a shared-memory bitmap and draws it repeatedly, but native applications send drawcalls to the server, specifying where they’re to be drawn into (a window, a bitmap, etc.) So, there’s a lot of leeway here, should we eventually get GPU acceleration and decide to experiment with GPU rendering, for the server to “batch” things, share contexts efficiently, etc. which applications on Linux can’t do anymore in the Wayland era (and didn’t do for a long time before that, usually, because X11’s rendering facilities didn’t really keep pace with what people expected 2D graphics drawing APIs to be.)
Well we can coquettishly suggest HVIF or SVG as a new entry to wl_shm::format and they’d tick the server side graphics box about as well as many other ones.
Neither of those are really designed for on-the-fly generation, though. HVIF in particular has a lot of features which make it very compact at the expense of writing and decoding time. The graphics protocol that Haiku uses for real-time drawing isn’t related to it. (Though it also has an “off the wire” form: BPicture files.)
Are USB audio devices in the kernel in Haiku? I have fond memories of BeOS 5 popping up a dialog to tell me that the audio stack had crashed and been restarted, with less than a second’s interruption to music playback, on the same machine, the sound card driver would routinely panic the Linux kernel and blue-screen Windows (Creative Labs absolutely could not be trusted in ring 0).
The USB audio driver is in-kernel for the moment, but that’s just because it interfaces with the media server the same way all the other audio drivers do. It could very much be ported to userland as a separate media output module instead, but that’d require a bit of work I didn’t see much reason to do at the moment.
The userland portions of Haiku’s audio stack can be restarted on-the-fly, but there’s some bugs remaining that mean applications currently outputting audio also have to be restarted and won’t automatically reconnect…
Does Haiku use the FreeBSD network stack, and, if so, does this share code with libuinet?
Being able to run pcap files against the network stack sounds really useful for fuzzing.
No, we do not; so no code is shared here.
It’s also just a shell for the TCP implementation and nothing else; it doesn’t run the IPv4/v6 portions of the stack, or the routing logic, etc.
Ah, that’s not what the changes described in the report were about; that was just about adding a way to dump packets from the userland test harness into a pcap file. But in fact replaying pcap files can already be done with “tcpreplay”, I think.
Unrelated to this project, I had started writing a source-level compatibility library for BeOS decades ago as well on top of Linux.
IIRC, the common threads implementations at the time (this was in the LinuxThreads era, with true POSIX threads just starting to become available on Linux), couldn’t quite do what I needed them to do to emulate BeOS threads. At the time there wasn’t a reliable way to get a thread to start in a suspended state (I don’t think? It’s been a long time), and even today I think it’s not actually possible with pure POSIX threads. You could probably do it with
clone(2)pretty easily now.There were other, simpler, problems too, like how BeOS’s error constants were negative numbers.
Again, this was in like 1998, so take this with a grain of salt. It was super fun and educational and I will be watching this project with great interest.
(Also, if it were up to me — and it’s not — I’d have moved the query parser for filesystem queries to user space and have a system call that took a parsed tree instead. From what I understand from reading Practical Filesystem Design by Giampaolo, which describes the implementation of the Be filesystem, the query parsing code was all in kernel space and made up a significant amount of the BFS code base. There’s probably a good reason for doing it that way that I just don’t see.)
All of the things that you’re querying require direct access to inodes. If you move the query parser into userspace, you still have query execution in the kernel. That means that you need to take the query from userspace, parse it, and then serialise it in a form that the kernel can execute. It’s not clear to me that this would reduce the amount of parsing in the kernel considerably.
If you moved the query execution into userspace then you’d have a lot more kernel <-> userspace traffic, because you’d need to pass intermediate results up to userspace. If you didn’t do query planning correctly, you could very easily end up passing lists of most inodes in the system up to userspace accidentally.
Spotlight (designed by the same person as BFS) avoids this by keeping the database of metadata entirely in userspace. This has the downside that it can easily get out of sync.
At least on Haiku, it doesn’t: the query portions of BFS are 1039 SLoC out of 13,144 in BFS altogether, plus another few hundred in common code. (Numbers counted by
cloc.) I guess it’s more if you count the indexing system, but that’s actually part of the on-disk structures.And by the way, I replied to the post on the NetBSD mailing list a few weeks back.
The normal way of doing this is to start the thread where the first thing that it does is wait on a mutex, which you hold in the caller. Did you need to do something to the thread that relied on it being in a known state (e.g. poke at its saved register file)? You can do that via ptrace, but it’s a very painful (trace yourself, request tracing of thread-creation events, create the thread, catch the event, poke the thread).
Yeah I did something like that, IIRC, but it was painful.
BeOS also does some stuff with signal handling on threads that isn’t directly mappable to pthreads.
Just a random guess, but I’d expect that it was for performance and shoving it in kernel space involved fewer context switches and/or page table walks than doing it in userland. IIRC Windows used to do the same with certain graphics code and such.
For what it’s worth: FreeBSD made getentropy a system call rather than a read of the device to better support Capsicum. Without this, you needed to open the random device before entering capability mode.
I’d love to see Capsicum support in Haiku.
Yes, that’s what’s happening here: Haiku has a “generic syscall” mechanism which functions something like “ioctl without a file descriptor”. The kernel and drivers can register themselves to be invoked through this system. You can see the commit that implemented this.
Exciting. Is there a roadmap for capsicum support?
No, we don’t have any particular plans for that.
Sorry, I misunderstood the ‘that’s what’s happening here’. I’m now not sure what that meant in your previous post.
Ah, I thought you were just asking if “getentropy” worked without using file-descriptor, to which the answer is “yes.” We didn’t implement it this way to support Capsicum or anything, but rather so that
getentropystill works properly when a process is close to its limit of FDs, and also for efficiency so it doesn’t need anopen/closeon every invocation.Ah. The perf wasn’t an issue on FreeBSD because a process rarely calls it (it seeds a PRNG in libc that periodically gets more entropy, if you’re calling it on a hot path then you’re doing something wrong, especially on modern x86 where you really want to feed RDRAND into your userspace PRNG and just use getentropy once to protect against supply chain attacks on the hardware RNG [FreeBSD doesn’t do this yet]). I don’t think the old code did an open and close every invocation, just an open on first use.
Running out of file descriptors might be an issue, but those limits have been high enough that non-malicious code doesn’t hit them and if you’re one below them then you have so many open that you’re probably going to see failures from other things soon. Does Haiku have low limits here? It was a problem on OS X for the first couple of releases because the default values hadn’t changed since OPENSTEP 4, I had a script that set the sysctl values to a couple of orders of magnitude higher than the default to prevent things dying, but after 10.4 Apple made the defaults sensible.
It was essential for Capsicum because you lose access to the global namespace once you enter capability mode and so can’t open the device mode. You’d need libc to open the device before you called cap_enter and that might have surprising effects (in particular, code often does closefrom just after for some defence in depth against things being accidentally opened concurrently). I believe Linux did it for similar reasons (seccomp-bpf policies wanting to disallow open and not being able to special case a particular path because BPF can’t see into string arguments).
I imagine the perf wouldn’t have been an issue indeed, but we wanted to do things correctly nonetheless.
I don’t think our limits are especially low. We’ve raised them now and again when big applications come along, but it’s been a while since we’ve needed to do that, I think.
I really like the Haiku icon format though SVG is a pretty evil comparison, it’s notoriously verbose. I think the Haiku format is a bit denser than PostScript, but closer to a factor of two (maybe less, depending on what you’re doing) rather than more than ten.
I am less convinced by the cost of loading all of the files though, for two reasons. First, I would expect the icon files to mostly live in the buffer cache if you keep reading the files. If you’re rendering directly from them then you can mmap them and that will keep them live in the buffer cache for anyone else. With modern SSDs (and even disks), you’re likely to write a load of them at once and so they’ll end up contiguous on disk (since you control the FS, you could even guarantee that) and so command queuing means that a load of reads for each of them will be coalesced and you’ll get a single read. With both SSDs and spinning rust, sequential reads are much cheaper than random reads, so doing two reads of sequential sectors is basically the same cost as doing a single read of a sector, but doing 100 reads of sectors scattered over the disk is 100 times slower than doing 100 reads of individual or pairs of sectors. You get much more of a speedup from packing multiple icons into a single contiguous block than from making the icons small.
Perhaps more importantly, modern disks have 4 KiB sectors and so a 500 byte icon is going to end up wasting 3/4 of the space. From what I remember of BFS, it regards the file contents as just another form of metadata and so will embed it in the inode if possible, so if BFS has been extended for 4 KiB sectors then you will be closer to 50% overhead because the file and the inode will fit in a single sector. BFS and NTFS both have a big-object or small-object split but NTFS packs small objects into a dedicated region (which results in write amplification).
I’m really curious how this composes with the newer Haiku ‘apps are immutable filesystem layers’ model (which is amazing and everyone should copy). Presumably those filesystems can be packed, so you don’t have wasted overhead and can do a single read for all icons in a particular app but you have a lot of random reads for different bundles (though, again, caching is your friend).
A particular application’s icons are usually stored in “resources” appended to the ELF binary, not extended attributes; it’s only the application icon itself that will be in an attribute.
You can pick BFS sector size upon initialization. The default is 2048, you can choose 4096. I think you are correct about metadata being stored with the inode, but I’m not very familiar with BFS internals, to be honest.
HPKGs are block-compressed, so yes, there’s no wasted space. There’s details on the format in our internals documentation (however it’s a bit outdated; there are since new minor versions including new attributes and a Zstd compression method.) But indeed reading 10 files from 10 different packages, even if they all appear to be in the same directory, will require 10 different reads, decompresses, etc.
I fully expected one of the drawbacks of the vector format (compared to bitmap) to be that the 16x16 icons are less legible (at least the example in this post, the casette player, is). There is something about hinting for small sizes. Must be why I love 9px Verdana :)
Would be interesting to try to come up with some kind of hinting for HVIF ’cause that example image shows a few fairly clear places where it matters most: darker outline, darker shadow, maybe higher contrast in general…
The article didn’t mention that HVIF also includes level of detail to optimise for rendering at smaller sizes
oooooooh, neat!
There’s some information about the LoD feature with examples in the Icon-O-Matic documentation.
I’ve long had on my list of “hopes and dreams when bored” projects to create a Rust library for HVIF. So long that there are probably ones that exist now and I’m not going to look at crates.io because I’m more likely to be depressed by whatever I see than remaining blissfully ignorant of that fact that either no one beat me to it or someone beat me to it.
Well, someone beat you to it, but that person is a long-time Haiku community member & occasional contributor. So I don’t think you should feel too bad about it. (It’s also worth noting that HVIF is technically still an evolving format, e.g. there were some more features added as part of a GSoC project this past summer, and I don’t think that library supports those.)
Yeah, looks like about 2 years ago. I’m glad that somebody did it.
All I want is a Haiku with some security guarantees. Any security guarantees other than “there are none”.
Well, there’s more than “none”, and more than a lot of OSes had in the 90s/00s. We have SMAP/SMEP, NX bit, ASLR (including kernel ASLR), filesystem permissions checks (though these haven’t been battle-tested, since the GUI runs as root and very few users bother to
useradd&suto a non-root user), syscall permissions checks (though these haven’t been fully audited or battle-tested either), and more.TIL! Last I heard an official statement there was no prioritization of security features. Clearly I’m out of the loop! Any plans on supporting disk encryption?
Well, the statement is correct: they’re not a priority, but we have some anyway.
There is/was third-party support (though developed by a long-time project member) for disk encryption, but only off the boot disk. There are no immediate plans to incorporate that and support boot disk encryption at the moment, I don’t think.
Are there any plams to automatically create a non-root user on install and use it as the default? This is how I believe macos does it, prompting for a root password when needed.
FWIW testing Haiku is eternally on my “when I have time” list, and for a lot of cases I’d love to ditch unixes for a simple single-user OS, but that user doesn’t have to be root if the separation is for security purposes ;)
There aren’t really any particular plans to do any security enhancements. Sometimes developers work on them, but they’re indeed not presently on any priority list. So, I guess the answer is “no”.
Fantastic to see kqueue on Haiku! I’m surprised that they made .NET work, the .NET use of kqueue upstream makes a few assumptions about kqueue behaviour that are specific to the XNU implementation.
I’m not sure what this means,
AF_UNIXsockets can beSOCK_STREAM. Does this mean that it can now createSOCK_SEQPACKETsockets? If so, that’s a big quality of life improvement.Does Haiku have clang support yet? Last time I looked it was GCC only and the prospect of having to use GCC put me off.
At present it’s only used for .NET socket notifications, which are pretty simple. Are you sure it really makes assumptions about XNU behavior? (There was kqueue use in a different component that made use of the “data” field, this component however didn’t even have an epoll version and so that was easy enough to disable for the present.)
No, it means it can also be used for
AF_UNIXSOCK_DGRAM, which were also implemented last month (and also noted in the activity report.)Haiku does define
SOCK_SEQPACKETin our headers, but I don’t think anything implements it at the moment. If you have a use-case (and ideally, test-case), I suppose you can open a ticket.We’ve had clang in the software depots for a while, and there’s some packages that are always compiled with it, so yes. However, there are still some issues compiling Haiku itself with clang, and I think there’s at least one issue about Haiku’s own headers tripping up on clang’s -Werror (our non-varargs
ioctlthat wants a default parameter; there’s WIP changes to fix that, but as it’s only an issue for C, not C++, and only with clang and -Werror, it hasn’t gotten much attention.)I don’t remember the details, but I think it was in the filesystem watcher interface. The macOS implementation had some subtle differences to the FreeBSD one that needed working around.
SOCK_SEQPACKET is basically my default for AF_UNIX things now. Every SOCK_STREAM thing that I’ve ever implemented with it has tried to implement a message-oriented system on top of the stream, relying on the kernel to do this removes a load of complexity in userspace. On FreeBSD, devd re-exports the kernel messages over a seqpacket socket, which is great for consumers: every recvmsg receives exactly one devd event.
Great! Last time I looked there was a port of a very old version, but it looks as if it’s getting closer (16 is almost landed but 17 is in RC state upstream).
Ah, well, I didn’t implement that. I considered it, but Haiku’s filesystem notification system is pretty different from what kqueue provides, so for the moment I didn’t. It looked like
libuvseparated filesystem events from the rest of the event system, at least, though of course for kqueue it’s in the same interface.What’s the difference between SOCK_DGRAM? Guaranteed in-order message delivery, I guess?
Pretty sure most AF_UNIX+SOCK_DGRAM implementations won’t drop packets except maybe under certain shutdown conditions, not sure. Either way, it likely wouldn’t be too hard to implement then, but it’s probably not any sort of priority for us.
Yes, exactly.
There are often places in kernel buffers where they can be dropped or reordered. If your data gram implementation already provides the relevant guarantees then you can probably implement it trivially.
Given that Haiku already adopts some FreeBSD APIs, I’m curious whether _umtx_op was considered for the userspace mutex support.
I don’t think it was.
A glance through the manual pages indicates that the
_umtx_opAPIs are much more complicated than what Haiku has (and maybe even more complicated that Linux futex, but I’m not an expert in either.)We do implement some FreeBSD APIs in userspace, but the underlying implementations often differ greatly from what the BSDs do, and furthermore we only implement such things when the POSIX APIs do not suffice in some area.
If any of the authors are reading this:
These reports would be a lot more readable if they included links or inline explanations of all of the proper nouns used to describe bits of the system. They very often read to me like ‘the FromdleService just gained support for FrobnicarionPlus’ and I have no idea whether I should be interested (and I say this as someone who ctually ram BeOS R5 and used BFS as a case study when teaching operating systems). It’s probably comprehensible to most of the Haiku developer community, but much less so to anyone outside who might want to become a contributor.
(I’m the author of the post.)
Indeed these reports started years ago precisely to keep the community informed and on the same page, and to provide something less technical and more accessible than the commit logs. That the posts are now getting spread more widely than that likely means we should indeed try to make them a bit more accessible. But for the community, such explanations would be redundant, so it’s a tricky balance to strike.
Do you just mean internal Haiku-specific components, like (e.g.)
app_server,userlandfs? Because (say) FreeBSD callouts are obviously not Haiku-specific, nor isstrace(our implementation is, but the concept of course is not), etc.For what it’s worth, when I helped edit the FreeBSD status reports, we had a policy of adding parenthetical definitions to these kind of things and never had complaints from the community (quite the revers).
I know what the fallout subsystem in FreeBSD does, but I would still have added a short description in an intro paragraph to a FreeBSD status report mentioning it. The same would be helpful in a Hakiu one. Note that in the FreeBSD status report, it would have linked the callout(9) man page, so the inline definition would be less important. You can use links like this or definitions that pop up on click to avoid disrupting the flow for people who know the subject well.
Please keep writing these reports though, they’re great for someone like me to see progress in Haiku.
Yeah, I think the gold standard for this is Dolphin’s reports, which is accessible to even non-technical users.
I mean, it is Turing-complete with =LAMBDA. I find it a bit distressing when programmers, especially influential ones, try to denigrate an environment or language they don’t like as “not real programming”. This reminded me of an article on contempt culture.
IBM i, which actually predates POSIX by some amount, is somewhat popular in my circles as an example of “what could have been” regarding CLIs, alternative programming paradigms, etc. It has a functional POSIX layer via AIX emulation (named PASE).
DOS and OS/2 had EMX which provided most of POSIX atop them. Mac OS 8/9 had GUSI for pthreads atop the horror show known as Multiprocessing Services. I’m pretty sure the Amiga had a POSIX layer. Stratus VOS. INTEGRITY. There are plenty of non-traditional, non-Unix platforms that are – at least mostly – POSIX conformant.
What I’m saying is there is absolutely no technological reason you couldn’t slap a POSIX layer atop virtually anything, even if it wasn’t originally designed for it. Hell, I would even suggest you could go all-out and design this “flexible innovative system” and have someone else put a POSIX layer atop it. You inherit half the world’s software ecosystem for “free” with good enough emulation, and your native apps will run better and show the world why they should develop for that platform instead of against POSIX, right?
But then, even Windows is giving up and making WSL2 a first-class citizen. This isn’t because of some weird conspiracy to make all platforms POSIX. It is because the POSIX paradigm has evolved, admittedly slowly in some cases, to provide a “good enough” layer on which you can build different platforms.
And abandoning POSIX could also lead to a bunch of corporations making locked-in systems that are not interoperable. Let’s not forget the origins of X/Open and why this whole thing exists…
Apple released libdispatch in 2011 with Snow Leopard under an Apache 2.0 license. It supports Mac OS, the BSDs, Linux, Solaris, and since 2017, Windows (using NT native blocks, even). I actually wrote an iOS app using GCD to process large XML API responses and found it did exactly what it was supposed to: on devices with more cores, more requests could be processed at once, making the system more responsive. At the same time, at least the UI thread didn’t lock up when your single-core 3GS was still churning through.
And yet nobody uses libdispatch. Sometimes I hear “ew, Apple”, which may have been a bigger influence in 2011. Now, there’s really no excuse. I think it’s just inertia. And nobody wants to introduce more dependencies when you’re guaranteed POSIX and it works “good enough”.
I think it should be the exact opposite. Software shouldn’t care about the hardware it is running on. It could be running on a Raspberry Pi Zero, or a z16. The reason POSIX has endured for this long is because it gives everyone a base platform to build more rich frameworks atop. Libraries like libdispatch are a good example of what can be built to take advantage of different scales of hardware without abandoning the thing that ensures we have an open standard that all systems are “guaranteed” to (mostly) follow.
I might use this comment as the basis for an article on my own, and go into more detail about what I think POSIX gets right and wrong, and directions it could/should head.
I’d love to read that!
I agree with pretty much all of this.
Relatedly, there is a misconception that has been around for years that Haiku, which I am one of the developers of, is “not a UNIX” or “only has POSIX compatibility non-‘natively’”. When this is corrected, some people are more than a little dismayed; they thought of Haiku as being “different” and “exotic” and are sad to discover that, under the hood, it’s less so than they imagined! (Or, often, it still is quite different and exotic; it’s just that “POSIX” means a whole lot less than most people may come to assume from Linux and the BSDs.)
The review of Haiku’s latest release in The Register originally included this misconception, and I wound up in an extended argument (note especially the reply down-thread which talks about feelings) with the author of the article about it (and also in an exchange with the publication itself on Twitter.)
Isn’t that true? It’s not a descendent of BSD or SysV, nor has it ever been certified as a UNIX. If someone called Haiku a UNIX then they’d have to say the same about Linux, which would be clearly off. Even Windows NT4 was POSIX-compliant and I’ve never met anyone who considers Windows to be a UNIX variant.
Hah, I had a similar (though briefer) exchange with the same author at https://news.ycombinator.com/item?id=34772982. I think that particular person just doesn’t have much interest in getting terminology correct before rushing their articles out the door.
As I said on HN:
Gee, thanks.
This may come as an unpleasant revelation, but sometimes, just saying to someone “that isn’t right” is not going to change their mind. You didn’t even bother to reply to my comment on HN, so how you can call that an “exchange” puzzles me. You posted a negative critical comment, I replied, and you didn’t.
Ah well. Your choice.
No, I do not “just rush stuff out”, and in fact, I care a very great deal about terminology. I’ve been a professional writer for 28 years, have written for some 15 magazines and sites in a paid capacity, and have been a professional editor as well. It is not possible to keep working in such a business for so long if you are slapdash or slipshod about it.
As for the technical stuff here:
I disagree with @waddlesplash on this, and I disagree with you as well.
I stand by my position on BeOS and Haiku: no, they are not Unixes, nor even especially Unix-like in their design. However, Haiku has a high degree of Unix compatibility – as does Windows, and it’s not a Unix either. OpenVMS and IBM z/OS also have high degrees of Unix compatibilty, and both have historically passed POSIX testing, meaning that they could, if they wished, brand as being “a UNIX”.
Which is where my disagreement with your comment here comes in.
Linux has passed the testing and as such it is a UNIX. Like it or not, it has won Open Group branding, and although none of the 2-3 vendors who’ve had it in the past still pay for the trademark, it did pass the test and thus it counts.
No direct derivative of AT&T UNIX is still in active development any more.
No BSD has ever sought the branding, but I am sure they easily could pass the test if they so wished. It would however be a waste of money.
I would characterise Haiku the same as I would OpenVMS, z/OS and Windows NT: (via its native POSIX personality) a non-Unix-like OS, which does not resemble traditional Unix in design, in implementation, in its filesystem design or layout, or in its native APIs. However, all of them are highly UNIX compatible – about as UNIX compatible as it’s possible to be without actually being one. OpenVMS even used to have its own native X11 server, although I don’t think it’s maintained any more. Haiku, like RISC OS, has its own compatibility library allowing X11 apps to run and display in the native GUI without running a full X server.
Linux is a UNIX-like design, implemented in the same language, conforming to the same spec, implementing the same APIs. Unlike Haiku, z/OS or OpenVMS, it has no other alternative native APIs or non-UNIX-like filesystems or anything else.
Linux is a UNIX. By the current strict technical definition: it passed the Open Group tests which subsumed and replaced POSIX decades ago. And by a description: it’s a UNIX-like design built with Unix tools in the Unix preferred language, and nothing else.
Haiku isn’t. It hasn’t passed testing, it isn’t Unix like in design, or implementation, or native APIs, or native functionality.
The one that is arguable, to me, is none of the above.
It’s macOS.
macOS has a non-Unix-like kernel, derived from Mach, but with a big in-kernel UNIX server derived from BSD code. It has its own native non-Unix-like APIs, but they mostly sit on top of a UNIX-derived and highly UNIX-like layer. It has its own native GUI, which is non-UNIX-like, and its own native configuration database and much else, which are non-UNIX-like and implemented in non-UNIX-like languages.
It doesn’t even have a case-sensitive filesystem, one of the lowest common denominators of Unix-like OSes.
But, aside from its kernel, it’s highly UNIX-like until you get up to the filesystem layout and the GUI layer – all the UNIX directories are there, just mostly empty, or populated with stubs pointing the curious explorer to Netinfo and so on.
For X11 apps, it does in fact run a whole X server based on X.org.
But macOS has passed testing and Apple does pay for the trademark so, by the strict technical definition, it 100% is a UNIX™.
Well, there are people who say it about Linux. After all, POSIX is part of the “single UNIX specification”, so it is somewhat reasonable. But if people want to be consistent and not use the term for either Linux or Haiku, that’s fine by me. It’s using the term for only one and not both that I object to as inconsistent.
libdispatch is kind of an ironic example. The APIs lends their implementations to heap allocations at every corner and thread explosion. Most of them could be addressed with intrusive memory and enforced asynchronous behavior at the API boundary.
It’s like POSIX in a sense where it’s “good enough” for taking some advantage of various hardware configurations but doesn’t quite meet expectations on scalability or feature set for some applications. POSIX apis like pthread and select/poll, under this lens, also take advantage of hardware and are “good enough”.
If that’s all that is required by the application then it’s fine, but lower/core components like schedulers, databases, runtimes, and those which provide the abstractions that people use over POSIX apis generally want to do as best they can. Only offering POSIX at the OS level limits this and I believe is why things like
io_uringon linux,ulockon darwin, and evenepoll/kqueueon both exists.Now these core components either try (pretty hard) to design APIs that work well across all of these extensions (including, and limiting-ly so, POSIX) or they just specialize to a specific platform. It’s too late the change now, but there’s more scalable API decisions for memory, IO and synchronization that POSIX could have adopted that could be built on-top of older POSIX apis, surprisingly looking to windows ntdll here for inspiration.
Well there’s at least one, and the article starts into this a little bit: That POSIX layer you’re talking about takes up space and CPU, so if you’re designing a small system (or even a “big” one optimised for cost or power efficiency) you might like to have that on the negotiating table.
I heard a story about a chap who sold forth chips and every time he tried to break out they would ask for a POSIX demo. They eventually made one, and of course it was slow and made everything warm, so it didn’t help. Now if you know forth, this makes sense, but if you don’t know forth – and heck, clearly management didn’t either – you might not understand why you can’t have your cake and eat it too, so “slapping a POSIX layer atop” might even make sense. But forth systems are really different, really ideal if you can break your problem down into a bunch of little state machines, but it’s hard to sell that to someone whose problem is buying software.
Years later, I worked for a company who sold databases, and a frequent complaint voiced by the market, at trade shows and in the press, was that they didn’t have an SQL layer, so they made one, but it really just handled the ODBC and some basic syntactic differences, like maybe it was brely SQL92 if you squinted, so the complaint continued to be heard in the market and the company made another SQL layer. When I joined they were starting the fourth or fifth version, and I’m like, this is just like the forth systems!
This might be more to do with the value of Linux as opposed to POSIX. For many developers (maybe even most), Linux is hands-down the best development environment you can have, even if your target is Windows or Mac or tiny forth chips, and I don’t think it’s because of POSIX, or really any one thing, but I do think if something else had been better, Microsoft probably would have used that instead (or in addition-to: look at how they’re treating the web platform with edge!)
That being said, I think POSIX was an important part of why Linux is successful: Once upon a time Linux was a pretty goofy system, and at that time a lot of patches were simply justified as compliance with POSIX, which rapidly expanded the suite of software Linux had access to. Having access to a pretty-good spec and standard meant people who ported programs to early-Linux fixed those problems in the right place (the kernel and/or libc) instead of adding another
#ifdef __linux__I can appreciate that. I focused on that because the article spent so much time waxing poetic about how it’s “hard” to find a computer with less than “tens of CPUs”. At that scale, it would be equally “hard” to justify not having a POSIX layer.
A chip designed to run Forth would be quite an interesting system! I don’t know if I’ve ever heard about one. I know of LispMs, and some of the specialised hardware to accelerate FORTRAN once upon a time.
You can make an SQL layer atop pretty much any database, even non-relational ones, if you squint hard enough. I suppose it’s the same thing with POSIX layers. Not always the best idea, but the standards are generous enough in their allowances that it can be done.
Yes. In the early days, it gained it a lot of software with little amount of porting. Now, it makes it easy to port workloads off other Unix platforms (like Solaris). In the future, it might just be the way that Next-New-OS bridges to bring Linux workloads to it.
These guys make forth chips, 144 “cpus” to a die, which is great for some applications, but POSIX is much too big to fit on even one of those chips.
Quite possibly we are seeing that right now with the “containerisation” fetish.
This is an excellent write-up! It was especially neat to see you came across (and apparently read, or at least skimmed!) the internals documentation, and utilized knowledge from that in the write-up. Kudos!