Fascinating read. Audio was the thing that made me switch from Linux to FreeBSD around 2003. A bit before then, audio was provided by OSS, which was upstream in the kernel and maintained by a company that sold drivers that plugged into the framework. This didn’t make me super happy because those drivers were really expensive. My sound card cost about £20 and the driver cost £15. My machine had an on-board thing as well, so I ended up using that when I was running Linux.
A bit later, a new version of OSS came out, OSS 4, which was not released as open source. The Linux developers had a tantrum and decided to deprecate OSS and replace it with something completely new: ALSA. If your apps were rewritten to use ALSA they got new features, but if they used OSS (as everything did back then) they didn’t. There was only one feature that really mattered from a user perspective: audio mixing. I wanted two applications to be able both open the sound device and go ‘beep’. I think ALSA on Linux exposed hardware channels for mixing if your card supported it (my on-board one didn’t), OSS didn’t support it at all. I might be misremembering and ALSA supported software mixing, OSS only hardware mixing. Either way, only one OSS application could use the sound device at the time and very few things had been updated to use ASLA.
GNOME and KDE both worked around this by providing userspace sound mixing. These weren’t great for latency (sound was written to a pipe, then at some point later the userspace sound daemon was scheduled and then did the mixing and wrote the output) but they were fine for going ‘bing’. There was just one problem: I wanted to use Evolution (GNOME) for mail and Psi (KDE) for chat. Only one out of the KDE and GNOME sound daemons could play sound at a time and they were incompatible. Oh, and XMMS didn’t support ALSA and so if I played music the neither of them could do audio notifications.
Meanwhile, the FreeBSD team just forked the last BSD licensed OSS release and added support for OSS 4 and in-kernel low-latency sound mixing. On FreeBSD 4.x, device nodes were static so you had to configure the number of channels that it exposed but then you got /dev/dsp.0, /dev/dsp.1, and so on. I could configure XMMS and each of the GNOME and KDE sound daemons to use one of these, leaving the default /dev/dsp (a symlink to /dev/dsp.0, as I recall) for whatever ran in the foreground and wanted audio (typically BZFlag). When FreeBSD 5.0 rolled out, this manual configuration went away and you just opened /dev/dsp and got a new vchan. Nothing needed porting to use ALSA, GNOME’s sound daemon, KDE’s sound daemon, PulseAudio, or anything else: the OSS APIs just worked.
It was several years before audio became reliable on Linux again and it was really only after everything was, once again, rewritten for PulseAudio. Now it’s being rewritten for PipeWire. PipeWire does have some advantages, but there’s no reason that it can’t be used as a back end for the virtual_oss thing mentioned in this article, so software written with OSS could automatically support it, rather than requiring the constant churn of the Linux ecosystem. Software written against OSS 3 20 years ago will still work unmodified on FreeBSD and will have worked every year since it was written.
There was technically no need for a rewrite from ALSA to PulseAudio, either, because PulseAudio had an ALSA compat module.
But most applications got a PulseAudio plug-in anyway because the best that could be said about the compat module is that it made your computer continue to go beep – otherwise, it made everything worse.
I am slightly more hopeful for PipeWire, partly because (hopefully) some lessons have been drawn from PA’s disastrous roll-out, partly for reasons that I don’t quite know how to formulate without sounding like an ad-hominem attack (tl;dr some of the folks behind PipeWire really do know a thing or two about multimedia and let’s leave it at that). But bridging sound stacks is rarely a simple affair, and depending on how the two stacks are designed, some problems are simply not tractable.
One could also say that a lot of groundwork was done by PulseAudio, revealing bugs etc so the landscape that PipeWire enters in 2021 is not the same that PulseAudio entered in 2008. For starters there’s no Arts, ESD etc. anymore, these are long dead and gone, the only thing that matters these days is the PulseAudio API and the JACK API.
I may be misremembering the timeline but as far as I remember it, aRts, ESD & friends were long dead, gone and buried by 2008, as alsa had been supporting proper (eh…) software mixing for several years by then. aRts itself stopped being developed around 2004 or so. It was definitely no longer present in KDE 4, which was launched in 2008, and while it still shipped with KDE 3, it didn’t really see much use outside KDE applications anyway. I don’t recall how things were in Gnome land, I think ESD was dropped around 2009, but pretty much everything had been ported to canberra long before then.
I, for one, don’t recall seeing either of them or using either of them after 2003, 2004 or so, but I did have some generic Intel on-board sound card, which was probably one of the first ones to get proper software mixing support on alsa, so perhaps my experience wasn’t representative.
I don’t know how many bugs PulseAudio revealed but the words “PulseAudio” and “bugs” are enough to make me stop consider going back to Linux for at least six months :-D. The way bug reports, and contributors in general, technical and non-technical alike were treated, is one of the reasons why PulseAudio’s reception was not very warm to say the least, and IMHO it’s one of the projects that kickstarted a very hostile and irresponsible attitude that prevails in many Linux-related open-source projects to this day.
I might be misremembering and ALSA supported software mixing, OSS only hardware mixing.
That’s more like it on Linux. ALSA did software mixing, enabled by default, in a 2005 release. So it was a pain before then (you could enable it at least as early as 2004, but it didn’t start being easy until 1.0.9 in 2005)… but long before godawful PulseAudio was even minimally usable.
BSD did the right thing though, no doubt about that. Linux never learns its lesson. Now Wayland lololol.
GNOME and KDE both worked around this by providing userspace sound mixing. These weren’t great for latency (sound was written to a pipe, then at some point later the userspace sound daemon was scheduled and then did the mixing and wrote the output) but they were fine for going ‘bing’.
Things got pretty hilarious when you inevitably mixed an OSS app (or maybe an ALSA app, by that time? It’s been a while for me, too…) and one that used, say, aRTs (KDE’s sound daemon).
What would happen is that the non-aRTs app would grab the sound device and clung to it very, very tight. The sound daemon couldn’t play anything for a while, but it kept queuing sounds. Like, say, Gaim alerts (anyone remember Gaim? I think it was still gAIM at that point, this was long before it was renamed to Pidgin).
Then you’d close the non-aRTs app, and the sound daemon would get access to the sound card again, and BAM! it would dump like five minutes of gAIM alerts and application error sounds onto it, and your computer would go bing, bing, bing, bang, bing until the queue was finally empty.
I’d forgotten about that. I remember this happening when people logged out of computers: they’d quit BZFlag (yes, that’s basically what people used computers for in 2002) and log out, aRTs would get access to the sound device and write as many of the notification beeps as it could to the DSP device before it responded to the signal to quit.
ICQ-inspired systems back then really liked notification beeps. Psi would make a noise both when you sent and when you received a message (we referred to IM as bing-bong because it would go ‘bing’ when you sent a message and ‘bong’ when you received one). If nothing was draining the queue, it could really fill up!
Then you’d close the non-aRTs app, and the sound daemon would get access to the sound card again, and BAM! it would dump like five minutes of gAIM alerts and application error sounds onto it, and your computer would go bing, bing, bing, bang, bing until the queue was finally empty.
This is exactly what happens with PulseAudio to me today, provided the applications trying to play the sounds come from different users.
Back in 2006ish though, alsa apps would mix sound, but OSS ones would queue, waiting to grab the device. I actually liked this a lot because I’d use an oss play command line program and just type up the names of files I want to play. It was an ad-hoc playlist in the shell!
This is just an example of what the BSDs get right in general. For example, there is no world in which FreeBSD would remove ifconfig and replace it with an all-new command just because the existing code doesn’t have support for a couple of cool features - it gets patched or rewritten instead.
I’m not sure I’d say “get right” in a global sense, but definitely it’s a matter of differing priorities. Having a stable user experience really isn’t a goal for most Linux distros, so if avoiding user facing churn is a priority, BSDs are a good place to be.
I don’t know; the older I get the more heavily I value minimizing churn and creating a system that can be intuitively “modeled” by the brain just from exposure, i.e. no surprises. If there are architectural reasons why something doesn’t work (e.g. the git command line), I can get behind fixing it. But stuff that just works?
Same as with systemd, there were dozens of us where everything worked before. I mean, I mostly liked pulseaudio because it brought a few cool features, but I don’t remember sound simply stopping to work before. Sure, it was complicated to setup, but if you didn’t change anything, it simply worked.
I don’t see this as blaming. Just stating the fact that if it works for some people, it’s not broken.
Well, can’t blame him personally, but the distros who pushed that PulseAudio trash? Absolutely yes they can be blamed. ALSA was fixed long before PA was, and like the parent post says, they could have just fixed OSS too and been done with that before ALSA!
But nah better to force everyone to constantly churn toward the next shiny thing.
ALSA was fixed long before PA was, and like the parent post says, they could have just fixed OSS too and been done with that before ALSA!
Huh? I just setup ALSA recently and you very much had to specifically configure dmix, if that’s what you’re referring to. Here’s the official docs on software mixing. It doesn’t do anything as sophisticated as PulseAudio does by default. Not to mention that on a given restart ALSA devices frequently change their device IDs. I have a little script on a Void Linux box that I used to run as a media PC which creates the asoundrc file based on outputs from lspci. I don’t have any such issue with PulseAudio at all.
dmix has been enabled by default since 2005 in alsa upstream. If it wasn’t on your system, perhaps your distro changed things or something. The only alsa config I’ve ever had to do is change the default device from the hdmi to analog speakers.
And yeah, it isn’t sophisticated. But I don’t care, it actually works, which is more than I can say about PulseAudio, which even to this day, has random lag and updates break the multi-user setup (which very much did not just work). I didn’t want PA but Firefox kinda forced my hand and I hate it. I should have just ditched Firefox.
Everyone tells me the pipewire is better though, but I wish it could just go back to the default alsa setup again.
Shrug, I guess in my experience PulseAudio has “just worked” for me since 2006 or so. I admit that the initial rollout was chaotic, but ever since it’s been fine. I’ve never had random lag and my multi-user setup has never had any problems. It’s been roughly 15 years, so almost half my life, since PulseAudio has given me issues, so at this point I largely consider it stable, boring software. I still find ALSA frustrating to configure to this day, and I’ve used ALSA for even longer. Going forward I don’t think I’ll ever try to use raw ALSA ever again.
I cannot up this comment more. The migration to ALSA was a mess, and the introductions of Gstreamer*, Pulse*, or *sound_daemon fractured the system more. Things in BSD land stayed much simpler.
I was also ‘forced’ out of Linux ecosystem because of mess in sound subsystem.
After spending some years on FreeBSD land I got hardware that was not FreeBSD supported at that moment so I tried Ubuntu … what a tragedy it was. When I was using FreeBSD I got my system run for months and rebooted only to install security updates or to upgrade. Everything just worked. Including sound. In Ubuntu land I needed to do HARD RESET every 2-3 days because sound will went dead and I could not find a way to reload/restart anything that caused that ‘glitch’.
From time to time I try to run my DAW (Bitwig Studio) in Linux. A nice thing about using DAWs from Mac OS X is that, they just find the audio and midi sources and you don’t have to do a lot of setup. There’s a MIDI router application you can use if you want to do something complex.
Using the DAW from Linux, if it connects via ALSA or PulseAudio, mostly just works, although it won’t find my audio interface from PulseAudio. But the recommended configuration is with JACK, and despite reading the manual a couple times and trying various recommended distributions, I just can’t seem to wrap my head around it.
I should try running Bitwig on FreeBSD via the Linux compatibility layer. It’s just a Java application after all.
Try updating to Pipewire if your distribution supports it already. Then you get systemwide Jack compatibility with no extra configuration/effort and it doesn’t matter much which interface the app uses. Then you can route anything the way you like (audio and MIDI) with even fewer restrictions than MacOS.
Fortunately on Linux side a lot of this gets much simpler these days with Pipewire which is kind of what pulseaudio should’ve been but actually works and delivers even more. Interface mess goes away (pulse and Jack client can connect at the same time and they’re both visible in jackctl routing), realtime mixing <5ms is easily achievable, runtime audio routing and effect processing cabinets “just work”, the mentioned “desktop/studio” buffer sizes are per app and just an env variable away, and many other great things. It really turned things around for Linux audio world.
Pipewire is still a work in progress but I’m using it on my laptop and finding that it works really well with both low-latency stuff through its jack interface and boring desktop stuff through its pulse interface. I’m excited for it to go mainstream when it’s ready.
Fascinating read. Audio was the thing that made me switch from Linux to FreeBSD around 2003. A bit before then, audio was provided by OSS, which was upstream in the kernel and maintained by a company that sold drivers that plugged into the framework. This didn’t make me super happy because those drivers were really expensive. My sound card cost about £20 and the driver cost £15. My machine had an on-board thing as well, so I ended up using that when I was running Linux.
A bit later, a new version of OSS came out, OSS 4, which was not released as open source. The Linux developers had a tantrum and decided to deprecate OSS and replace it with something completely new: ALSA. If your apps were rewritten to use ALSA they got new features, but if they used OSS (as everything did back then) they didn’t. There was only one feature that really mattered from a user perspective: audio mixing. I wanted two applications to be able both open the sound device and go ‘beep’. I think ALSA on Linux exposed hardware channels for mixing if your card supported it (my on-board one didn’t), OSS didn’t support it at all. I might be misremembering and ALSA supported software mixing, OSS only hardware mixing. Either way, only one OSS application could use the sound device at the time and very few things had been updated to use ASLA.
GNOME and KDE both worked around this by providing userspace sound mixing. These weren’t great for latency (sound was written to a pipe, then at some point later the userspace sound daemon was scheduled and then did the mixing and wrote the output) but they were fine for going ‘bing’. There was just one problem: I wanted to use Evolution (GNOME) for mail and Psi (KDE) for chat. Only one out of the KDE and GNOME sound daemons could play sound at a time and they were incompatible. Oh, and XMMS didn’t support ALSA and so if I played music the neither of them could do audio notifications.
Meanwhile, the FreeBSD team just forked the last BSD licensed OSS release and added support for OSS 4 and in-kernel low-latency sound mixing. On FreeBSD 4.x, device nodes were static so you had to configure the number of channels that it exposed but then you got /dev/dsp.0, /dev/dsp.1, and so on. I could configure XMMS and each of the GNOME and KDE sound daemons to use one of these, leaving the default /dev/dsp (a symlink to /dev/dsp.0, as I recall) for whatever ran in the foreground and wanted audio (typically BZFlag). When FreeBSD 5.0 rolled out, this manual configuration went away and you just opened /dev/dsp and got a new vchan. Nothing needed porting to use ALSA, GNOME’s sound daemon, KDE’s sound daemon, PulseAudio, or anything else: the OSS APIs just worked.
It was several years before audio became reliable on Linux again and it was really only after everything was, once again, rewritten for PulseAudio. Now it’s being rewritten for PipeWire. PipeWire does have some advantages, but there’s no reason that it can’t be used as a back end for the virtual_oss thing mentioned in this article, so software written with OSS could automatically support it, rather than requiring the constant churn of the Linux ecosystem. Software written against OSS 3 20 years ago will still work unmodified on FreeBSD and will have worked every year since it was written.
Luckily there’s no need for such a rewrite because pipewire has a PulseAudio API.
There was technically no need for a rewrite from ALSA to PulseAudio, either, because PulseAudio had an ALSA compat module.
But most applications got a PulseAudio plug-in anyway because the best that could be said about the compat module is that it made your computer continue to go beep – otherwise, it made everything worse.
I am slightly more hopeful for PipeWire, partly because (hopefully) some lessons have been drawn from PA’s disastrous roll-out, partly for reasons that I don’t quite know how to formulate without sounding like an ad-hominem attack (tl;dr some of the folks behind PipeWire really do know a thing or two about multimedia and let’s leave it at that). But bridging sound stacks is rarely a simple affair, and depending on how the two stacks are designed, some problems are simply not tractable.
One could also say that a lot of groundwork was done by PulseAudio, revealing bugs etc so the landscape that PipeWire enters in 2021 is not the same that PulseAudio entered in 2008. For starters there’s no Arts, ESD etc. anymore, these are long dead and gone, the only thing that matters these days is the PulseAudio API and the JACK API.
I may be misremembering the timeline but as far as I remember it, aRts, ESD & friends were long dead, gone and buried by 2008, as alsa had been supporting proper (eh…) software mixing for several years by then. aRts itself stopped being developed around 2004 or so. It was definitely no longer present in KDE 4, which was launched in 2008, and while it still shipped with KDE 3, it didn’t really see much use outside KDE applications anyway. I don’t recall how things were in Gnome land, I think ESD was dropped around 2009, but pretty much everything had been ported to canberra long before then.
I, for one, don’t recall seeing either of them or using either of them after 2003, 2004 or so, but I did have some generic Intel on-board sound card, which was probably one of the first ones to get proper software mixing support on alsa, so perhaps my experience wasn’t representative.
I don’t know how many bugs PulseAudio revealed but the words “PulseAudio” and “bugs” are enough to make me stop consider going back to Linux for at least six months :-D. The way bug reports, and contributors in general, technical and non-technical alike were treated, is one of the reasons why PulseAudio’s reception was not very warm to say the least, and IMHO it’s one of the projects that kickstarted a very hostile and irresponsible attitude that prevails in many Linux-related open-source projects to this day.
That’s more like it on Linux. ALSA did software mixing, enabled by default, in a 2005 release. So it was a pain before then (you could enable it at least as early as 2004, but it didn’t start being easy until 1.0.9 in 2005)… but long before godawful PulseAudio was even minimally usable.
BSD did the right thing though, no doubt about that. Linux never learns its lesson. Now Wayland lololol.
Things got pretty hilarious when you inevitably mixed an OSS app (or maybe an ALSA app, by that time? It’s been a while for me, too…) and one that used, say, aRTs (KDE’s sound daemon).
What would happen is that the non-aRTs app would grab the sound device and clung to it very, very tight. The sound daemon couldn’t play anything for a while, but it kept queuing sounds. Like, say, Gaim alerts (anyone remember Gaim? I think it was still gAIM at that point, this was long before it was renamed to Pidgin).
Then you’d close the non-aRTs app, and the sound daemon would get access to the sound card again, and BAM! it would dump like five minutes of gAIM alerts and application error sounds onto it, and your computer would go bing, bing, bing, bang, bing until the queue was finally empty.
I’d forgotten about that. I remember this happening when people logged out of computers: they’d quit BZFlag (yes, that’s basically what people used computers for in 2002) and log out, aRTs would get access to the sound device and write as many of the notification beeps as it could to the DSP device before it responded to the signal to quit.
ICQ-inspired systems back then really liked notification beeps. Psi would make a noise both when you sent and when you received a message (we referred to IM as bing-bong because it would go ‘bing’ when you sent a message and ‘bong’ when you received one). If nothing was draining the queue, it could really fill up!
This is exactly what happens with PulseAudio to me today, provided the applications trying to play the sounds come from different users.
Back in 2006ish though, alsa apps would mix sound, but OSS ones would queue, waiting to grab the device. I actually liked this a lot because I’d use an oss
play
command line program and just type up the names of files I want to play. It was an ad-hoc playlist in the shell!This is just an example of what the BSDs get right in general. For example, there is no world in which FreeBSD would remove
ifconfig
and replace it with an all-new command just because the existing code doesn’t have support for a couple of cool features - it gets patched or rewritten instead.I’m not sure I’d say “get right” in a global sense, but definitely it’s a matter of differing priorities. Having a stable user experience really isn’t a goal for most Linux distros, so if avoiding user facing churn is a priority, BSDs are a good place to be.
I don’t know; the older I get the more heavily I value minimizing churn and creating a system that can be intuitively “modeled” by the brain just from exposure, i.e. no surprises. If there are architectural reasons why something doesn’t work (e.g. the
git
command line), I can get behind fixing it. But stuff that just works?I guess we can’t blame Lennart for breaking audio on Linux if it was already broken….
You must be new around here - we never let reality get in the way of blaming Lennart :-/
Same as with systemd, there were dozens of us where everything worked before. I mean, I mostly liked pulseaudio because it brought a few cool features, but I don’t remember sound simply stopping to work before. Sure, it was complicated to setup, but if you didn’t change anything, it simply worked.
I don’t see this as blaming. Just stating the fact that if it works for some people, it’s not broken.
Well, can’t blame him personally, but the distros who pushed that PulseAudio trash? Absolutely yes they can be blamed. ALSA was fixed long before PA was, and like the parent post says, they could have just fixed OSS too and been done with that before ALSA!
But nah better to force everyone to constantly churn toward the next shiny thing.
Huh? I just setup ALSA recently and you very much had to specifically configure
dmix
, if that’s what you’re referring to. Here’s the official docs on software mixing. It doesn’t do anything as sophisticated as PulseAudio does by default. Not to mention that on a given restart ALSA devices frequently change their device IDs. I have a little script on a Void Linux box that I used to run as a media PC which creates theasoundrc
file based on outputs fromlspci
. I don’t have any such issue with PulseAudio at all.dmix has been enabled by default since 2005 in alsa upstream. If it wasn’t on your system, perhaps your distro changed things or something. The only alsa config I’ve ever had to do is change the default device from the hdmi to analog speakers.
And yeah, it isn’t sophisticated. But I don’t care, it actually works, which is more than I can say about PulseAudio, which even to this day, has random lag and updates break the multi-user setup (which very much did not just work). I didn’t want PA but Firefox kinda forced my hand and I hate it. I should have just ditched Firefox.
Everyone tells me the pipewire is better though, but I wish it could just go back to the default alsa setup again.
Shrug, I guess in my experience PulseAudio has “just worked” for me since 2006 or so. I admit that the initial rollout was chaotic, but ever since it’s been fine. I’ve never had random lag and my multi-user setup has never had any problems. It’s been roughly 15 years, so almost half my life, since PulseAudio has given me issues, so at this point I largely consider it stable, boring software. I still find ALSA frustrating to configure to this day, and I’ve used ALSA for even longer. Going forward I don’t think I’ll ever try to use raw ALSA ever again.
I’m pretty sure calvin is tongue in cheek referencing that Lennart created PulseAudio as well as systemd.
I cannot up this comment more. The migration to ALSA was a mess, and the introductions of Gstreamer*, Pulse*, or *sound_daemon fractured the system more. Things in BSD land stayed much simpler.
I was also ‘forced’ out of Linux ecosystem because of mess in sound subsystem.
After spending some years on FreeBSD land I got hardware that was not FreeBSD supported at that moment so I tried Ubuntu … what a tragedy it was. When I was using FreeBSD I got my system run for months and rebooted only to install security updates or to upgrade. Everything just worked. Including sound. In Ubuntu land I needed to do HARD RESET every 2-3 days because sound will went dead and I could not find a way to reload/restart anything that caused that ‘glitch’.
Details here:
https://vermaden.wordpress.com/2018/09/07/my-freebsd-story/
From time to time I try to run my DAW (Bitwig Studio) in Linux. A nice thing about using DAWs from Mac OS X is that, they just find the audio and midi sources and you don’t have to do a lot of setup. There’s a MIDI router application you can use if you want to do something complex.
Using the DAW from Linux, if it connects via ALSA or PulseAudio, mostly just works, although it won’t find my audio interface from PulseAudio. But the recommended configuration is with JACK, and despite reading the manual a couple times and trying various recommended distributions, I just can’t seem to wrap my head around it.
I should try running Bitwig on FreeBSD via the Linux compatibility layer. It’s just a Java application after all.
Try updating to Pipewire if your distribution supports it already. Then you get systemwide Jack compatibility with no extra configuration/effort and it doesn’t matter much which interface the app uses. Then you can route anything the way you like (audio and MIDI) with even fewer restrictions than MacOS.
I’ll give that a try, thanks!
Fortunately on Linux side a lot of this gets much simpler these days with Pipewire which is kind of what pulseaudio should’ve been but actually works and delivers even more. Interface mess goes away (pulse and Jack client can connect at the same time and they’re both visible in jackctl routing), realtime mixing <5ms is easily achievable, runtime audio routing and effect processing cabinets “just work”, the mentioned “desktop/studio” buffer sizes are per app and just an env variable away, and many other great things. It really turned things around for Linux audio world.
Pipewire is still a work in progress but I’m using it on my laptop and finding that it works really well with both low-latency stuff through its jack interface and boring desktop stuff through its pulse interface. I’m excited for it to go mainstream when it’s ready.