There’s a lot going on in this article, but I am most fascinated by this:
Part of this is because using a serial console is a monumental pain in the ass, so your typical computer operator will do a lot to avoid dealing with it.
The serial console is literally all I ever want for a remotely managed UNIX system. I have absolutely no idea why anybody would choose anything else! Why would you want to watch a video of a text console (i.e., the framebuffer forwarded over VNC or whatever proprietary equivalent) at a particular resolution when you could have a text stream directly into whatever terminal emulator you already use, with the ability to log the text and to copy and paste text? I’ve even had to automate reboot loops of systems during OS debugging where I would have a program interact with the serial console, which would have been tedious or impossible with a video stream.
And of course, a serial console is much more accessible to a blind person than a framebuffer sent over VNC or the like. I don’t say the framebuffer is completely inaccessible, because one can use it, to some degree, with OCR. But of course, reconstructing text from pixels, never mind knowing where the cursor is, is far inferior to just having the text in the first place.
Incidentally, at least two of the 90s Linux accessibility tools developed by blind people, Emacspeak and Speakup, were bootstrapped by accessing a Linux machine from an MS-DOS PC with an existing screen reader, a terminal program, and a serial connection. But I digress.
A combination of “just works” with PC console and most actual work happening over SSH or similar leads to tragedy where it is not correctly configured in many OSes or somewhat obtuse to do so in the firmware, bootloader and OSes over time. That said it is table stakes for a service provider to figure it out, which should be a simple chore but could manifold if you have enough generations of technologies in play.
It is perceptibly slower than SSH, I prefer to use it only in recovery situations.
It is perceptibly slower than SSH, I prefer to use it only in recovery situations.
Yeah, this is basically the only downside, although it doesn’t come into play too often (mostly with curses programs and when paging a lot of text). And I agree it’s basically ideal for recovery because of its simplicity.
The slowness is typically due to being constrained by the baud rate of the serial link between the host and the BMC, which even at a relatively high baud (115200, say) is, yes, still pretty slow compared to modern network speeds. However, it doesn’t have to be that slow – Aspeed BMC SoCs (as commonly employed on many non-{Dell,HPE} servers) have a sort of nifty little hardware feature called the VUART (virtual UART) which presents a standard 16550-style UART to both the host and the BMC, but elides the actual “analog” serial line between them, basically just using back-to-back FIFO buffers instead, so it can run at much higher speeds (both sides still think they’ve configured some standard baud rate, but the actual transmission speed is completely independent of it and the two sides don’t even need to agree with each other on what it is). It’s still not quite as fast as an SSH connection, but it’s closer enough that the times when it’s noticeable are vastly fewer and farther between.
Granted, while the Aspeed hardware has that feature, as far as I’m aware most commercial BMC firmware doesn’t actually use it, sadly, so in practice it’s perhaps kind of moot.
Most more modern things can manage 921,600 bps. You can do better with SSH, but not much.
However, it doesn’t have to be that slow – Aspeed BMC SoCs (as commonly employed on many non-{Dell,HPE} servers) have a sort of nifty little hardware feature called the VUART (virtual UART) which presents a standard 16550-style UART to both the host and the BMC,
And this is where the whole things gets silly. I have a few things with UARTs in them connected to this machine. Each one has something like a 16550 (with varying levels of extension) that then connects to a USB interface that encapsulates the serial stream in USB packets. The way that most of these work is to connect the pins of an RS-232 connection to something like an FDTI chip that then handles the USB to the kernel at the other end.
There’s absolutely no reason to actually build an RS-232 interface between the UART and the USB device. The UART just manages a FIFO. If the USB device and UART are on the same IC, you can just read directly from the FIFO and, for bonus points, set and clear the ready signals depending on whether the USB bit has managed to consume bytes from the FIFO. I’m not sure if it’s a limitation of FDTI chips or just laziness, but most people seem not to connect the hardware flow control wires up between the UART and FDTI so there’s no back pressure if you write too fast or read too slowly.
I’m glad to hear that at least one company is doing the sensible thing.
Most more modern things can manage 921,600 bps. You can do better with SSH, but not much.
About 10-15 years ago there was an ssh patchset maintained by the HPC community called something like hibw-ssh (high bandwidth ssh). It aimed to get gigabit speeds across intercontinental links, specifically Internet2, the bombastically named US academic network, successor to NSFnet.
There is a really important antipattern in network protocol design, the window-in-window problem. This occurs when you have a multiplexing layer on top of a bytestream layer. TCP is usually the lower layer of this problem. TCP is built on a congestion window to control backpressure. When you are multiplexing multiple streams on top of TCP the mechanisms for fairly sharing the TCP bytestream interact badly with TCP’s mechanisms for sharing the underlying network.
The window-in-window problem explains why SSL VPNs suck, why HTTP/2 is being replaced by HTTP/3, and why hibw-ssh existed.
Much of what the patchset did was adjust ssh’s and sftp’s windowing and multiplexing behaviour so that it didn’t unnecessarily restrict the bandwidth-delay product that ssh could fill. Most people wouldn’t notice the problem, but HPC users wanted to transfer terabytes across multi-gigabit links spanning tens of milliseconds of latency. In that situation, megabyte windows are too small and can slow you down by decimal orders of magnitude.
I believe hibw-ssh is obsolete because OpenSSH has become more clever with its window scaling.
But anyway, you should ecpect a lot more than a megabit per second from ssh.
IPMI is very common in enterprise servers but very rare elsewhere, much to the consternation of people like me that don’t have the space or noise tolerance for a 1U pizzabox in their homes.
While that is basically true, non-server hardware having a BMC isn’t entirely unheard of – ASRock and Asus (perhaps others as well) make BMC-equipped workstation motherboards, for example.
And there’s a nice range of small Mini-ITX towers with mainboards including IPMI as well, e.g. the 5028D-TN4T. They’re not silent if you give them something to work on but the loudest about mine is usually the hum of spinning rust and rattling drive heads.
There’s a lot going on in this article, but I am most fascinated by this:
The serial console is literally all I ever want for a remotely managed UNIX system. I have absolutely no idea why anybody would choose anything else! Why would you want to watch a video of a text console (i.e., the framebuffer forwarded over VNC or whatever proprietary equivalent) at a particular resolution when you could have a text stream directly into whatever terminal emulator you already use, with the ability to log the text and to copy and paste text? I’ve even had to automate reboot loops of systems during OS debugging where I would have a program interact with the serial console, which would have been tedious or impossible with a video stream.
And of course, a serial console is much more accessible to a blind person than a framebuffer sent over VNC or the like. I don’t say the framebuffer is completely inaccessible, because one can use it, to some degree, with OCR. But of course, reconstructing text from pixels, never mind knowing where the cursor is, is far inferior to just having the text in the first place.
Incidentally, at least two of the 90s Linux accessibility tools developed by blind people, Emacspeak and Speakup, were bootstrapped by accessing a Linux machine from an MS-DOS PC with an existing screen reader, a terminal program, and a serial connection. But I digress.
A combination of “just works” with PC console and most actual work happening over SSH or similar leads to tragedy where it is not correctly configured in many OSes or somewhat obtuse to do so in the firmware, bootloader and OSes over time. That said it is table stakes for a service provider to figure it out, which should be a simple chore but could manifold if you have enough generations of technologies in play.
It is perceptibly slower than SSH, I prefer to use it only in recovery situations.
Yeah, this is basically the only downside, although it doesn’t come into play too often (mostly with curses programs and when paging a lot of text). And I agree it’s basically ideal for recovery because of its simplicity.
The slowness is typically due to being constrained by the baud rate of the serial link between the host and the BMC, which even at a relatively high baud (115200, say) is, yes, still pretty slow compared to modern network speeds. However, it doesn’t have to be that slow – Aspeed BMC SoCs (as commonly employed on many non-{Dell,HPE} servers) have a sort of nifty little hardware feature called the VUART (virtual UART) which presents a standard 16550-style UART to both the host and the BMC, but elides the actual “analog” serial line between them, basically just using back-to-back FIFO buffers instead, so it can run at much higher speeds (both sides still think they’ve configured some standard baud rate, but the actual transmission speed is completely independent of it and the two sides don’t even need to agree with each other on what it is). It’s still not quite as fast as an SSH connection, but it’s closer enough that the times when it’s noticeable are vastly fewer and farther between.
Granted, while the Aspeed hardware has that feature, as far as I’m aware most commercial BMC firmware doesn’t actually use it, sadly, so in practice it’s perhaps kind of moot.
We’ve been operating the serial console at 3Mbaud in the Oxide machines, at least under some conditions, and it is indeed a lot snappier than 115k2!
Most more modern things can manage 921,600 bps. You can do better with SSH, but not much.
And this is where the whole things gets silly. I have a few things with UARTs in them connected to this machine. Each one has something like a 16550 (with varying levels of extension) that then connects to a USB interface that encapsulates the serial stream in USB packets. The way that most of these work is to connect the pins of an RS-232 connection to something like an FDTI chip that then handles the USB to the kernel at the other end.
There’s absolutely no reason to actually build an RS-232 interface between the UART and the USB device. The UART just manages a FIFO. If the USB device and UART are on the same IC, you can just read directly from the FIFO and, for bonus points, set and clear the ready signals depending on whether the USB bit has managed to consume bytes from the FIFO. I’m not sure if it’s a limitation of FDTI chips or just laziness, but most people seem not to connect the hardware flow control wires up between the UART and FDTI so there’s no back pressure if you write too fast or read too slowly.
I’m glad to hear that at least one company is doing the sensible thing.
About 10-15 years ago there was an ssh patchset maintained by the HPC community called something like hibw-ssh (high bandwidth ssh). It aimed to get gigabit speeds across intercontinental links, specifically Internet2, the bombastically named US academic network, successor to NSFnet.
There is a really important antipattern in network protocol design, the window-in-window problem. This occurs when you have a multiplexing layer on top of a bytestream layer. TCP is usually the lower layer of this problem. TCP is built on a congestion window to control backpressure. When you are multiplexing multiple streams on top of TCP the mechanisms for fairly sharing the TCP bytestream interact badly with TCP’s mechanisms for sharing the underlying network.
The window-in-window problem explains why SSL VPNs suck, why HTTP/2 is being replaced by HTTP/3, and why hibw-ssh existed.
Much of what the patchset did was adjust ssh’s and sftp’s windowing and multiplexing behaviour so that it didn’t unnecessarily restrict the bandwidth-delay product that ssh could fill. Most people wouldn’t notice the problem, but HPC users wanted to transfer terabytes across multi-gigabit links spanning tens of milliseconds of latency. In that situation, megabyte windows are too small and can slow you down by decimal orders of magnitude.
I believe hibw-ssh is obsolete because OpenSSH has become more clever with its window scaling.
But anyway, you should ecpect a lot more than a megabit per second from ssh.
While that is basically true, non-server hardware having a BMC isn’t entirely unheard of – ASRock and Asus (perhaps others as well) make BMC-equipped workstation motherboards, for example.
I was pleasantly surprised that the ASRock motherboard from my latest build included IPMI.
And there’s a nice range of small Mini-ITX towers with mainboards including IPMI as well, e.g. the 5028D-TN4T. They’re not silent if you give them something to work on but the loudest about mine is usually the hum of spinning rust and rattling drive heads.
Yes, my homelab server uses an ASRock BMC for this, works great for my needs.