Commenting specifically on unikernels and not the general lightweight VM vs. serverless & container approach: My thought on using unikernels is it should be better latency in that you don’t have to switch between user and kernel space. But from a security perspective, unless some of the changes discussed in https://lobste.rs/s/jsjkfn/making_c_less_dangerous happen, you are probably better off with processes and their own memory space.
I haven’t lately figured out exactly how far you can cut down the Linux kernel in terms of memory footprint and actual lines of code in use, but it used to be possible to run Linux in 4MiB of RAM. Yocto/openembedded can be quite complicated to use; I’d consider first whether https://buildroot.org/ meets your needs if you’re going to start from scratch.
Unless your hypervisor is providing more services than they traditionally do, in theory you should be able to strip down a traditional kernel almost as far as a unikernel except for the syscall and process management portions.
I’ve worked on embedded devices where not even the network stack was run on-chip, it was just bytes over a serial port. I think that’s the level of offloading where a unikernel would start to look significantly different than a traditional paravirtualized kernel.
Like Oberon, Solo (about 16K for core), NuttX (highly tunable), and embOS (has some numbers). Of course, you probably already know about the stuff on the right being in embedded.
I bet you embedded folk cringe at the waste and unpredictability in mainstream OS’s and software even more than most of us knowing it could be way better because you work with better. Well, depends on what you build on like with hard RTOS’s and such.
“I’ve worked on embedded devices where not even the network stack was run on-chip, it was just bytes over a serial port.”
I actually posted in Jack Ganssle’s Embedded Muse here that real-time products should ditch traditional interrupts wherever possible in favor of I/O coprocessors or asynchronous circuitry. I gave examples there with the concept most proven in mainframes with them handling simultaneously massive CPU utilization and I/O throughput. Why you think about that? Oh yeah, Jack emailed me there was at least one product that was doing that with a big, ARM core for main workload but a tiny, cheap one for I/O.
And I just noticed right under my reply was one from “John Carter.” We have a @johncarter here that does embedded. You a muse reader, too, John? Or a different guy?
We have a @johncarter here that does embedded. You a muse reader, too, John? Or a different guy?
Guilty as charged, that’s me.
I bet you embedded folk cringe at the waste and unpredictability in mainstream OS’s and software even more than most of us knowing it could be way better because you work with better.
It’s always a trade off. I have tiny low cost, low powered embedded linux systems on my desk…
With open embedded I can pull in tens of thousands of packages to address any need, way faster and better, than I could write myself in a decade….
And depending on whether we keep our brains switched on, it will be more robust security wise than any traditional micro RTOS I could use…. since the real world security testing of the linux kernel exceeds just about anything else on the planet.
I also have systems on my desk where I’m constantly being pushed to reduce bill of materials (BoM) cost, manufacturing cost, power consumption, shift life, physical size, weight, … ie. The business side of the company will never stop pushing that for sound business reasons.
There, we currently using a tradition embedded rtos, and we’re even considering the idea of swapping in an, effectively a unikernel, that my colleague and I wrote. (We run our off target test suite using it since I can cycle perfect emulate the scheduling behaviour of the target rtos whilst playing nice with valgrind and gdb and gcov…. on the desktop.
Interrupts, yup, they are a pain. I always feel there is something wrong with any design that has a guideline, “You mustn’t spend too long in a “blah” routine.” “Ok, so how long is too long?” “Dunno, shorter is better”.
That to me just stinks.
Coprocessors? Hmm. We’re sort of lucky… we have an FPGA to play with. So it’s our coprocessor. And a DSP.
Ideally a coprocessor design should be autonomous. ie. Big CPU can go to sleep while nothing is happening, I/O coprocessor ticks along handling noise and bounces and….. and coughs an event up into the fifo and wakes big brother if and when something needs thinking about.
And somebody somewhere has a event handling resource budget somewhere that guarantees that fifo never overflows, and somebody designed up front a strategy for throttling / providing back pressure on input rates. (Hint for any newbies reading this: Growing that fifo sounds like a solution right? Give me a couple of sound reasons why it isn’t…)
In a world full of micromanagers, designers tend to design systems and protocols that micromanage the coprocessors and destroy the value of having coprocessors.
ie. Your coprocessor idea is good, sound, but dammit it’s hard to get a herd of cats to design and implement that properly! (Sometimes the micromanagement is even part of the international standard you might be implementing!)
Although I upvoted it, I forgot to say thanks for the reply. Might be able to market something like that if designed in reverse: the coprocessor is the main processor with the library set up to run reactive routines on the “general-purpose” processor. Kind of like how that one SoC on budget boards is really a graphics chip with a general-purpose ARM on the side. The marketing and technical materials would present it the way it’s intended to be used. Everything else about it is whatever is standard.
The recent discussion of FSM’s had me looking for an old comment of yours to jokingly counter @mempko’s “FSM’s are your friend” with “Long as you’re not John Carter.” ;) Then, I noticed the comment about showing complexity.
Interesting enough, I was advocating for that on HN recently for using Abstract, State Machines for complexity measurement. I don’t know if it’s been done. My idea was looking at the ranges of values, the transitions, and their combinatorial explosion. The higher those numbers, the higher the complexity. That ASM’s are a fundamental kind of thing you can model hardware and software in makes it generally applicable. That they’re a minimalist thing, like Turing Machines operating on structures, means there’s either very little or no incidental complexity clouding the measurements. What you think of that as a guy who’d like to see more state machines in your industry?
You can use separation kernels to reduce attack surface like commercial products were doing far back as 2005. Those also ran Linux in user-mode. Trimmed and/or memory-safe Linux running on tiny kernel with secure IPC like Cap n Proto.
Commenting specifically on unikernels and not the general lightweight VM vs. serverless & container approach: My thought on using unikernels is it should be better latency in that you don’t have to switch between user and kernel space. But from a security perspective, unless some of the changes discussed in https://lobste.rs/s/jsjkfn/making_c_less_dangerous happen, you are probably better off with processes and their own memory space.
I haven’t lately figured out exactly how far you can cut down the Linux kernel in terms of memory footprint and actual lines of code in use, but it used to be possible to run Linux in 4MiB of RAM. Yocto/openembedded can be quite complicated to use; I’d consider first whether https://buildroot.org/ meets your needs if you’re going to start from scratch.
Why would processes be more secure than Unikernels? The attack surface of the kernel is much much larger than that of a hyper visor.
Unless your hypervisor is providing more services than they traditionally do, in theory you should be able to strip down a traditional kernel almost as far as a unikernel except for the syscall and process management portions.
I’ve worked on embedded devices where not even the network stack was run on-chip, it was just bytes over a serial port. I think that’s the level of offloading where a unikernel would start to look significantly different than a traditional paravirtualized kernel.
Like Oberon, Solo (about 16K for core), NuttX (highly tunable), and embOS (has some numbers). Of course, you probably already know about the stuff on the right being in embedded.
I bet you embedded folk cringe at the waste and unpredictability in mainstream OS’s and software even more than most of us knowing it could be way better because you work with better. Well, depends on what you build on like with hard RTOS’s and such.
“I’ve worked on embedded devices where not even the network stack was run on-chip, it was just bytes over a serial port.”
I actually posted in Jack Ganssle’s Embedded Muse here that real-time products should ditch traditional interrupts wherever possible in favor of I/O coprocessors or asynchronous circuitry. I gave examples there with the concept most proven in mainframes with them handling simultaneously massive CPU utilization and I/O throughput. Why you think about that? Oh yeah, Jack emailed me there was at least one product that was doing that with a big, ARM core for main workload but a tiny, cheap one for I/O.
And I just noticed right under my reply was one from “John Carter.” We have a @johncarter here that does embedded. You a muse reader, too, John? Or a different guy?
Guilty as charged, that’s me.
It’s always a trade off. I have tiny low cost, low powered embedded linux systems on my desk…
With open embedded I can pull in tens of thousands of packages to address any need, way faster and better, than I could write myself in a decade….
And depending on whether we keep our brains switched on, it will be more robust security wise than any traditional micro RTOS I could use…. since the real world security testing of the linux kernel exceeds just about anything else on the planet.
I also have systems on my desk where I’m constantly being pushed to reduce bill of materials (BoM) cost, manufacturing cost, power consumption, shift life, physical size, weight, … ie. The business side of the company will never stop pushing that for sound business reasons.
There, we currently using a tradition embedded rtos, and we’re even considering the idea of swapping in an, effectively a unikernel, that my colleague and I wrote. (We run our off target test suite using it since I can cycle perfect emulate the scheduling behaviour of the target rtos whilst playing nice with valgrind and gdb and gcov…. on the desktop.
Interrupts, yup, they are a pain. I always feel there is something wrong with any design that has a guideline, “You mustn’t spend too long in a “blah” routine.” “Ok, so how long is too long?” “Dunno, shorter is better”.
That to me just stinks.
Coprocessors? Hmm. We’re sort of lucky… we have an FPGA to play with. So it’s our coprocessor. And a DSP.
Ideally a coprocessor design should be autonomous. ie. Big CPU can go to sleep while nothing is happening, I/O coprocessor ticks along handling noise and bounces and….. and coughs an event up into the fifo and wakes big brother if and when something needs thinking about.
And somebody somewhere has a event handling resource budget somewhere that guarantees that fifo never overflows, and somebody designed up front a strategy for throttling / providing back pressure on input rates. (Hint for any newbies reading this: Growing that fifo sounds like a solution right? Give me a couple of sound reasons why it isn’t…)
In a world full of micromanagers, designers tend to design systems and protocols that micromanage the coprocessors and destroy the value of having coprocessors.
ie. Your coprocessor idea is good, sound, but dammit it’s hard to get a herd of cats to design and implement that properly! (Sometimes the micromanagement is even part of the international standard you might be implementing!)
Although I upvoted it, I forgot to say thanks for the reply. Might be able to market something like that if designed in reverse: the coprocessor is the main processor with the library set up to run reactive routines on the “general-purpose” processor. Kind of like how that one SoC on budget boards is really a graphics chip with a general-purpose ARM on the side. The marketing and technical materials would present it the way it’s intended to be used. Everything else about it is whatever is standard.
The recent discussion of FSM’s had me looking for an old comment of yours to jokingly counter @mempko’s “FSM’s are your friend” with “Long as you’re not John Carter.” ;) Then, I noticed the comment about showing complexity.
Interesting enough, I was advocating for that on HN recently for using Abstract, State Machines for complexity measurement. I don’t know if it’s been done. My idea was looking at the ranges of values, the transitions, and their combinatorial explosion. The higher those numbers, the higher the complexity. That ASM’s are a fundamental kind of thing you can model hardware and software in makes it generally applicable. That they’re a minimalist thing, like Turing Machines operating on structures, means there’s either very little or no incidental complexity clouding the measurements. What you think of that as a guy who’d like to see more state machines in your industry?
You can use separation kernels to reduce attack surface like commercial products were doing far back as 2005. Those also ran Linux in user-mode. Trimmed and/or memory-safe Linux running on tiny kernel with secure IPC like Cap n Proto.
That doesn’t sound like Docker.
I said as much to a person claiming to secure it once.