This guy’s articles never disappoint. Even when they’re above my head.
Nice. I’ve been meaning to write up how this works on a number of OSes, since I’ve been implementing a thread library for Myrddin on several of them (and, of course, since I’ve got no libc, I’ve also got no pthreads.)
Linux and FreeBSD are the nicer one systems to do this on. Futexes (and their close cousin, umtx) are pretty elegant, and while starting threads needs a dash of assembly due to the lack of a valid stack, clone() is a pretty sane interface for doing it.
I think the only system where threads were easier to implement was Plan 9, and they do it by taking a performance hit on context switches – each thread gets its own private stack, mapped at the same location in address space.