What is despair? I have known it—hear my song. Despair is when you’re debugging a kernel driver and you look at a memory dump and you see that a pointer has a value of 7. THERE IS NO HARDWARE ARCHITECTURE THAT IS ALIGNED ON 7. Furthermore, 7 IS TOO SMALL AND ONLY EVIL CODE WOULD TRY TO ACCESS SMALL NUMBER MEMORY. Misaligned, small-number memory accesses have stolen decades from my life.
I’ll try to remember to write some up. Working on high-performance embedded hardware has been a lot of fun but also has had a lot of deep debug sessions.
Edit: as an example, we used Ghidra to disassemble a vendor-provided closed source library, found a function that we suspected was missing a lock, and then used https://frida.re/ to dynamically fix their code to prove definitively that the missing lock was the problem. We had been getting occasional deadlocks and had used GDB to get a rough idea of what was happening but the vendor refused to believe the issue was on their end.
Interesting, so this select() wrapper is able to be aborted by a peer thread, without cleaning up the kernel component? Is this a common situation in the windows API?
It looks as if APC is very similar to a UNIX signal, except that you get to specify the code that runs in the signal handler. The problem (I think) is that select is not a system call, it’s a wrapper around two calls, one that asynchronously writes the result to the specified buffer and another that blocks for the first to complete. The UNIX equivalent would be:
Issue some syscall that writes asynchronously, such as aio_read, passing it an on-stack buffer.
Use a blocking call to wait for completion, such as aio_waitcomplete.
Receive a signal from another thread.
Throw an exception (or setcontext) out of the signal handler.
Repeat from the first step.
The signal delivery in this example would interrupt aio_waitcomplete, but the aio_read call remains in flight because it’s asynchronous. Later, the aio_read will complete and overwrite the on-stack buffer.
This is more common on Windows because more system calls are asynchronous, though io_uring on Linux may change this.
The obvious response to this bug is to point out that you should never pass an on-stack buffer to an asynchronous call, but here that’s an implementation detail. You think that you are calling a synchronous select.
I would almost be inclined to regard this as a bug in select: it could probably have SEH handlers that cancel the asynchronous call if you unwind. That said, it’s an understandable bug because throwing exceptions out of signal handlers or their equivalents is such a spectacularly bad idea that it’s easy to imagine it never happens. It’s impossible to be completely defensive against it because it is basically a way of throwing exceptions at any point at all in the code, on any instruction. Just don’t do that. Throwing an exception out of the APC callback is something that should make people doing code review look incredibly carefully at the code and then reject it unconditionally.
James Mickens’ “Night Watch” sprang to mind:
I love reading debugging stories like this, I wish more people would post them
I’ll try to remember to write some up. Working on high-performance embedded hardware has been a lot of fun but also has had a lot of deep debug sessions.
Edit: as an example, we used Ghidra to disassemble a vendor-provided closed source library, found a function that we suspected was missing a lock, and then used https://frida.re/ to dynamically fix their code to prove definitively that the missing lock was the problem. We had been getting occasional deadlocks and had used GDB to get a rough idea of what was happening but the vendor refused to believe the issue was on their end.
Interesting, so this select() wrapper is able to be aborted by a peer thread, without cleaning up the kernel component? Is this a common situation in the windows API?
It looks as if APC is very similar to a UNIX signal, except that you get to specify the code that runs in the signal handler. The problem (I think) is that select is not a system call, it’s a wrapper around two calls, one that asynchronously writes the result to the specified buffer and another that blocks for the first to complete. The UNIX equivalent would be:
aio_read, passing it an on-stack buffer.aio_waitcomplete.setcontext) out of the signal handler.The signal delivery in this example would interrupt
aio_waitcomplete, but theaio_readcall remains in flight because it’s asynchronous. Later, theaio_readwill complete and overwrite the on-stack buffer.This is more common on Windows because more system calls are asynchronous, though
io_uringon Linux may change this.The obvious response to this bug is to point out that you should never pass an on-stack buffer to an asynchronous call, but here that’s an implementation detail. You think that you are calling a synchronous select.
I would almost be inclined to regard this as a bug in select: it could probably have SEH handlers that cancel the asynchronous call if you unwind. That said, it’s an understandable bug because throwing exceptions out of signal handlers or their equivalents is such a spectacularly bad idea that it’s easy to imagine it never happens. It’s impossible to be completely defensive against it because it is basically a way of throwing exceptions at any point at all in the code, on any instruction. Just don’t do that. Throwing an exception out of the APC callback is something that should make people doing code review look incredibly carefully at the code and then reject it unconditionally.
It amuses me that the fix was the self-pipe trick: the best way to deal with signals and select is more robustly portable than I thought!