It’s great that this has successfully found info leaks. That said, it feels like a somewhat simplistic solution. Set taint bytes, see if those bytes show up anywhere. Maybe the code complexity of the kernel prevents things like symbolic execution from being feasible.
This strategy works at runtime, and doesn’t require source. But, we have source.. could shadow memory based strategies be used? What about compile-time data dependency analysis - with marked variables, and a known list of functions which return data to userland, a walk of the control flow graph might be able to uncover issues (maybe that ‘known list’ assumption is too strong?).
It’s possible that these other strategies are infeasible for reasons I’m unaware of - thoughts?
It’s not that hard to start at copyout and scan backwards. Anything not provably zeroed is probably a leak. (I assert without evidence.)
Take this with a huge pinch of salt (I didn’t work on it and I don’t know about the intent and decisions made) but I’m guessing it’s about making it easier for finding issues at runtime to fit in as part of a bigger picture of functionality. Extending your toolchain is not an easy task and that work mounts up as you look at multi platform support. You are likely to get an answer from people who worked on the project if you asked on the relevant list :)