I’ve found that if shrinking takes too long to return, I can often just improve the shrinking logic in less time than it takes to manually minimize the unshrunken generated input, but not always. Better shrinkers are still usually a great investment rather than just disabling them because they facilitate understanding the story behind the bug more efficiently. For example, most of my tests that somehow take randomized byte buffers don’t need to have every individual byte within every buffer shrunken - just shrinking the length of the buffer is usually sufficient and reduces search dimensionality significantly.
Another tip: make infrequent logic much more frequent under test. If you have a tree that usually splits nodes at 16 children, do so under test at a threshold that is itself generated so it hits rare branches more often.
Generating sequences of operations that get executed against a simple model and a real stateful system is probably the technique that finds the most bugs for me. Run the generated operations against both the real thing and the model, and blow up if they ever return different results at any stage. A hashmap is a great model for a more complex database that has keys added and removed while sometimes restarting (a no-op for the model).
Sometimes combining hit-tracing fuzzers with property testing yields great results, but can significantly slow down the process, so you have to be careful about coverage per time unit. I’ve been meaning to try out fuzzcheck which seems like it could be a lot faster than my previous strategies of just using libfuzzer to generate byte buffers that are then mapped to random number generator seeds that are used to generate a sequence of operations for a model test.
After finding so many subtle bugs with low-effort property testing, I really consider things without property tests to be essentially untested.
The suggestion to disable shrinking makes me feel like much better workflows are possible, because I think it’s just a UI limitation of how our tests run.
I think my ideal would be a UI/harness that:
Finds a failure
Shrinks it for a short period
Shows me the test failure
If there’s more shrinking to do (based on the heuristic), the harness continues shrinking in the background with the option to replace the failure.
I’ve found that if shrinking takes too long to return, I can often just improve the shrinking logic in less time than it takes to manually minimize the unshrunken generated input, but not always. Better shrinkers are still usually a great investment rather than just disabling them because they facilitate understanding the story behind the bug more efficiently. For example, most of my tests that somehow take randomized byte buffers don’t need to have every individual byte within every buffer shrunken - just shrinking the length of the buffer is usually sufficient and reduces search dimensionality significantly.
Another tip: make infrequent logic much more frequent under test. If you have a tree that usually splits nodes at 16 children, do so under test at a threshold that is itself generated so it hits rare branches more often.
Generating sequences of operations that get executed against a simple model and a real stateful system is probably the technique that finds the most bugs for me. Run the generated operations against both the real thing and the model, and blow up if they ever return different results at any stage. A hashmap is a great model for a more complex database that has keys added and removed while sometimes restarting (a no-op for the model).
Sometimes combining hit-tracing fuzzers with property testing yields great results, but can significantly slow down the process, so you have to be careful about coverage per time unit. I’ve been meaning to try out fuzzcheck which seems like it could be a lot faster than my previous strategies of just using libfuzzer to generate byte buffers that are then mapped to random number generator seeds that are used to generate a sequence of operations for a model test.
After finding so many subtle bugs with low-effort property testing, I really consider things without property tests to be essentially untested.
The suggestion to disable shrinking makes me feel like much better workflows are possible, because I think it’s just a UI limitation of how our tests run.
I think my ideal would be a UI/harness that: