You might also find the explanation of bloom filters in “Cache Efficient Bloom Filters for Shared Memory Machines” helpful, particularly the description of how to make bloom filters dynamically grow as they start to get too full.
My C property-based testing library, theft, uses a dynamic blocked bloom filter to check whether it has already run a property test with a particular combination of argument(s). This eliminates a lot of redundant activity, and also helps track how often duplicates are getting generated.
I know silentbicycle knows this, but I’m working on a set of bloom filters written in Rust. I already have an implementation of a Blocked Bloom Filter based on the paper linked above.