The point is you don’t need to know. You just keep dividing, and at some point the problem fits in L1 cache and even though your algorithm keeps dividing further down to the base case, it doesn’t really matter because the remaining algorithm just needs to be correct and it’ll be pretty fast because it’s all in L1.

How do you know how small your datasets need to be when you divide them?

The point is you

don’tneed to know. You just keep dividing, and at some point the problem fits in L1 cache and even though your algorithm keeps dividing further down to the base case, it doesn’t really matter because the remaining algorithm just needs to be correct and it’ll be pretty fast because it’s all in L1.I see. So worst case, you sub-divide the problem and it

stilldoesn’t fit in the L1 cache. But that’s better than nothing.But in that case nothing would have done better, so you’re still optimal. At least that’s the idea.

This also works in other places in the cache hierarchy, e.g. when you’re operating on data on disk or over the network. See also http://user.it.uu.se/~arnea/ps/expTr.pdf http://www.researchgate.net/publication/27247707_Exponential_Structures_for_Efficient_Cache-Oblivious_Algorithms/file/3deec522ec2be4a771.pdf