Very similar article was just added within the last week. Is it just B-tree Building Time? This one I found particularly naive — you can find more and better info on Wikipepedia or online articles like Modern B-Tree Techniques, and also the book Database Internals [O’Reilly], both of which are solid gold.
I’ve said this before, but: people get hung up on the idea of a b-tree needing a fixed branching factor. That’s not necessary, and it totally gets in the way of supporting variable length keys and values. Just keep shoving entries into a node until it overflows, then split it.
I am on the lookout for a good introduction to writing disk-backed btrees since I have never done that before. This is maybe the closest or exactly what I was looking for.
Most books that cover it don’t actually give you working, minimal code you could run today. Or maybe I miss them. I do own Database Internals and I have been meaning to read it but I feared it would still be higher level than actually working code samples.
Wikipedia for example does not give working code samples.
This tutorial for writing a sqlite clone is pretty good and has actual C code: https://cstack.github.io/db_tutorial/
True! But it kind of leaves off write in the middle and has a lot of distracting detail about databases that I’d like to not have to look at (in an ideal tutorial). So I discount that.
Yeah, I couldn’t find good code either, so I just kind of dove in and started coding a few months ago. It’s been challenging but lots of fun. Unfortunately this is a work project, so I can’t just open source it.
Another good resource I found is the documentation of the SQLite file format. It’s not actual code, but it tells you all about the data structures down to the bit level. This is a page on the SQLite.org site somewhere.