1. 15
    1. 7

      Okay, this post does an alright job of introducing B-trees and B+trees. But it completely goes off the rails describing how they’re used.

      In actual Database implementation, the database uses both B-tree and B+tree together to store data. B-tree used for indexing and B+tree used to store the actual records. B+tree provides sequential search capabilities in addition to the binary search, which gives the database more control to search non-index values in a database.

      This isn’t true at all. There’s no reason whatsoever to specifically store indexes as B-trees, and actual records as B+trees. The implication that ordered sequential search isn’t useful for indexed columns makes no sense. And B-trees can do ordered sequential search, you can walk a B-tree the same way you can walk any binary tree. Leaf chaining in B+trees is just an implementation detail, and typically not noticeable in any way.

      I have no idea why the author would make that claim, especially since the post opens with “architectural details are described referenced to SQLite 2.x database architecture,” but neither SQLite 2 nor 3 uses B+trees.

      SQLite uses B-trees exclusively, ordered by the rowid, or primary key for WITHOUT ROWID tables. The author is at least correct that SQLite’s secondary indexes are B-trees containing (value, rowid / primary key) pairs.

      MySQL’s InnoDB works the same way as SQLite, but uses B+trees exclusively rather than B-trees.

      PostgreSQL’s implementation of MVCC relies on write-once immutable pages, so it can’t use B+trees. Every row update goes on a new page, making leaf links unmaintainable. PostgreSQL also differs from SQLite in index structure: its indexes don’t reference primary key values, they store the actual page numbers containing the row data. That’s why updates to tables with lots of indexes are so expensive for PostgreSQL, each row update must also update every index pointing to that row’s location.

      I don’t actually know of any database that uses B+trees for primary storage, but B-trees for secondary indexes. There’s just no reason to do that.

      1. 1

        Correct me if I’m wrong. In theory, the insertion and deletion for a sorted array should be O(log(n)), but because you need to shift the elements to account for the insertion/deletion that is what makes it O(n) right?

      🇬🇧 The UK geoblock is lifted, hopefully permanently.