By the time we get to the end, do we have an efficient data structure for any editor not just hex editors?
Yes. This is 2004. If it were written today, I would say, “why not just use xi-rope?”
Excellent article; it explains the underlying concepts well. Note that using a B-tree to store a long string is like using many little gap buffers, because each tree node has some slack space.
I wrote about the same topic (a list structure with fast insertion and reading) with a different focus. My description is shorter and shallower, but code is included. https://www.nayuki.io/page/avl-tree-list
It’s true that the modern solution is to use ropes in a balanced tree.
The default implimentation of “Set” in F# is an immutable AVL Tree. Is there a requirement for uniqueness of elements for AVL trees? Also are AVL’s unordered? Or is this just a coincidence that F# decided to use AVL Trees for unique unordered collections.
In an imperative programming language like Java (C++, Python, etc.), it is customary to use a self-balancing binary tree data structure to implement ordered sets and ordered maps/dictionaries. If I recall correctly, the AVL balanced tree was invented first, but the later red-black tree became the more popular choice in practice. I’m not sure why though, because the RB code is longer, there are more cases to handle, and it’s not clear to me that it’s faster than AVL (though I haven’t implemented or benchmarked it). Anyways, these mutable binary trees can be adapted to be immutable - just throw away a node and replace it with a new one if you need to insert/update/delete a tree.
Onto your questions. Any tree (plain binary, AVL, RB, B-tree) doesn’t require unique elements or ordering of the values. But you need certain rules about where to store and retrieve an element. For ordered sets, the ordering forces the position within the tree. For the hex editor article and my “AVL tree list”, the subtree weight forces the position within the tree.
Sorry, because I’m unfamiliar with F#, I can’t answer your specific question about why they decided to use AVL trees for unique unordered collections. It seems that “unique” and “unordered” are features that are mutually incompatible for a tree structure. I expect you’ll have better luck on a larger community forum like Stack Overflow.
That’s a great article and site. I’ll probably dig into those write-ups at some point.