1. 10
  1. 12

    I have some grave doubts about the performance claims, especially where the graph shows SQLite taking 2700ms to read 1000 documents by ID — that’s ridiculously slow. We’re they reopening the db for each read?

    Update: Their SQLite benchmark is so bad I can’t believe it’s unintentional. As I half-expected, they create the table without any primary key, and neglect to create an index on the document ID. They do have a create_index method, but they only index other columns, not the idd column that stores the doc-ID.

    I haven’t really dug into their own database code, but when the entire disk storage engine is 250 lines long, without importing any structured-storage library, it can’t be doing anything fancy, even a b-tree…

    AFAICT the storage looks like a flat file, with a separate file containing the serialized metadata for each record, which is read into memory on startup. (I think IBM called this ISAM back in the ’60s?) It’s probably nice and fast for 1000 records, but not what you’d call scalable, even for client-side data sets.

    In short, this looks like a toy … nice for prototyping or unit tests, but not something you’d use as a real database.

    (Disclaimer/hat: I’m the tech lead of Couchbase Lite, another mobile NoSQL database.)

    Update 2: The disk engine makes no effort to provide real durability or crash resistance. It overwrites records in place; the metadata is written some time later to a separate file; there’s no exception handling; and it doesn’t do any flushing beyond whatever Python’s flush method does. In short it can lose or corrupt data if the process crashes, let alone if there’s a power failure or kernel panic. Providing real fail-safe durability of file contents is actually quite difficult, and real databases do their best to provide it.

    At this point I’m pretty upset that this guy is overstating the capabilities of the library so badly. There’s definitely a place for a mock of a real database (I’ve written one in the past and we still use it for testing), but you absolutely don’t advertise it as being a real database or encourage people to trust real data to it.

    1. 7

      Nice idea, but I don’t get why it’s written in python. At least half of sqlite’s usefulness is that it can be easily used in every language.

      1. 2

        I’d love to see some attempt to compare this to SQLite + the JSON1 extension and am surprised from the performance comparison section that they rate SQLite’s point reads so low. SQLite’s native document oriented support is actually entirely reasonable although unlike Postgres and others I don’t believe indexes on JSON documents or substructures are supported.

        1. 1

          SQLite’s JSON support is quite decent. You can index JSON values since SQLite supports indexes on expressions. It’s even possible to query inside JSON collections (stuff like “where the ‘states’ array contains ‘CA’”) using a virtual table to represent the collection.