1. 69
  1. 26

    Very similar story from a few weeks ago: SQLite is not a toy database – I won’t repeat my full comment from there.

    SQLite is very fast. [..] The only time you need to consider a client-server setup is: [..] If you’re working with very big datasets, like in the terabytes size. A client-server approach is better suited for large datasets because the database will split files up into smaller files whereas SQLite only works with a single file.

    SQLite is pretty fast compared to fopen(), sure, but PostgreSQL (and presumably also MariaDB) will beat it in performance in most cases once you get beyond the select * from tbl where [..], sometimes by a considerable margin. This is not only an issue with “terabytes” of data. See e.g. these benchmarks.

    Is it fast enough for quite a few cases? Sure. But I wouldn’t want to run Lobsters on it, to name an example, and it’s not like Lobsters is a huge site.

    Well, first of all, all database administration tasks becomes much easier. You don’t need any database account administration, the database is just a single file.

    Except if you want to change anything about your database schema. And PostgreSQL also comes with a great deal of useful administrative tools that SQLite lacks AFAIK, like the pg_stats tables, tracking of slow queries, etc.

    And sure, I like SQLite. I think it’s fantastic. But we need to be a tad realistic about what it is and isn’t. I also like my aeropress but I can’t boil an egg with it.

    1. 9

      SQLite is pretty fast compared to fopen(), sure, but […] MariaDB will beat it in performance

      I would actually be interested in knowing whether SQLite handles that query that broke Lobste.rs’ “Replies” feature better than MySql/MariaDb.

      But I wouldn’t want to run Lobsters on it, to name an example, and it’s not like Lobsters is a huge site.

      I think Lobste.rs would run fine. It would probably be more an issue with the limited amount of SQL SQLite supports.

      1. 7

        The replies query broke because the hosted MySQL Lobste.rs relies on doesn’t do predicate push down. SQLite does do predicate push down, so it wouldn’t have the same problem.

        However SQLite doesn’t have as many execution strategies as MySQL, so it may be missing a key strategy for that query.

        1. 5

          SQLite’s query planner is honestly a bit smarter than MySQL’s in certain ways. For example, MySQL, as recently as 2017, did temporary on-disk tables for subselects. SQLite instead usually managed to convert them to joins. Maybe that’s been fixed in the last four years, but I wouldn’t assume that MySQL would be faster/that SQLite would be slower.

        2. 1

          Lobsters uses some fairly complex queries; usually those kind of things tend to do less well on SQLite, although I didn’t run any benchmarks or anything. I found that SQL support in SQLite is actually pretty good and don’t necessarily expect that to be a major issue.

          From what I understand is that the biggest problem with the Lobsters hosting is that it’s running MySQL rather than MariaDB. While MySQL is still being actively developed, from what I can see it’s not developed very actively and MariaDB is leaps ahead of MySQL. At this point we should probably stop grouping them together as “MySQL/MariaDB”.

          1. 1

            Aside from the operations perspective of migrating data, converting things that are not 1:1 between mysql and mariadb, etc. are there any features in lobste.rs that prevent the use of MariaDB?

            1. 1

              It used to run on MariaDB until there was a handover of the servers. AFAIK it runs well on both (but not PostgreSQL, and probably also not SQLite).

              1. 1

                I guess their current hoster only provides MySql (for unknown reasons).

                I asked about offering hosting, but never got a reply.

          2. 12

            I also like my aeropress but I can’t boil an egg with it.

            I bet you could poach an egg with it, with some inventiveness and a slightly severe risk of getting scalded. ;)

            1. 3

              When I posted that comment I was thinking to myself “I bet some smartarse is going to comment on that” 🙃

              1. 2

                Joking aside, I think a better analogy would be comparing the Aeropress to an espresso machine: the Aeropress is going to get you really good coffee that you’re going to use every day, costs very little, is easy to maintain, and you can bring with you everywhere, but it’s never going to give you an espresso. But then again, it’s not really trying to.

                (The analogy falls apart a bit, as one of the original claims was that it could produce espresso. I think they stopped claiming that though.)

              2. 1

                LOL

                …and audible laughter was emitted. Thanks for that.

                1. 1

                  On the other hand if you had to set up and supply your password to obtain admin rights every time you just wanted to make coffee….

                  …because some nutjob might want to use it for boiling eggs and the company wanted to stop that….

                  …the device that just let’s you get on with making coffee (or boiling eggs) is a hellavuh lot faster for many jobs!

                2. 5

                  Except if you want to change anything about your database schema.

                  SQLite has supported ALTER TABLE ADD COLUMN for years, and recently added support for dropping columns. So I’d amend your statement to “…make complex changes to your db schema.”

                  SQLite has stats tables, mostly for the query optimizer’s own use; I haven’t looked into them so I don’t know how useful they are for human inspection.

                  1. 2

                    SQLite has supported ALTER TABLE ADD COLUMN for years, and recently added support for dropping columns. So I’d amend your statement to “…make complex changes to your db schema.”

                    Yeah, the drop column is a nice addition, but it’s still a pain even for some fairly simple/common changes like renaming a column, changing a check constraint, etc. I wouldn’t really call these complex changes. It’s less of a pain than it was before, but still rather painful.

                    SQLite has stats tables, mostly for the query optimizer’s own use; I haven’t looked into them so I don’t know how useful they are for human inspection.

                    As far as I could find a while ago there’s nothing like PostgreSQL’s internal statistics. For example keeping track of things like number of seq scans vs. index scans. You can use explain query plan of course, but query plans can differ based on which parameters are used, table size, etc. and the query planner may surprise you. It’s good to keep a bit of an eye on these kind of things for non-trivial cases. Things like logging slow queries is similarly useful, and AFAIK not really something you can do in SQLite (although you can write a wrapper in your application).

                    None of these are insurmountable problems or show-stoppers, but as I mentioned in my other comment from a few weeks ago, overall I find the PostgreSQL experience much smoother, at the small expense of having to run a server.

                    1. 6

                      it’s still a pain even for some fairly simple/common changes like renaming a column

                      https://sqlite.org/lang_altertable.html :

                      ALTER TABLE RENAME COLUMN The RENAME COLUMN TO syntax changes the column-name of table table-name into new-column-name. The column name is changed both within the table definition itself and also within all indexes, triggers, and views that reference the column.

                3. 17

                  I used to use SQLite all the time for geospatial data using the SpatiaLite extensions, it made dealing with data in many different formats much easier (and scriptable) and just simplified a lot of the work we had to do to manage weird datasets on nationalmap.gov.au(/renewables - sadly recently made defunct due to lack of government funding).

                  We’d pretty regularly get CSVs with columns for LAT and LON, and need to do some actual work with the data, or turn it into another format like GeoJSON. Or we’d get a bunch of GeoJSON data that we wanted to manipulate.

                  Probably the most useful thing I did (but have probably lost access to sadly) was sticking the Australian Government’s Geocoded National Address File (GNAF) into a SpatiaLite database which we could use to geocode addresses, display know addresses etc. It handed the rather large quantity of data well (though IIRC generating spatial indexes for the several million locations took a while. Having the indexes meant we could do things like find all addresses which mapped to identical or nearby locations.

                  All this is pretty basic, and could be done in plenty of tools, but it ended up being the one that was most flexible, allowing us to do more than we could in QGIS and without having to buy licenses for ArcGIS. Importantly it was also as powerful, at least as far as we needed, as PostGIS which we used for serving much of the data. Anything we knew would be static was simply put in SpatiaLite databases and handed to GeoServer.

                  1. 1

                    This is a great illustration on why geospatial data must move past shapefiles and other clunku format to sqlite+spatialite and Geopackage (being sqlite+spatialite configured). Having al you data in a geospatial-aware database let you use easily Geojson as the data format to transfer in-between but also allow to just drop Geoserver/Mapserver in front of it (the databases) and provide WCS/WFS api for free or almost. QGIS can also be used easily as a viewer for a local or remote database with PostGIS or Spatialite or an editor also.

                    There is some much to move to in the geospatial/gis field to simplify the data format and get rid of past format as shapefiles that is a pain to work with.

                  2. 13

                    I saw a big speedup moving a very low-traffic NextCloud install from SQLite to PostgreSQL. I suspect this is because PHP’s one-script-invocation-per-HTTP-request model amplifies any costs of opening and closing a database. With Postgres, this is just opening a UNIX domain socket. With SQLite, this is opening a file, reading indexes, and so on, followed by having to write back everything on close. Putting the database in a separate process works around limitations of the PHP model.

                    I’d be really curious what the performance of a SQLite daemon process that exposed the SQLite API but kept the in-memory caches live across open / close operations would be in such a scenario.

                    1. 6

                      According to PHP’s docs, the pg extension does some transparent connection pooling by default too, which may have helped.

                      1. 4

                        I suspect this the problem quite a few PHP systems have. They use the database as a session store, log facility, or even an object cache, so almost every request causes at least one database write. SQLite only supports a single writer at a time, so the process might have to wait a bit for a database lock.

                        1. 3

                          Why don’t you use fastCGI? It adds less complexity than managing another server - in my opinion.

                          1. 13

                            12-15 years ago I ran a web forum written in PHP using a SQLite database. It had 40.000 registered users, and 100-300 users browsing it during daytime, sometimes peaking 500+. It never had a single hiccup. I backed it up by simply copying the database file. I think I had a script to put the forum in maintenance mode before proceeding to the backup (which was ready in 1-3 seconds) but I can’t be sure. I tested the backups manually from time to time and they worked.

                            Frankly, I didn’t need anything else. It was faster than virtually all websites I browse today.

                            1. 1

                              Why would FastCGI help? The PHP programming model is the problem, not the way it communicates with the server. Even in a FastCGI mode, your script still opens the file, still needs to parse the indexes, and still closes and flushes it at the end, you just don’t create a new process each time.

                            2. 1

                              I saw a big speedup moving a very low-traffic NextCloud install from SQLite to PostgreSQL.

                              That’s good to know. I’ve (reluctantly) installed Nextcloud to see if it could address my requirements for photo syncing, and it’s sluggish performance OOTB was a surprise. That will be the next thing I try. Thanks!

                            3. 6

                              In most cases?

                              • k:v stores for fast lookups (requirement: 50ms p90 latency with 30.000 req/s)
                              • storing 50TB flat data for analytical workloads
                              • storing relational user data for web applications

                              I think SQLite is sometimes a good option for #3.

                              1. 20

                                FWIW, SQLite is by far the most-deployed database in the world. It’s just that most of its deployments aren’t in servers — it’s in almost every cellphone in the world, and probably in most cars and TVs built in the last decade. I’ve heard it’s found in microwave ovens too.

                                I know people here tend to be focused on servers, but databases aren’t just for that.

                                1. 12

                                  k:v stores for fast lookups (requirement: 50ms p90 latency with 30.000 req/s)

                                  Trivial for SQLite. Arguably the best case for SQLite performance. See Expensify using SQLite as a KVS reaching 4 million QPS.

                                  1. 6

                                    Yeah, when I saw the title, I’ve thought “oh, another one?”. And yes, the app makes some (unsubstantiated) claims about “thousands of concurrent writes”, and even if true, I wouldn’t say “most cases” works here.

                                    I mean, yes, SQLite is valuable and has its definite uses. And yes, a lot of applications out there could probably work with SQLite instead of whatever else they may have. But “most cases”? I’d rather say “few cases”, go into a niche, and then rule the world in that niche.

                                    1. 7

                                      Your notions of most vs. niche are kind of backwards… https://www.sqlite.org/mostdeployed.html

                                      it seems likely that there are over one trillion (1e12) SQLite databases in active use.

                                      1. 3

                                        That would suggest there are trillions of webapps there. Is say majority of those trillions are embedded in a mobile app - practically one use case. And a great I’ve at that - you’d have to be a little insane to ship a postgresql to acompany your mobile app.

                                        The article seems to claim that you should be developing web applications and replacing your regular database servers with it. I mean, for single-to-few-users blogs, sure. Even some small apps. But for most cases where I’m using PostgreSQL or a MongoDB cluster, I would probably not be doing myself favors by replacing them with sqlite.

                                        So take my comment in that context, of how I understood the article - using sqlite instead of postgresql. In that context, I would not go for “most cases”, that’s what I’m vasing my comment on.

                                        1. 5

                                          There are not trillions of webapps, but there are, combined, over a trillion airplanes, smartphones, industrial machines, automobile computers, infrastructure controllers, and other real world places you want to have a database.

                                          The world of computer technology is so much more than the web, or even the internet.

                                    2. 4

                                      storing 50TB flat data for analytical workloads

                                      Sqlite recently increased the max database size to 280TB, because people had reached the half-way point of it’s previous limit, so 70TB Sqlite databases exist.

                                    3. 4

                                      I also always enable “foreign keys”: PRAGMA foreign_keys = ON;

                                      1. 4

                                        I sometimea consider sqlite for one-off prod apps, but postgres is already installed on al my servers (and a single apt install away if not) so I end up using what’s easy.

                                        1. 3

                                          The only time you need to consider a client-server setup is:

                                          Where you have multiple physical machines accessing the same database server over a network. In this setup you have a shared database between multiple clients.

                                          So…. an enormous percentage of mature production applications?

                                          1. 8

                                            Which is a tiny percentage of all database deployments.

                                            1. 1

                                              I think saying “this tool, which is unfit for most production usages, is the only tool you need in most cases” is quite disingenuous at best.

                                              1. 2

                                                The average person is using 30-40 SQLite databases somewhere between waking up and leaving their house.

                                                It’s good to know when not to use SQLite, but denying that SQLite runs this planet is rejecting reality.

                                          2. 4

                                            No thank you, I value my type system.

                                            1. 24

                                              This is a weird comment. I checked the link, and my understanding is that due to the way SQLite parses the types, FLOATING POINT would be treated as INT as it ends with INT. Who writes “floating point” as a type? That’s not even in the standard.

                                              I read that tweet as classical “let’s shit about stuff others wrote so I look smart”. SQLite has its flaws, just like every software written, but it is a really good piece of software.

                                              1. 6

                                                I mean, I’m not sure what the tweet is saying, but you can insert a string “foo” into a column declared “int” no problem in SQLite.

                                                1. 3

                                                  I read that tweet as classical “let’s shit about stuff others wrote so I look smart”. SQLite has its flaws, just like every software written, but it is a really good piece of software.

                                                  I wrote this tweet, it’s just a joke, riffing on the popular How SQLite is Tested article which gets shared often. SQLite does have an absurd implementation of a “type affinity” system which takes the spelling of a type into consideration. I think we can all have a good laugh about that and also appreciate how unbelievably high-quality the software itself is.

                                                  I think your comment was the classic “this other guy is doing the classic asshole thing” asshole thing ;)

                                                  1. 2

                                                    Well, sure, but people tend to take certain jokes seriously if it lines up with their confirmation biases, inadvertently turning the jokes into actual criticism.

                                                  2. 1

                                                    Wouldn’t it be more reasonable to die than to try to guess what the programmer meant?

                                                  3. 3

                                                    Ran into issues with this just this week. Never again

                                                    1. 1

                                                      Perhaps you have multiple apps using the same db?

                                                    2. 2

                                                      Usually I’m kinda thankful if databases stay away from type systems, because I’d rather have a minimal one than a wrong one, but what really hurts is that even those types are gone as soon as you use a database function.

                                                    3. 2

                                                      (Is something wrong on my end or why is the prose monospaced and the code using sans-serif?)

                                                      SQLite is almost always more user friendy. I sometimes decide to not use tools if they depend on a Client-Server DBMS, just because configuring it is always more complicated than it has to be (for me).

                                                      1. 1

                                                        At my end, the prose is monospaced sans-serif while the code is monospaced with serif (Courier family).

                                                        I use Chrome on Windows with no appearance-related plugins.

                                                        1. 1

                                                          Yeah, the body font family is monospace, while the pre font family is “Courier New, Courier;” – i.e., missing any generic font in case Courier (New) isn’t installed on the user’s system. So pre blocks are displayed in your browser’s default font.

                                                          Makes for a funny combination…

                                                          1. 1

                                                            Yet another reason not to use monospace for prose.