1. 27

  2. 5

    All databases suck in their own ways, but historically I’ve found the hype around postgres to be pretty unreasonable. They’ve closed many of the severe operational gaps in the last few years, but you still see a lot of pain when pushed hard. Everyone eventually gets burned by the architecture’s performance limitations, multitenancy deficiencies, vacuum issues etc… It’s sort of like the redis of the relational world: lots of features, great for developing features againts, but gives SRE’s headaches when pushed into production at orgs with diverse tenant needs / high connection churn / high throughput / high concurrency / etc… Over time it will overcome these, but I don’t trust it with demanding usage for the time being.

    1. 2

      Could you expand more, or maybe share a link, about the Redis critique? I’m curious to learn more about that.

      1. 1

        I’m a noob if it comes to databases. Could you tell what alternatives to Postgres are there? I understand that answer is mainly: it depends. But maybe you could write a couple of “if this then that” alternatives?

        1. 2

          I imagine that icefall was comparing it to MySQL. The best thing MySQL has going for it is that many operations teams know how to run it at scale and it scales well. Developers don’t need to spend a ton of time to make their SQL performant, they can just USE INDEX. Postgres is more featureful and “correct”, but it isn’t easy to run at the very high end of performance.

          That being said, not many use cases require that high end performance and can get along quite well with Postgres.

          1. 2

            But you need to avoid JOINs when using MySQL, especially when involving more than one table.

        2. 1

          AFAICT, most software does not run an especially demanding workload.

          When the value your software generates is high compared to the (computer-and-therefore-operational) load it generates, correctness and features are in higher demand.

        3. 4

          For me leaving postgres was like moving from say python to php. Postgres was just so expressive and featureful and reliable. Mysql is a weakly typed minefield that has lots of surprising features that corrupt your data.

          I understand why lots of people used mysql. It basically gave you a pretty fast multithreaded server with query caching which allows you to basically treat it like a pretty fast nosql database. In fact I would argue that mysql is mostly used as a nosql database with a sql frontend.

          Mysql didn’t get good with the whole transaction and data integrity thing until much later, and due to historical baggage still requires alot of work to avoid shooting yourself in the foot.

          Postgres has always been an amazing single node correct ACID SQL database, it lagged a bit in performance on simple queries and having a good replication story. I was at a company that built a custom logical replication engine for postgres, and it was a pretty big pain.

          1. 1

            [MySQL] requires alot of work to avoid shooting yourself in the foot.

            Can you expand on this point? I’ve used MySQL for many years and not encountered data integrity issues. Genuinely curious, I like Posgres as well.

            1. 12

              Not a problem at all, I suspected someone might ask me to defend my opinion. I’ll only mention specific issues that I have encountered in production mysql usage.

              Mysql has non-transactional engines available that silently pretend to support transactions. https://dev.mysql.com/doc/refman/5.6/en/myisam-storage-engine.html

              Mysql’s utf8 is not actually utf8, but a subset of utf8 which has caused me some headaches in production. https://dev.mysql.com/doc/refman/5.7/en/charset-unicode-utf8.html Mysql has a “real” utf8 charset but it’s not called utf8.

              Mysql table repair: It boggles my mind that a database can have a table “break” until someone comes around to ’“repair” it, but here we are: https://dev.mysql.com/doc/refman/5.7/en/repair-table.html

              Enum handling is a joke, but it’s not funny when at 3am you realize you realize that an ENUM(‘a’,‘b’,‘c’) can have 5 possible values. https://dev.mysql.com/doc/refman/5.7/en/enum.html

              In fact I’d like to leave you with a few choice quotes from the mysql docs on ENUM:

              If you insert an invalid value into an ENUM (that is, a string not present in the list of permitted values), the empty string is inserted instead as a special error value. This string can be distinguished from a “normal” empty string by the fact that this string has the numeric value 0

              If the numeric value is quoted, it is still interpreted as an index if there is no matching string in the list of enumeration values. For these reasons, it is not advisable to define an ENUM column with enumeration values that look like numbers

              All of these issues are avoidable, in fact it’s quite similar to coding PHP and avoiding all the “bad” functions (mysql_escape_string vs mysql_real_escape_string), which is why I say it requires work to avoid shooting yourself in the foot.

              1. 8

                I’m not the person complaining, but allow me to share my MySQL story.

                I was leading a small dev team in a tiny company. We had no QA team, and our test suite was nonexhaustive, so for upgrades, our process was as follows: upgrade the dev environment, and let it sit for a few weeks; if no problems appeared, upgrade live. It was not a perfect process by any means, but it was better than nothing.

                The person in charge of the DB was cautious enough that even point releases went through this process.

                So ge queued up an upgrade in dev. A security release, not a major or minor release. A few weeks passed, and we pushed it to live. And then the database crashed. The entire database on the server side, complete with data loss and corruption.

                We restored from backup. A few hours later, it crashed again. After a couple more iterations of that, we downgraded.

                What happened? We traced it to a single page on the site, where a specific date formatting function was being called in a specific (valid, documented) way. The semantics of that date function had changed, without findable mention in the release notes, such that the old (valid, documented) way of calling it caused the whole system to collapse and trash data.

                Now, this was years ago, and I’m sure things are better now than they were then (heck, they’d have to be), but I’ve never experienced anything like that from Postgres, either prior to that or since. I’ve had custom extensions break in odd ways, with major releases. I remember the transaction wraparound problem in 9.3, also introduced (and fixed) in major releases. I even remember some of the stability issues from the 6.x era. But nothing like introducing a server-crashing, data-corrupting bug in a security fix release.

                And I’m not likely to trust MySQL again.

                1. 1

                  I’ve seen a MySQL GUI crash, leaving a zombie connection that somehow never got killed running on the server and doing horrible things to the CPU despite the fact that the client for that connection had long since vanished.

                  Check constraints don’t exist.

                  Nested subqueries can have “surprising” performance characteristics. (Especially bad since it also lacks a “WITH” clause.)

                  Tables can get into “broken” states for no apparent reason. MyISAM (a hilarious engine all around) had a “repair table” functionality(!), but InnoDB doesn’t, and yet can still get into a situation where it won’t let you do anything to the table.

                  The default collation is latin1, case insensitive. utf8 isn’t utf8. utf8mb4 is utf8.

                  I could go on.

              2. 3

                This is great thanks! Is there anything you’d say specifically if someone asked you why to chose Postgres over MySQL? Let’s say they’re using AWS and not Citus. I had a hard time making an argument for Postgres at my current job where basically all the SQL databases are MySQL. In my case it’s for analytics and upserts are really useful which at the time Postgres didn’t have. HyperLogLog is also a really useful feature but AFAIK you can’t add the extension on AWS RDS.

                1. 2

                  For me the biggest thing is around flexible datatypes (arrays, hstore, range types, and JSONB) along with the corresponding indexes that can use used with them like GIN and GiST. And then truly extensions, even if not using HyperLogLog you might need geospatial support via PostGIS, maybe full text search, maybe foreign data wrappers, and there’s a much longer list.

                2. 1

                  For me, the only reason to choose PGSQL over MySQL is the sane default – reliability rather than performance - less care, less effort - lower maintenance cost.

                  Because I’m not an DB expert, and have not enough resource to hire another DB expert to configure db server for me.

                  If I get a performance issue, which means a great success and much money, then I would hire a db expert and let them to do whatever they want.