1. 34
    1. 10

      We use RabbitMQ for similar things at my work but I have never run into this problem. Without knowing more of the technical details I couldn’t say for sure why that is, but my gut instinct is that it has to do with the queue topology. The way we use it, messages are always sending to topic exchanges and never directly to a queue, which always seemed to me sort of antithetical to how AMQP is designed to be used.

      Workers come in two flavors: either we need a persistent queue, in which case we establish a simple persistent queue, bind to that exchange, and then have workers reading from it. RabbitMQ automatically round-robins messages to workers. I have never really seen the bad behavior described in the article in this arrangement but it seems like, if the article’s analysis is correct, I should have. The other scenario we have is where the queue doesn’t need to be persistent, and then the worker creates an unnamed, transient, auto-delete queue and binds to the exchange with that.

      I do have one scenario in which we have data in Postgres and a RabbitMQ queue related to the same work. We did this because I thought it was unsafe to use Postgres as a queue with multiple workers, on account of MVCC and race conditions. So what I do there is something akin to, create a transient queue in RabbitMQ, load it up with data from the database, and then have workers reading off that queue; as they finish, they mark things done in Postgres as well. It would be interesting to explore a solution using row locking as described in the article.

      I really like using RabbitMQ, because it’s very responsive and the whole AMQP topology concept is very interesting to me and affords a lot of possibilities. I don’t see abandoning it, although it would be good to eliminate the double-entry system I have above if it isn’t really necessary. Overall, I see AMQP as enabling a kind of cross-language, event-driven system integration platform. If you just need a simple queue it is probably overkill.

      1. 10

        unsafe to use Postgres as a queue with multiple workers, on account of MVCC and race conditions.

        The rest of this post all makes sense to me, but postgresql has much, much better tools to deal with “ensure only one worker processes this, and only with the latest version of the data” than AMQP does.

        1. 1

          Please elaborate.

          1. 6

            SELECT … FOR UPDATE SKIP LOCKED

            1. 1

              Right, and this also means that jobs are not available to be worked on until you commit the transaction inserting them - so workers cannot pick up a job before the associated records are present.

            2. 1

              I use the following code in a few different projects:

              DELETE FROM task
              WHERE task_id IN
              ( SELECT task_id
                FROM task
                ORDER BY RANDOM()
                FOR UPDATE
                SKIP LOCKED
                LIMIT 1
              )
              RETURNING task_id, task_type, params::JSONB AS params
              

              Note that RANDOM() is not ideal for larger tables – use TABLESAMPLE instead.

    2. 4

      Here I thought Postgres already had a queue functionality built in. I may be thinking of its pubsub stuff though.

      Anyone know what the gotchas for this sort of approach may be? I can imagine lots of different jobs fighting over access to the same table, but that already happens in databases anyway…

      1. 4

        If you are sharing a cluster, then saturating connections is definitely a concern.

      2. 4

        Yeah you are thinking of notification channels. It can’t be used for queues because message posted to these channels are not stored (like redis pubsub). So if a listener misses a message due to for eg. downtime, the message is just lost.

        1. 7

          It can be used to wake the workers and let them try to select for update skip locked and perhaps acquire a job to process without them needing to poll constantly, though.

      3. 4

        I haven’t actually tried this, but I posted it a year ago

        What is SKIP LOCKED for in PostgreSQL? Most work queue implementations are wrong

        And a comment says maybe that this is over-optimizing for FIFO semantics ?

        Also related: https://news.ycombinator.com/item?id=29599132

      4. 3

        Because Postgres uses MVCC, I think you need to worry about different transactions seeing completely different queue states. I think you are certainly capable of creating a race condition by having two separate transactions both trying to accept the same item at the same time. I believe for these reasons, building a queue is considered a “no-no” in Postgres-land; I remember reading a great article about it some years back.

        However, in searching for this article (which I couldn’t find), I found a lot of articles describing how people had successfully built a queuing system in Postgres. So it may be that either the failure scenarios are really far-fetched or that there are techniques (like row locking as described in the article) which make it reliable.

        1. 12

          The transaction isolation level, which is independent of MVCC or any other underlying implementation of it, keeps workers from reading the wrong state value out of the row. The problem with using Postgres as a queue is its default overly coarse row-level locking. In the past few years they’ve extended the locking model to support the queue use case, allowing concurrent workers to lock and update rows without blocking each other.

          1. 5

            The crates.io job queue does it by using SELECT FOR UPDATE ... SKIP LOCKED. The job runners lock the first open row in the table, work it, then either delete the row on success or update the state to failed.

            1. 1

              This is what I’ve done, although never deleting the rows but just setting the state (easier to just run deletes as a vacuum under times of lower load).

              1. 2

                I would have thought that doing an UPDATE would have the same impact as a DELETE: both creates garbage rows that must be vacuumed. So you double the number of rows that need vacuuming.

                There’s two edge cases I can think of where it can be worth it. One is if the table has big columns with TOASTed data. The other is if the table has a ton of indexes. In both cases the UPDATE might avoid dealing with those “right now”, so if your DB is stressed during peak but has low load times it can make sense to batch the deletes.

                1. 2

                  Postgres has HOT updates where if the update fits on the same page it can be done in place without creating vacuum load.

                  I wonder if that can be used to avoid queue bloat. https://www.postgresql.org/docs/current/storage-hot.html

                  1. 1

                    Yes, we set our fillfactor to something like 10%. Or whatever the value is that means you have more space for HOT updates.

                  2. 1

                    I had forgotten about HOT updates! Looks like they don’t apply if you have an index on the affected column, however. That might be workable for a queue table if it’s small enough that not indexing the column is acceptable—but that then begs the question of whether deferring the deletes is necessary.

                2. 1

                  https://www.postgresql.org/docs/current/storage-hot.html

                  I don’t think this optimization applies to deletes.

        2. 2

          As long as your worker accepting a job from the queue involves a write to postgres, you’re fine. If you have a bunch of workers accepting work at the same time you can get a lot of transaction retries, so it can be useful to serialize workers access to the queue table. And of course if your job queue is high volume you can shard it and serialize access to different shards. But these are all performance optimizations and not needed for correctness.

    3. 3

      Exchanges, queues, topics, streams, etc. RabbitMQ always struck me as rather laden with terminology for something that everybody says is so simple.

      I know it’s very solid and for some of our high volume event catchers it ran like a beast, but for most applications if I could get away without having to run this, I would try. And if something like this were necessary, why not just go straight to Kafka?

    4. 2

      I remember when I looked at doing something similar 10 years ago or so, that with MsSQL, you needed to be careful about message volume due to hot pages used by table storage. If I recall correctly, you could end up with a lot of used space in the tempdb, or page fragmentation perhaps.

      I wonder if this is a concern for postgres?

      1. 2

        Almost certainly. At a certain scale you’re likely to need something that can shard over multiple machines (kafka or its more modern successors)

    5. 2

      As a rookie cloud native developer, I made the mistake of using Cassandra as a job queue just because we were using it to store persistent data and it was all configured with correct access controls.

      It worked great till we started to scale. Just removing the queue entries after processing the task resulted in so many tombstones, it caused a major degradation. We ended up migrating to Kinesis almost overnight and that has been stable so far.