1. 22

It’s nothing new, but I found out about this today and found it very useful. If you ever need to use it from a JVM language there’s an advanced JDBC driver that allows to listen natively to events without having to poll the database (https://impossibl.github.io/pgjdbc-ng/).

  1.  

  2. 7

    We use this to sync newly ingested data across all backend instances so they can share with all open websocket connections so all browser instances receive the same newly ingested data no matter what backend ingested the data. Easier to re-use PostgreSQL than introduce some new system just for this purpose.

    1. 5

      Super handy! I was looking into it recently as I’m prototyping a job processing system on top PG (with select... for update skip locked and triggers on top of that).

      1. 4

        update skip locked works great. Be sure to tell no-one on the Internetz you’re using it though, or you’ll get a torrent of “use a real queue system~!” nonsense over you.

        1. 2

          Ha, I know - my long term plan is to replace RabbitMQ with the new system so we can offer per-tenant queues, something that’s nearly impossible to do in Rabbit at scale (or most queueing systems for that matter)

          1. 1

            I was under the impression you could do this with Kafka, but I know very little about it. Regardless, I would also use PostgreSQL or MySQL.

            1. 4

              Not really - Kafka topics are a resource on their own, so per customer topic is not a great strategy in the long term. Segment has a great write up about building something what I have in mind, but at much, much larger scale: https://segment.com/blog/introducing-centrifuge/

              1. 1

                Great post! That’s almost exactly how I’d design a job queue. For example, the Director in-memory cache fits their use case perfectly, but doesn’t make sense for everyone. I love their idea of rotating JobDB instances, it’s like SSTable compaction but for MySQL.

        2. 1

          That’s what I use it for, maps very nicely to go’s channels and I love not bringing in some other system to run. Then again my stuff isn’t public and I’m the only user at the moment, so we’ll see if I end up regretting rolling my own at some point.

          I based mine off ruby’s queue_classic.

        3. 3

          This is very nice in theory and having a simple queuing mechanism without any additional infrastructure is appealing. BUT, it plays poorly with connection pooling; if you’re using it in the JVM the driver you linked to has several issues and is not fully compatible with the official Postgresql JDBC driver; that driver also seems to have undefined or hard-to-understand behavior in regards to network partitions and dropped connections; notifications can be “blocked” by long-running queries in the same connection; while it is useful to take advantage of transactions in the context of sending messages, transactions are per-connection and this can play poorly with the aforementioned problem of queries sharing the same connection.

          All in all, this feature was useful but a bit painful and we ultimately ended up dropping it.

          1. 4

            One important limitation: You can’t use it with connection pooling proxy (pgbouncer). So as soon as you need to manage hundreds of connections to the DB its not usable.

            1. 3

              I’m a bit of a noob here - doesn’t that depend on the type on the pool mode (session, transaction, statement)?

              1. 5

                Sorry, yes you are right! Session pooling allows all Postgresql features.

                However session pooling is only useful for limited use cases such as maintaining a long-lived connection for short-lived clients. In most cases, people use transaction pooling to handle more connection than postgres would allow.

                1. 1

                  Got it! We’re currently using connection poolers at the application level (Clojure + hikari-cp connection pool) and it’s been working great so far - it give us session-mode connections, but without overloading our PG instances. I wonder if at some point we will need something like pgBouncer

                  1. 3

                    Depending on your architecture, you can use a single dedicated connection per worker process to LISTEN and dispatch notifications to other threads within the process. Then you can use transaction pooling for all your other threads, reducing the total number of connections needed.

                    1. 1

                      Yes, that’s more or less what I had in mind - the notification listener doesn’t have to process the jobs itself. Are you aware of any existing open source solutions that implement this kind of strategy? All I could find are Python and Ruby implementations where threads are not really a thing.

                      1. 3

                        I don’t know of any, but it’s pretty straightforward. Especially since your dedicated listen thread only needs to LISTEN, you can still NOTIFY normally through a transactional connection pool.

                        Have each worker thread create a promise, .put to a shared Java BlockingQueue, then deref the promise (blocking). Have the dedicated LISTEN thread .take from the queue, send LISTEN to PostgreSQL, and then (deliver worker-promise notify-payload). Taking from the promise queue before sending LISTEN to PostgreSQL ensures you don’t LISTEN before you can use the payload.

                        You could even use a dedicated connection pool in the LISTEN thread with maximumPoolSize=1, reusing all of Hikari’s reconnect logic.

                        1. 2

                          That’s super helpful - thank you! I guess I could also play around with core.async instead of BlockingQueue, although that’s more an implementation detail.

                          1. 2

                            You probably could, promises and async code go hand in hand.

                            Unpopular opinion: unless you’re writing something approximately similar to a load balancer, async is a waste of time. It’s 2020, mutexes are fast, context switching is cheap, and asynchronous code comes with plenty of poorly understood problems.

                            1. 2

                              Not sure if it’s an unpopular opinion, I’ve been avoiding core.async as much as possible, as on the server there’s not many reasons to use it and it doesn’t solve any problems for me (or my team - we have ~12 Clojure applications in production right now).

                              I’m also not convinced if it’s useful in Clojurescript, but admittedly I didn’t build anything significant to have a better opinion.

                        2. 2

                          the notification listener doesn’t have to process the jobs itself. Are you aware of any existing open source solutions that implement this kind of strategy

                          Take a look at rxJava (or rxCPP [1] or rx.NET – same semantic, but different languages). Eg review this article:

                          https://vlkan.com/blog/post/2016/07/20/rxjava-backpressure/

                          I am assuming that

                          • a) you are within single process
                          • b) you want one producer (eg your listener).. although multiple producers will work
                          • c) you want multiple non-blocking consumers within same process as the producers
                          • d) you want some control over backpressure (eg slow down the producer,if some consumers cannot keep up). you want to use queue with backpressure monitoring, and multiple clients

                          Of course, if your listeners are out of process (and/or running on different machines), then rxJava will not help.

                          I will also say, that rxJava has been really helpful in just writing mobile app clients (Android) because there are a number of things, when you write these clients, that require reacting to ‘events’ (Eg responses coming from backends, purchase/subscription acks from playstore, even user clicks.., etc). So this is not just for server-side code.

                          In my view, the rx set of libraries, their semantic, and use cases are exceptionally useful to learn and that knowledge will not become ‘irrelevant’ a few years, instead, you will be able to utilze it (and probably same APIs for next decade or more). Even in programming languages that do not have native threads …

                          [1] https://github.com/ReactiveX/RxCpp

                2. 2

                  Server hardware is getting better and better, with more and more cores. The standard 100 connection limit of PostgreSQL is making less and less sense as time goes on.

                  Fun fact: I just installed PostgreSQL on a server with a 3900X and it compiled in less than a minute with make -j24. (Yes, it is a completely different workload. But it’s still really cool!)

                  edit: missing “on”

                3. 2

                  Envlope passes PostgreSQL NOTIFYs through websocket connections: https://github.com/workflowproducts/envelope

                  I used this recently to detect data updates in a webpage and automatically fetch the new records.

                  (Shameless self-plug, I developed the Envelope server component for $work)

                  1. 2

                    (Another shameless self-plug)

                    I also wrote a notification provider that works with iOS and Android using PostgreSQL NOTIFYs as the basis.

                    https://github.com/wickednotify/wickednotify-server for those interested. There are full tutorials for iOS and Android usage.