1. 5

I am trying to find a whitepaper that I read many years ago (I think around 2016) that was looking at the performance of specialised graph databases vs using a relational database (which I am reasonably sure was Postgres).

I remember the conclusion was that Postgres was as fast/faster than the graphdb they were testing against in all cases other than a few edge cases and that I think at least one of the authors was a Google employee.

My searching skills are letting me down, so I was hoping someone else might remember this whitepaper.

Alternatively, I am interested in similar themed papers!

  1. 2

    I remember skimming it but can’t seem to find it now either, sorry. FWIW though, YMMV, and just my opinion, etc, but: most of the value I get from using Neo4J a lot is not its raw speed, though that’s very good for my purposes, but natively using Cypher/CQL. I find it so much more natural than SQL & tables to model not only long chains of relationships but in fact pretty much any relational data, and it’s plenty fast enough. IMHO, etc.

    1. 2

      Interesting! Cypher is on my list of things to have a look at, as I’ve only seen it in a book about neo4j, and not had any practical experience with it.

      Having said that, most services I interact with have postgres (or similar) as their primary datastore, so when we need to do something, I’d rather try and not introduce a new datastore unless it’s actually necessary - hence looking for this paper again!

      1. 2

        I guess you guys are aware of https://github.com/apache/incubator-age ?

        1. 2

          I was not aware, thank you so much!

          1. 2

            V interesting, thanks. I’d seen AgensGraph before and if this supports all that AG does but as a Postgres extension rather than a fork that would be awesome. I don’t think it can support all the apoc.* procedures that make Neo4j so comprehensive though so I wouldn’t be able to port existing projects to it - but for new projects or ones which require multi-model or mostly straight-up graph querying then it could be really powerful given how widespread and operationally strong PG is. Nice one.

          2. 1

            Fair enough!

        2. 2

          I don’t think this is the exact paper you’re looking for, but a paper in a similar theme is “Scalability! But at what COST?”, a perennial favorite with my team. It is a fairly snarky exploration of how the pursuit of “scalability” for its own sake has meant many distributed processing systems are in fact slower than a reasonably fast naive implementation running on a single thread.

          https://www.usenix.org/system/files/conference/hotos15/hotos15-paper-mcsherry.pdf

          1. 2

            Yes, this is a good paper!

          2. 2

            Although it doesn’t compare to Postgres, maybe this paper comparing Neo4j with two one commercial RDBMS could be interesting? http://ceur-ws.org/Vol-1810/GraphQ_paper_01.pdf

            It also contains some pointers to other benchmark papers at the end.

            1. 1

              This is the paper I was looking for! Thank you so much!

              As you say, not postgres; I probably muddled that with something else about postgres (possibly the comparison to mongodb presentation, thinking about it).

              1. 1

                You’re welcome! I just noticed that “RDBMS A” and “RDBMS B” does not stand for two different commercial RDBMS, but rather two different ways to organize edges in the same RDBMS.