I really would like to see the queries and explains for Postgres and Virtuoso, or at least know what algorithm was attempted. For Postgres to be that much slower, there has to be some algorithmic issue.
You can see the search path query here. I haven’t analyzed it yet to determine how efficient it is.
I saw that, and wasn’t 100% sure that’s the one they used. For example, does that query work on both Postgres and Virtuoso?
If that is the query, it’s terribly inefficient. They copied the Postgres docs example to enumerate an entire subtree, and modified it to return only the shortest path. So to compute the shortest path between A and Z, it starts at A, lists all of A’s connections, and scans them for Z. It then non-uniquely lists of A’s second degree connections. So if A knows D via B and C, it will list A B D and A C D, and then double-search for D in the next stage. This is particularly bad in social networks, where 2nd degree connections are likely to also be 1st degree connections.
But this isn’t a tree, it’s a graph. Instead, they should only keep the shortest way to get to any point, and work from there. So if they have A C, and they discover A B C, we should throw that out and keep A C only.