Interesting. Back when we upgraded to PG12, I tested it and btree indexes were much smaller for me than hash indexes in many cases. I wrote about it here. And PG13 brings more updates to btree, so I would expect the difference to be even higher now.
We have a few postgres tables with primary keys storing UUIDs as varchars (unfortunately). While trying to improve the performance of a query joining ~10-50k of these rows, adding a hash index on this column improved the runtime of the join portion of the query somewhere around 30-50%.
Glad to see a positive article on the topic now that they’re “production ready”. At least for what I’ve come across in the postgres space, hash indexes are much less known than btree, gin, & gist… maybe for good reason if they have more of a niche use case, but still handy to have in the toolbelt.
Hey kppullin, that’s interesting. When I was researching this article, one of the things I wanted to check is if a hash index can speed up a hash join. I guess because they both mention hash, and probably share some internal mechanisms, I assumed they can benefit each other. However, I was unable to “engineer” such an example.
I’d also assume that a hash join could make good use of a hash index, at the very least so it doesn’t have to build the up any temporary hash tables. It’s been a bit too long and I’ve forgotten the specifics and didn’t save the plans. This certainly feels like a ‘your mileage may vary’ scenario… in our case it was a net benefit, and I haven’t found anywhere else yet in our system to try utilizing hash indexes, but I was really glad to run across it in this case!
Thanks again for taking the time to write the article (I struggle to even comment on things lol).
Interesting. Back when we upgraded to PG12, I tested it and btree indexes were much smaller for me than hash indexes in many cases. I wrote about it here. And PG13 brings more updates to btree, so I would expect the difference to be even higher now.
We have a few postgres tables with primary keys storing UUIDs as varchars (unfortunately). While trying to improve the performance of a query joining ~10-50k of these rows, adding a hash index on this column improved the runtime of the join portion of the query somewhere around 30-50%.
Glad to see a positive article on the topic now that they’re “production ready”. At least for what I’ve come across in the postgres space, hash indexes are much less known than btree, gin, & gist… maybe for good reason if they have more of a niche use case, but still handy to have in the toolbelt.
Hey kppullin, that’s interesting. When I was researching this article, one of the things I wanted to check is if a hash index can speed up a hash join. I guess because they both mention hash, and probably share some internal mechanisms, I assumed they can benefit each other. However, I was unable to “engineer” such an example.
I’m glad you liked the article ☺️
I’d also assume that a hash join could make good use of a hash index, at the very least so it doesn’t have to build the up any temporary hash tables. It’s been a bit too long and I’ve forgotten the specifics and didn’t save the plans. This certainly feels like a ‘your mileage may vary’ scenario… in our case it was a net benefit, and I haven’t found anywhere else yet in our system to try utilizing hash indexes, but I was really glad to run across it in this case!
Thanks again for taking the time to write the article (I struggle to even comment on things lol).