1. 5
    1. 2

      In fiddling around with DuckDB, things gracefully spilled to disk when you had a query that needed one big hash or sort, but you did two or more, you could still hit OOMs; maybe each operation assumed it could use up to the whole memory limit. (Docs say, “If multiple blocking operators appear in the same query, DuckDB may still throw an out-of-memory exception due to the complex interplay of these operators.”) As a workaround you could do step one into a temporary and step two as a separate query.

      One positive vs. other things I tried is that DuckDB’s optimizer could more often find a plan that would minimize memory needs, e.g. if you join a lot of rows to a few DuckDB would hash the small table.

      There was some recent discussion about DuckDB stability that’s probably relevant to folks considering it at the kind of scale where external-memory stuff’s important.

      1. [Comment removed by author]