So the tl;dr is that an insert heavy workload benefited from a batch write?
These tools are awesome, and deep diving into the source code to find the specific things that make batching more efficient is very educational (apparently some table related metadata is cached).
For applications that can tolerate bounded latency on their writes (even if it’s only 1s) batch writes are usually a huge win.
Yes, but it was unclear that was the case at the beginning. We had (incorrectly) assumed CPU was mostly spent on evaluating the partial index predicates. Based on that assumption, we thought batching wouldn’t have had much of an effect. It wasn’t until we actually examined what the CPU was being used for did we realize that our assumption about CPU usage was completely wrong and that batching would actually have a dramatic impact.