I’m not sure what’s unique to Elixir or erlang / BEAM here, but it’s an excellent tutorial on how to measure and visualise what’s happening - and how to fix it.
Whenever I’m writing code that’s going to deal with large or potentially large data, I make a decision as to whether I’m happy to write less code by just loading it all up into RAM or more code to handle streaming it.
If I’m writing a tool I’m going to run from the command line manually and I’m confident enough I’ll be able to just allow whatever RAM it might need to be eaten while it runs, I’ll maybe take the shortcut, but in nearly every other scenario I’ll be streaming the data.
I usually use - or write - wrappers, so I don’t have to do the stream handling over and over again. One thing I’ve noticed is that I often find that I have to write these myself, as they don’t exist, or they don’t have the flexibility I require (especially in error handling).
A lack of easy streaming options to reach for encourages falling back to reading whole files and responses to RAM.
Does anyone have examples of ‘well designed’ helpers for streaming that can be used for inspiration?
BTW here’s an attractive-looking example of a streaming interface for ZIP file handling: https://github.com/ananthakumaran/zstream