1. 20

  2. 15

    Strange that this entire piece does not mention SSDs. The price of SSDs has also dropped a ton in the last few years. And as a result, disks are not nearly as slow as they used to be.

    The combination of big-RAM and big-SSD-disk machines (e.g. AWS i3 instances) can give you, today, a cloud machine for $5/hour with 488 GiB of RAM and 15 TiB of SSD, where the persistent SSD delivers 16 GiB/second of sequential read time and the RAM is probably 10x faster than that, or better (source).

    So, it’s not so much that “RAM is the new disk” – it’s more that I/O, on the whole, is getting way faster. Bigger data sizes can be handled by single nodes much more easily. Some of the fundamental assumptions that drove the design of database technologies (optimized for “reducing disk seeks” and “minimizing memory usage”) are being bent and broken. “Spinning rust” is truly a thing of the past.

    That’s also why projects like Apache Arrow and Apache Parquet are so important. Open source can help us shift our data structures from row-oriented (which are, in some ways, “more natural”) to column-oriented (which are way faster when RAM is cheap).

    Also, Michael Stonebraker (of Ingres/Postgres fame) gave a talk about the shift from row-based to column-based architectures, which you can watch here.

    1. 2

      On a single drive, SMR sequential read is faster than sequential read on SSD. The fact that AWS is unable to give you this high read throughput is an expertise issue on their part.

      I think the big shift is not HDD -> SSD which largely will always be a cost/workflow thing (hdd is the new tape?), but that SSDs might get fast enough to replace RAM in many applications, which has ginormous implications for power usage.

    2. 2

      Anybody else here old enough to remember copying your entire dev environment into ram disk because everything was floppy based and hard drives were outside the bounds of what a normal human could afford? :)

      I had a custom C dev config for my Amiga 2000 like this. Copy everything into ram disk and work from there, copying back to floppy when you had something worth saving.

      1. 2

        I remember!

      2. -1

        Wow, this is a very insightful piece!