1. 8
  1.  

  2. 9

    Definitely fascinating but why do they say

    Machine learning algorithms read the data back by decoding images and patterns that are created as polarized light shines through the glass.

    ?

    Is this a best-effort storage medium where the data that’s read back is just the best guess as to what’s there? Or does it store a bit-for-bit copy? Was the optimal mechanism discovered by ML but now simply used directly? Or did the writers use the words “machine learning algorithms” as buzzwords?

    1. 5

      ALL storage is just the best guess as to what’s there. You need error correcting code since no underlying storage is perfect, but enough simultaneous errors can defeat any error correcting code.

      As for machine learning algorithms, modern error correcting codes use belief propagation, and belief propagation is usually described as a machine learning algorithm. It is not a buzzword.

      1. 4

        ALL storage is just the best guess as to what’s there.

        Of course, I guess I meant…more analog-y than digital.

        As for machine learning algorithms, modern error correcting codes use belief propagation, and belief propagation is usually described as a machine learning algorithm. It is not a buzzword.

        Interesting. I’m not up on the nomenclature. I guess I’d feel weird if someone said “this SATA drive has state-of-the-art machine learning algorithms to read your data.” Not saying you’re wrong, just that I’m apparently out of touch.

        1. 3

          Not saying you’re wrong, just that I’m apparently out of touch.

          Machine learning is the new buzzword, so it gets shoehorned in everywhere.

          Just remember, linear regression is machine learning.

          1. 3

            I happened to study error correcting code well before current deep learning craze. It was still machine learning then, so it is not shoehorning.

            1. 3

              “I trained a neural network to solve this problem.”

              “You mean you trained Joe the intern to do it?”

              “….yes”

            2. 2

              As far as I can tell, Project Silica storage is a bit-for-bit digital storage.

          2. 3

            This is nothing more than a guess, but the algorithms required to determine a 3D structure from scattered light are extremely compute intensive. I’m guessing they use machine learning to approximate the algorithm with a much cheaper learned function.

          3. 3

            What’s currently the most accessible long term storage solution for files in general? I’ve been searching for over 2 years on/off and all I can think of is laminated paper.

            I know for small data (private keys, etc), engraved steel plates is the go-to.

            1. 2

              Accessible is an important consideration for long term archives. Paper has a distinct advantage in that you generally don’t need special technology to read it. The technology described in the article is at the other end of the spectrum: very expensive, would only be used by data centers that adopt it… how long will those machines last? Would it be possible to reconstruct their design from just the storage media?

              Also, the part where the glass storage media is described as impervious to boiling, scouring, etc made me roll my eyes a bit. I bet it’s not so impervious to hammering. And how durable are the reader machines?

              1. 2

                Well, boiling water is just boasting, but durability is relevant since the current alternative is negative film archived with air conditioning to keep constant temperature and humidity. If you can remove air conditioning that is a big win.

                1. 1

                  I’m not sure I agree; certainly digital data is much more durable than fragile analog media, but is that a fair comparison? Cutting maintenance costs and risks is good, but I don’t think any real archives are ever entirely maintenance-free. You can trade media maintenance for reader maintenance any number of ways. Is there any real advantage to this glass over a tape robot or RAID array or whatever current state of the art is for long term data storage? Hard to say at this point; by the time it’s actually for sale I suppose one could do the calculations.

              2. 1

                AFAIK Microfiche. M-DISK also advertises itself with durable storage.

              3. 2

                This is extremely cool. I’d love to know more about the complexity of the system required to write these.

                1. 1

                  Very interesting stuff indeed but it really felt like the author was either paid by word or had a minimum article length because I’m sure I read the same thing 3 times just with different words.