1. 11
  1. 6

    This kind of problem really only exists for distributed systems, and no amount of lawyers can solve the problem in distributed systems. In non-distributed systems there are several ways to handle transformations to immutable data. The idea that an accountant can’t reverse a transaction because his log is immutable is frankly borne out of ignorance of how people managed immutable systems before computers.

    Append only immutable systems or processes have existed long before computers, and will continue to exist long after. An accountant does not erase. Documents such as contracts and doctor’s charts are also often immutable. They add an entry in their books to signify a correction. The mental frame that the world is exclusively mutable (it’s not) means that you view bread as hot wet yeasted wheat. Most of us just call it bread.

    In the transformation for most people the consistent identity has been broken. Some people have a philosophy of strong continuity through transformation, but in practice it is likely to ruin your intuition. You’ll expect things to remain true that are no longer true. For example ash probably isn’t as great of a building material as firewood. The idea of a completely unbroken identity for all things will cloud what things are possible with any given object. In essence sometimes it just makes sense to return a new object rather than trying to twist an existing object to fit.

    1. 4

      I don’t think that fully addresses the article. Even if you append a correction, the information is still internally there and can be retrieved by eg a database leak. I used to work for an edtech company that had to comply with COPPA, and I remember the COPPA rules being extremely strict. Deleted information couldn’t be “in the database and not served”, or “in the database with a correction”, or “not in the current database but in a sixth month old s3 snapshot.” It had to be gone.

      (IANAL of course, and this was a while back.)

      1. 1

        With event sourcing and CQRS patterns you can use the audit log of events to generate a projection as a SQL database for example. So it’s not in the database, you can even use rolling snapshots if you would like to periodically delete the audit logs (aka event store) entirely. However there are domains where deleting the audit log may be a crime. I personally would not recommend deleting your events unless legally required but you can do it.

        1. 1

          That sounds like a technical solution for a legal problem. The technical solution wants the records there. The legal problem wants them gone. Legal professionals will determine if the solution meets the legal requirements. Probably safer if we just do what they wanted in that case. Especially something whose risk they understand.

          1. 2

            Agreed, what I was trying to demonstrate is that many approaches are possible :).

      2. 1

        An accountant does not erase.

        Of course he does. He cannot erase once the books are closed and totals given to someone external - owners, IRS and so on. But up to that point, he absolutely erases any mistakes he makes.

        It’s a major UX failure not to allow mutation of an unpublished artifact. It usually ends up by the user preparing the artifact in an external tool (usually a spreadsheet) and only submitting it afterwards.

        1. 4

          If it’s okay I’m going to address the second point first, then i’m going to come back to the first, because I think we don’t actually disagree about the user experience. When we write immutable systems it does not mean we should not expose an edit screen to the user. What it means is when the user clicks edit, and makes their changes, that it records the event of editing, and generates the current state from that event. To the user for all they know it could be mutable, or immutable, and they don’t have to care. For the business owner often the important thing is that changes are audited. This allows us to find bugs faster, identify security issues and more. The present state of the data is derived from the audit log, so there cannot be a disagreement. In this way software can allow us to have the gains of the accountant’s process while automating the correction entry as to create a nice smooth user experience.

          As for the first point I live in the US and I have been told several times that here they never ever ever erase. They take great precaution to not have pencil and erase marks in their books as it could look criminal. I have been told some do not even keep a pencil in their office. Other countries may have other norms around this but I do currently work at a finance institution and have been told that you never erase an entry. If you wish to erase a change instead you must create a new correction entry that fixes the incorrect entry.

          1. 1

            I don’t disagree that auditing changes is beneficial to many parties, but my experience is that accountants (and civil servants as well) are rather uneasy about revealing drafts mainly because they fear that someone will compare the drafts with the final version and assume malice.

            Perhaps it’s specific to Czechia, because we do have problem following proper procedures when a shortcut will save us time or energy.

            1. 2

              While not everyone follows proper procedure all the time in the US, there is a decent amount of fear I think of being financially ruined as a result. In essence if you are found to not be following proper procedure, and that leads to financial damages of an institution even as a result of the perception that you are malicious, you can be sued and lose whatever a jury of your peers deem reasonable. Now I’m not a lawyer and in real life things are probably not so absolute or harsh, however this is often how laypeople perceive things to be.

              In the vacuum of legal structure suing often arises as a control structure.

      3. 2

        I agree that mutability is often needed (and not only because of evil lawyers). But the reference to Golang seems bit confusing to me. Immutability at a programming language level is completely different kind of immutability than immutability of whole system (e.g. blockchain).

        Fashion trends and hypes (like „let’s make everything object-oriented/functional/immutable/web/nosql/asynchronous/etc.!“) are usually harmful. There is no one size that would fit all and no holy grail in the software engineering.

        At programming language or code level both mutable and immutable structures make sense and have its use.

        Beyond that, things in the real world are almost always mutable and removable because lawyers can show up on your doorstep with a court order to make them so, and the court generally doesn’t care about what problems your choice of technology has created for you in complying.

        One possible solution of this problem is that you do not own and control that technology and it is running independently of you, it is not your property, like a grown up child that was raised and educated by you, but now it is living independently and you can not stop it or order it something.

        Whether creating such system is ethical is another question. Sometimes it makes sense…

        1. 1

          I apologize; I wrote something that assumed too much context with my link text. It’s referring not to Go the language but to Go’s current design and implementation of a module source code proxy service, which is designed for immutable and potentially permanently cached module source code (even if the upstream vanishes; in fact explicitly partly to cope with the upstream vanishing). The current official proxy is operated by Google with no apparent point of contact to get them to remove something from the proxy’s cache.

          (I’m the author of the linked to article.)

        2. 2

          Let us first imagine a hypothetical scenario, before going back to system design consequences. First, suppose, hypothetically, that I take some E. coli, genetically engineer it to have a gene that both encodes some basic gentle antibiotic resistance and also a complete copy of Bee Movie, and release my engineered strain into the wild.

          Imagine the cease-and-desist letter that I might receive; what might it demand? I cannot wipe out a single E. coli strain. I have no continuing action which I might stop carrying out. Regardless of what I do, the gene will propagate, and as it naturally mutates, all of the consequences of the mutation are beyond my control. Certainly, I might face more dire legal consequences, but censorship of the information is effectively impossible.

          While I want to therefore recommend that we design systems which have this self-replicating, uncensorable sort of property, I am also wary of the way in which this feels like we are trying to change the world permanently, or speed up a natural process.

          1. 2

            While the Bee Movie is a fantastic one, I’m pretty sure the correct movie would be Osmosis Jones. I think optimistically this hopes to litigate the few large businesses rather than every single host of content, which is probably intractable. If the public wills it, information once released is probably not able to be hidden again. Data mules exist in 2020, as do data dead drops.

          2. 2

            He really needs to watch some Rich Hickey talks on Datatomic.

            1. 1

              Could you elaborate?

              1. 5

                Hickey’s main point is everything is immutable.

                What you had for breakfast this morning will never change.

                You may have something else for breakfast tomorrow, but what you had today will be the same for all eternity.

                You may argue that today’s breakfast as recorded in “The Great Database Of HWAYNE’s breakfasts” may have been erroneously entered as snails…. and should be corrected…

                This is true, but so is the fact that at the time of data entry, “The Great Database of HWAYNE’s breakfasts” believed you ate snails for breakfast on this day.

                And that fact is true and immutable for all time.

                If the only query you make on The Great Database Of HWAYNE’s breakfasts is “what is the latest recorded thing that HWAYNE had for breakfast”, then you may make some optimizations.

                But they are merely optimizations (possibly premature), they are not alterations of the Fabric of Time and Space.

                It’s sounds wondrously philosophical, but he is shipping a practical and successful DB (datatomic) based on this principle.

                1. 3

                  I don’t think that addresses the OP’s thesis. They’re saying that courts can order you to scrub the data. In that case appending a correction is not enough: any employee with database access can see the redacted info. You have to make it impossible for anyone to be able to retrieve the data in any way.

                  1. 1

                    Guess what happens when you delete a file….? I think you might be a tad underwhelmed.

            2. 1

              The problem isn’t as inherent to distributed systems as @voronoipotato makes out. Most of them simply lack the right mechanisms to gather consensus to add the kind of corrective entry that’s needed.

              1. 1

                I agree that it’s not inherent to distributed systems however for distributed systems it is a non-zero challenge, as opposed to the trivial non-distributed systems case. In a distributed system you can merely request that something be deleted, however currently you can’t ensure that any given client will delete it. Maybe if you used something like fountain codes though it could be easier to get a “full delete” since no person is responsible for a full copy, so that people can’t just scrape through the audit log.

              2. 1

                In my experience pure immutability is impractical. Garbage Collection is necessary to avoid consuming unlimited resources. What has proven to be a good compromise is mutable pointers to immutable artifacts. This is something I was excited to see in the IPFS/IPNS solution. It is then a design decision as to whether old pointers remain valid and keep linking to the old artifact.

                1. 0

                  Cue in blockchain issues. We’ve seen an interesting version of this play out with ETH and ETH Classic.