1. 29
  1. 9

    Semantically, who is to say the old string wasn’t deallocated and a new string allocated in the same memory location.

    1. 9

      But my string of Theseus!

      1. 1

        At the risk of being off topic, I burst out laughing at this, and then failed to explain to my family why it was so funny. Good show!

    2. 8

      from the article: “Does this really count as mutability though? Not really.”

      1. 5

        This is not mutation. Mutation would be s[3]=‘d’.

        1. 4

          It is a mutation. If I have a list l, and I call l.append("a"), of course I’ve mutated the list. This is directly analogous, because the resulting string is not a newly-allocated string, but has the same identity as it did before. Semantically the difference can’t usually matter, since python knows there’s only one reference. But operationally, the string has been modified.

          1. 5

            There’s too much fighting around an assumption of “Python forbids this ever happening” for what really ought to be read as “Python may internally do this from time to time as it sees fit, but you the programmer cannot explicitly rely on Python doing it or exposing a predictable way to do it”.

            1. 3

              A mutable object is one whose contents can be changed after creation.

              Here we have a new string that is being created and assigned to the variable. The fact that the memory location is being reused sometimes is quite different from the string object being mutable.

              Only if we were able to create a string object then change parts of it without reassigning the variable would the string be mutable.


              Here s is not being reassigned, it is being mutated. You can’t do this with Python strings.

              1. 3

                The problem is nothing to do with immutability, it is around assumptions on the meaning of identity. Two distinct objects may have the same identity in Python if their lifetimes overlap. This is no different from the following in C:

                const char *str1 = someString();
                uintptr_t id1 = (intptr_t)str1;
                const char *str2 = someOtherString();
                uintptr_t id2 = (intptr_t)str2;

                It is perfectly acceptable for id1 and id2 to have the same values, because the lifetime of str1 ended and so its identity (memory location, in C) may be recycled. This is a necessary property for any language that has a notion of explicit identity and a finite set of values (e.g. a 32-bit or 64-bit integer) used to represent them. Without it, you would exhaust your identity space.

                Languages such as Java and C# avoid directly exposing object identity for precisely this reason. Instead, they provide a hash-code value, which is guaranteed not to change while the object is life but is not guaranteed to be unique (though it is expected to have a low probability of collisions), and an identity comparison (which requires two live objects as arguments and so doesn’t have to worry about stale identities).

                I’m not really a fan of languages allowing identity to be captured. Even the Java-style (Smalltalk-style?) hash code adds some overhead with copying GC, because it requires you to capture the original hash code when you copy an object (if it has been accessed).

          2. 2

            Swift does the exact same thing with its collections, they are strictly immutable with copy on write semantics - e.g. they pass/copy by value, so the semantics are:

            var x = [1,2,3]
            var y = x
            assert(x.count == y.count)
            x.append(4) // x becomes a new distinct array
            assert(x.count == 4)
            assert(y.count == 3)

            but if we do: var x = [1,2,3] x.append(4) // x is mutated in place

            1. 1

              if I remember correctly actually it is hack of python when using s += “some characters” if s has only one reference counting, cpython know it can reuse the same buffer