1. 22
  1.  

  2. 4

    I think every programmer that wants to be good at any language should learn the basics of memory and pointers. Learning C is a good way to do this.

    Understanding what happens inside the language is always important. Just thinking it as a “black box of magic” is not a good idea.

    1. 2

      Disagree. Understanding pointers will not always help.

      Why is it not a good idea for me to assume that evaluation semantics are upheld?

      1. 5

        Semantics are one thing, performance is another. If you don’t have some sense of what is going on “under the hood”, you can’t effectively diagnose performance problems. For instance, say we have a Python function that is supposed to read chunks of data from a socket and concatenate them onto one big string. The naive way to do it would be something like

        long_string = ''
        while True:
            chunk = sock.read(100)
            if not chunk:
                break
            long_string += chunk
        

        But since python strings are immutable, this will allocate a new string, copy the old string, concatenate the new chunk, and then drop the reference to the old string, which adds a lot of overhead. A more efficient way would be

        string_buffer = io.StringIO()
        while True:
            chunk = sock.read(100)
            if not chunk:
                break
            string_buffer.write(chunk)
        

        This, on the other hand, concatenates the string onto the end “in-place” as it were. So it avoids a lot of unnecessary copying and deleting of references. Having a sense of how memory is handled by the runtime is important to understanding why the second way is more efficient.

        1. 3

          Interesting example, because the python community is already very aware of that performance quirk, and already has standard workarounds. In this case, something like:

          string_list = []
          while True:
            chunk = sock.read(100)
            if not chunk:
              break
            string_list.append(chunk)
          long_string = ''.join(string_list)
          

          But, as it turns out, the entire premise here might be flawed. Because Python now performs string concats like that in O(n) time in most cases. Specifically, if the base string has only one reference to it, then performing “+=” will attempt an in-place extend (in cPython, at least).

          Which means that all three of our code snippets should have the same algorithmic performance.

          Which is all a roundabout way of saying: no, understanding pointers will not help you write python. It will help you write a python interpreter or compiler, but it might lead you well astray trying to write performant code in the actual language. Every layer of abstraction requires its own set of domain knowledge, and should be treated as such.

          1. 1

            That’s a weird conclusion to draw. The performance of += isn’t opaque to the programmer. As you said, the optimization doesn’t always work. So the programmer very much needs to be aware of that quirk.

            1. 3

              I agree that the programmer needs to be aware of that quirk, but I disagree that being aware of the quirk is at all related to understanding pointers, or to understanding immutable strings in other language environments. Preconceived notions of immutable strings can, in fact, point you to suboptimal performance (since the list join is still generally the fastest of those three approaches, and the StringIO the slowest). Instead, build your understanding of the language based on the appropriate level of abstraction.

            2. 1

              I wasn’t talking specifically about understanding pointers, I was just talking about generally “Understanding what happens inside the language.” The fact is that you have to know what the runtime is doing in order to write efficient code for it.

          2. 2

            Understanding the big picture of things happening under the framework/language/vm/OS at basic levels is helpful. Maybe not necessary, but definitely helpful.

            And yeah, any knowledge is not always helpful, but if you want to become a better programmer in any language, gaining understanding what happens under the hood is important.

        2. 4

          I’ve actually been looking for a document like this for some time. I probably could have benefited from it when I was first learning C 25 years ago (my coworker observing me write this says: “Sos como un abuelito hacker!”), but also it does what I think is a good job of explaining the differences between the C virtual machine and the pointer-safe virtual machines today’s popular languages use.

          sgtatham is also sufficiently experienced with C that he manages to slip a lot of hints about best practices into what is, nominally, a purely descriptive document.

          1. 1

            Or maybe the difference of using a virtual machine and running native code, as compiled C doesn’t actually run on a VM like today’s popular languages.

            1. 1

              I didn’t mean C was compiled to bytecode; I meant that the C standard defines an abstraction of the underlying hardware which is mostly the same across many different physical machines.