1. 33
  1.  

  2. 20

    I’ve used C for a while (and I now work on a C/C++ compiler) and I see it this way: you should be very reluctant to start a project in C, but there’s no question you should learn it. It mainly boils down to point 3 in the article: “C helps you think like a computer.”

    But really, it helps you think like most computers you’re going to use. This is why most operating systems are written in C: they are reasonably “close to the metal” on current architectures. It’s not so much that this affords you the opportunity for speed (it does, since the OS or even the CPU is your runtime library), but because you’re not that far removed from the machine as an API. Need to place values in a specific memory location? That’s easy in C. Need to mix in some assembly? Also pretty easy. Need to explicitly manage memory? Also not hard (to do it well is another matter). Sure, it’s possible in other languages, but it’s almost natural in C. (And yes, not all I’ve mentioned is strict C, but it’s supported in nearly all compilers.)

    All this doesn’t mean I like it, but that’s the reality. I’d rather see more variety in computer architectures such that something safer than C were the default. I’m always looking for kind of machine that essentially rejects the C model so much so that C would actually be awful to use. Unfortunately, those things tend to not have hardware.

    1. 10

      I found that learning C was not very helpful in this regard (though I have no doubt this is partly because I was badly taught in university). What finally made it click was learning Forth. C’s attempt at a type system makes it easy to imagine that things other than bytes have reified form at runtime, whereas Forth gives you no such illusion; all that exists is numbers.

      When I came back to C afterwards, everything made so much more sense. Things that used to be insane quirks became obvious implications of the “thin layer of semantic sugar over a great big array of bytes” model.

      1. 6

        I had this same problem, but for me the thing that made everything to click together was using assembler. Pointers (and other C types to some extend) are really wonderful abstraction, even though they are “a bit” thin. And that power of abstraction hides all the machine details if one does not yet know how to look past it.

      2. 5

        The reasons you mention are really why I still use it at all. It comes with me to almost any device I feel like programming, and I do think it sometimes makes sense.

        For example, programming the GBA is quite easy in C, and it doesn’t really matter if someone breaks my game by entering a really long name or whatever (in fact, I love things like that.)

        I hope Rust will one day be my trusty companion but it’s not quite there yet.

        1. 3

          Well Rust is getting there. I sometimes think it would be fun to program such systems but when I think about using C for that I always cringe, so Rust might be a viable option in the future.

      3. 22

        This article is great except for No 3: learning how hardware works. C will teach you how PDP-11 hardware works with some extensions, but not modern hardware. They have different models. The article then mentions computer architecture and assembly are things they teach students. Those plus online articles with examples on specific topics will teach the hardware. So, they’re already doing the right thing even if maybe saying the wrong thing in No. 3.

        Maybe one other modification. There’s quite a lot of tools, esp reimplementations or clones, written in non-C languages. Trend started getting big with Java and .NET with things like Rust and Go making some more waves. There’s also a tendency to write things in themselves. I bring it up because even the Python example isn’t true if you use a Python written in Python, recent interpreter tutorials in Go language, or something like that. You can benefit from understanding the implementation language and/or debugger of whatever you’re using in some situations. That’s not always C, though.

        1. 14

          Agreed. I’ll add that even C’s status as a lingua franca is largely due to the omnipresence of unix, unix-derived, and posix-influenced operating systems. That is, understanding C is still necessary to, for example, link non-ruby extensions to ruby code. That wouldn’t be the case if VMS had ended up dominant, or lisp machines.

          In that way, C is important to study for historical context. Personally, I’d try to find a series of exercises to demonstrate how much different current computer architecture is from what C assumes, and use that as a jumping point to discuss how relevant C’s semantic model is today, and what tradeoffs were made. That could spin out either to designing a language which maps to today’s hardware more completely and correctly, or to discussions of modern optimizing compilers and how far abstracted a language can become and still compile to efficient code.

          A final note: no language “helps you think like a computer”. Our rich history shows that we teach computers how to think, and there’s remarkable flexibility there. Even at the low levels of memory, we’ve seen binary, ternary, binary-coded-decimal, and I’m sure other approaches, all within the first couple decades of computers’ existence. Phrasing it as the original author did implies a limited understanding of what computers can do.

          1. 8

            C will teach you how PDP-11 hardware works with some extensions, but not modern hardware. They have different models.

            I keep hearing this meme, but pdp11 hardware is similar enough to modern hardware in every way that C exposes. Except, arguably, with the exception of NUMA and inter-processor effects.

            1. 10

              You just countered it yourself even with that given prevalence of multicores and multiprocessors. Then there’s cache hierarchies, SIMD, maybe alignment differences (memory is fuzzy), effects of security features, and so on.

              They’d be better of just reading on modern, computer hardware and ways of using it properly.

              1. 6

                Given that none of these are represented directly in assembly, would you also say that the assembly model is a poor fit for modeling modern assembly?

                I mean, it’s a good argument to make, but the attempts to make assembly model the hardware more closely seem to be vaporware so far.

                1. 6

                  Hmm. They’re represented more directly than with C given there’s no translation to be done to the ISA. Some like SIMD, atomics, etc will be actual instructions on specific architectures. So, Id say learning hardware and ASM is still better than learning C if wanting to know what resulting ASM is doing on that hardware. Im leaning toward yes.

                  There is some discrepency between assembly and hardware on highly-complex architectures, though. The RISC’s and microcontrollers will have less, though.

              2. 1

                Not helped by the C/Unix paradigm switching us from “feature-rich interconnected systems” like in the 1960s to “fast, dumb, and cheap” CPUs of today.

              3. 2

                I really don’t see how C is supposed to teach me how PDP-11 hardware works. C is my primary programming language and I have nearly no knowledge about PDP-11, so I don’t see what you mean. The way I see it is that the C standard is just a contract between language implementors and language users; it has no assumptions about the hardware. The C abstract machine is sufficiently abstract to implement it as a software-level interpreter.

                1. 1

                  As in this video of its history, the C language was designed specifically for the hardware it ran on due to its extremely-limited resources. It was based heavily on BCPL, which invented “programmer is in control,” that was what features of ALGOL could compile on another limited machine called an EDSAC. Even being byte-oriented versus word-oriented was due to PDP-7 being byte-oriented vs EDSAC that allowed word-oriented. After a lot of software was written in it, two things happened:

                  (a) Specific hardware implementations tried to be compatible to it in stack or memory models so that program’s written for C’s abstract machine would go fast. Although possibly good for PDP-11 hardware, this compatibility would mean many missed opportunities for both safety/security and optimization as hardware improved. These things, though, are what you might learn about hardware studying C.

                  (b) Hardware vendors competing with each other on performance, concurrency, energy usage, and security both extended their architectures and made them more heterogenous than before. The C model didn’t just diverge from these: new languages were invented (esp in HPC) so programmers could easily use them via something that gives a mental model closer to what hardware does. The default was hand-coded assembly that got called in C or Fortran apps, though. Yes, HPC often used Fortran since it’s model gave them better performance than C’s on numerical applications even on hardware designed for C’s abstract machine. Even though easy on hardware, the C model introduced too much uncertainty about programmers’ intent for compilers to optimize those routines.

                  For this reason, it’s better to just study hardware to learn hardware. Plus, the various languages either designed for max use of that hardware or that the hardware itself is designed for. C language is an option for the latter.

                  “ it has no assumptions about the hardware”

                  It assumes the hardware will give people direct control over pointers and memory in ways that can break programs. Recent work tries to fix the damage that came from keeping the PDP-11 model all this time. There were also languages that handled them safely by default unless told otherwise using overflow or bounds checks. SPARK eliminated them for most of its code with compiler substituting pointers in where it’s safe to do so. It’s also harder in general to make C programs enforce POLA with hardware or OS mechanisms versus a language with that generated for you or having true macros to hide boilerplate.

                  “ The C abstract machine is sufficiently abstract to implement it as a software-level interpreter.”

                  You can implement any piece of hardware as a software-level interpreter. It’s just slower. Simulation is also a standard part of hardware development. I don’t think whether it can be interpreted matters. Question is: how much does it match what people are doing with hardware vs just studying hardware, assembly for that hardware, or other languages designed for that hardware?

                  1. 3

                    I admit that the history of C and also history of implementations of C do give some insight into computers and how they’ve evolved into what we have now. I do agree that hardware, operating systems and the language have been all evolving at the same time and have made impact on each other. That’s not what I’m disagreeing with.

                    I don’t see a hint of proof that knowledge about the C programming language (as defined by its current standard) gives you any knowledge about any kind of hardware. In other words, I don’t believe you can learn anything practical about hardware just from learning C.

                    To extend what I’ve already said, the C abstract machine is sufficiently abstract to implement it as a software interpreter and it matters since it proves that C draws clear boundaries between expected behavior and implementation details, which include how a certain piece of hardware might behave. It does impose constraints on all compliant implementations, but that tells you nothing about what “runs under the hood” when you run things on your computer; an implementation might be a typical, bare-bones PC, or a simulated piece of hardware, or a human brain. So the fact that one can simulate hardware is not relevant to the fact, that you still can’t draw practical assumptions about its behavior just from knowing C. The C abstract machine is neither hardware nor software.

                    Question is: how much does it match what people are doing with hardware vs just studying hardware, assembly for that hardware, or other languages designed for that hardware?

                    What people do with hardware is directly related to knowledge about that particular piece of hardware, the language implementation they’re using, and so on. That doesn’t prove that C helps you understand that or any other piece of hardware. For example, people do study assembly generated by their gcc running on Linux to think about what their Intel CPU will do, but that kind of knowledge doesn’t come from knowing C - it comes from observing and analyzing behavior of that particular implementation directly and behavior of that particular piece of hardware indirectly (since modern compilers have to have knowledge about it, to some extent). The most you can do is try and determine whether the generated code is in accordance with the chosen standard.

                    1. 1

                      In that case, it seems we mostly agree about its connection to learning hardware. Thanjs for elaborating.

              4. 9

                I foresake C as someone who works with it all the time. The author of this post makes the point that it’s worth knowing. I think that’s totally true if you interact with low-level systems.

                I definitely don’t buy the point about distributed systems usually requiring C because of performance reasons. As a distributed systems engineer, most of my peers work in memory safe languages, with a few things along data paths being in C, and every once in a while people may peer into the kernel to reason about an issue in the networking stack, but I’d imagine that most people who bill themselves as distributed systems engineers today are terrible at C and it probably doesn’t hurt them very much.

                When I foresake C I don’t advocate for its ignorance. I advocate for learning all you can about memory corruption, and being honest as a biased human who is building things for other biased humans and with other biased humans. C is a great language to use for learning about bugs and exploitation techniques. There is too much macho bullshit from prominent C engineers, and it catches on with incompetent engineers who make the world a more dangerous place.

                1. 4

                  At my undergrad CS program (NYU, 2002-2006) they taught Java for intro programming courses, but then expected you to know C for the next level CS courses (especially computer architecture and operating systems). Originally, they taught C in the intro courses, but found too many beginning programmers to drop out – and, to be honest, I don’t blame them. C isn’t the gentlest introduction to programming. But this created a terrible situation where professors just expected you to know C at the next level, while they were teaching other concepts from computing.

                  But, as others have stated, knowing C is an invaluable (and durable) skill – especially for understanding low-level code like operating systems, compilers, and so on. I do think a good programming education involves “peeling back the layers of the onion”, from highest level to lowest level. So, start programming with something like Python or JavaScript. Then, learn how e.g. the Python interpreter is implemented in C. And then learn how C relates to operating systems and hardware and assembler. And, finally, understand computer architecture. As Norvig says, it takes 10 years :-)

                  The way I learned C:

                  • K&R;
                  • followed by some self-instruction on GTK+ and GObject to edit/recompile open source programs I used on the Linux desktop;
                  • read the source code of the Python interpreter;
                  • finally, I ended up writing C code for an advanced operating systems still archived/accessible here which solidified it all for me.

                  Then I didn’t really write C programs for a decade (writing Python, mostly, instead) until I had to crack C back open to write a production nginx module just last year, which was really fun. I still remembered how to do it!

                  1. 3

                    One of the things I loved about my WSU CS undergrad program 20 years ago is that in addition to teaching C for the intro class, it was run out of the EE department so basic electronics courses were also required. Digital logic and simple circuit simulations went a long way towards understanding things like “this is how RAM works, this is why CPUs have so much gate count, this is why you can’t simply make up pointer addresses”

                    1. 2

                      they taught Java for intro programming courses, but then expected you to know C for the next level CS courses (especially computer architecture and operating systems).

                      It’s exactly like this at my university today. I don’t think there’s any good replacement for C for this purpose. You can’t teach Unix system calls with Java where everything is abstracted into classes. Although most “C replacement” languages allow easier OS interfacing, they similarly abstract away the system calls for standard tasks. I also don’t think it’s unreasonable to expect students to learn about C as course preparation in their spare time. It’s a pretty simple language with few new concepts to learn about if you already know Java. Writing good C in a complex project obviously requires a lot more learning, but that’s not required for the programming exercises you usually see in OS and computer architecture courses.

                      1. 1

                        I think starting from the bottom and going up the layers is better. Rather than being frustrated as things get harder, you will be grateful for and know the limitations of the abstractions as they are added.

                      2. 3
                        1. 4

                          The core of our product is written in C. We interact heavily with internal kernel structures, require very fine-grained control over allocation lifetimes, and need to do a lot of low-level bit groveling in a soft real-time environment. There isn’t a better language out there for that sort of thing.

                          (Maybe C++, but there’s a lot to learn in C++, which I haven’t used in years…but I use C every day.)

                          1. 4

                            Amen.

                            I’m doing a project in C right now. I want it to run on nearly any computer, without requiring a runtime environment. The aesthetic goal is “Just Plain C” (oh, valgrind, how I love thee) so I even wrote my own linear regression routine from scratch. (Numerical precision isn’t terribly important for this application.) It has taken me longer than it would in a higher-level language. I also don’t think it’s likely to be fast, until highly optimized, since I’m using naive data structures when I’d have better tools in a higher-level language (or even C++, with STL). But it’s been a great learning experience.

                            C is also useful if you want to go into security, which may be one of the few fields left– corporate “data science” is more often using off-the-shelf ML than it is true innovation– where you can reach 35 and not need to go into management.

                            1. 4

                              I want it to run on nearly any computer, without requiring a runtime environment.

                              The present article misses this point entirely. This along with the closely aligned ability to wrap a c library to call from any other language with an FFI is another strong selling point of c.

                            2. 2

                              “There isn’t even a cute C animal in C’s non-logo on a C decal not stuck to your laptop.”

                              It would obviously have to be a sea lion. Get it? C Lion, ah ha ha ha! :)
                              And C++ would be two sea lions.