1. 47
  1.  

  2. 32

    I think there are two conflicting interests being confused in both the essay and the discussion here. First, whether or not people should, right now, learn C as a systems language, and second, whether or not we want to live in a world where C is the premier systems programming language.

    As /u/nanxiao points out: “from kernel to command line tools, [UNIXes and their tools] are almost [all] implemented in C.” It’s true; for high-performance portable code, C is king, and if you want to understand or hack on UNIX kernels and userlands, you need to learn C.

    But why is C the king in this space? Because CPUs are optimized for C just as much as C is optimized for those CPUs.

    As David Chisnall says in his excellent essay C Is Not a Low-level Language, “Your computer is not a fast PDP-11.” /u/verisimilitude says in this thread that “The C programming language is directly responsible for countless damning flaws in modern software”. I don’t entirely agree; rather, I think the co-evolution of modern processor architectures and the C language is responsible for those flaws. As Chisnall says:

    The root cause of the Spectre and Meltdown vulnerabilities was that processor architects were trying to build not just fast processors, but fast processors that expose the same abstract machine as a PDP-11. This is essential because it allows C programmers to continue in the belief that their language is close to the underlying hardware.

    As more and more software was written in C, CPUs leaned more and more into the paradigm of “pretend to be a PDP-11”, and C compilers evolved ever better optimizations for fast PDP-11 clones. The last thirty years have been a vicious cycle of software architecture and CPU architecture reinforced by short-cycle marketing (the “megahertz wars”, standardized synthetic benchmarks, and hype). /u/nanxiao calls out Go and Rust as C alternatives, but they aren’t really. They do expose some additional facilities for a nonserial model, but so does C11.

    The discussion of Lisp below is more apt, but still not quite there. In my opinion, we need something more like Erlang, but designed to run much closer to the bare metal - a BEAM-as-OS model, if you will - in addition to better vector processing support in modern languages (and in C, for that matter).

    So, yes, you should learn C if you want to hack on UNIX. It might also be a good idea for us as an industry to begin to phase C out. If we start now, perhaps in another 30 years we can shake the legacy of the PDP-11.

    1. 4

      Do modern CPUs have features which are the result of trying to be a PDP-11? What features of C make it particularly suited to a PDP-11?

      1. 7

        Two, quick examples. First is the stack where data flows in toward the stack pointer with no CPU checks. This led to many crashes and hacks. On Burroughs MCP (1961), operations like that were bounds-checked by the CPU. On MULTICS (1969), the data flowed away from the stack pointer into newly allocated memory. The PDP-11 didn’t use a reverse or otherwise safe stack. Implementing one would cost performance on a slow machine. So, the former MULTICS guys that invented C used a PDP-11-compatible stack that maximized performance. Things keeping compatibility with C maximizing performance do the same thing.

        Another example is null-terminated strings. MULTICS used prefix strings. They’re safer. The PDP-11 used null-terminated strings. It was extremely limited in memory. The C language used them to save memory on PDP-11 and based on Ritchie’s belief that null-terminated strings were superior. So, if accelerating C strings, CPU vendors probably use null-terminated strings instead of bounds-checked prefix strings.

        1. 6

          Have you read C Is Not A Low-Level Language? A lot of my opinion on this topic is based on that essay.

        2. 6

          But why is C the king in this space? Because CPUs are optimized for C just as much as C is optimized for those CPUs.

          Another reason it’s king in this space is that it’s pretty simple to keep a 98% accurate mental model of what the code is going to do in your head while working on it. Sure that last 2% is a bear, but the language itself only takes a few days to understand and model.

          By contrast, a lof of languages that are “better” are extremely difficult to get used to or even model successfully. APL and friends are very concise but have alien notation. SQL is quite readable but the mental model for how a medium-sized query actually runs can be almost impossible without hours of meditation (arguably a feature and not a bug, but still). Ruby has expressive syntax and is remarkably comfortable, but this comes with a large space one must get used to. Learning how to predict what Rust compilers will consider to be valid code is no small thing.

          I think that that’s something people miss–C itself is not a great language, but the marginal cost for picking it up enough to be dangerous (and yes, you in the back, I chose that phrasing knowing full well about the security issues that plague most C codebases) is quite low and the rewards for doing so quite high.

          1. 5

            Your choice of languages here is odd, as none of these are generally suggested as C replacements for the domains in which C really shines, like embedded software, where Ada is the runner-up and Rust is trying to push its way in; OS development, where Rust is beginning to make inroads (yes, only on small projects, but that is due to its immaturity rather than its nature); and high-performance network servers, where Java perhaps even beats out C and Go is a huge contender despite its relative lack of a developed ecosystem.

            1. 3

              APL is a strawman. If we start comparing languages specifically meant for systems programming, Ada has a much clearer syntax.

              Also, I can’t see how learning to predict what a compiler will consider a valid input is of any relevance. The compiler, it will tell you if the input is valid or not. What’s important is to learn to predict what will work correctly when executed.

            2. 3

              But why is C the king in this space? Because CPUs are optimized for C just as much as C is optimized for those CPUs.

              That is true. However, the real reason was probably that UNIX was implemented in C. UNIX was also designed for the PDP-11 hardware. C took off riding its waves. Lots of people build on it and copy it. Compiler vendors put money into optimizing those apps. The CPU vendors did, too. Eventually we have GNU compiler show up which probably added to it.

              So, it looks like at least three things starting with UNIX getting people the most out of their cheap minicomputers.

              1. 6

                I absolutely agree - the chicken (or egg ;) ) of this chicken and egg cycle was UNIX, no doubt about it. But the cycle kicked in very shortly thereafter; the emergence of GNU was a consequence of the success of UNIX which was to some degree a consequence of people building hardware for it which was a consequence of it working really well on existing hardware, portably.

              2. -1

                Thank you for that NoraCodes. That was probably the first comment I’ve ever read on a tech board that I 100% agree with.

              3. 11

                Learn it? Yes. Use it? No.

                Last but not least, because C is so “low-level”, you can leverage it to write highly performant code to squeeze out CPU when performance is critical in some scenarios.

                This is true for every systems programming language in existence and is frequently easier to do in other languages.

                1. 1

                  This. The article makes a good argument that you should be able to read C so you can look at the implementation of Unix tools. There is no good argument for writing C in the article.

                  1. 1

                    The problem is that people tend to have limited capacity for remembering things, so they use what they learn. (Or, rather, swiftly un-learn what they never use.) Therefore, an argument for learning X is often the same as an argument for using X.

                  2. 1

                    What are some examples of high-performance code in other systems programming languages?

                    I notice a distinct lack of, say, large-scale number crunching outside of Fortran and C.

                    1. 1

                      Ada and Rust come to mind. Ada’s used in time- and performance-critical applications in aerospace. Rust’s metaprogramming even lets it use multicore, GPU’s, etc better. D supports unsafe if GC is slow. I think Nim does, too, with it compiling to C. People use those for performance-sensitive apps. Those would be the main contenders.

                      One I have no data on is Amiga-E which folks wrote many programs for. On Lisp/Scheme side, PreScheme was basically for making “a C or assembly-level program in Lisp syntax” that compiled to C. It didn’t need any of the performance-damaging features of Lisp like GC’s or tail recursion. Probably comparable to C programs in speed.

                      So, there’s a few.

                      1. 1

                        What are some examples of high-performance code in other systems programming languages?

                        Pretty much anything written in anything. C isn’t magically fast and it’s easy to match or beat it in C++, Rust, D, Nim, …

                        I notice a distinct lack of, say, large-scale number crunching outside of Fortran and C.

                        Fortran, sure. But C? I have a feeling that C++ is much more used for that. CERN basically runs on the stuff. Fortran has the pointer aliasing advantage, but again, any language with templates/generics will generate code that’s just as fast.

                    2. 21

                      I disagree. The C programming language is directly responsible for countless damning flaws in modern software and can be credited for the existence of the majority of the modern computer security industry.

                      You can write system software in many languages, including Lisp. For a less outlandish example, Ada is specifically designed for producing reliable systems worked on by teams of people in a way that reduces errors at program run time.

                      I find it amusing to mention UNIX fundamentals as a reason to learn C, considering UNIX is the only reason C has persisted for so long anyway. Real operating systems focused largely on interoperation between languages, not funnelling everything through a single one; Lisp machines focused on compilation to a single language, but that language was well-designed and well-equipped, unlike C.

                      Last but not least, because C is so “low-level”, you can leverage it to write highly performant code to squeeze out CPU when performance is critical in some scenarios.

                      It’s actually the opposite. The C language is too high-level to correspond to any single machine, yet too low-level for compilers to optimize for the specific machine without gargantuan mechanisms. It’s easier to optimize logical count and bit manipulations in Common Lisp than in C, because Common Lisp actually provides mechanisms for these things; meanwhile, C has no equivalent to logical count and its bit shifting is a poor replacement for field manipulations. Ada permits specifying the bit structures of data types at a very high level, while continuing to use abstract parts, whereas large C projects float in a bog of text-replacement macros.

                      Those are my thoughts on why C isn’t worth learning, although this is nothing against the author.

                      1. 5

                        Unix-like operating systems aside, are people like D. Richard Hipp or Howard Chu doing it wrong and simply wasting their time, then?

                        1. 12

                          Your question implies all or nothing type of answer. They could be making a bad choice in language while otherwise doing great design, coding, and testing. There’s a lot of talented people that attract to C. There’s also sometimes justification such as available time/talent/tooling or just making stuff intended for adoption by C programmers.

                          What few studies that have been done always showed C programmers were less productive and their code screwed up more. The language handicaps them. More expressive languages with more safety that are easier for compilers to understand are a better solution.

                          1. 3

                            The question was rhetorical. I.e. for the aforementioned Howard Chu, C was the obvious and only choice to write LMDB in.

                            1. 4

                              Sometimes C simply is the only viable language to write a system in. Databases and programs on microcontrollers with less than 16 KB of ram are such examples, because in those cases every bit of memory counts.

                              Alltough I would definitely not use C blindly, it is still worth learning. But I do think that it is a bad idea to learn it as your first language.

                              1. 7

                                Forth would probably be an even better choice for a microcontroller with less than 16KB of RAM, to be honest …

                                1. 3

                                  I would argue Ada is just as well suited to microcontrollers with less than 16 kB of RAM – perhaps even more than C is.

                                  1. 3

                                    Only if you can do bitwise operations directly on specific cpu registers. With Ada, these operations are not always available, while C nearly always has them.

                                    They are vital if you want to make a logic output pin high or low.

                                    1. 3

                                      That is no inherent fault of the language, however - as with much of this discussion, the conflation of language with ecosystem obscures meaning. As I mention above, C is only king of the scrap heap because we’re locked in a vicious cycle of building CPUs that execute C better, building compilers that compiler better for those CPUs, etc. Similarly we have a vicious cycle of C compatibility in software. Everything is compatible with C because C is compatible with everything.

                                      1. 4

                                        I’d love to agree with you, but then we would both be wrong. Furthermore, this is a very short sighted opinion.

                                        First: There is an astonishing amount of CPU’s that are mostly designed to be cheap, fast or power efficient, and those are certainly not designed to execute specifically C programs better. If they are designed towards a specific programming related goal, then they are optimized towards executing the most-used instructions in their specific instruction-set. That is, if they are optimized for anything else than cost at all.

                                        Second: It doesn’t matter how you design a CPU, somewhere you’ll have to deal with bits and bits are wires which you pull high or low. You’d also have to pull the data off the CPU at some point in time. The simplest, cheapest and most efficient method of doing so, is by directly tying a wire that goes off-chip into some part of a register in the CPU.

                                        Third: I think that the emergence of C is a consequence of how our technology is built, how it functions and what is most efficient to implement in a scilicon-die and not the other way around. The reason that C is compatible with everything is probably because it is easy to use and implement for everything. I think this is because there is a deep connections between how electronic circuits work, the way that C is specified and the operations you can perform in C.

                                        I agree with you that the “CPUs are built for C and C is created for CPUs” causation goes both ways, but it is definitely way stronger in the direction of “C is created for CPUs” than the other way around.

                                        Keep in mind that this article is specifically about C as a systems language, therefore we don’t care about C as a language for applications (In fact, once you are out of the systems doamin, you’d probably be better off using something else). However it will be impossible for certain applications to ignore the functioning of their underlying systems down to their (electro-)mechanical levels (e.g. database systems).

                                        1. 2

                                          I’d love to agree with you, but then we would both be wrong.

                                          That’s entirely unnecessary and serves only as an insult. Please don’t.

                                          First: There is an astonishing amount of CPU’s that are mostly designed to be cheap, fast or power efficient, and those are certainly not designed to execute specifically C programs better.

                                          These are effectively orthogonal concerns. In the embedded space, consider the (relative) failure of the Parallax Propeller compared to the AVR family of microcontrollers. Comparable options exist in the two product lines in terms of power usage, cost, and transistor count, but AVR is a fundamentally serial architecture while Propeller requires the use of multiple threads to take advantage of its transistor count efficiently. A language optimized for this does not have widespread adoption in the embedded space, where aside from Ada and a bit of Rust, C is the absolute king. This is almost certainly a major contributing factor in the relative success of AVR over Propeller (in addition to the wider part range and backing from a major semiconductor manufacturer).

                                          Second: It doesn’t matter how you design a CPU, somewhere you’ll have to deal with bits and bits are wires which you pull high or low. You’d also have to pull the data off the CPU at some point in time.

                                          And you don’t need C to do either of those things - in fact, you can’t do them in “C”. You need someone to write some machine code at some point to enable you to do those things, and if you can package that machine code up into a C library or compiler intrinsic you can package it up into a Rust or Ada library just as well.

                                          Third: I think that the emergence of C is a consequence of how our technology is built, how it functions and what is most efficient to implement in a scilicon-die and not the other way around.

                                          This is potentially possible, but I suggest you take a look at C Is Not A Low Level Language which discusses the vicious cycle of C and CPU better than I can here.

                                          One reason I don’t think this is true is because there are examples of using existing silicon technology to build non-serial-like computers; GPUs are a huge one, as are FPGAs and other heterogeneous computing technologies. Those fundamentally cannot be programmed like a serial computer, and that makes them less accessible to even many very skilled systems programmers.

                                          it will be impossible for certain applications to ignore the functioning of their underlying systems down to their (electro-)mechanical levels (e.g. database systems).

                                          I hope I didn’t imply that there will ever be a point at which “bare metal engineering” isn’t needed. I’m not saying that low level programming is not essential; I’m saying that you can do low level programming without C in principle, and often even in practice.

                                          1. 2

                                            That’s entirely unnecessary and serves only as an insult. Please don’t.

                                            It wasn’t an insult. It was me stating that I’d love to live in a better world in which I could agree with your viewpoint, but also stating that your viewpoint, does not comply with the reality at hand.

                                            Not everything is, or is meant as, an insult, and you’d be wise to assume nothing is an insult until it undeniably is. Nothing I’ve written so far is an insult, and in fact, I’d rather walk away from a discussion before insults are being made. I won’t waste my time on discussions that serve the purpose of reaffirming ones, or my own, beliefs.

                                            This is almost certainly a major contributing factor in the relative success of AVR over Propeller (in addition to the wider part range and backing from a major semiconductor manufacturer).

                                            I disagree. I think that AVR’s success is mostly due to the fact that in the embedded space, interrupts are more important than multi-threading is. Most embedded jobs simply don’t need multiple threads. It’s not the C language, but economics that is to blame.

                                            And you don’t need C to do either of those things - in fact, you can’t do them in “C”. You need someone to write some machine code at some point to enable you to do those things, and if you can package that machine code up into a C library or compiler intrinsic you can package it up into a Rust or Ada library just as well.

                                            Ah but here’s the problem. You’d need to write some extra machine code to set bits in a certain register. That extra machine code would require extra cycles to be executed.

                                            I’d also like to point out that when you are using C, you don’t need the extra machine code at all! In the embedded- or system-space, you can simple look up the address of a register in the datasheet or description of the instruction set, put that number into your program, treat it as a pointer and then read from or write to the address your pointer is referring to.

                                            So you just don’t need extra machine code in C. You just “input the number and write to that address” in C. This is why it’s king. A lot of other languages simply can’t do that.

                                            This is potentially possible, but I suggest you take a look at C Is Not A Low Level Language which discusses the vicious cycle of C and CPU better than I can here.

                                            I’ve read it, but that does not mean that I agree with that viewpoint. I still think that C is a low level language. Mostly because of the “input the address of a register as a pointer and treat it regularly”-approach C has taken. As for the vicious cycle, I’ve stated my thoughts on that in my previous post quite clearly with:

                                            I agree with you that the “CPUs are built for C and C is created for CPUs” causation goes both ways, but it is definitely way stronger in the direction of “C is created for CPUs” than the other way around.

                                            One reason I don’t think this is true is because there are examples of using existing silicon technology to build non-serial-like computers; GPUs are a huge one, as are FPGAs and other heterogeneous computing technologies. Those fundamentally cannot be programmed like a serial computer, and that makes them less accessible to even many very skilled systems programmers.

                                            First of all: GPU’s are multiple serial computers in parallel. It doesn’t matter how you look at it, their data-processing is mostly serial and they suffer from all the nastiness that regular serial computers do when you have to deal with concurrency.

                                            Second: FPGA’s simply aren’t computers. They are circuits. Programmable circuits that you can use to do computations, but they are circuits nonetheless. Expecting that you can efficiently define circuits with C, is like expecting that you can twist a screw in with a hammer: “You might accomplish your goals, but you will have a crude result or a very hard time”.

                                            Third: I’ve been making the argument that C is mainly a consequence of how CPU’s work and (mostly, see my above statement that is also in my previous post) not the other way around.

                                            I hope I didn’t imply that there will ever be a point at which “bare metal engineering” isn’t needed. I’m not saying that low level programming is not essential; I’m saying that you can do low level programming without C in principle, and often even in practice.

                                            You did give me the impression that you were implying that bare-metal engineering isn’t needed and you confirmed that impression by stating that “you’d just need to write some machine code” to get hardware level access in other languages. The whole point of C was that you simply have (if you know the address that is) your hardware access by just inputting the address of where your register to communicate with the hardware is, without having the need for extra machine code.

                                            C provides you with a level of abstraction for writing machine code, without needing to know the machine code and without needing extra machine code to accomplish your goals.

                                            That’s why I think that it is a low level language and why I also think that it is still worth learning as a systems language.

                                            PS: I am by no means a fan of C, but I do am a fan of using the right tool for each problem as it makes your life, and the problem much easier. In the (embedded) systems world, I think that C is often simply the right tool to use.

                                            1. 2

                                              So you just don’t need extra machine code in C. You just “input the number and write to that address” in C. This is why it’s king. A lot of other languages simply can’t do that.

                                              Ada does it better. Ada has attribute for Address, Size, and also permits giving specific meaning to its enumeration types. I’ve never used all of this with Ada, but I believe it would look like this:

                                              declare
                                                 type Codes is (This, That, Thus);
                                                 for Codes use (This => 1, That => 2, Thus => 17);
                                                 for Codes'Size use 8;
                                                 Register : Codes := This
                                                    with Address => 16#0ABC#;
                                              begin
                                                 ...
                                              end;
                                              

                                              So, this is a high-level, type-safe, and simple way to do what you just described, but you usually won’t need to do this and so suffer none of the drawbacks.

                                              C is worse than useless, because it deceives people such as yourself into believing it has any value whatsoever or is otherwise at all necessary.

                                              1. 1

                                                Nice! I didn’t know about this.I’ll definitely look into Ada more when I get the chance.

                                                C is worse than useless, because it deceives people such as yourself into believing it has any value whatsoever or is otherwise at all necessary.

                                                And yet I still disagree with you here. There’s are reasons why C is king and why Ada isn’t.

                                        2. 2

                                          Which is why Ive encouraged authors of new languages to use its data types and calling conventions with seemless interoperability. It took decades for C to get where it is. Replacing it, if at all, will be an incremental process that takes decades.

                                          Personally, I prefer just developing both new apps in safer languages and compiler-assisted security for legacy C like Softbound + CETS. More cost-effective. Now, C-to-LangX converters might make rewrites more doable. Galois is developing a C-to-Rust tool. I’m sure deep learning could kick ass on this, too.

                                          1. 1

                                            I think it’s not realistic to assume that C will be replaced anytime soon, not even in decades. C will still be around, long after Rust has died.

                                            I also think it’s a pipe dream to assume that other programs can transform C-programs into some safer language while still preserving readability and the exact same behaviour. What you describe has been studied and is known in scientific literature as “automatic program analysis” and is closely related to the halting problem, which is undecidable. This technology can certainly make many advances, but ultimately it is doomed to fail on a lot of cases. We’ve known this since the the 1960’s. When it fails, you will simply need knowledge about how C works.

                                            Furthermore: Deep learning is akin to “black magic” and people simply hate any form of “magic”. At some point you want guarantees. Most traditional compilers give you those because their lemma’s and other tricks are rooted in algebra’s that have been extensively studied before they are put into practice.

                                            1. 1

                                              “I think it’s not realistic to assume that C will be replaced anytime soon, not even in decades. C will still be around, long after Rust has died.”

                                              I agree. There will probably either always be more C than Rust or that way for a long time.

                                              “ it’s a pipe dream to assume that other programs can transform C-programs into some safer language while still preserving readability and the exact same behaviour. “

                                              There’s already several projects that do it by adding safety checks or security features to every part of the C program with risk that their analyses can’t prove safe. So, your claim is already false. The research is currently focused on further reducing the performance penalty (main goal) and covering more code bases. Probably needs a commercial sponsor with employees that stay at it to keep up with the second goal. They already mostly work with many components verifiable if anyone wanted to invest the resources into achieving that.

                                              “Deep learning is akin to “black magic” and people simply hate any form of “magic”. At some point you want guarantees. “

                                              Sure they do: they’re called optimizing compilers and OS’s. They trust all the magic so long as it behaves the way they expect at runtime. For a transpiler, they could validate it by eye, with test generators, and/or with fuzzing against C implementation comparing outputs/behavior. My idea was doing it by eye on a small pile of examples for each feature. Once it seems to work, add that to automated test suite. Put in a bunch of codebases with really different structure or use of the C language, too.

                                              1. 1

                                                Sure they do: they’re called optimizing compilers and OS’s. They trust all the magic so long as it behaves the way they expect at runtime. For a transpiler, they could validate it by eye, with test generators, and/or with fuzzing against C implementation comparing outputs/behavior. My idea was doing it by eye on a small pile of examples for each feature. Once it seems to work, add that to automated test suite. Put in a bunch of codebases with really different structure or use of the C language, too.

                                                • In a lot of area’s, comparing it by eye and testing with fuzzers is simply not going to fly (sometimes in the most literal sense of the word fly).
                                                • An automated test suite with tons of tests can also slow development down. I’m all for tests, but I am against mindlessly adding a test for each and every failure you’ve encountered.
                                                • What operating systems do is explainable with relative ease. What most deep-learning systems do is not. If you want guarantees for 100% of all cases, deep learning is immediately out of the picture.

                                                The research is currently focused on further reducing the performance penalty (main goal) and covering more code bases.

                                                Herein lies the problem. These tools cover some, but not all codebases. We have known for almost a century that a tool that covers all possible codebases is impossible to construct. See the “Common pitfalls” section on the halting problem on Wikipedia for a quick introduction. You will see that my argument is not false and will still hold, and that means that it is still useful to learn C (which is the main topic under discussion here).

                                                1. 2

                                                  The by eye, feature by feature testing, and fuzzing in my comment were for the transpiler’s development, not normal programs. You were concerned about its correctness. That would be my approach.

                                                  I’m not buying the halting problem argument since it usually doesn’t apply. It’s one of most overstretched things in CompSci. The static analyses and compiler techniques work on narrow analyses for specific traits vs the broader goal halting problem describes. They’ve been getting lots of results on all kinds of programs. If the analysis fails or is infesible, the tools just add a runtime check for that issue.

                                                  1. 1

                                                    The by eye, feature by feature testing, and fuzzing in my comment were for the transpiler’s development, not normal programs. You were concerned about its correctness. That would be my approach.

                                                    Formal verification is the route the real language and compiler-development teams take (See clang and ghc for example). Fuzzing is something they use, but usually as an afterthought.

                                                    I’m not buying the halting problem argument since it usually doesn’t apply. It’s one of most overstretched things in CompSci. The static analyses and compiler techniques work on narrow analyses for specific traits vs the broader goal halting problem describes. They’ve been getting lots of results on all kinds of programs. If the analysis fails or is infesible, the tools just add a runtime check for that issue.

                                                    Fair enough, but I’d still like to point out that throwing the “halting problem”-argument, because it usually doesn’t apply, and stating “we have to make sure something works on all kinds of codebases”, are two polar opposites of reasoning methods.

                                                    If you are reasoning like this: “Okay, we know this is impossible because of the results Turing provided us about the halting problem, but lets see how close we can get to perfection”, or “lets see if we can build something usefull for 80% of cases”, then I approve of the approach and then I’ll agree. In this case you probably would also agree with me that there is still value in learning C as a systems language.

                                                    But if your reasoning is along the following lines: “Look this works on nearly all codebases practice and therefore we don’t have to learn C as a systems language”, then you are just simply dead wrong. It’s the last 5 or 10% that where algorithms, ideas and projects fail, and not the easy first 80%.

                                                    You really require Feynman’s kind of “kind of utter honesty with yourself” when discussing these kinds of topics, because it is very easy to fool yourself into believing in some favourable picture of an ideal where technology or skill x is not needed any more.

                                                    1. 1

                                                      “Formal verification is the route the real language and compiler-development teams take (See clang and ghc for example).”

                                                      They don’t use mathematical verification for most compilers. Only two that I know of in past few years. I’m not sure what V&V methods most compiler teams use. I’d guess they use human review and testing. Maybe you meant formal as in organized reviews. What you said about fuzzing is easily proven true given all the errors the fuzzers find in… about everything.

                                                      “ stating “we have to make sure something works on all kinds of codebases””

                                                      You keep saying all. Then, you argue with your own claim like I said it. I said “Put in a bunch of codebases with really different structure or use of the C language, too.” As in, keep testing it on different kinds of code bases to improve it’s general applicability. We don’t have to eliminate all C or C developers out there. That I thought so was implied by me advocating compiler techniques for making remaining C safer.

                                                      “or “lets see if we can build something usefull for 80% of cases”, then I approve of the approach and then I’ll agree. In this case you probably would also agree with me that there is still value in learning C as a systems language.”

                                                      Basically that. Except, like HLL’s vs C FFI’s, I’m wanting the number of people that need that specific low-level knowledge to go down to whatever minimum is feasible. People that don’t need to deal with internals of C code won’t need to learn C as a systems language: just how to interface with it. People that do need to rewrite, debug, and/or extend C code will benefit from learning C.

                                                      If a competing language succeeds enough, it might come down to a tiny number of specialists or just folks cross-trained in that which can handle those issues. Much like assembly, proof engineering, and analog are today.

                                                      “You really require Feynman’s kind of “kind of utter honesty with yourself” “

                                                      I had to learn that lesson about quite a few things. My claims have quite a few qualifiers with each piece proven in a field project. I’m doing that on purpose since I’ve been burned on overpromising in languages, verification, and security before. I don’t remember if I read the essay, though. It’s great so far. Thanks for it.

                                                      1. 2

                                                        I’m wanting the number of people that need that specific low-level knowledge to go down to whatever minimum is feasible.

                                                        I fully agree with that goal, but I question whether or not you are still at a “systems-level” when you can ignore the specific low-level knowledge.

                                                        People that don’t need to deal with internals of C code won’t need to learn C as a systems language: just how to interface with it.

                                                        I guess, our whole argument boils down to my “belief” that if you are only interfacing with C, you probably have left the systems-domain behind already, because you are past the level where you need to be aware of what your bits, registers, processors, caches, buffers, threads and hard drives are doing.

                                                        If you are on that level, then I totally agree with you that you should use something else than C, unless your utility is run millions of times per day all around the world.

                                                        I’m doing that on purpose since I’ve been burned on overpromising in languages, verification, and security before.

                                                        What I want you to take away from this discussion is something similar: You should not “over-blame” C for all kinds of security vulnerabilities (amongst other issues). I agree that the language has certain aspects that make it more inviting to all kinds of issues. In fact I even dare to go as far as to state that C is a language that not just “invites” issues, but that it almost “evokes” those issues.

                                                        However I also think that the business processes that cause the issues and vulnerabilities which are often attributed to C, are an even bigger (security) problem than C in and of itself is.

                                                        I don’t remember if I read the essay, though. It’s great so far. Thanks for it.

                                                        You’re welcome! I’m glad you like it.

                                                        I’ve also posted the essay as a story. I was surprised that it wasn’t already on here.

                                                        1. 2

                                                          re low level knowledge

                                                          I think I should be more specific here. We are talking about systems languages. The person would need to understand a lot of concepts you mentioned. They just don’t use C to do it. If anything, not using C might work even better since it targets an abstract machine that’s sort of like current hardware and in other ways (esp parallel stuff) nothing like it. Ada w/ ParaSail or Rust with its parallelizers might represent the code in a way that keeps systems properties without C’s properties. So, they still learn about this stuff if they want things to perform well.

                                                          From there, they might need to integrate with some C. That might be a syscall. That might be easy to treat like a black box. Alternatively, they might have to call a C library to reap its benefits. If it’s well-documented w/ good interface, they can use it without knowing C. If it’s not, they or a C specialist will have to get involved to fix that. So, in this situation, they might still be doing systems programming considering low-level details. They just will need minimal knowledge of C’s way of doing it. That knowledge will go up or down based on what C-based dependencies they include. I hope that makes more sense.

                                                          re business processes

                                                          The environment and developers are an important contributor to the vulnerabilities. That C has enough ways to trip up that even the best developers do make me put more blame on its complexity in terms of dodging landmines. I still put plenty blame on the environment given quality-focused shops have a much lower defect rate. Especially if they use extensive tooling for bug elimination. You could say a major reason I’m against most shops using C is because I know those processes and environments will turn it into many liabilities that will be externalized. Anything that lowers number of vulnerabilities or severe ones in bad environments can improve things. At the least, memory-safe languages turn it from attackers owning your box to just crashing it or leaking things. Some improvement given updates are easier if I still control the box. ;)

                            2. 12

                              Your reasons to not learn C are… mostly irrelevant. You may not think unix is a “real” operating system, but it’s the most widely OS family outside of seriously low-powered embedded stuff. Nobody programs for lisp machines. You could write ada, sure, but I don’t imagine there’s a vibrant open source library ecosystem for it.

                              In your preferred world, where everyone uses lisp machines or what you consider “real” operating systems, you are right, but we’re not in that world. If people are going to continue using unixes, learning C will continue to be worthwhile even if you prefer writing most new code in something better like rust or ada or lisp.

                              1. 3

                                You could write ada, sure, but I don’t imagine there’s a vibrant open source library ecosystem for it.

                                I’m assuming you mean that this vibrancy exists for C libraries? Which makes the statement odd, because calling a C library from Ada is trivial.

                                (And no, you don’t lose all the advantages of Ada by calling libraries written in C. Any argument to that effect is so off the mark I would have trouble responding to it.)

                                1. 2

                                  I wouldn’t say you lose all the advantages of Ada by calling C libraries. However, if you’re calling C libraries, you should have at least a rudimentary understanding of C, right? If anything, using C libraries from Ada is a big argument in favor of learning C, isn’t it? Not because you necessarily should use C in a project, but just because you need to be able to read the documentation, know how to translate a C function signature into a call to that function from Ada, know what pitfalls you can fall into if you’re calling the C code incorrectly, know how to debug the segfaults you’ll inevitably encounter, know how to resolve linker issues, etc. Maybe you’ll even have to write a tiny wrapper in C around another library if that library is particularity happy to abuse macros or if you need to do something stupid like use setjmp/longjmp to deal with errors (like libjpg requires you to do).

                                  1. 1

                                    Sure, but “C still deserves learning once you know Ada” is rarely how it goes.

                                    1. 3

                                      I don’t know what previous experience you have with being coerced to learn C but the article we’re commenting on here just said that modern C is very different from old C, that knowing C is really useful to be able to study the vast amount of open source C code in your operating system (assuming that’s something unixy), and that there may be times when you want to write something in C for the performance. I’d say “C still deserves learning once you know Ada” is perfectly consistent with those points.

                                      I’ll admit you do have a point regarding the performance thing; if you know Ada, you may not need C to write performant code. However, I’d bet it’s vastly easier to use a C library from, say, Python, than it is to use an Ada library from Python, so even if you know Ada, writing performance-critical code in C still makes sense in certain fairly common circumstances.

                                      1. 3

                                        The biggest difficulty with using Ada libraries from Python is that you have two languages with expressive type systems talking through a third one without, since the OS API is that of C. You have to either come up with a way of encoding and decoding complex data, or reduce the API of the library to the level of expressiveness afforded by C.

                                        To study old code, knowing modern C is not enough, so the point about modern C becomes moot there.

                                        Also, this is a shameless plug, but I have a real life demonstration of what calling C and assembly from Ada actually looks like: Calling C:https://github.com/dmbaturin/hvinfo/blob/master/src/hypervisor_check.adb#L108-L121 Calling x86 assembly: https://github.com/dmbaturin/hvinfo/blob/master/src/hypervisor_check.adb#L22-L36

                                    2. 1

                                      If anything, using C libraries from Ada is a big argument in favor of learning C, isn’t it?

                                      Perhaps, but you’re advocating that people should learn more modern languages first and do the majority of their programming in those languages, which is not what /u/nanxiao suggested at all.

                                      1. 2

                                        Isn’t it though? The article is saying you can squeeze out some extra performance from C, that C really isn’t as bad as it used to be, and that being able to read the source of your operating system and tools is beneficial; “do most of your programming in your preferred language, but know how to read and write C when that turns out to be necessary” seems perfectly consistent with that, doesn’t it?

                                        1. 2

                                          I suppose so; however, the post seems to be mostly engaging with users of the Rust and Go languages, which are cast by the author as direct competitors to C (their success causing it to be “ignored”).

                                          In any case I definitely think this is the right approach.

                                      2. 1

                                        “However, if you’re calling C libraries, you should have at least a rudimentary understanding of C, right?”

                                        Nope. The whole point of putting it behind a function call, aka abstraction, is not understanding the internals. Instead, you should just know about the safety issues (i.e. responsible use), what data goes in the function call, and what is returned. You should just have to understand the interface itself.

                                        The other stuff you mentioned is basically working with the C code inside. That requires knowing about C or at least assembly.

                                        1. 3

                                          I don’t think that’s the only interpretation. If you have to build the abstraction at the ffi level yourself, then you generally want to be familiar with C. In many cases, the docs aren’t good enough to cover all of the cases where UB is expressed through the API, so you wind up needing to read the source.

                                          1. 1

                                            True in practice.

                                  2. 3

                                    Lisp machines focused on compilation to a single language, but that language was well-designed and well-equipped, unlike C.

                                    We are back in the mainframe age where you can use whichever language you please on the server side. Before we write an alternate history where lisp machines won, how many major players have succeeded with lisp and stuck with it? Biggest I remember was Reddit, and I’m pretty sure they eventually re-wrote everything in Python.

                                    Would also add that as much as I enjoy lisp, I’d never even consider it for something that had the potential to get big. If I have to start dropping newbies into my code, I want types and compile-time warnings.

                                    1. 4

                                      how many major players have succeeded with lisp and stuck with it?

                                      ITA famously uses Lisp.

                                      There are attempts to catalog production uses of Lisp, but they’re almost certainly out of date, incorrect, and don’t highlight companies anyone has really heard of.

                                      There’s no intention in arguing, just providing the list that I know of in direct answer to your question.

                                      1. 4

                                        If I have to start dropping newbies into my code, I want types and compile-time warnings.

                                        Common Lisp has types and compile-time warnings. It has the ability to treat warnings as errors, too. it’s a pretty great language considered.

                                        1. 1

                                          I remember there was some macro to annotate a parameter for type checking, but its behavior was left unspecified by the standard. Has that blank been filled in since then? Or is there another facility that I’ve overlooked entirely?

                                        2. 3

                                          Look at Franz and LispWorks customer pages for answer to that. Quite a range of uses. Most seem to like its productivity and flexibility. It’s definitely a tiny slice of laguage users, though.

                                      2. 1

                                        C has the great virtue that most computers have a C compiler installed or can get one from the vendor, and by typing:

                                        % clang -o thing thing.c # or gcc if you must.

                                        You get a binary that’ll run on that platform. If you wrote your code sanely, you can recompile it on every major platform with no changes. If you find a place where a platform’s different, you can add #if to fix that.

                                        Contrast with every other systems language:

                                        • Do you have a compiler installed?
                                        • Is it portable?
                                        • Can you even fix non-portable areas?
                                        • Can you make a binary an end-user can run?
                                        • Can you call other libraries?

                                        In C, the answer is always yes. In any other, it’s “probably not, or only with great effort”.

                                        Someone was recommending Ada. It’s not in standard gcc. You have to install something called “GNAT”:

                                        % sudo port install gnat-gcc
                                        #  Ada is self hosted (http://en.wikipedia.org/wiki/Self-hosting)  #
                                        #  You need to install an existing Ada compiler and then choose    #
                                        #  an appropiate variant. For more info use:                       #
                                        #  port variants gnat-gcc                                          #
                                        % sudo port variants gnat-gcc
                                        gnat-gcc has the variants:
                                           ada: Uses the MacPorts Ada (https://www.macports.org/) compiler to bootstrap!
                                           gnatgpl: Uses GNAT/GPL compiler (http://libre.adacore.com) to bootstrap!
                                           gnuada: Uses the GnuAda (http://gnuada.sourceforge.net/) compiler to
                                                   bootstrap!
                                           macada: Uses MacAda compiler (http://www.macada.org) to bootstrap!
                                           odcctools: Use the odcctools instead of the system provided ones - does not
                                                      work for x64 currently!
                                        

                                        I dunno what to do here. MacAda hasn’t been updated since OS X 10.4, which was 2005. I don’t see any other ada in MacPorts. At this point I gave up and uninstalled everything. It’s not a working language.

                                        C isn’t safe, but it’s fast. It’s pretty hard to use safely on large projects, but for small tools with static data buffers it’s fine. You take some tradeoffs for any language.

                                        I like Scheme, and there’s a ton of other good high-level languages. But if you need to write at the lowest level, and make it work everywhere, C’s the only rational choice.

                                        1. 1

                                          Check out ZL if you like C and Scheme. It’s an academic prototype that was meant for exploration, not production use. You might still find it interesting given it does a C++-like language in Scheme that compiles to C.

                                          PreScheme was another C alternative that compiled to C. It could probably be mocked up in Racket if someone hasn’t already.

                                          1. 1

                                            I already use Chicken Scheme, which compiles to C and lets you have inline C, and it is a production language, it has a ton of libraries.

                                            And it can only exist because a C compiler is so basic to every modern platform.

                                            1. 1

                                              Cool. Yeah, C compilers are entangled with every platform now.