1. 25

  2. 26
    • This reads like an ad for your company.

    • I really have to disagree on the Dragon Book. It is the “canonical text” in the sense that it is the only book that people who don’t have a background in PL/compilers have heard of (not saying that this is you, necessarily). I think Modern Compiler Implementation in ML is a much better book. For deeper dives, see Types & Programming Languages, the GC Handbook, Parsing Techniques: A Practical Guide, and Optimizing Compilers for Modern Architectures.

    • Eve isn’t on the frontier of PL, if you want to follow PL research then read {PLDI, POPL, ICFP, OOPSLA}.

    1. 9

      Thanks for the feedback!

      • A 2000 word ad? Sorry you didn’t get any value out of it, perhaps you’re not the target audience.
      • We have used the Appel book too but have had more success with the Dragon book. I think a lot of people have, which is why it has remained such a popular suggestion for undergrads, but like I mentioned in the article it has some thoughtful detractors and you appear to be one of them
      • The frontier of PL is not the frontier to which I was referring

      Thanks again :)

      1. 7

        A 2000 word ad? Sorry you didn’t get any value out of it, perhaps you’re not the target audience.

        I’m definitely not the target audience, but I care a lot about getting people interested in PL and compilers. Sorry, I think I was a bit flippant.

        We have used the Appel book too but have had more success with the Dragon book.

        I imagine that the dynamics are different in a classroom vs self-study. The latest edition of the Dragon Book is 1000 pages and spends a significant amount of time on things that aren’t appropriate for a newcomer to compilers. There are over 200 pages dedicated to parsing, for example. In a classroom setting you can pick and choose for your syllabus, but someone doing self-study won’t have the background necessary to know what is important and what isn’t. This is why I prefer recommending less overwhelming texts and supplementing them with advanced, subject-specific texts. In my experience the advanced texts are better references later in your career, too.

        The other problem I have with the Dragon Book is that it spends almost no time on type checking, which is a major part of many compiler front-ends. Appel’s book spends time on it + the supplemental TAPL/ATTAPL provides a much more thorough introduction. Do you cover this in your course?

        1. 4

          Yes I totally agree. So do the authors; they strongly recommend aiming to cover the content over two quarters or even semesters, so it’s certainly a book that benefits from guided study. I should revise the article to make this clear, thank you.

          There is some content on type checking in the book, and yes we do visit it in the course. I can see why the coverage in the book would be unsatisfying for somebody whose particular interest is in types, but our students seem to find it sufficient. That may be because the languages they use day to day tend not to have rich type systems.

        2. 1

          The creator of Elixir used the Dragon book to help him improve his understanding of PL and compilers. Then he went on to create a fantastic programming language.

          1. 5

            I don’t see what this comment has to do with its parent. cmm did not say the Dragon Book was useless. José was already an educated and experienced programmer long before he decided to create Elixir, so any faults in the Dragon Book did not affect him.

        3. 4

          I’ll add that a meta-analysis of recommendations on HN and Amazon reviews had me going with Modern Compiler Implementation in C by Appel followed by Practical Compiler Construction by Holm for optimizations etc. I’m still up for feedback on best books for getting people started on foundations esp for mainstream compilers.

          1. 6

            I too would recommend the Modern Compiler Implementation in _ series by Appel, but would strongly recommend the ML edition over the C edition. I’ve had both, and the C version feels like a lossy translation from the original ML – ML has many relevant abstractions for language implementation that C lacks, so the ML edition can spend more time covering the material. (And for what it’s worth, I’m more familiar with OCaml than SML, but had no problem following it.)

            I also frequently recommend Compiler Construction by Niklaus Wirth, before the Appel books. While Appel’s book has more depth, Wirth’s is focused on hitting the ground running, learning a couple simple techniques and getting something working end-to-end. In about a hundred pages, it builds up a lexer, recursive-descent parser, bytecode compiler, and virtual machine for Oberon, a dialect of Pascal. Some people might find seeing how everything fits together more helpful than spending what feels like hundreds of pages on parsing without a usable system in sight. There’s a free PDF online.

            1. 2

              I used to recommend the Wirth book as I started with it doing my BASIC variants. The reason I stopped is that he seems to do the opposite of what about any mainstream compiler would do. From his requirements to the implementation. You get a start on a foundation you have to throw away in favor of another foundation. This led me to find the simplest book to start with that explained passes & techniques that step someone toward understanding existing compilers.

          2. 2

            But in the post he recommends the Terrence Parr book. I have this book and I think it’s a pretty good intro too.

            (I dislike some things about Java, but the Dragon book also uses Java in new editions, and honestly Java has advantages over C and C++ for prototyping languages.)

            1. 4

              I’m not familiar with the Parr book but I skimmed it just now. This ultimately appears to be a tutorial on ANTLR and the visitor pattern in Java. You make an interpreter but ultimately this is a trivial application of the visitor pattern. This is nearly 400 pages long, but where’s the beef? There’s no content about language semantics, SSA form, flow analyses, garbage collection, etc. I’m not sure what relation this book has to MCI or the Dragon Book, and couldn’t see recommending it to someone who wanted to learn about languages.

              (I dislike some things about Java, but the Dragon book also uses Java in new editions, and honestly Java has advantages over C and C++ for prototyping languages.)

              I can’t imagine teaching compiler fundamentals in a language without sum types. You may not end up using a language like ML/Haskell/Rust in the real world, but at least while learning you won’t be in Visitor Pattern Hell™.

              1. 4

                The first language book should just be writing a lexer, parser, AST, and tree walking interpreter… the Parr book is totally fine for that. And it has a lot of other stuff you didn’t mention, like static typing, which he builds up nicely in a few steps.

                SSA, flow analysis, GC, are all their own topics beyond that. People need to write a few thousand lines of basic code before moving onto those topics.

                I might choose SICP over that book to learn about languages because it gets to the essence a little faster (and that’s what I did 20 years ago in school). Although honestly skipping lexing and parsing is bad IMO.

                I like OCaml, but for a lot of programmers, especially working ones, learning ML at the same time as learning programming languages is probably more extra work. Although I agree you should do it eventually.

                Yeah the visitor pattern seems overblown, but Parr mentions the “external visitor pattern” in one of his books (I’m pretty sure it is that one.) All it is duplicating type information with an enum and using a switch statement, pretty much like you would in ML. I noticed that the TypeScript compiler has a heterogeneous AST and uses tons of duplicate switch statements. So it’s more ML style than Java style, even though TypeScript is pretty similar to Java or C#.

                I honestly have never used the visitor pattern successfully… I don’t get the problem with a switch statement over data, and other “experts” do that too. It’s supposedly more functional than OOP, but it works just fine in OO languages.

          3. 7

            I just read the Design and Evolution of C++ (which he also recommends) based on a recommendation from someone who said they gave up on C++ but loved the book. I tend to agree: I have mixed feelings about C++ at best, but I have to admit that Bjarne has a good reason for everything. He understands things you don’t. If you were to design C++ with the same goals, you would probably do worse. And his goals are good ones.

            1. 6

              Consider a Scheme resource like How to Design Programs for its combination of easy modification of syntax and DSL’s. Additionally, OMeta, TXL, or the original paper on Meta II for transformation-based languages. Paul Morrison’s stuff on flow-based programming. Hit them with Datalog or Mercury on declarative side on top of whatever foundational logic they learn so they see its practical power. Formal language like Coq with Chlipala’s Programming with Proofs might be icing on the cake so they can see programs get engineered from precise specifications with provable correctness. Should come in handy in debates on whether programming is engineering or not. ;) Lightweight version of that would be SPARK Ada that can automatically prove aspects of imperative programs due to clean design. Or design by contract with that, Ada 2012, or originator Eiffel. If doing parallelism, show them an inherently parallel language like X10, Chapel, or ParaSail. Erlang should come up if we’re talking high concurrency and reliability in same sentence.

              Just some suggestions of things to think about.

              1. 5

                I think Mercury is very cute but I think I’d recommend learning Prolog first, because Mercury really felt to me like a strict superset of Prolog’s concepts - and far more tutorials are available for introducing Prolog.

                I’ve also heard Chipala’s PwP being recommended as very good, but not beginner friendly and recommended to read Software Foundations first.

                1. 3

                  That’s probably the best route. Main reason for adding it is many people learn then drop Prolog as impractical. Giving them Prolog + Mercury and Datalog with examples might broaden perspective on its utility.

              2. 5

                I’m not sure if this is a great strategy but it approximates what I actually did over my first four years of programming, mostly because I wasn’t beholden to anyone for the first two and could dabble in whatever I want. I feel like the following effect referenced in the article is one of the biggest advantages, especially at a big company:

                As the popularity of languages ebb and flow, you will have a wider choice of jobs, companies and projects if you’re not limited by language choice.

                I’m at Google now and being free to contribute to any project across the whole company (20% time and all that) has been empowering.