I had this at a job once. A team was assigned to doing a greenfield rewrite of a core system which was complicated, slow, and dangerous to change. They designed a performant job system where each job would work a record and then queue one or more jobs as needed. This was announced to the company with a bit of fanfare as The Big Rewrite.
I read through their docs to understand how it worked and finally realized I was seeing a programming language, just sort of one level up. A job specification was a function signature and each execution of a queued job was a function call, with the job record as the shared, mutable state for data (at least it was a hash rather than a stack?) for each of them to progressively read and write. There was no functionality for locking data or semantics for controlling parallel execution. Having dipped my toe in multithreaded C++ in the 90s I realized this was a recipe for disaster. I tried to raise my concerns that the designed system wasn’t going to be expressive enough to avoid data races or permit tracing. On a private call the lead designer agreed he agreed with my metaphor, but he was confident the system would work. Looking at their LinkedIn profile, I realized they had designed similar systems at their three previous jobs. I worked far away on an unrelated project, so I wished them well and left them to it.
A year later the project was rescoped down, redesigned, and rebooted. Then again a year later. Then again a year later. I don’t what happened after that. I eventually met people who knew the lead designer from previous jobs: those previous projects also had not succeeded. The designer just had a really strong design idea for a job system and felt certain the next attempt would work. I wondered if there was any way I could’ve escalated my concerns more effectively, but that company had a weird culture around criticism and a strong focus on OKRs that meant non-numeric concerns always failed to quantitative plans.
There are lots of good reasons for building a compiler. There is also a lot of knowledge floating around about how to build compilers and so you can learn from 60 years of experience if you set out to build a compiler.
The problem is if you don’t set out to build a compiler but end up building one anyway. This is a problem because you are almost certainly going to repeat some of the mistakes that other people have learned from.
For compiler, feel free to substitute ‘scheduler’, ‘distributed system’, or ‘RPC protocol’ into the above, the rest of the rest of the text works just as well.
I think it’s more aimed at those not realizing they are building a compiler / interpreter. They think they’re just allowing some flexibility in a configuration file, which slowly morphs into its own language.
Or, as I once did, try to use string mangling to convert someone’s made up Excel expressions (text, not formulas) into valid Ruby, as they seemed close enough that it would be simpler than writing a transpiler, only to end up with a muddled mess for which a parser + code generator would probably have been the same amount of work. The project died due to business reasons and I never got to find out though.
Before becoming a programmer I worked with finance, and I had an automation system in a spreadsheet where each row was a command and the columns held arguments. Commands where for creating slides based on figures in spreadsheets.
Spreadsheet programming in itself is such a passion of mine. That’s one of the things that got me hooked in FP. You know, Excel is a 2D programming environment for pure and total functional programming with incremental parallel computation with ample support for database connectivity, and built in chart visualization and data tables.
I’ve witnessed two projects that could be described as compilers, but whose value was not “being a compiler.”
One was incredibly easy to work on.
The other had collapsed in on itself.
Only one was a compiler on purpose.
Avoiding cleverness or sinking time into “doing it right” are often stated goals. But cutting against the inevitable grain of a project to achieve it isn’t any better. “Wanting to build a compiler” when you’re building a compiler is a massive asset for building a good compiler.
You appear to be advocating a new:
[ ] functional [ ] imperative [ ] object-oriented [ ] procedural [ ] stack-based
[ ] "multi-paradigm" [ ] lazy [ ] eager [ ] statically-typed [ ] dynamically-typed
[ ] pure [ ] impure [ ] non-hygienic [ ] visual [ ] beginner-friendly
[ ] non-programmer-friendly [ ] completely incomprehensible
programming language. Your language will not work. Here is why it will not work.
You appear to believe that:
[ ] Syntax is what makes programming difficult
[ ] Garbage collection is free [ ] Computers have infinite memory
[ ] Nobody really needs:
[ ] concurrency [ ] a REPL [ ] debugger support [ ] IDE support [ ] I/O
[ ] to interact with code not written in your language
[ ] The entire world speaks 7-bit ASCII
[ ] Scaling up to large software projects will be easy
[ ] Convincing programmers to adopt a new language will be easy
[ ] Convincing programmers to adopt a language-specific IDE will be easy
[ ] Programmers love writing lots of boilerplate
[ ] Specifying behaviors as "undefined" means that programmers won't rely on them
[ ] "Spooky action at a distance" makes programming more fun
Unfortunately, your language (has/lacks):
[ ] comprehensible syntax [ ] semicolons [ ] significant whitespace [ ] macros
[ ] implicit type conversion [ ] explicit casting [ ] type inference
[ ] goto [ ] exceptions [ ] closures [ ] tail recursion [ ] coroutines
[ ] reflection [ ] subtyping [ ] multiple inheritance [ ] operator overloading
[ ] algebraic datatypes [ ] recursive types [ ] polymorphic types
[ ] covariant array typing [ ] monads [ ] dependent types
[ ] infix operators [ ] nested comments [ ] multi-line strings [ ] regexes
[ ] call-by-value [ ] call-by-name [ ] call-by-reference [ ] call-cc
The following philosophical objections apply:
[ ] Programmers should not need to understand category theory to write "Hello, World!"
[ ] Programmers should not develop RSI from writing "Hello, World!"
[ ] The most significant program written in your language is its own compiler
[ ] The most significant program written in your language isn't even its own compiler
[ ] No language spec
[ ] "The implementation is the spec"
[ ] The implementation is closed-source [ ] covered by patents [ ] not owned by you
[ ] Your type system is unsound [ ] Your language cannot be unambiguously parsed
[ ] a proof of same is attached
[ ] invoking this proof crashes the compiler
[ ] The name of your language makes it impossible to find on Google
[ ] Interpreted languages will never be as fast as C
[ ] Compiled languages will never be "extensible"
[ ] Writing a compiler that understands English is AI-complete
[ ] Your language relies on an optimization which has never been shown possible
[ ] There are less than 100 programmers on Earth smart enough to use your language
[ ] ____________________________ takes exponential time
[ ] ____________________________ is known to be undecidable
Your implementation has the following flaws:
[ ] CPUs do not work that way
[ ] RAM does not work that way
[ ] VMs do not work that way
[ ] Compilers do not work that way
[ ] Compilers cannot work that way
[ ] Shift-reduce conflicts in parsing seem to be resolved using rand()
[ ] You require the compiler to be present at runtime
[ ] You require the language runtime to be present at compile-time
[ ] Your compiler errors are completely inscrutable
[ ] Dangerous behavior is only a warning
[ ] The compiler crashes if you look at it funny
[ ] The VM crashes if you look at it funny
[ ] You don't seem to understand basic optimization techniques
[ ] You don't seem to understand basic systems programming
[ ] You don't seem to understand pointers
[ ] You don't seem to understand functions
Additionally, your marketing has the following problems:
[ ] Unsupported claims of increased productivity
[ ] Unsupported claims of greater "ease of use"
[ ] Obviously rigged benchmarks
[ ] Graphics, simulation, or crypto benchmarks where your code just calls
handwritten assembly through your FFI
[ ] String-processing benchmarks where you just call PCRE
[ ] Matrix-math benchmarks where you just call BLAS
[ ] Noone really believes that your language is faster than:
[ ] assembly [ ] C [ ] FORTRAN [ ] Java [ ] Ruby [ ] Prolog
[ ] Rejection of orthodox programming-language theory without justification
[ ] Rejection of orthodox systems programming without justification
[ ] Rejection of orthodox algorithmic theory without justification
[ ] Rejection of basic computer science without justification
Taking the wider ecosystem into account, I would like to note that:
[ ] Your complex sample code would be one line in: _______________________
[ ] We already have an unsafe imperative language
[ ] We already have a safe imperative OO language
[ ] We already have a safe statically-typed eager functional language
[ ] You have reinvented Lisp but worse
[ ] You have reinvented Javascript but worse
[ ] You have reinvented Java but worse
[ ] You have reinvented C++ but worse
[ ] You have reinvented PHP but worse
[ ] You have reinvented PHP better, but that's still no justification
[ ] You have reinvented Brainfuck but non-ironically
In conclusion, this is what I think of you:
[ ] You have some interesting ideas, but this won't fly.
[ ] This is a bad language, and you should feel bad for inventing it.
[ ] Programming in this language is an adequate punishment for inventing it.
oh gods CMake. and yeah I did recently get real angry at Ansible recently for making it difficult to nest array lookups in templates. Not sure that systemd and nginx are quite as guilty though.
If you do a little searchengineering for systemd unit file parsing bugs, you will find a lot of the kind of issues that arise from not treating it as a task for a compiler.
Terminology can be funny. Take a look at PostCSS, the well-known CSS “tool”:
PostCSS is a tool for transforming styles with JS plugins. These plugins can lint your CSS, support variables and mixins, transpile future CSS syntax, inline images, and more.
…
PostCSS takes a CSS file and provides an API to analyze and modify its rules (by transforming them into an Abstract Syntax Tree). This API can then be used by plugins to do a lot of useful things, e.g., to find errors automatically, or to insert vendor prefixes.
I mean, it’s a compiler, right ? But they don’t call it that. They call it a “tool”, a post-processor (or, at least, they used to). (I would conjecture it’s to avoid giving the impression it will compile Sass “down” to CSS, but rather “across” from CSS to CSS, but honestly I don’t know). Maybe it should be called CompilerCSS, or CompCSS.
Are post-processors just compilers, then? Pre-processors, too? De-compilers, even? Auto-formatters? Linters? Aren’t these all compilers, in a sense? I suppose so.
I would say that there’s two really key parts of a compiler: parsing structured text into some kind of AST, and mechanically transforming that AST into something else. Maybe three parts, the third being the intermediate step where you walk through the AST or some derivative of it to perform error checking. Auto-formatters and linters tend to do some of these steps but not all, afaik linters don’t usually transform their AST and autoformatters transform it back into the original structured text with identical meaning. Preprocessors…. mmmmmaybe, depends on the kind… a full symbolic macro system like Rust’s or TeX’s, definitely. A text macro system like m4 or the C preprocessor, I would argue do a lot less.
I had this at a job once. A team was assigned to doing a greenfield rewrite of a core system which was complicated, slow, and dangerous to change. They designed a performant job system where each job would work a record and then queue one or more jobs as needed. This was announced to the company with a bit of fanfare as The Big Rewrite.
I read through their docs to understand how it worked and finally realized I was seeing a programming language, just sort of one level up. A job specification was a function signature and each execution of a queued job was a function call, with the job record as the shared, mutable state for data (at least it was a hash rather than a stack?) for each of them to progressively read and write. There was no functionality for locking data or semantics for controlling parallel execution. Having dipped my toe in multithreaded C++ in the 90s I realized this was a recipe for disaster. I tried to raise my concerns that the designed system wasn’t going to be expressive enough to avoid data races or permit tracing. On a private call the lead designer agreed he agreed with my metaphor, but he was confident the system would work. Looking at their LinkedIn profile, I realized they had designed similar systems at their three previous jobs. I worked far away on an unrelated project, so I wished them well and left them to it.
A year later the project was rescoped down, redesigned, and rebooted. Then again a year later. Then again a year later. I don’t what happened after that. I eventually met people who knew the lead designer from previous jobs: those previous projects also had not succeeded. The designer just had a really strong design idea for a job system and felt certain the next attempt would work. I wondered if there was any way I could’ve escalated my concerns more effectively, but that company had a weird culture around criticism and a strong focus on OKRs that meant non-numeric concerns always failed to quantitative plans.
Looks like you built a call-reified concurrent scheduler with multiple IPC options. ;)
Why in the world would anybody not want to build a compiler?
There are lots of good reasons for building a compiler. There is also a lot of knowledge floating around about how to build compilers and so you can learn from 60 years of experience if you set out to build a compiler.
The problem is if you don’t set out to build a compiler but end up building one anyway. This is a problem because you are almost certainly going to repeat some of the mistakes that other people have learned from.
For compiler, feel free to substitute ‘scheduler’, ‘distributed system’, or ‘RPC protocol’ into the above, the rest of the rest of the text works just as well.
This is essentially what happened to the project that inspired https://lobste.rs/s/iksbf4/alien_artefacts
I think it’s more aimed at those not realizing they are building a compiler / interpreter. They think they’re just allowing some flexibility in a configuration file, which slowly morphs into its own language.
Or, as I once did, try to use string mangling to convert someone’s made up Excel expressions (text, not formulas) into valid Ruby, as they seemed close enough that it would be simpler than writing a transpiler, only to end up with a muddled mess for which a parser + code generator would probably have been the same amount of work. The project died due to business reasons and I never got to find out though.
Before becoming a programmer I worked with finance, and I had an automation system in a spreadsheet where each row was a command and the columns held arguments. Commands where for creating slides based on figures in spreadsheets.
It was like a byte code for slides!
i actually kinda like this idea for toy programming. turtle graphics or something, with a spreadsheet frontend
Spreadsheet programming in itself is such a passion of mine. That’s one of the things that got me hooked in FP. You know, Excel is a 2D programming environment for pure and total functional programming with incremental parallel computation with ample support for database connectivity, and built in chart visualization and data tables.
Your comment sparked a bit of personal “aha”.
I’ve witnessed two projects that could be described as compilers, but whose value was not “being a compiler.”
One was incredibly easy to work on. The other had collapsed in on itself.
Only one was a compiler on purpose.
Avoiding cleverness or sinking time into “doing it right” are often stated goals. But cutting against the inevitable grain of a project to achieve it isn’t any better. “Wanting to build a compiler” when you’re building a compiler is a massive asset for building a good compiler.
From: https://www.mcmillen.dev/language_checklist.html
Written by: Colin McMillen, Jason Reed, and Elly Fong-Jones
Pretty sure this is how MUMPS happened. Probably RPG too. And Bash, for that matter.
OTOH, avoiding this gave us Awk, Lua, ELisp, and other lovely little tools.
CMake. All DevOps tools that use YAML.
systemd
. Apache httpd and nginx config files. Exim and sendmail.oh gods CMake. and yeah I did recently get real angry at Ansible recently for making it difficult to nest array lookups in templates. Not sure that systemd and nginx are quite as guilty though.
If you do a little searchengineering for systemd unit file parsing bugs, you will find a lot of the kind of issues that arise from not treating it as a task for a compiler.
i can’t tell if the article is aimed at people who are writing a babel plugin, or people that ought to be
i do enjoy the idea that people are out there writing an ad hoc, informally-specified, bug-ridden, slow implementation of half of javascript, though
… which contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp.
Terminology can be funny. Take a look at PostCSS, the well-known CSS “tool”:
I mean, it’s a compiler, right ? But they don’t call it that. They call it a “tool”, a post-processor (or, at least, they used to). (I would conjecture it’s to avoid giving the impression it will compile Sass “down” to CSS, but rather “across” from CSS to CSS, but honestly I don’t know). Maybe it should be called CompilerCSS, or CompCSS.
Are post-processors just compilers, then? Pre-processors, too? De-compilers, even? Auto-formatters? Linters? Aren’t these all compilers, in a sense? I suppose so.
I would say that there’s two really key parts of a compiler: parsing structured text into some kind of AST, and mechanically transforming that AST into something else. Maybe three parts, the third being the intermediate step where you walk through the AST or some derivative of it to perform error checking. Auto-formatters and linters tend to do some of these steps but not all, afaik linters don’t usually transform their AST and autoformatters transform it back into the original structured text with identical meaning. Preprocessors…. mmmmmaybe, depends on the kind… a full symbolic macro system like Rust’s or TeX’s, definitely. A text macro system like m4 or the C preprocessor, I would argue do a lot less.