I love this. It’s pretty easy to make a Lisp in assembly language so I think I’d skip the first step. But then again, this exercise is not about skipping steps. :)
This is absolutely glorious. I should do this someday. …Not today though.
There are some projects that make me simultaneously think ‘no one should ever do this’ and ‘I want to do this’. This is one of them and its existence makes me unreasonably happy.
Why shouldn’t anyone ever do this? How do you think the first high-level languages got written? :-P
A lot were developed with the T-shaped model. You build a self-hosting toy compiler and then compile subsequent versions of it with the newer version. Quite often, you start by writing a toy interpreter and then use that to run the compiler. Once you have one higher-level language, you can use that for building the next one. For example, Squeak defines a subset of Smalltalk that is easy to statically translate to C (no reflection, classes defined early on) and has a simple text-rewriting engine that spits out something that a C compiler can use. The core of the Smalltalk VM is then written in this subset and the rest is written in the full language. You can do the same thing to implement a C-like language in assembly. C and Pascal both had a single-pass model, which meant that you could compile them one statement at a time and could even write a simple interpreter in assembly for each of the statements.
The portable Pascal compiler worked by generating a stack-based IR (P-Code). A P-Code interpreter could be written for most targets in a few hundred assembly instructions. Once you had that running, you could then run the compiler natively and use it to compile a translator from P-Code to native assembly, which could then run.
Similarly, a FORTH interpreter is normally a few hundred instructions and so you can write one of these and then write your first compiler in FORTH, if you’re that way inclined.
More generally, it’s quite common to write a bootstrapping compiler / interpreter that completely skips all type checking and assumes well-typed code. This can then compile or interpret the real compiler, which does all of the analysis that you’d expect, at which point you can start type checking the bootstrap compiler and so when you add new features to the bootstrap compiler to support the real compiler that these are type checked by the real compiler. Or (more commonly) you make sure that you have a cross compiler and then throw away the bootstrap compiler and just ensure that someone has a binary for the previous version of the compiler to be able to build the next one.
All compilers should be cross compilers and the whole nonsense of bootstrapping like this becomes a non-issue.
You write the new backend target and copy over the binaries, done.
I believe that is what you meant by no one should ever do this. It does make a great side quest for a compiler course.
Perhaps it would be more to the point to say “no one should ever have to do this,” maybe?
As for wanting to do this anyway – I did something extremely similar myself, and it was a blast.
Is this the same Koichi Nakamura from Enix? I wonder if there are any FF references in the assembly…
Doesn’t seem to be the same Koichi.