Processors are just interpreters of machine code. Eventually there needs to be something that executes code, and that thing is an interpreter.
A common way of defining language semantics is with an operational semantics, which is literally just defining the language as an interpreter written in mathematical (formal) logic.
On that note, mathematics itself is an interpreted language. Who interprets statements written in math? Human minds. Compilers are just intermediate steps in between interpretation.
Alternative viewpoint: “It’s not safe to say the language is the same regardless of which way it’s processed, because the language design and the processing strategy aren’t really independent. In principle a given language might be processed either way; but the two strategies provide different conceptual frameworks for thinking about the language, lending themselves to different language design choices, with different purposes — for which different mathematical properties are of interest and different mathematical techniques are effective. This is a situation where formal treatment has to be considered in the context of intent.”
http://fexpr.blogspot.com/2016/08/interpreted-programming-languages.html
Sure. Shutt’s point applies to common languages too. For example, neither C nor Go can be interpreted. There certainly exist REPLs and other interpreter-like tools for large fragments of C and Go, but their semantics are technically defined only with respect to a compilation phase. (There’s nothing wrong or deficient with either language; it’s just a bit of fun trivia.)
I don’t believe that’s true for C, at least. There’s nothing in the spec that requires compilation, from what I recall last time I read it or when I worked on formal models of the language. There is a requirement that it is preprocessed and there is a notion of a translation unit but the semantics are defined in terms of the behaviour when individual statements are executed and there’s nothing that would preclude a statement-at-a-time interpreter over the preprocessed source.
I’m not sure about Cint, but Cling isn’t really an interpreter. It’s the clang front end wired to the LLVM JIT. It does module-at-a-time compilation and adds a REPL that creates a new module for each statement.
A long time ago I used SoftIntegration’s Ch, which (SI claimed to be) was a C90 interpreter. I never looked under the hood to see how it worked but it was slow enough that I believed their claim that it was an interpreter (not to take away from its usefulness, it was a very good prototyping tool). Maybe post-C90 standards have more strict requirements but I’m fairly sure C could, and was, at some point, interpreted. I dimly recall that statement-by-statement execution precluded some common optimizations but I think it worked?
SI is still around if anyone’s curious: https://www.softintegration.com/ . I’m not affiliated with them in any way and haven’t used Ch in 10+ years at this point, I’m just linking to it because I think it is/it was cool.
Indeed, a simple web search suggests that the vast majority of pages describing compiled and interpreted languages propose a crisp delineation between the two.
This is one of those things that gets the compiler hacker half of my brain and the anti-prescriptivist half of my brain into fistfights.
Notable mainstream language implementations using it include CPython and CRuby. CPython even caches the output of its compiler as .pyc files
A fun thing to ask someone who says Python is an “interpreted language” is “what does .pyc stand for?”
All in all a great article with solid arguments. I really hope the term “interpreted language” stops getting used outside academic/pedagogical contexts, because there hasn’t been a mainstream language without a compiler since 2011 when R added their bytecode engine.
Graal makes this even more blurry because it does AST interpretation in a JIT language and so end up using the host JIT environment to inline the execution of AST node interpreter chunks for individual traces.
This is one of those things that gets the compiler hacker half of my brain and the anti-prescriptivist half of my brain into fistfights.
Prescriptivism rarely wins in the end; I’d just give up now.
The first memory I have of the compiled/interpreted language debate was on IRC. Someone made a remark to the effect of “words have meanings; redefining them to be meaningless is annoying”. That’s the only contribution to this line of inquiry I’ve ever bothered to remember.
Apart from the intrinsic pointlessness of trying to make some words useless, I kind of doubt that the fuzziness of these specific words comes as a revelation to the sort of person who would read an article about optimizing a brainfuck interpreter.
I have much more positive feelings about the post if I forget about its prescriptivism and think of it just as an exploration of the implementation techniques and how they blur together. It’s arguably just a good article with “popular usage is wrong” as the clickbait thumbnail.
I don’t want to read the author’s mind, but I imagine that they know about the Futamura projections, which clearly show the difference between interpreters and compilers. They are two different sorts of artifacts.
I agree with your point about fuzziness and definitions. There is, as far as I know, no type-theoretic nor category-theoretic definition of “compiler”, let alone “correct compiler”. We have an intuition – a compiler is a functor from one programming language to another – but we have informal expectations about correctness and performance which are not easily captured.
I think correctness follows pretty easily from knowing the semantics of the source and destination languages. It’s harder when we want compilers to reject buggy code, though, I guess because we’re accustomed to programming techniques where the difference between spec and implementation is not obvious.
On compilers vs interpreters, though, you illustrate an important point: there’s nothing saying you can’t make a compiler out of an interpreter or an interpreter out of a compiler. So it shouldn’t be important that the interpreter in the OP has a compiler inside it. Although it would make for a neat demonstration of the first protection, given a sufficiently aggressive partial evaluator. I wonder how hard it would be to make Rust do this?
On compilers vs interpreters, though, you illustrate an important point: there’s nothing saying you can’t make a compiler out of an interpreter or an interpreter out of a compiler.
As long as we’re here, let’s sharpen this! Specifically, we can make a compiler out of an interpreter; use the first Futamura projection. The author is quite familiar with this because they work with RPython, which projects interpreters to JIT compilers.
Let’s also sharpen the other branch: we can make a compiler into an interpreter by composition, because the composition of a compiler and an interpreter is an interpreter. If the interpreter is sufficiently trivial, then the compiler becomes the bulk of the implementation effort. Elsewhere in the comments, folks have noted that Java is implemented by composition of a Java-to-JVM compiler with a JVM interpreter, and also Python implementations compose a Python-to-bytecode compiler with a bytecode interpreter. In the limit, we think of hardware as trivially interpreting its bytecode, so a C-to-hardware compiler composed with hardware seems like a “compiled language” situation.
Apart from the intrinsic pointlessness of trying to make some words useless
I’m sympathetic to this line of reasoning, but I think that the word is already useless now. When someone says “interpreter” I don’t know if they mean it’s an actual interpreter, or something that involves a compiler-but-not-to-machine-code-so-it-doesn’t-count. Or they could mean a REPL?
IMO the word is already nearly a lost cause in normal usage, outside academic/pedagogical contexts. I’m not going to go on a campaign to yell at people who are holding it wrong, but I guess the only thing left to do is move on and try to find other ways of expressing the concepts?
I’m sympathetic to this line of reasoning, but I think that the word is already useless now. When someone says “interpreter” I don’t know if they mean it’s an actual interpreter, or something that involves a compiler-but-not-to-machine-code-so-it-doesn’t-count. Or they could mean a REPL?
There is definitely room for confusion, but in practice I don’t encounter it much. As a first stab at a rule, though, I think people identify “interpreter” and “compiler” extensionally: it’s a compiler iff it behaves like a compiler to casual observation. Then R and Python and things are pretty clearly interpreters that happen to be implemented using (mostly-) internal compilers. I think this accounts for most of the usage I’ve seen.
The most common confusion for me is when I say I’m going to be adding features to my compiler this weekend and people assume I’m writing some gnarly low-level machine-code thing but actually it’s lisp->lua.
I guess in that case I shouldn’t complain because it makes me look smarter than I actually am.
Processors are just interpreters of machine code. Eventually there needs to be something that executes code, and that thing is an interpreter.
A common way of defining language semantics is with an operational semantics, which is literally just defining the language as an interpreter written in mathematical (formal) logic.
On that note, mathematics itself is an interpreted language. Who interprets statements written in math? Human minds. Compilers are just intermediate steps in between interpretation.
Alternative viewpoint: “It’s not safe to say the language is the same regardless of which way it’s processed, because the language design and the processing strategy aren’t really independent. In principle a given language might be processed either way; but the two strategies provide different conceptual frameworks for thinking about the language, lending themselves to different language design choices, with different purposes — for which different mathematical properties are of interest and different mathematical techniques are effective. This is a situation where formal treatment has to be considered in the context of intent.” http://fexpr.blogspot.com/2016/08/interpreted-programming-languages.html
Sure. Shutt’s point applies to common languages too. For example, neither C nor Go can be interpreted. There certainly exist REPLs and other interpreter-like tools for large fragments of C and Go, but their semantics are technically defined only with respect to a compilation phase. (There’s nothing wrong or deficient with either language; it’s just a bit of fun trivia.)
I don’t believe that’s true for C, at least. There’s nothing in the spec that requires compilation, from what I recall last time I read it or when I worked on formal models of the language. There is a requirement that it is preprocessed and there is a notion of a translation unit but the semantics are defined in terms of the behaviour when individual statements are executed and there’s nothing that would preclude a statement-at-a-time interpreter over the preprocessed source.
The article we’re discussing provides proof to the contrary:
I’m not sure about Cint, but Cling isn’t really an interpreter. It’s the clang front end wired to the LLVM JIT. It does module-at-a-time compilation and adds a REPL that creates a new module for each statement.
A long time ago I used SoftIntegration’s Ch, which (SI claimed to be) was a C90 interpreter. I never looked under the hood to see how it worked but it was slow enough that I believed their claim that it was an interpreter (not to take away from its usefulness, it was a very good prototyping tool). Maybe post-C90 standards have more strict requirements but I’m fairly sure C could, and was, at some point, interpreted. I dimly recall that statement-by-statement execution precluded some common optimizations but I think it worked?
SI is still around if anyone’s curious: https://www.softintegration.com/ . I’m not affiliated with them in any way and haven’t used Ch in 10+ years at this point, I’m just linking to it because I think it is/it was cool.
This is one of those things that gets the compiler hacker half of my brain and the anti-prescriptivist half of my brain into fistfights.
A fun thing to ask someone who says Python is an “interpreted language” is “what does .pyc stand for?”
All in all a great article with solid arguments. I really hope the term “interpreted language” stops getting used outside academic/pedagogical contexts, because there hasn’t been a mainstream language without a compiler since 2011 when R added their bytecode engine.
I’ve heard that shell languages still do AST interpretation. I haven’t checked.
Graal makes this even more blurry because it does AST interpretation in a JIT language and so end up using the host JIT environment to inline the execution of AST node interpreter chunks for individual traces.
Prescriptivism rarely wins in the end; I’d just give up now.
The first memory I have of the compiled/interpreted language debate was on IRC. Someone made a remark to the effect of “words have meanings; redefining them to be meaningless is annoying”. That’s the only contribution to this line of inquiry I’ve ever bothered to remember.
Apart from the intrinsic pointlessness of trying to make some words useless, I kind of doubt that the fuzziness of these specific words comes as a revelation to the sort of person who would read an article about optimizing a brainfuck interpreter.
I have much more positive feelings about the post if I forget about its prescriptivism and think of it just as an exploration of the implementation techniques and how they blur together. It’s arguably just a good article with “popular usage is wrong” as the clickbait thumbnail.
I don’t want to read the author’s mind, but I imagine that they know about the Futamura projections, which clearly show the difference between interpreters and compilers. They are two different sorts of artifacts.
I agree with your point about fuzziness and definitions. There is, as far as I know, no type-theoretic nor category-theoretic definition of “compiler”, let alone “correct compiler”. We have an intuition – a compiler is a functor from one programming language to another – but we have informal expectations about correctness and performance which are not easily captured.
I think correctness follows pretty easily from knowing the semantics of the source and destination languages. It’s harder when we want compilers to reject buggy code, though, I guess because we’re accustomed to programming techniques where the difference between spec and implementation is not obvious.
On compilers vs interpreters, though, you illustrate an important point: there’s nothing saying you can’t make a compiler out of an interpreter or an interpreter out of a compiler. So it shouldn’t be important that the interpreter in the OP has a compiler inside it. Although it would make for a neat demonstration of the first protection, given a sufficiently aggressive partial evaluator. I wonder how hard it would be to make Rust do this?
As long as we’re here, let’s sharpen this! Specifically, we can make a compiler out of an interpreter; use the first Futamura projection. The author is quite familiar with this because they work with RPython, which projects interpreters to JIT compilers.
Let’s also sharpen the other branch: we can make a compiler into an interpreter by composition, because the composition of a compiler and an interpreter is an interpreter. If the interpreter is sufficiently trivial, then the compiler becomes the bulk of the implementation effort. Elsewhere in the comments, folks have noted that Java is implemented by composition of a Java-to-JVM compiler with a JVM interpreter, and also Python implementations compose a Python-to-bytecode compiler with a bytecode interpreter. In the limit, we think of hardware as trivially interpreting its bytecode, so a C-to-hardware compiler composed with hardware seems like a “compiled language” situation.
I’m sympathetic to this line of reasoning, but I think that the word is already useless now. When someone says “interpreter” I don’t know if they mean it’s an actual interpreter, or something that involves a compiler-but-not-to-machine-code-so-it-doesn’t-count. Or they could mean a REPL?
IMO the word is already nearly a lost cause in normal usage, outside academic/pedagogical contexts. I’m not going to go on a campaign to yell at people who are holding it wrong, but I guess the only thing left to do is move on and try to find other ways of expressing the concepts?
There is definitely room for confusion, but in practice I don’t encounter it much. As a first stab at a rule, though, I think people identify “interpreter” and “compiler” extensionally: it’s a compiler iff it behaves like a compiler to casual observation. Then R and Python and things are pretty clearly interpreters that happen to be implemented using (mostly-) internal compilers. I think this accounts for most of the usage I’ve seen.
The most common confusion for me is when I say I’m going to be adding features to my compiler this weekend and people assume I’m writing some gnarly low-level machine-code thing but actually it’s lisp->lua.
I guess in that case I shouldn’t complain because it makes me look smarter than I actually am.
You could try using a different word if “compiler” is confusing. Maybe “transpiler”?
I could, but ugh. I hope it doesn’t come to that.