1. 38
  1.  

  2. 21

    First, if we want to be rigorous about computer science terms, calling it “interpreted vs compiled languages” is a misnomer, because being interpreted or compiled is not a property of the language, but of the implementation.

    I can’t tell you how refreshing it is to hear people finally getting this.

    1. 7

      Isn’t there an implicit “commonly”, as in “Python is commonly interpreted”? Although I suppose explicitly saying “commonly” avoids the terminology misuse, and besides, drawing the distinction can mean one’s about to say something of dubious accuracy…

      1. 4

        Sure, you can make that argument, but the quotation in question is still flawed because it frames compiled vs interpreted as mutually exclusive.

        For example, Python is commonly compiled to bytecode, which is interpreted by the Python virtual machine. Java is commonly compiled to bytecode, which is interpreted by the Java virtual machine, and then often further compiled to machine code. Nowadays it’s rather unusual to find a non-toy language whose flagship implementation does not compile it.

      2. 7

        There is no difference between “compiled code” and “interpreted code.” It’s interpreters all the way down (what do you think the CPU does with machine code?).

        1. 5

          And most ‘interpreters’ first compile to some bytecode that is then interpreted.

          I expressed this in the past as: it’s Turing machines all the way down.

          1. 3

            Maybe, kinda. But “compiled code” vs “compiled language” have very different implications as to process. When you say “compiled code” you’re talking about where the code came from, and when you say “compiled language” you’re talking about what’s going to happen to the code, even if at the bottom of the pile it’s guaranteed to hit an interpreter of some sort.

            1. 2

              Yep! Since the dawn of microcode in the CPU, even the “machine language” is interpreted. It’s just a particularly rigid and hard-to-update interpreter because it’s in silicon.

            2. 6

              I think that most people understand this, but just are loose with terminology. QBASIC had an interpreter and a compiler though, which was very nice, instant execution and fast binaries (plus overflow checks when you interpreted your programs).

              I type this while waiting ~10 minutes on a project that is compiling. The ways that developer experience degrades over the years is interesting.

              Update: it was actually more than 30 minutes, and this was for a non-release build which was not from scratch.

              1. 2

                I think that most people understand this

                I think most people have some understanding of this, but it’s not always coherent or in agreement with other peoples’ understanding.

                As far as I can tell the most common thing people mean when they say “compiled language” is “I use this language with a compiler that emits machine code.” But sometimes what they mean is “this language’s main compiler must be run ahead-of-time, and you can’t just point the executable directly at the code to run it.” And sometimes they literally just mean “a compiler exists for the language”.

                Since the term is so overloaded and borderline meaningless in its most precise definition, it’s better to avoid it altogether and pick more descriptive words.

              2. 4

                It’s not actually true though. Any language which includes any eval statement fundamentally has to be interpreted. It might do that interpreting by bundling a compiler in a compiled artifact (i.e. it could also be compiled), but it is interpreted (as well).

                On the flip side any language that allows for extensive compile time execution has to be, for some definitions of the word, compiled. It’s not possible to properly execute any language with a turing complete type and/or macro system, for instance, without spending an arbitrary amount of time “compiling” (type checking/expanding macros) first.

                So it’s a property of the implementation, but it can also be a property of the language. And it’s not necessarily exclusively compiled xor interpreted in either case.

                1. 2

                  Any language which includes any eval statement fundamentally has to be interpreted.

                  This is not true. In the general case, it must at least be JIT compiled, but eval can be implemented by:

                  • Parsing and evaluating the AST
                  • Parsing, compiling to bytecode, interpreting the bytecode.
                  • As above, but JIT compiling the bytecode and executing it
                  • Linking the compiler, generating a new object-code file, and loading it (Smalltalk/X did this, for example)

                  In some common cases (i.e. where the input to eval is not part of the input), it can be implemented by partially symbolically executing the code around the eval and then AoT compiling the result. I don’t know of any mainstream compilers that do this, but I have come across some research compilers that do it with varying levels of success.

                  1. 1

                    I always considered JIT compiling to be a form of interpreting. Do you believe that your browser isn’t interpreting javascript anymore when it sends it to the JIT portions of it’s JS executor. Our definitions are probably different here and I’d be interested in hearing what your one is (I put mine below as a reply to technomancy). Do you only count AST walking?

                    1. 4

                      I always considered JIT compiling to be a form of interpreting

                      That’s at odds with how it’s used in most compilers. JIT is short for just-in-time compiler. A compiler runs once per unit of code (occasionally more than once if it’s incorporating profiling information), an interpreter is involved in every execution of a statement.

                      The distinction is clearer with the older translator vs interpreter terminology: a translator is someone who translates a text into a new language and creates a version in the target language that can be read by anyone who understands that language. An interpreter translates things as they are said and will translate the same thing every time it is said.

                      Do you believe that your browser isn’t interpreting javascript anymore when it sends it to the JIT portions of it’s JS executor

                      The browser that I’m typing this on is Safari and so the JavaScript engine is JavaScriptCore. It has a common bytecode across four implementations:

                      • An interpreter, written in a portable macro assembly language. This is intended for fast start and for the entire interpreter to fit in L1 i-cache.
                      • A simple JIT that mostly just inlines the instruction sequences from the interpreter, occasionally invoking them directly for more complex JavaScript operations. This is fast (no optimisation), gives a reasonable speedup, and has the same stack layout as the interpreter and so a JavaScript program can switch between the two on the fly.
                      • An optimising JIT, which assembles traces in a continuation-passing style and performs dataflow analyses on them. This uses type information recorded by the first two passes.
                      • An optimising JIT that takes the output from the previous optimising JIT, converts it to SSA form, and performs more aggressive optimisation on it.

                      So, my browser is interpreting every JavaScript statement at least once before it compiles it. If I remember the heuristic correctly, it only begins compiling only after a function has executed 10 times or a loop has executed 100 times (one of the big benefits of the first-tied JIT having the same stack layout as the interpreter: you can JIT-compile a hot loop in a function without any complex on-stack replacement).

                      The first versions of v8; however, did not include an interpreter. They had a simple parser that collected names of functions and the ranges of their bodies and then parsed JIT compiled every function on demand. Smalltalk/X works in a similar way: it parses Smalltalk, generates C, compiles the C, links it into a shared library, and loads the shared library. In both cases, the compiler visits each statement (or bytecode, or whatever) once, generates some executable code from it and then it’s no longer involved in the execution process.

                      For JSC it’s slightly more complex because there’s an initial compiler that translates from JavaScript source to bytecode, but then the first-tier of the execution infrastructure is interpreting the bytecode, so there’s a compiler then an interpreter and then hot loops are compiled from the output of the first compiler into something else. The v8 implementation decided against a common bytecode and so goes back to the source for each tier and recompiles from there.

                      Do you only count AST walking?

                      No, a bytecode can also be either interpreted or compiled, but the definition is the same: if you translate each bytecode into another form and then hand it off to something else to execute, then you are compiling. If you do something to execute the bytecode each time it is reached, then you are interpreting. A bytecode interpreter executes a bytecode, then decides which bytecode to execute next, then loops. A bytecode compiler reads a sequence of bytecodes and then generates some other representation. The output from a compiler may be interpreted or compiled.

                      At the end of the pipeline, a microcoded CPU is usually an interpreter (note: most RISCy CPUs don’t implement most of their instructions in microcode, some don’t use microcode to implement any of them). The original Smalltalk VM compiled to Smalltalk bytecode that was interpreted by the Smalltalk flavour of Alto microcode (the Alto CPU wasn’t designed for compilers to directly target its native instruction set, it was intended for languages to provide an interpreter as microcode and then to use that to execute a bytecode).

                      The Transmeta CPUs and nVidia Project Denver cores used a compiler in the microcode, which would take the traces in the public ISA and JIT compile them to the native VLIW instructions. As with v8, the original Transmeta CPUs provided only a compiler and suffered from high startup times the first time a sequence was executed. The nVidia Project Denver CPUs provided a first-tier interpreter (which decoded one Arm instruction and issued one VLIW instruction) and a second-tier JIT that took traces of Arm instructions and generated sequences of VLIW instructions.

                  2. 1

                    Any language which includes any eval statement fundamentally has to be interpreted.

                    Can you explain your definition of “interpreter” under which this is true? It sounds interesting but it doesn’t fit with the definitions I’m familiar with.

                    It’s not possible to properly execute any language with a turing complete type and/or macro system, for instance, without spending an arbitrary amount of time “compiling” (type checking/expanding macros) first.

                    Now here you’re absolutely playing fast and loose with definitions. I would object that type checking is certainly not a form of compiling, because the only output of a type checker is a success or error message. Just because the same program performs type checking as performs compilation doesn’t mean that type checking is itself a form of compilation; that’s just a coincidence of how the two processes happen to be commonly implemented.

                    Macroexpansion is another beast altogether; it’s much closer to compilation than type checking is because the output is code. But expanding a macro is turning code from one language into output in the same language, and as far as I’m familiar with, “compiling” implies that the output language is different from the input language.

                    Edit: the one exception I can think of is Forth; I believe the definition of the language (or at least one definition of it) specifies that the implementation include a compiler as part of the language semantics itself. But I am confident that no one using the term “compiled language” is actually referring to this particular specification quirk.

                    1. 1

                      Can you explain your definition of “interpreter” under which this is true? It sounds interesting but it doesn’t fit with the definitions I’m familiar with.

                      Along the lines of: “A program that takes the source code for another high level (i.e. what humans write) program and immediately executes it”. I’m actually really surprised two people have objected to this, what is your definition of an interpreter?

                      Now here you’re absolutely playing fast and loose with definitions.

                      Agreed. But I think your objections are a bit to broad.

                      Typechecking typically does more than output a “true” or “false” despite the name. It determines the sizes of things, determines which functions to call with the numerous forms of overloading (does foo.bar() call String::bar(foo) or HashMap::bar(foo)?), and more. With monomorphizing languages (and if they support the C abi you can really force this to be monomorphism at the language level) it can create exponential numbers of things to be compiled down to machine code. I can point you to concrete “useful” code of mine that intentionally produces quadratic amounts of assembly from type monomorphisim for instance (admittedly not exposed to the C abi, but I think it would be a small change to do so).

                      Both macros and types in most languages (with the exception of lisp?) are really their own mini language inside the larger language, at least from some perspectives.

                      1. 1

                        “A program that takes the source code for another high level (i.e. what humans write) program and immediately executes it”

                        So if gcc had a flag you could pass to it which said “after compiling the program in question, execute it” then you would consider that an interpreter? Also this definition excludes bytecode interpreters, which are (as far as I can tell) by far the most common type of interpreter in common use.

                        My definition of an interpreter would be a program which walks thru some representation of a program and executes actions based on its interpretation of each instruction. In the case of Clojure for instance, the language has eval in it, but Clojure’s reference implementation is strictly a compiler because it only knows how to convert source into JVM bytecode. Clojure does not contain any knowledge of how to interpret any given JVM bytecode instruction; that is handled by Hotspot instead. (Hotspot contains an interpreter and also a JIT compiler which emits machine code.)

                        Both macros and types in most languages (with the exception of lisp?) are really their own mini language inside the larger language, at least from some perspectives.

                        Yeah, you can say that macros make a sub-language inside another language, and that sub-language is compiled. But you can’t have it both ways; once you call the macros a sub-language then the fact that they’re compiled no longer implies anything about the outer language being compiled!

                        1. 1

                          So if gcc had a flag you could pass to it which said “after compiling the program in question, execute it” then you would consider that an interpreter? Also this definition excludes bytecode interpreters, which are (as far as I can tell) by far the most common type of interpreter in common use.

                          Yes, and yes (though it includes the many interpreters that compile to bytecode as an intermediary format).

                          You don’t “interpret” bytecode, you execute it on a virtual machine (assuming you execute it directly and don’t further compile it). Or at least that’s how I’ve always used the word. You and david_chisnall seem to be in agreement here so I could be using the language wrong.

                          [Macros]

                          The language you write the macro in is the sub-language, the place where you use the macro is (frequently) part of the main language. Rust is a good example here with macros like println!("foo") all over the place, but where macro definitions look quite foreign to the rest of the language.

                          The outer language is compiled, because the statement (in the outer language) that invokes the macro has to be expanded. The inner language is probably in fact commonly interpreted (by all our definitions).

                          1. 2

                            You don’t “interpret” bytecode, you execute it on a virtual machine (assuming you execute it directly and don’t further compile it). Or at least that’s how I’ve always used the word. You and david_chisnall seem to be in agreement here so I could be using the language wrong.

                            You interpret bytecode. Or, sometimes, you compile it. For example, Java bytecode was designed to be fast to interpret, but is sometimes JIT compiled. .NET bytecode is either interpreted, JIT compiled, or AOT compiled.

                            1. 1

                              You don’t “interpret” bytecode, you execute it on a virtual machine

                              I’m too much of a descriptivist to just out and say “you’re wrong” but this is inconsistent with every other usage of this word I’ve encountered, and if you use the word this way you’re likely to run into a lot of confusion. =)

                    2. 3

                      Executable formal semantics of C, which yields an interpreter and a debugger for C, pretty much yields any argument against this point NULL. C is the “definitive” compiled language, given it is a “portable assembler.”

                    3. 6

                      Obviously a compiler simply builds a compilation of multiple files into a single one (like Babel, or ld for example) while a transpiler also transforms the files from one language to another (like clang & gcc).

                      :-)

                      1. 6

                        In the (generally unlikely) event that I use the term “transpiler” what I mean is that I’m running a program that is able to, in a completely automated way, convert a program written in one language by a human to another language that is commonly used by humans to write programs.

                        So if I’m generating instructions for an intel CPU or for the JVM, I’m not transpiling. But if I’m generating javascript or C, I am. I don’t think the result needs to be or is likely to be editable by a human for it to count.

                        I don’t think calling it a “compiler” instead is incorrect in any way. I just think calling it a “transpiler” adds information about what’s going on.

                        If the resulting output were intended to be edited by a human, I’d call it a translator instead, and I wouldn’t expect the result of the automated transformation to be guaranteed equivalent to the original. i.e. I’d expect some tweaking to be required. For me, “transpiler” carries a strong implication that the result can be compiled into or interpreted as something runnable, without further human correction.

                        1. 5

                          Even if we were to claim that a compiler has to compile to “native code” we open another can of worms: that is native code? Does a x86 CPU execute “native code” or does it interpret x86 assembly and run microcode instructions. Is your CPU a compiler? Suddenly the question becomes “what does a CPU execute” and on some platforms the question whether your software is a compiler or not becomes hardware-dependent.

                          1. 4

                            I think the de-facto definition of ‘native code’ is code for which a lower-level abstraction is not exposed by the hardware. Yes, an x86 chip may interpret some instructions in microcode but it doesn’t expose the microcode instruction set to any higher-level code.

                            The place this gets really fuzzy is in things like Jazelle, where the CPU natively executed a subset of Java bytecode instructions but relied on helper code for others. In that case, javac was compiling to interleaved native and non-native code.

                          2. 4

                            While I know these terms are rather fuzzily defined, my understanding has become; that ‘transpile’ means that you move laterally in the “speed hierarchy” whereas ‘compile’ moves you to a faster representation.

                            The interpreter of that representation could be the CPU or some higher VM thing.

                            This is just how I’ve been reading these words, fully aware that it’s not precisely what they mean or authoritative in any sense.

                            1. 4

                              ‘Transpile’ is a portmanteau of ‘translate’ and ‘compile’. I don’t think the term is particularly meaningful. What’s wrong with just ‘translate’?

                              1. 5

                                Or rather, what’s wrong with just ‘compile’?

                              2. 4

                                This computer scientist clicked the link prepared to argue, but happily found a sane argument! Thank you!

                                1. 2

                                  I believe the very existence of this debate is a perfect reason to bring the term “translator” into mainstream usage. ;)

                                  It’s still used in the Ada circles, but hardly heard elsewhere.

                                  1. 2

                                    For characterizing what a compiler actually is, it might be helpful to remember that compilers themselves have to be written (and, potentially compiled) in some language. Here is a blog post on that topic.

                                    1. 1

                                      I always thought transpiler is a better word for what compilers as all a compiler does is translate code from one language into another. But they are interchangable. It does get at the marketing question, are we obligated to give things descriptive names even if the definitions are fairly unambiguous?