1. 11

    Awk and the various members of the Unix shell family?

    1. 6

      Yup, I’ve looked at at 5-10 shell implementations, and all of them are essentially AST interpreters + on-the-fly parsing/evaluating.

      Just like Ruby was an AST interpreter that recently became a bytecode interpeter, R made the same jump recently:

      A byte code compiler for R

      https://scholar.google.com/scholar?cluster=1975856130321003102&hl=en&as_sdt=0,5&sciodt=0,5

      1.  

        Do you happen to know how recently they’ve switched? (And an example of a large project written purely (or almost purely) in that language?) Thanks.

        1.  

          It looks like it’s around R release 2.13 and 2.14, which was around 2011. R is from the early/mid 90’s, similar to Ruby, and I think they both switched to bytecode compilers at around the same time.

          https://csgillespie.github.io/efficientR/7-4-the-byte-compiler.html

          https://cran.r-project.org/src/base/R-2/

          R is for data analysis, so it’s hard to say what a “large project” is. It’s also a wrapper around Fortran numeric libraries in the same way Python is a wrapper around C (or NumPy is a wrapper around Fortran.)

          There are thousands of libraries written in R:

          https://cran.r-project.org/web/packages/available_packages_by_date.html

          1.  

            Thanks! This actually looks like a pretty good example. That website even lists package dependencies so I can try to find a long chain of pre-2011 dependencies.

      2.  

        Unix shell, for sure. GNU awk and mawk compile to bytecode. Not sure about nawk, though the regexp implementation in nawk uses bytecode.

        1.  

          GNU awk and mawk compile to bytecode

          Really? That seems wierd, since gawk is both slower than nawk and mawk. Or is it because of the features the GNU project added, that it’s overall slower?

          1. 6

            One big difference is that gawk operates on sequences of Unicode characters, while mawk and nawk operate on sequences of bytes. If your text is all ASCII, setting your locale to ‘C’ will cause gawk to also operate on bytes, which should close at least some of the performance gap. But gawk’s Unicode support can be nice if you have UTF-8. mawk and nawk will often still produce the right result on UTF-8 input, especially when non-ASCII characters are only passed through unmodified. But sometimes you get:

            $ echo $LANG
            en_GB.UTF-8
            $ echo "ÜNICÖDE" | gawk '{print tolower($0)}'
            ünicöde
            $ echo "ÜNICÖDE" | nawk '{print tolower($0)}'
            �nic�de
            $ echo "ÜNICÖDE" | mawk '{print tolower($0)}'
            �nic�de
            
        2.  

          Thanks for the example. I don’t know how they work internally. Although to me, they are in the write-more-primitives-in-a-faster-language category. The primitives being the external commands called (maybe awk a bit less?).

          Are there examples of multi-layered library (or use) for awk and shell?

          Edit: typo

          1.  

            What does “multi-layered push” mean?

            As for awk, when I use it, I never shell out. If I need to shell out from awk, I just write the whole thing in Go or something. So I just use the primitives exposed by awk. Though, before Go, I shelled out of awk more often.

            1.  

              multi-layered push

              Oops, that’s the wrong words. That’s what I get for alt-tabbing between the comment window and a shell.

              I mean something like calling a library which calls another library which calls another library, with all intermediate libraries written in awk (or shell). (Or the same thing with functions if there are layers one on top of each other. By this I mean functions in one layer treats functions in the previous layer as primitives).

              As for awk, when I use it, I never shell out.

              That would be a good example then if you also have at least two layers of functions (although more would be nice).

        1.  

          I question the assertion that the interpreting slowdown is exponential; whether it takes a few dozen cycles per statement (as in a bytecode interpreter) or a few thousand (as in a pure AST interpreter), the same number of statements would need to be executed, and so the slowdown is roughly linear no matter how deep the call stack gets. The exception to that would be if the interpreter were itself getting interpreted for a subroutine call, and I can’t see any reason that an implementation would do such a thing.

          That said, one interesting place you might look for purely interpreted applications is in the Forth family; it’s fairly common to implement forth using some form of threaded code, which is effectively an AST modulo the fact that neither the S nor the T really apply to forth. You don’t see many programs written in forth directly (though apparently Chuck Moore has written a VLSI layout package), but postscript is fairly common and every implementation I’ve seen has been a pure interpreter.

          Another possibility, also outside the realm of traditional languages, is TeX, which seems to be completely uncompileable (even to bytecode); I’d definitely consider LaTeX to be a large application written entirely in TeX.

          1.  

            and so the slowdown is roughly linear no matter how deep the call stack gets.

            You’re the first one to say anything about this! I need to think about this again. Right now I think you’re right. Hm, so the slowdown isn’t from where I think it is in that case. I hope I didn’t just waste more than a month on a false premise.

            Also, then I’m surprised there aren’t more examples of direct AST interpretation if its “just” a constant factor.

            Edit: Ok, so I think the issue is that the slowdown scales exponentially with the height of the call stack but that’s unaffected by whether bytecode or direct AST is interpreted. I’m trying to push primitives (like loops) out of the host language and have them written in L (a bit like Forth but its more interpreted than compiled). Now instead of calling a primitive D directly, I call A which calls B which calls C which calls D and each of those have ~10 instructions in L then I get a 1000x slowdown. Is this reasoning right?

            The exception to that would be if the interpreter were itself getting interpreted for a subroutine call

            There are many nested layers but the interpreter itself isn’t interpreted (except possibly as a (very deliberate) test).

            Forth

            I guess I’m thinking of Forth as basically already bytecode since its a stack machine like most bytecode’s VM. The same goes for Postscript although I don’t know the language.

            TeX and LaTeX

            That’s an interesting example. I don’t know how it works internally, just heard it described as a macro expansion language. I’m not sure I’d say that’s interpreted though but its definitely not compiled to either assembly or bytecode. So it would be an example I’m looking for, although I don’t think I can actually make use of this example.

            1.  

              I think the way to look at is is that, as you push the veil back and write more and more of your logic in L, the slowdown applies to more and more of your execution time. It’s still not exponential; if a mixed program takes time t to run, and you have an interpretation overhead of o, the slowest you can possibly get by moving everything to interpretation is t*o and the fastest it can get is t/o. Those bounds could be tightened somewhat by considering the time that it spends in interpreted code vs. compiled code separately; take i to be the ratio of time spent in interpreted code. Then the worst case runtime would be t*i+(1-i)*t*o and the best case runtime would be t*(1-i)+i*t/o.

              These bounds are difficult to think about intuitively though, because they look at if you moved everything to compiled code or interpreted code. So, instead of looking at the possible performance improvement or loss, let’s look at the actual performance relative to the best case of everything being compiled. If the proportion of interpreted code in your program is i, the slowdown relative to the best possible case will simply be i*o+(1-i)

              It’s worth noting, though, that for the first few levels of pushing logic into your language, i grows (roughly) exponentially larger. The base has nothing to do with the interpretation overhead though; it’s simply the branching factor of your routines.

              1.  

                Thanks! Your comments here should be all the way at the top!

                If the proportion of interpreted code in your program is i, the slowdown relative to the best possible case will simply be i*o+(1-i)

                Yes, I agree and the all-compiled program is a better baseline. This means that I should get a factor of o at worst, even if its all AST interpreted.

                It’s worth noting, though, that for the first few levels of pushing logic into your language, i grows (roughly) exponentially larger.

                Hm, I’m wondering if this is what I’ve been seeing and let me to the incorrect exponential conclusion.

                Still, that means both direct AST interpretation and layering will result in “only” a constant factor slowdown? And its also “only” a constant factor to go from bytecode to native assembly.

                In general, is there somewhere I can run my reasoning (like this one) by some people for flaws? Other than to posting on lobste.rs that is. I’ve told some others and no-one had seen the problem. At the same time I’m re-doing the time bound estimates, I’m also estimating how much time I lost because of this :(.

                Ok, one more thing. If functions in level j only calls functions in level j-1 (and level 0 is primitives of the host language) then the time for a level j call is exponential in j, right? (And this happens regardless of whether the levels are (all) interpreted bytecode or direct AST.)

                And that means I should make calls in as low level as possible (instead of j-1) to make things faster. Except this goes in the wrong direction in terms of the abstraction I’m trying to provide (and potential polymorphism that would allow). For example, not using the object system would be faster than using it.

              2.  

                Also, then I’m surprised there aren’t more examples of direct AST interpretation if its “just” a constant factor.

                You need to take into consideration the number of toy languages that make it to languages in production use, where that constant factor results in a significant savings in time and money. (Less servers, less developer downtime while tests run, etc)

                1.  

                  I guess I’m thinking of Forth as basically already bytecode since its a stack machine like most bytecode’s VM. The same goes for Postscript although I don’t know the language.

                  Hmm. Forths are typically pretty low level, and often run directly on hardware (though they often will use their own calling conventions, instead of the C one). The dynamism comes from the fact that words are “compiled” into a list of addresses to the other words, which can basically then be “jumped” to in succession. So, it’s actually slightly strange, and not really a VM with “a list of known operators that you switch on” in the traditional sense.

                  1.  

                    Good point. Although it means Forths are even more compiled (and a less good example of what I’m looking for) than languages that only stop at the bytecode.

              1.  

                Do languages that implement compiling to closures count? They aren’t strictly AST style interpreters at that point, but they also aren’t byte code interpreters, either. I mention it, because I’ve successfully sped up some direct AST walking templating languages in the past with this technique. So, I’d nominate “templating languages” as a potential consideration.

                1.  

                  Very interesting! I wasn’t aware of this technique so I’ll have ot read that before I can really tell. What’s an example of a language that uses it (other than their Scheme)?

                  On the practical side, it will depend on how much extra is needed to make use of this method (compared to adding compilation to bytecode).

                  1.  

                    I’m not aware of any real programming languages using this technique (at least not in general use), but once upon a time I added this technique to toffee, and I experimentally added it to one of the Python mustache implementations (but never finished/upstreamed it), with promising initial results.

                    Conceptually it’s much simpler than compiling to bytecode, I think, so it’s the thing I always reach for when performance of a little AST evaluator matters. I’ve written several small scale interpreters this way for things like evaluation of basic “rules” and, more recently a prototype JSON document query / merger thingy.

                1. 15

                  PicoLisp is an interesting example of a purely interpreted language, though it doesn’t meet your request for a popular interpreter. It uses dynamic variable binding, which was historically a common feature of the directly-walking-sexpressions approach to interpreting Lisp.

                  Emacs’s Elisp used to be implemented similarly, but now has a byte-compiler. Earlier than that, almost all Lisp and Scheme systems used to include a directly-walking-sexpressions interpreter for use at the REPL because compilers had too high latency to use interactively. But compilers have gotten fast enough that many systems have switched over to just using a compiler everywhere, to avoid having maintain a separate interpreter. One that still has both an interpreter and a compiler is Chez Scheme, which is now open source (and like all Scheme interpreters, it does lexical variable binding). The interpreter has lost most of its reason for existence though: historically, the Chez business model was that the interpreter, Petit Chez, was free and cross-platform, while platform-specific native-code compilers were sold as products.

                  1. 9

                    Emacs’s Elisp used to be implemented similarly, but now has a byte-compiler.

                    Though Emacs has a byte compiler, it’s not used exclusively. Nearly every Emacs setup runs a mix of interpreted and compiled code, so Emacs may contain the most widely-used non-bytecode interpreter in the world.

                    1.  

                      This is very interesting to learn. Are there large pure PicoLisp projects? Do you happen to know what’s the last version of ELisp that was still interpreted? I’ve never looked but I’m guessing Emacs is mostly written in ELisp?

                      Edit: Ok, so I just looked at some PicoLisp source for PilBox and it seems to me that the call graph is pretty shallow. That is, functions in these libraries call primitives but not each other very much. That and using its Java FFI for Android is really cool! But doesn’t quite help with examples of what I’m trying to find (tricks to get around the speed issue from nesting in direct AST interpretation).

                        1.  

                          On a related note, femtolisp was used in Julia compiler. They rewrote Julia in mostly Julia. So, I’m not sure how much femotlisp is in there now.

                          1.  

                            Thanks!

                            even though many primitives (e.g. filter and for-each) are written in the language instead of C.

                            This sounded promising but then it went on to

                            femtolisp uses a bytecode compiler and VM

                            which is unfortunate for the current question. Otherwise, Julia should have certainly been a large enough example of a femtolisp program.

                            1.  

                              I seem to recall that femtolisp was a straight up interpreter at one point in time, but I think I may be confusing that with the tiny implementation, which I’d guess is less used than femtolisp itself, and therefore purely academic in nature. :)

                            2.  

                              On a related note

                              I’m not sure how this is at all related, to be honest. :)

                              1.  

                                Yeah, I read it too quickly or forgot no bytecode by the time I replied. My bad. Least author found the link interesting.

                        1. 4

                          Very nice!

                          It includes unique algorithms that I invented (afaik), like a contextual-lexer for LALR, and a dynamic lexer for Earley.

                          Do you have a high level description of what the algorithm is somewhere?

                          It can generate a stand-alone parser.

                          What do you mean by this? Aren’t most parsers parser generators? (In fact mine is the only one I’m aware that isn’t a parser generator).

                          (* PEGs cannot handle non-deterministic grammars. Also, according to Wikipedia, it remains unanswered whether PEGs can really parse all deterministic CFGs)

                          I’m not sure about this statement. You could just as well say CFGs can’t handle PEGs because of the lack of a choice operator (or rather that CFGs have unpredictably outputs on non-deterministic grammars for inputs with more than one parse).

                          Notice punctuation doesn’t appear in the resulting tree. It’s automatically filtered away by Lark. Build a parse-tree automagically, no construction code required

                          Because of these statements, I looked at python_parser.py

                          >>> from python_parser import *
                          >>> python_parser2.parse('a >= b\n') == python_parser2.parse('a == b\n')
                          True
                          

                          One statement is >= and the other is == but the information is lost.

                          >>> python_parser2.parse('a >= b\n')
                          Tree(expr_stmt, [Tree(comparison, [Token(NAME, 'a'), Tree(comp_op, []), Token(NAME, 'b')])])
                          >>> python_parser2.parse('a == b\n')
                          Tree(expr_stmt, [Tree(comparison, [Token(NAME, 'a'), Tree(comp_op, []), Token(NAME, 'b')])])
                          

                          Lark looks pretty good otherwise.

                          1. 6

                            Hi, thanks for your comments!

                            Do you have a high level description of what the algorithm is somewhere?

                            I kept postponing it because I wanted to write a lengthy blog post, but you’re right, and I will add an explanation. Meanwhile, here it is in a couple of sentences:

                            • Earley is a chart parser, which means it goes through the input letter-by-letter, which is very slow. You can use a lexer, but then you lose a lot of the ambiguity in the text. My improvement is a “skipping” chart parser, which lets Earley match regexps, to the same effect as scanless parsing but much faster in the common case.

                            • The contextual lexer communicates with the LALR(1) parser and uses the parser lookahead prediction to narrow its choice of tokens. So at each point, the lexer only matches the terminals that are legal at that parser state, instead of all of the terminals (which is what yacc & ply do). It’s surprisingly effective at resolving common terminal collisions.

                            Aren’t most parsers parser generators

                            The most common Python parsers aren’t. Namely PLY, pyparsing and the rest. In the sense that you always need the entire library to build a parser. I agree it’s not innovative in the grand scheme of the parser world, but it’s pretty unique in the Python landscape.

                            CFGs have unpredictably outputs

                            Lark lets you control that with weights, and you can also get a parse tree that contains all of the ambiguities, for you to trim as you see fit.

                            CFGs can’t handle PEGs because of the lack of a choice operator

                            I’m not sure why you think that. The choice operator is an “or” operator that forces determinism. But when your parses can handle non-determinism, it’s not necessary. You can make your “choice” later, after the parse it complete.

                            The choice operator does have advantages in terms of efficiency. If enough users ask for, I will add it as a feature (it’s entirely possible to add to the Earley parser).

                            python_parser.py

                            These are the comments at the header of the grammar:

                            // This grammar should parse all python 2.x code successfully,
                            // but the resulting parse-tree is still not well-organized.
                            

                            There are multiple ways to keep the operator, for example to define comp_op like this:

                            !comp_op: "<" | "==" | ...
                            

                            I did fix that in the Python3 grammar, but there are many small details like that. I just never got around to it, it’s just an example in the examples folder.

                            Thanks again for your input.

                            1. 3

                              “skipping” chart parser, which lets Earley match regexps

                              Thanks for the description. I do hope you write that blog post at some point.

                              So it can match either a regexp or a character (instead of the just a character)?

                              On a side note, I always found it weird to use regexp in parsing because regexp matching is already a parsing task so somehow a subparser is used for part of the grammar. There’s no reason not to do this, of course.

                              contextual lexer communicates with the LALR(1) parser

                              Oh so you are doing both steps at once (or sort of alternating between them).

                              The most common Python parsers aren’t. Namely PLY, pyparsing and the rest. In the sense that you always need the entire library to build a parser. I agree it’s not innovative in the grand scheme of the parser world, but it’s pretty unique in the Python landscape.

                              Oh, I see. I never looked at those two in more detail. Maybe I was thinking of Parsimonious or Parsley? It definitely felt like they were essentially generating Python code although it may be encoded in some form or other.

                              Lark lets you control that with weights, and you can also get a parse tree that contains all of the ambiguities, for you to trim as you see fit.

                              Weights? Sounds pretty interesting!

                              You can make your “choice” later, after the parse it complete.

                              Yes, that’s true. I’ve never seen it stated that way.

                              The choice operator does have advantages in terms of efficiency. If enough users ask for, I will add it as a feature (it’s entirely possible to add to the Earley parser).

                              How does that work with Earley parsing? I probably haven’t thought about this as much as you have, but its not obvious to me how to keep track of multiple alternatives.

                              I don’t think a lot of people will ask for this as a feature mind you. I’m more curious about how that can be done.

                              These are the comments at the header of the grammar: // This grammar should parse all python 2.x code successfully, // but the resulting parse-tree is still not well-organized.

                              Yeah, when I saw that but thought it meant the tree could be shaped differently than what the ast module would produce but still is an AST that you could, say, pretty print the parsed Python source code with.

                              !comp_op: “<” | “==” | …

                              Ok, that looks as pretty simple fix so it isn’t really an issue then. I didn’t find anything about the prefixes ! and ? in the documentation though. Maybe just looked too fast.

                              1. 4

                                I do hope you write that blog post at some point.

                                I’m sure I will. And your expressed interest makes it even more likely!

                                Yes, it can match regexps. And characters are just very short regexps :)

                                I always found it weird to use regexp in parsing

                                It actually makes a lot of sense. A stronger parser is always slower than a weaker one, so (string match > regexp > LALR > Earley). Splitting the parser to several layers lets us pick the low-hanging fruits faster, and use the heavy lifting only when necessary. In fact, I believe it’s possible to make the chart-skipping Earley to run LALR(1) on the non-ambiguous subgrammars of what it’s trying to parse, so it will have 3 layers. It’s a little more tricky than regexps, but hopefully I’ll manage to write it some day.

                                So you are doing both steps at once

                                Actually, it’s the same classic architecture, that both Yacc and PLY use: Read token, advance parser state, read token, advance parser state, etc.

                                The only difference is that now, before the lexer reads a token, it asks the parser what state its in, and tests only the relevant subgroup of terminals. (which are organized to groups beforehand, of course)

                                I don’t know how nobody thought of that before :)

                                Weights? Sounds pretty interesting!

                                It’s still a little primitive, you have to use integers and the higher one gets chosen. But it’s used rarely enough so it’s okay for now. And I do sum up the descendants, so weights propagate up in the grammar.

                                There is, however, more than one weight selection algorithm. The other one contributed by a guy who used it for NLP.

                                its not obvious to me how to keep track of multiple alternatives.

                                Actually, Earley already keeps track of multiple alternatives. The hard part is to make it consider only one! So I think the way I would do it, is that when I see a choice operator, I would only let Earley see the first option, and I’ll mark the spot with a “checkpoint”. That can happen many times during the parse. Then, when the parse fails (either immediately or when it reaches the end), I would backtrack the the most recent checkpoint and yield the second option, if any, or backtrack some more. The chart parser will have to be modified to support incremental parsing, but I think there’s nothing preventing it from doing so.

                                It’s only efficient if used correctly, of course, because the worst case performance is atrocious. But I think PEGs have the same problem, no?

                                I didn’t find anything about the prefixes ! and ? in the documentation though.

                                The ? prefix is mentioned here: https://github.com/erezsh/lark/wiki/Tree-Construction

                                But I did forget about the ! prefix. I’ll add it, thanks!!

                          1. 1

                            Interesting project, but I have a question regarding:

                            Python is Lisp with syntactic sugar and Lisp is Forth with syntactic sugar.

                            Could you elaborate further on this point? I’m guessing it’s not to be taken literally, but although I have heard of the connections between Python and Lisp (map, fliter, lambda, …) I haven’t heard about the second connection. Wasn’t FORTH invented in the 1970’s and LISP in the 1950’s? I’m guessing the order of appearance isn’t strictly necessary, but you don’t often hear about newer programming languages making older ones more “difficult”.

                            1. 3

                              I’m guessing it’s not to be taken literally, but although I have heard of the connections between Python and Lisp (map, fliter, lambda, …) I haven’t heard about the second connection. Wasn’t FORTH invented in the 1970’s and LISP in the 1950’s?

                              I think the idea is that Lisp is a delimited, prefix language (e.g. (foo (bar baz) quux)) while Forth is an undelimited, postfix language (e.g. quux @ baz @ bar foo, where baz & quux are words which put an address on the stack, bar is a word which pops a value from the stack and pushes a value on the stack & foo is a word which pops two values from the stack and does something).

                              You can imagine that one could work up to a Lisp from something like OPAREN quux OPAREN baz bar CPAREN foo CPAREN in Forth, where OPAREN & CPAREN are special Forth words which do the obvious thing.

                              1. 3

                                Its nothing so profound. Just a remark about syntax. Its almost as bargap explained. To go from Lisp to Forth

                                (foo (bar baz) quux)
                                

                                Put the function at the end of the list instead of the beginning so (func arg1 arg2 arg3) becomes (arg1 arg2 arg3 func).

                                ((baz bar) quux foo) 
                                

                                And remove the parenthesis

                                baz bar quux foo
                                

                                This should produce the same output when run in Forth. (Assuming constants are functions with no parameters and return the constant. And adjust for autoquoting and implicit returns.)

                                The same goes for the Python to Lisp transfomration. Replace all syntax with calls to their dunder equivalent, put the functions inside the parenthesis and remove the comma separators (func(arg1, arg2, arg3) becomes (func args1 arg2 arg3)) and turn indented blocks into quotes (and put them as the last argument of the block heading).

                                1. 2

                                  I still don’t see the link they’re trying to make with lisp and forth, especially considering their lisp example looks and seems nothing like lisp…

                                  consider their example

                                  bind("fib" fun([i]
                                                 [if(<(i 3) [return(1)])
                                                  return(+(fib(-(i 1)) fib(-(i 2))))]))
                                  

                                  against common lisp

                                  (defun fib (i)
                                   (if (< i 3)
                                    1
                                    (+ (fib (- i 1)) (fib (- i 2)))))
                                  
                                  1. 2

                                    To do the reverse transformation:

                                    (defun fib (i)
                                      (if (< i 3)
                                       1
                                       (+ (fib (- i 1)) (fib (- i 2)))))
                                    

                                    Put function names in front of the parenthesis.

                                    defun(fib (i)
                                     if(<(i 3)
                                        1
                                        +(fib(-(i 1)) fib(-(i 2)))))
                                    

                                    Add explicit returns.

                                    defun(fib (i)
                                     if(<(i 3)
                                        return(1)
                                        return(+(fib(-(i 1)) fib(-(i 2))))))
                                    

                                    Put square bracket for quotes to distinguish them from calls.

                                    defun(fib [i]
                                      if(<(i 3) return(1)
                                         return(+(fib(-(i 1)) fib(-(i 2))))))
                                    

                                    Change which arguments need to be quoted and which ones don’t.

                                    defun("fib" [i]
                                      [if(<(i 3) [return(1)]
                                          return(+(fib(-(i 1)) fib(-(i 2))))]))
                                    

                                    (Some of these transformations I’ve seen on discussions on possibilities for Lisp syntax.)

                                1. 2

                                  It’s been running for a couple of minutes now, but I guess it’s working.

                                  Very interesting language!

                                  Edit: Took ~15 minutes to get to a REPL?!

                                  1. 2

                                    Thanks for trying this out and reporting back! Which file in precompiled/ did you pass it?

                                    If you just want its (FlpcForth) REPL, then ./flpc with no arguments should work. Nothing is defined in that case, of course. You can try to paste in some source from precompiled/*.

                                    The sample sources in precompiled/ are all for parsing (some of) its own source and/or generating its own source. That’s why it takes so long (although even that shouldn’t…).

                                    There’s no FlpcPython REPL yet (it can parse FlpcPython but doesn’t know what to do with the resulting AST). So right now, to write FlpcPython, I suggest writing to a file, say foo.flpc, and running

                                    python compiler.py lib/stage{0,1{a,c,d},3{a,b}}.flpc foo.flpc > foo.f
                                    ./flpc foo.f
                                    

                                    This uses the external Python compiler which will eventually be removed.

                                    1. 1

                                      Thanks. I ran precompiled/flpc-all.f. Oh, I just noticed that you recommended flpc-gen.f above :/

                                  1. 1

                                    Is Nim a good language for creating (eventually) self compiling languages out of? Is it a good target for code generation (by which I mean do you have to fight it a lot if the language you’re implementing is different)?

                                    I switched over to C from x86_64 assembly but would like to know what are good alternatives these days (currently, they are both hard to debug).

                                    The readme in src/ talks about a sprymin.nim file but I wasn’t able to find it anywhere.

                                    1. 3

                                      Honestly, I don’t know. I’d have to learn the language. I’ll ping @dom96 in case he can tell you something about the debug story or what low-level stuff people do in it. It’s in a state of constant change where I’d probably have missed stuff. Main page did say it at least outputs a stack trace when it crashes.

                                      I know it’s has several features for metaprogramming. That gives it lots of power. Compiling to C with low-latency GC gives it broad applications. Given Oberon was used for OS’s, you could probably build most things with it so long as performance wasn’t too bad. It probably has an unsafe keyword or something for high-performance, manual management of memory for when GC can’t be tolerated. You still get benefits of a HLL in that case. If resource-constrained, you might try embedding something like C in as a DSL. Ivory language does the latter in Haskell. Techniques like Design-by-Contract help a lot in debugging. You should aim to knock out as many errors as possible ahead of time with a safe, high-level language that localizes them where possible. Then, contract assertions can help you find the rest esp combined with property-based testing or fuzzing.

                                      Far as self-compiling, Nim is already implemented in Nim. It leverages C for performance. One can always build subsets that compile unoptimized to x86 if they want to compile something in itself. This is something many obsess about but that isn’t that important. That’s because traits we judge compilers on are same as a program: compilation speed, runtime speed, flexibility, safety/security. The compilers that produce fastest code are black boxes in unsafe languages you’d have to invest massive time into understanding. Those you can bootstrap and understand easiest will produce sub-optimal code you’ll likely not use. The last point is one should always use best tool for the job. Certain languages, like Ocaml, have features that make them inherently better at writing compilers balancing code size, safety, and performance. For extreme example, Ometa can implement things like JavaScript in a few hundred lines of code. Its tradeoff is conciseness and flexibility you’re not getting in C but way, way slower. Tradeoffs abound.

                                      If you need both, then the Wirth languages and some Schemes are probably best. One has whole books telling you how to compile it with source code. Use them for trust/understanding or extract to C to leverage black boxes. For Scheme, you can use the easiest features to implement in a modern Scheme (eg Racket or Chicken) where you can use their optimizing compiler or extend a mini-homebrew Scheme you understand up to their level.

                                      Those are the basic principles. Compiling in self just isn’t that important for 99.9+% of people’s use cases. It’s more a tradition of testing the language on something hard that the developers are sure will both stick around and get bigger. It’s not necessary, though, like with all the HLL’s written in C++ or whatever.

                                      1. 2

                                        Thanks for the detailed reply! If you get ahold of them, I’d be much more interested to know how they intend to spin up the live environment, especially for something like the Squeak debugger. Even a simple curses or command line version (or curses) would be interesting if it can be made without a lot of code.

                                        Nim does look like it has a number of nice properties and at least claims to aim for speed and allows manual GC. On the other hand, it does not seem to be very focused on its debugging tools. (Someone tried to remedy this)[https://forum.nim-lang.org/t/3076] and it does have source and line number info but presumably I can’t easily evaluate nim expressions and make calls in gdb (or Nim’s debugger if it had one).

                                        Techniques like Design-by-Contract help a lot in debugging.

                                        Assertions are used a bit; maybe I should do more. But I’d really rather have a better debugger/repl/live environment (preferably written in the new language itself). Part of the reason is that the spec changes a lot as I explore and a lot is even scrapped. In this case, the language and (core) library also changes so updating assertions and test is rather costly. The same reason applies to other techniques that needs a lot of time before I can run the program and visualize the result to decide if the program should be scrapped or not (having complete correctness is usually not needed to visualize).

                                        Is there any way to gain certainty more quickly?

                                        Nim is already implemented in Nim

                                        I briefly looked at Nim’s compiler in Nim but it says that it was almost directly translated from Pascal and it is also quite large.

                                        One can always build subsets that compile unoptimized to x86 if they want to compile something in itself.

                                        I didn’t understand what you mean here. Subset of Nim, the new language or the Nim compiler?

                                        Ometa can implement things like JavaScript in a few hundred lines of code.

                                        I think the original Ometa is also hard to debug.

                                        Its tradeoff is conciseness and flexibility you’re not getting in C but way, way slower.

                                        Is it really that much slower? I thought you could use it for optimization using Ometa AST transformations.

                                        Scheme

                                        In Scheme flavours, is it easy to implement specific behaviours with respect to memory layout and computation? Do you only get a language equivalent to what you specified or can you get the same amount of control as assembly and C (and I guess without fighting the language too much)?

                                    1. 1

                                      I know it’s not the same thing, but it reminds me of the syntax-directed editor of Python’s spiritual ancestor, ABC.

                                      1. 1

                                        I’m unaware of that editor. Is there a video of its usage somewhere?

                                        1. 1

                                          Not that I’m aware of, but there’s a paper describing it here: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.694.3304&rep=rep1&type=pdf

                                          (Like I said, it’s not really related, just thinking about editing Python AST’s and how Python is kinda-sorta descended from ABC.)

                                      1. 1

                                        My approach to this is to just look at the worst case time (all constants included). Keeping a low worst case time also “composes” into low (but perhaps more importantly, predictable) worst case times for larger functions. If something looks like it runs slow then you can also check if any part breaks your assumptions about the maximum amount of time it should take.

                                        In that case, you don’t have to try to second guess how your program (or C compiler) will be used (i.e., on what input distribution).

                                        By accepting to “lose” this time difference between average and worst case, you can also get clear cut winners. For n cases in a switch statement,

                                        • linear search takes 2n
                                        • jump tables takes log(n) + 4
                                        • binary search takes log(n)

                                        instructions in the worst case. This tells us what to do for large n. For small n, we can still look at the details of how many instructions the worst execution path will take for each approach (or even the best imaginable approach for worst case time).

                                        Huffman search doesn’t apply since we don’t assume to know frequencies in advance.

                                        This is more of a complimentary point to this post both because we still have to figure out the best thing to do for small n (and hard code it into the compiler) and the point that the compiler will optimize the switch statement on its own (so the programmer doesn’t have to) still hold. Although I do find it adds an extra level of indirection for reasoning about the code’s running time.

                                        1. 2

                                          How is a jump table log(n)? Wouldn’t it be O(1)?

                                          1. 1

                                            Yes, thank you! Its just a table lookup. And I think binary search might be 2 log(n) in general to compare and then jump.

                                            Looks like its too late for me to edit though.

                                        1. [Comment removed by author]

                                          1. 2

                                            Thanks for the correction. Do you know if there’s a word for me to call it then? (Something better than “make a lisp” in many languages that hopefully draw similarity to the rest of the article.)

                                            I guess you have to do other lang in MAL followed by MAL in MAL for it to be bootstrapping?

                                            1. [Comment removed by author]

                                              1. 2

                                                self hosting would be implementing MAL-lisp in MAL-lisp (and they have done this I think!)

                                                Yeah, that’s what I meant by MAL in MAL. They do have it!

                                                So I think running MAL in MAL on top of MAL in X would match your definition with “compiler” replaced by “interpreter” (and skip the compile step since interpreters don’t have one, or replace it by “parse”). This is not MAL’s intended use, of course.

                                                “Hosted” sounds like it refers to a property of the final product and I want to refer to the process (the steps for writing MAL in X). I guess I can stick with “make” if there’s nothing better.

                                          1. 2

                                            Some of the arrows in the new CFG are reversed? Not sure why if depends on while.

                                            1. 1

                                              Thanks! I’ve removed the single_if to while arrow, but can’t figure out which one is reversed.

                                            1. 10

                                              Speaking as someone who wrote a Python 3 interpreter:

                                              Be careful! Python’s simplicity is a bit of a siren song. It feels simple from a user perspective because it stays out of your way, but that means nothing when it comes to simplicity of implementation. Worse, if you’re expecting to replicate the semantics of existing Python code you might be faced with actually implementing all the weird stuff it does in order to use your existing code.

                                              I had several moments on Hython where I seriously doubted if I could finish it to a satisfactory degree, and whether it was worth it. In retrospect, I have to say it was not worth it; I should have just worked on my own language instead.

                                              1. 5

                                                Thanks for the feedback. Yes I’ve long realized that Python is misleading – it hides a very large language with some tricky semantics underneath that clean syntax!

                                                I didn’t emphasize this enough in the blog post for space, but I have already run my unit tests under the 4 components. It parses and compiles them into runnable format. So the remaining work is to port the Python interpreter loop to C++, and fill in all the metacircular stuff like dicts, lists, sets, etc.

                                                There is some hidden work like probably implementing the marshal format for bytecode. I have an idea for punting on the garbage collector if necessary.

                                                But I only have to make 12,000 lines of well-tested Python work. I think I can do that. The main question in my mind is if this OPy language will be useful at all to users, or if it will just be an implementation detail for me.

                                                It’s possible it will fail, but I’ve only spent 10 days on it and gotten quite far.

                                                1. 5

                                                  Hm I just looked at your code, and it’s under 3,000 lines total??? Depending on what kinds of programs it can run, I would see that as an advertisement for Python’s simplicity. It looks pretty clean in Haskell.

                                                  I’m aiming for about 8,000 lines of Python and 3,000 - 5,000 lines of C++ for OPy, and I only have to write the C++ part.

                                                  Remember the shell doesn’t need much of a standard library. It only makes like 10 system calls – fork/exec/wait/open/read/write/close/pipe/dup2/fcntl is almost all of it.

                                                  1. 2

                                                    This is a really impressive project! Did you implement your own dict and list or use an existing implementation from Haskell?

                                                    I’m also in the boat of writing a Python (variant) interpreter. I think its a pretty good idea to try it if you’re not aiming for 100% compatibility. I did hit a few complexity of implementation here and there, but later I realized I didn’t need them (yet).

                                                    Can you list some of the weird stuff you encountered? For me it was setting up the environment on a function call (with keyword arguments, *, **).

                                                    For what its worth, here are my current line count.

                                                    Component                    files          blank        comment           code
                                                    -------------------------------------------------------------------------------
                                                    Parser                           1             40            166            300
                                                    Interpreter & debugger           2             68             22            598
                                                    AST nodes semantic               2             33             15            181
                                                    Library                          2             83              6            361
                                                    

                                                    I think writing the interpreter in Python itself made the process easier because I could import missing pieces and pass it to my interpreter in a variable to visualize the final result instead of trying to get something working in one pass. In the end, I’m hoping to only depend on ctypes so I can port the parser and interpreter parts to other languages.

                                                    (Right now its able to run the first chunk of its own parser but not all variables use the list and dict types defined in my library.)

                                                    1. 2

                                                      IIRC, dict/list are hybrids: implemented mostly in Python for convenience, but most operations simply invoke interpreter primitives, which lets them interface with Haskell.

                                                      As for weird stuff, let me think on that. Right off the bat I’ll say exception handling is a bitch. :)

                                                      1. 1

                                                        I remember the first time around, I had to make a lot of changes to add exception to distinguish between the returned and raised values, but don’t remember them being particularly difficult.

                                                        Exceptions are actually pretty important for this project because I wanted to be able to continue execution after an uncaught exception is raised.

                                                        In the rewrite (with the line counts above), I just put a blatant hack: a .raised parameter in error objects to indicate if they’re raised or returned. (If someone returns an error with a manually set .raised = True, I’ll consider the semantics to mean that they actually want to raise an error.)

                                                        Exceptions made me think of the break and continue statements which were really difficult for me (the first time) and ended up being implemented as exceptions!

                                                        Do share if you remember more of this stuff so we’ll be better prepared!

                                                        I just found this post of yours:

                                                        http://callcc.io/hython-the-simplest-possible-language/

                                                        If there are others (especially about the internals), I’d be interested in getting some links.

                                                        1. 2

                                                          Okay, I remember where a lot of the weirdness came from: I used continuations in the implementation, which are great, but you have to be very careful when using them. I vaguely recall I had to write lots of tests to verify that exception handling’s interaction with while loops handled all cases correctly. Mostly this involves careful modification of state.

                                                          Also, I added modules in late, and they always felt a bit shoved in. Conceptually they aren’t that weird, but the myriad of import statements means you need to write a very flexible implementation of import and then tailor it based on what is used.

                                                          Be careful of loops where the interpreter calls Python code which calls the interpreter again, too. I’m not sure I even handle that properly. It’s kind of nasty.

                                                          1. 1

                                                            Thanks for coming back and sharing!

                                                            That’s funny, I even have a Python Exception subclass called “Continuation” (for return, break, continue). But from the sound of it, you also implemented regular loop iterations using continuations.

                                                            I remember a python dev saying on a blog that they wish from x import y just meant `import x; y=x.y’ or something like that. So that’s what I ended implementing for import. Otherwise, they were just a different scope. Even Classes were just scopes with extra values preset.

                                                            Be careful of loops where the interpreter calls Python code which calls the interpreter again, too. I’m not sure I even handle that properly. It’s kind of nasty.

                                                            Oh yes, this, definitely this. Actually, this is why I can’t explain how some of things that shouldn’t work actually ran (correctly or apparently correctly) in the first version.

                                                  1. 4

                                                    Why oh why would someone choose to build another GUI toolkit on top of tk?

                                                    Everyone should just pig pile on Toga - fill in the documentation gaps and then all bask in the glow of awesome truly cross platform (mobile, desktop, wherever) UI for Python.

                                                    1. 2

                                                      And this is part of the reason I can’t find instructions on building a GUI toolkit.

                                                      Why oh why would someone choose

                                                      Do you have any particular gripes against the reason at the beginning of the article

                                                      on top of tk

                                                      What do you mean?

                                                    1. 2

                                                      Very interesting series on using those data structures. Readers might also be interested in one that came as a counterpoint:

                                                      http://scienceblogs.com/goodmath/2009/02/18/gap-buffers-or-why-bother-with-1/

                                                      1. 2

                                                        Also be aware that this rope series is motivated by existing editors’ non-responsiveness with large files and long lines. The gap buffer article instead says

                                                        I’m not writing a string processing framework for managing gigabytes of data

                                                        the cost of moving the cursor is proportional to the distance that you move it

                                                        The gap buffer article doesn’t mention (handle?) unicode or non-printable character.

                                                        So, to me at least, it looks like ropes are a solution to all the gap buffer issues.

                                                      1. 4

                                                        This is cool! I wrote a tiny backtracking PEG interpreter too in Python, but I never used it to parse Python.

                                                        But you’re interpreting Python’s CFG as a PEG. Have you thought about where that will break, if anywhere? Or tested it?

                                                        I have been meaning to look into Python’s bespoke parser generator pgen.c. I don’t quite understand the algorithm they are using. It seems similar to the top-down algorithms that ANTLR uses, but I haven’t grokked all the details.

                                                        1. 1

                                                          Thanks! Do you have link to your PEG parser?

                                                          I did encounter some places where I reordered the alternatives being tested. It parses all the scripts in this project (and a few others I’ve tried with no glaringly obvious problem to my eyes). At the moment, I am not aware of a Python script that gets the wrong AST (except for the whitespace issue I wrote about where some valid scripts are rejected). I don’t know if my rearrangement of the grammar was needed or if it was bugs in my parser at the time.

                                                          I think the fact that Python is LL(1) helps because if a string matches, it only matches in one way (if two alternatives match the next character/token of the input, you may have to backtrack if you start with the wrong one of the two). I’m not entirely sure of what I just said. Maybe someone else can confirm or deny. Also since I don’t tokenize first and pgen might, this might not be enough in my case even if what I said about LL(1) grammars is true.

                                                          I haven’t looked inside pgen. The fact that compiling C was needed just to build the AST for a slightly different language using the ast module did bother me.

                                                          1. 1

                                                            Somewhat embarrasingly it is only available on the defunct Google Code. I have it as an hg repo on my machine but haven’t touched it in a long time:

                                                            https://code.google.com/archive/p/annex/source

                                                            I plan to revive this code, or at least the idea, as part of my http://www.oilshell.org/ project. If you have any questions let me know.

                                                            It’s not pure PEG actually – it’s a regex-based lexing system, with a PEG-like parsing system on top (using ordered choice). It’s a backtracking interpreter – no code generation.

                                                            1. 1

                                                              Thanks! The regex part makes it sound a bit like parsimonious.

                                                              Clicking through some links, I just found your recent article on Pratt parsing.

                                                              http://www.oilshell.org/blog/2016/11/01.html

                                                              Can I ask you some questions about that? I guess most of this should probably be in a PM or email.

                                                              Could you expand the shorthands Pratt uses? From your post, I have:

                                                              • lbp: left binding power
                                                              • rbp: right binding power

                                                              but what’s nud, led, etc?

                                                              And in what way does it differ from Floyd’s algorithm? The first step of Floyd’s algorithm computes a table and two arrays f and g of precedences. For Pratt, it seems that lbp and rbp need to be first built by hand but then how does the two differ after that?

                                                              1. 2

                                                                Sure, nud and led are abbreviations for null denotation and left denotation, which IMO are not very meaningful terms. I explain it like this:

                                                                In Pratt parsing, tokens can be used in the null and/or left position, based on whether they take an expression on the left or not (nud or led in Pratt’s terminology). Examples of left operators are infix +, postfix ++, or the pseudo-infix a[0] (with operator [). Examples of null operators are unary minus -, and grouping parentheses (.

                                                                I actually don’t know Floyd’s algorithm. I just remember it was mentioned in one of the papers, and I thought that it used a partial order on precedences rather than a total order. Do you know if that’s true and do you have a pointer to a good explanation of Floyd’s algorithm?

                                                                What are the arrays f and g in Floyd’s algorithm?

                                                                1. 1

                                                                  f and g are the left and right precedence functions (for each token, returns an integer and so can be though of as two arrays indexed by tokens).

                                                                  https://en.wikipedia.org/wiki/Operator-precedence_grammar#Precedence_Functions

                                                                  The Floyd article first gives an algorithms that uses a table of pairwise comparisons (that could potentially only form partial order). Later in the article, the mention that to save memory, you can compute f and g first and then just compare f(currenttoken) to g(nexttoken).

                                                                  I was hoping there was a good explanation of Floyd too but haven’t found it. I think that Wikipedia page is rather helpful, if incomplete in some places. I ended up reading Floyd’s article. In the article, near the end, there’s a huge (# tokens)x(# tokens) table for a sample ALGO-like language and at the bottom and right of the table are the values of f and g.

                                                                  Edit to add that I didn’t read Pratt’s article (even though a reformatted version is on Github).

                                                                  1. 1

                                                                    Hm yeah I have read that Wikipedia page and been a little confused.

                                                                    I sort of understand what they are doing with turning the matrix into two functions. To me that seems to be turning a matrix that happens to be a total order into a more compact representation of a total order.

                                                                    I don’t really understand where they use f and g in the algorithm.

                                                                    Also the operator precedence algorithm on the page isn’t given a name. It is referenced to the dragon book “Aho, Sethi & Ullman 1988, p. 206.”, but I don’t think my later edition has it. I looked in the index for “precedence” and it was about LR parsers and yacc. Maybe they removed it?

                                                                    In any case, it’s definitely possible that Floyd’s algorithm and Pratt parsing are the same in the special case that you have f and g functions, rather than a full matrix? I really can’t tell though because I think the page is incomplete.

                                                                    Although my question would be: is anyone using Floyd’s algorithm? Have you ever seen it in code?

                                                                    I’ve seen Pratt’s algorithm in many places. Lua uses a simple variant of it (more like the “precedence climbing” structure).

                                                                    One thing I did mention in my article is that Guy Steele said the Fortress programming language disallowed certain combinations of operators without explicit parens. That is, they used a partial order rather than a total order for usability reasons. So I think that could be a good use case for Floyd’s? Fortress is open source but I haven’t looked at the code!

                                                                    If you find any further connections feel free to PM me or e-mail me at andy@oilshell.org!

                                                                    1. 1

                                                                      Ok, I’ll sent you email for the rest. I’ll answer some parts that might be interesting to others here.

                                                                      Yeah, that page is more of a helper for when Floyd’s article wasn’t so clear. It does have every piece. It just doesn’t tell you how they all fit together.

                                                                      I don’t really understand where they use f and g in the algorithm.

                                                                      Since its optional, Floyd (and Wikipedia’s) first description of the algorithm does not make use of f and g. To use them, instead of checking if t1 <. t2 (for every time we attempt this check in the first description), check if f[t1] < g[t2]. Instead of t1 .> t2, check if f[t1] > g[t2].

                                                                      In any case, it’s definitely possible that Floyd’s algorithm and Pratt parsing are the same in the special case that you have f and g functions, rather than a full matrix?

                                                                      But doesn’t a total order means you have f and g? And doesn’t Pratt only work when there are total orders (since they are entered by hand)? If yes, then the special case would actually be all cases. I’m hoping the answer is no and someone can tell me what makes it a no.

                                                          1. 1

                                                            Thank you! Indeed, what’s next? I don’t want to overhype anything but the first options makes it easier to experiment with changes to the language.

                                                          1. 6

                                                            From a pristine db:

                                                            > db[1] = 1; db[2] = 2; db
                                                            {u'1': 1, u'2': 2}
                                                            > db.undo(); db
                                                            {u'1': 1}
                                                            > db[2] = 3; db; db.redo(); db
                                                            {u'1': 1, u'2': 3}
                                                            > db.undo(); db
                                                            {u'2': 3} #???
                                                            > db.undo(); db.redo(); db.redo(); db
                                                            {u'1': 1, u'2': 3}
                                                            > db.redo(); db
                                                            {u'1': 1, u'2': 2} #?!
                                                            

                                                            Also:

                                                            • Client undo/redos will interfere with each other (one undos, the other redos, etc).
                                                            • Since the server switches to listen on a different port when you create a ‘lock’, you can have a single client request a lock and then do nothing, which prevents anyone else from querying.

                                                            I think this could be really interesting as a teaching tool: show how even a db that ‘looks’ okay can have subtle errors, and that dealing with concurrency is a really hard problem.

                                                            1. 1

                                                              Thanks for noticing this! It’s a bug in undoable that is now fixed.

                                                              For your other two points, it’s also in the README [1][2], but more importantly I’m interested in discussing solutions.

                                                              For example, I’ve avoided using any form of timing in this project prefering to leave it to libraries (that can independently update the heuristic definition of a lost connection). But a simple way to avoid locking up on a dead client is to have a timeout when waiting for requests from it.

                                                              More generally, I don’t think handling ill-behaved clients is such a clean cut problem. What if the client has a bug that keeps requesting the same data in a loop?

                                                              [1] “The client program has to be written so the order in which other accesses are treated are unimportant.” But I can see that it may be ambiguous if undo/redo are considered accesses.

                                                              [2] “No deadlock (dead client) checks are implemented.”