Given that the focus on this article is writing a parser, a much more reasonable alternative (when the language is reasonable enough to allow it) is to simply extract the specification, and write a generic parser for it. In the case of a JSON parser, we do not have to bother with things such as precedence, and a simple PEG parser can do the job easily. As I show below, writing a simple PEG parser is really simple.

    First, here is the specification taken from the JSON webpage, the ordering is modified so that the longer prefixes come first.

    json_golden = {
        "<START>": [ ["<json>"] ],
        "<json>": [ ["<element>"] ],
        "<value>": [
        "<object>": [ ["{", "<ws>", "<members>", "<ws>", "}"], ],
        "<members>": [
            ["<member>", ",", "<members>"],
        "<member>": [ ["<ws>", "<string>", "<ws>", ":", "<element>"] ],
        "<array>": [
            ["[", "<ws>", "]"],
            ["[", "<elements>", "]"],
        "<elements>": [ ["<element>", ",", "<elements>"], ["<element>"], ],
        "<element>": [ ["<ws>", "<value>", "<ws>"], ],
        "<string>": [ ["\"","<characters>","\""] ],
        "<characters>": [ ["<character>", "<characters>"], [] ],
        "<character>":  [[s] for s in string.printable if s not in {"\"", "\\"}] +
        [["\\", "<escape>"]],
        "<escape>": [[c] for c in '"\\/bfnrt"'] + [["u", "<hex>", "<hex>", "<hex>", "<hex>"]],
        "<hex>": [
            ["<digit>" ],
            ["a"], ["b"], ["c"], ["d"], ["e"], ["f"],
            ["A"], ["B"], ["C"], ["D"], ["E"], ["F"]
        "<number>": [ ["<integer>", "<fraction>", "<exponent>"] ],
        "<integer>": [
            ["-", "<onenine>","<digits>"],
        "<digits>": [ ["<digit>", "<digits>"], ["<digit>"], ],
        "<digit>": [ ["0"], ["<onenine>"], ],
        "<onenine>": [ ["1"],  ["2"],  ["3"],  ["4"],  ["5"], ["6"],  ["7"],  ["8"],  ["9"] ],
        "<fraction>": [ [".", "<digits>"], [] ],
        "<exponent>" :[ ["E", "<sign>", "<digits>"], ["e", "<sign>", "<digits>"], [] ],
        "<sign>": [ ["+"], ["-"], [] ],
        "<ws>": [ [" ", "<ws>"], [] ]

    Once we have this specification, writing a parser for this is trivial (see this post for details or even better, this chapter on implementing more advanced parsers).

    import functools
    import sys
    class peg_parse:
        def __init__(self, grammar):
            self.grammar = grammar
        def unify_key(self, key, text, at=0):
            if key not in self.grammar:
                if text[at:].startswith(key): return (at + len(key), (key, []))
                else: return (at, None)
            rules = self.grammar[key]
            for rule in rules:
                l, res = self.unify_rule(rule, text, at)
                if res is not None: return l, (key, res)
            return (0, None)
        def unify_rule(self, parts, text, tfrom):
            results = []
            for part in parts:
                tfrom, res = self.unify_key(part, text, tfrom)
                if res is None: return tfrom, None
            return tfrom, results
    def main(to_parse):
        result = peg_parse(json_golden).unify_key('<START>', to_parse)
        assert (len(to_parse) - result[0]) == 0
    if __name__ == '__main__': main(sys.argv[1])

    using it:

    python3  parse.py '{ "data": { "fish": "cake", "array": [1,2,3], "children": [ { "something": "else" }, { "candy": "cane" }, { "sponge": "bob" } ] } } '

    Note that while I have not implemented much error handling, it would be simple to add that to this skeleton. The information that one needs — which parse failed, and at what point is readily available.

    Even if you have to do special processing for some of the nonterminals, it may be better to simply modify the unify_key and handle it for that particular nonterminal than to write the complete parser from scratch.

    The utility of such an approach is that one only needs to write the parser once. You can reuse the parser with other specifications without too much effort. While a simple PEG parser may not be suitable for more complex languages such as JavaScript, one can simply use one of the variants of Earley Parsers (which, again, is fairly simple to understand) with the grammar specification for JavaScript.

    A side benefit for taking time to extract the specification, and using a generic parser is that, verifying that the parser produces correct result becomes much more easier. One can also use grammar fuzzers to generate valid inputs for testing parts of the code that lie behind the parser easily.


      This paper discusses an interesting experiment (that involved a lot of work). An interesting blog post on the article ;-)


        I was led to it from your other post, and I wanted to highlight that as a separate post :). I am interested in what the community might say about the utility of fuzzing in areas that may not directly involve security (such as compilers).


          I think compilers are fundamental to security, and I wish that were a more common position.


            They certainly are, in that one needs to trust the compiler. However, I wonder what the security impact of a bug in a compiler is that is never transmitted to the compiled artifact.


            There is always the “whoopee do” post that helped germinate the original work :-)

            I think the that grammar base fuzzing has practical, non-security uses, provided the rule probabilities are realistic.

            And I keep meaning to write something about most existing mutation research being a complete waste of time (of course citing your PhD thesis to back up my claims). People need to research how to generate ‘bigger’ mutations, or move onto something else.

        1. 3

          Now if only there was a matching dpkg for each one…..

          Sadly, fuzzing is mostly an academic paper mill rather than software mill.

          Out of the box on ubuntu you only get

          afl/bionic,now 2.52b-2 amd64 [installed] instrumentation-driven fuzzer for binary formats

          afl-cov/bionic,bionic 0.6.1-2 all code coverage for afl (American Fuzzy Lop)

          fusil/bionic,bionic 1.5-1 all Fuzzing program to test applications

          libfuzzer-9-dev/bionic-updates,bionic-security 1:9-2~ubuntu18.04.2 amd64 [installed] Library for coverage-guided fuzz testing

          wfuzz/bionic,bionic 2.2.9-1 all Web application bruteforcer

          zzuf/bionic,now 0.15-1 amd64 [installed] transparent application fuzzer

          1. 2

            Interesting that https://gitlab.com/akihe/radamsa is not packaged in apt. IIRC E.g. homebrew has it. Well, best idea to install from the source anycase – usually you want the latest and the best when fuzzing.

            1. 1


              Nice idea…..

              Very light on dependencies… trivial to build and install.

              Comes along with it’s own Scheme interpreter and a bunch of scheme programs by the look of it!

            2. 2

              We have enough fuzzers already, we want people to run them and find interesting stuff.

              Anything slightly useful eventually becomes an academic paper mill. It’s the nature of the system. Research is like VC investments, most die and a few take off spectacularly.

              Anyway back to fuzzing.

              Here is a discussion of one of the listed papers, which I thought was excellent work: http://shape-of-code.coding-guidelines.com/2020/01/27/how-useful-are-automatically-generated-compiler-tests/

              1. 2

                Yes, I read your blog post when it popped up on my feed.

                Very interesting indeed.

                It reminds me of a moment of “Too Much Truth in Advertising”….

                One of the big static analysis firms used to have an page of recommendations from happy customers.

                One customer, a Big household name, said something like, “We used X to find ten’s of thousands of real bugs in code that we have been shipping to our customers for more than a decade!”

                Which immediately told me most bugs are never found in testing, and if they are, they’re probably aren’t triggered, and if they are, it probably doesn’t matter…..

                Which also says, by far, most software is in the grade of serving up cat pictures, if it fails to serve up one to one person… who cares? ie. Most software isn’t doing stuff that really matters.

                Which also says, in the fields where it really really really does matter (Avionics / Self driving cars / ….) by far most practical experience of software engineering isn’t really relevant.

                Except as a warning that “Here be Dragons! Do you really want to trust this stuff?

                And also, don’t use C. I’m not sure what The One True language is…. but I bet it is one that makes automated whole static analysis a lot easier than C does.

                All this said, to me, defects really do matter, even if you’re only serving cat pictures….


                Because testing and debugging a change built on top of pile of flakiness is much much much harder than testing and debugging one built on a rock solid foundation.

                Because as our systems get bigger and bigger built on more and more layers, the probability of one the tens of thousands of very low probability bugs biting us tends to one.

                As usual, MonkeyUser puts it succinctly… https://www.monkeyuser.com/assets/images/2019/139-mvp.png

                Which brings me back to fuzzing, I’m using fuzzing and watching the field because of one simple habit.

                When I start working on an area of code…. I stress the hell out of it, and make it rock solid.

                Then I start with any enhancements……

                Then I stress the hell out of my work.

                1. 2

                  We have enough fuzzers already, we want people to run them and find interesting stuff.

                  I would say, not really. In the hierarchy of fuzzers, we are struggling to go till or beyond level 3, that is we can generate syntactically valid programs if we have the grammar, but anything beyond is really hard. We are still making progress, but we are no where near fuzzing programs that take multi-level inputs (most of the interesting stuff happens beyond the first level parsers).

                  Sadly, fuzzing is mostly an academic paper mill rather than software mill.

                  Unfortunately, I agree with this mostly. Quite a lot of fuzzing papers seem to be making rather limited improvements to the domain, and original ideas are few and far between. I believe that part of the reason is faulty measurement. Given a target such as a benchmark suite of programs, and faults, it is relatively easy to over optimize for them. On the other hand, finding bugs in numerous programs often may only mean that you went looking for them, and may not say anything more about the impact or originality of your approach.

                  1. 1

                    Quite a lot of fuzzing papers seem to be making rather limited improvements to the domain, and original ideas are few and far between.

                    Actually I’d argue most fuzzing papers are tweaks on afl (or to a lesser extent) on libfuzzer, since they are so easily available.

                    If the first task in reproduction is downloading and building /patching/fixing an arcane set of out of date dependencies…. no giants are going to be standing on your shoulders.

              1. 18

                Not this again.

                Different kinds of Arabic read numbers out loud differently, so some read most significant first, and some read least significant first, many are middle-endian. It’s pretty hard to make this conclusion just based on that.

                BUT: Arabic got the numerals from India. Indian languages write down numbers the same way you do in English, most significant digit first in a left to right script. Indian languages typically say numbers as most-to-least significant, with the middle-endian nature of tens and ones digits being swapped (similar to how “thirteen” is “backwards”). So it doesn’t even matter how it’s read in Arabic, because for a long time they were writing numbers the same way as the Indians, and it would be very weird if they used the same numerals but in the opposite order when looked at visually. The numbers got copied twice, from an LTR script to an RTL script back to an LTR script, without changing the order. The RTL nature of Arabic is a red herring here.

                And this doesn’t have much to do with why our computers work a certain way, anyway (see other comments).

                1. 2

                  Different kinds of Arabic read numbers out loud differently, so some read most significant first, and some read least significant first, many are middle-endian. It’s pretty hard to make this conclusion just based on that.

                  What do you mean by different kinds of Arabic? Classical and Modern Arabic both read numbers out loud the same, are you referring to dialectal Arabic? because most Arabs would not consider their dialect to be a legitimate Arabic, since almost all published text is written in Modern Arabic, or still in some cases Classical Arabic.

                  For the record though this post is just wrong, because 521 in both Classical and Modern arabic is read as five hundred, one and twenty, and the author claims that “formal” arabic reads it differently. I’m not really aware of any dialects that read that as one, twenty, and five hundred, and at least to me (north African native speaker, but also Arabic language enthusiast) it sounds insanely wrong even for a dialect.

                  BUT: Arabic got the numerals from India.

                  Eastern Arabic Numerals are from India, Western Arabic numerals are believed to have originated in North Africa. European languages use the Western Arabic numerals, and not the eastern Arabic Numerals. The decimal system though, the more interesting part in my opinion, was indeed copied from Indian mathematicians.

                  1. 2

                    Eastern Arabic Numerals are from India, Western Arabic numerals are believed to have originated in North Africa

                    Any sources for that? Most of the sources I have seen seems to trace the origins of both to Brahmi numerals.

                    1. 1

                      Right, dialectal Arabic, but also I knew that MSA doesn’t follow the same rules as what the author talks about and I wasn’t sure what they meant by “formal”, so I was really talking about MSA (I don’t know enough about Classical).

                      Eastern Arabic Numerals are from India, Western Arabic numerals are believed to have originated in North Africa.

                      No, the exact numeral forms were invented in North Africa (with influence from the eastern arabic forms!), but the concept of using numbers like that came from Eastern Arabic numerals, and Western Arabic numerals are a bit of an early divergence from them. They’re still related, see the first paragraph in https://en.wikipedia.org/wiki/Arabic_numerals#Origin_of_the_Arabic_numeral_symbols , in particular mathematicians had already trained themselves on Eastern Arabic numerals but hadn’t necessarily agreed on the forms.

                  1. 2

                    Thank you for the excellent tool. I am also interested in the same space using Python. We found that Z3 is still fairly bad when it comes to string constraints. In particular, even simple things like lower() upper() etc need to be defined by the implementors, and Z3 tends to give up pretty fast with relatively small expressions.

                    What was your experience with string functions?

                    1. 1

                      CrossHair uses the standard string solver in Z3 and doesn’t employ any special Z3 tactics, so I imagine it has the same weaknesses. (to be honest, I’ve only tried very simple test cases so far) I’d love it if you could send me some more details on your challenging use cases to see what happens. Additionally, I think that over the last year or so, the Z3 string solver has been in a state of flux; I suspect you’ll get a variety of behaviors depending on what build you’re using.

                      I’d love to collaborate in any way that makes sense: at a minimum, sharing challenges and solutions would be great. If you’re looking to develop something into a practical tool along these lines, I’d love to collaborate more directly too. The illusion that a tool like this needs to create is really difficult to get right, and too big of a task for one person.

                      1. 1

                        Thank you for the response. I was trying out simple string functions in crosshair (lower, upper, capitalize), and it seems string functions are not supported yet, which from your reply I suppose is not unexpected – they are not directly supported by smt-lib (or Z3) and need to be implemented separately. This is what we did.

                        If you’re looking to develop something into a practical tool along these lines, I’d love to collaborate more directly too. The illusion that a tool like this needs to create is really difficult to get right, and too big of a task for one person.

                        We are working to make the fuzzing book into a usable suite of tools in Python. The idea is to explain in detail what goes into each tool, and the trade-offs thereof in a comprehensive Jupyter notebook – essentially the idea is to get the reader started on a given advanced topic over the course of a weekend (where possible). This means keeping the basic implementation simple, understandable and well explained (which means advanced optimizations that may be needed to make these algorithms scalable and practical are given lower priority if they make the basic idea difficult to understand – this may be counter to your goal). The users can then build their tools on top of this suite of tools by replacing what they need with advanced versions.

                        Cross-fertilization of ideas and code would indeed be great, and I will certainly keep track of where crosshair goes.

                        1. 1

                          Yeah - your prompting about string methods here made me notice a pretty bad regression where those methods were failing. I’ve fixed it in HEAD and will cut a new release soon. That said, even when “fixed,” CrossHair now just falls back to concrete execution - a real fuzz tester would be much better in these cases.

                          Your solution for str.upper/str.lower looks great. I might try to come up with a unicode-aware variant and share it back with you. (though perhaps that’s quite a bit more complex now that I think about it!)

                          Agreed that I think some of our goals are different, as I would like to get to a more practical, usable product. Excited about chatting in the future though!

                    1. 3

                      In my case, I do not have an account on HN, but I do browse it often. When I find a particular topic I like, it is invariably posted here, and I come here for discussions. While I think the number of experts/well known people here seems to be much lesser than HN, the community seems more welcoming.

                      1. 4

                        I understand that reproducibility is one of the core features of GNU-Guix. How good is it with respect to Maven?

                        (I investigated Nix sometime back for producing reproducible artifacts for Java (maven) for our research but found that the Maven builds did not result in the same checksums)

                        1. 2

                          I certainly wouldn’t expect Maven builds to result in the same checksums. Either between multiple Maven builds or when comparing to nix builds. Maven makes no attempt to guarantee reproducible builds at all.

                          1. 1

                            Why do you say that? Maven seems to have a page on producing reproducible builds now.

                            1. 1

                              Interesting, that must be new. Historically it wasn’t a thing.

                        1. 5

                          The regex as a value generator, as used in this book, is really quite good. I’ve been dying to see somebody write it up and stick it on CPAN.

                          1. 3

                            Which chapter? any link?

                            1. 6

                              I think parent comment mentions the Infinite Streams chapter of the book: https://hop.perl.plover.com/book/pdf/06InfiniteStreams.pdf (see page 18 in the link).

                              1. 4

                                Section 6.5 on page 272 for those just peeking in a browser. Good stuff! Thank you.

                                1. 2

                                  Thanks! much appreciated.

                            1. 1

                              Recently learned how to hack Python to use any syntax :) ! and not through the usual module loading magic, but on the main file itself. See this post.

                              1. 1

                                Since Prof. Regehr only mentions mutational fuzzers, here is our status on starting from scratch.

                                • Sequence of ASCII characters
                                  • Millers fuzzer
                                • Sequence of words, separators, and white space
                                  • Lexical fuzzing – aka dictionary based fuzzers
                                • Syntactically correct C program
                                  • We can kind of do this – see this for an approach without grammar and this for an approach where the grammar is mined from the program.
                                • Type-correct C program
                                  • Current research
                                • Statically conforming C program
                                  • Targeted by CSmith (and beyond)
                                • Dynamically conforming C program
                                • Model-conforming C program

                                Although McKeeman reports finding bugs at levels 1-5, for the Csmith work we did not bother with any of levels 0-5. It’s not that we didn’t think test cases at these levels would trigger some bugs, but rather that we were mainly interested in sophisticated optimizer bugs that are very hard to spot until level 6. My guess, however, is that a high-quality modern compiler (GCC, Clang, Intel CC, MSVC, etc.) is probably not going to have a lot of bugs that you can find using levels 0-3.

                                Unfortunately, it has been rather hard to convince our reviewers that ignoring parser bugs is reasonable :/.

                                1. 3

                                  I recently learned the hard way that ‘string1’ is not in [ ‘string1’ ‘string2’ ].

                                  1. 1

                                    aaargh, this has hit me more times than I care to remember.

                                  1. 6

                                    My experience is only with Jupyter notebooks, so I do not know if it is similar in other notebooks, but beyond what the authors noted, here are my pain points.

                                    • Debuggability – there is almost no way out other than print statements here. The pixie debugger is really hard to use (hard enough that I went back to print statements).
                                    • No way to decouple state from interface. Using a CLI, I can simply use tmux to start a session in the server and forget about it. For jupyter, if you start an experiment with a front end in one machine, you are stuck until the experiment finishes.
                                    • Encourages the entire code to be put in a single notebook. While hacky solutions to modularization exist, none are popular, and bug free.
                                    • Sharing the current session. While I can write a paper with a collaborator on overleaf collaborating real time, there is no way to share the current session of a jupyter with any one else.
                                    1. 1

                                      Debuggability – there is almost no way out other than print statements here.

                                      I’m not a notebook user, but I’ve noticed VS Code and PyCharm mention notebook support in their release notes. Do they add debugging support? I’d have thought that would be a place they could add value.

                                      1. 2

                                        VSCode was not very impressive when I tried it last (~6 months ago) and was very hard to set it up for our environment. Further, we use a few plugins such as TOC heavily which is not supported in these IDEs.

                                    1. 3

                                      Interesting. Given the authors disassemble the binary first, perhaps an existing approach like Mull which mutates the LLVM IR which can be recovered from a binary could have been used for comparison.

                                      1. 3

                                        I have a very similar story to share. In the context of testing, and abstraction does have a cost. In our case, the lesson was that a language allowed only controlled amount of abstractions was better than a general language.

                                        1. 2

                                          Very interesting. Most of the time, people talk about using non-Turing Complete to provide restrictions that facilitate machine analysis. Your observation of restricting people’s habits might be even more important. It’s similar to why we include strong, type systems in some languages. Just applying that to the size and expressiveness of the language in this case.

                                        1. 3

                                          This is a really great project!. Is ffwff in lobsters? If so, any chance you can explain your future goals? Where do you plan to take lilith?

                                          1. 3


                                            I don’t really have a specific goal in mind, but my plan is to make an OS that’s usable (at least for me!) with some experimental unix redesigns. I don’t plan on being the next plan9 but at least I can be plan8? :^)

                                            1. 1

                                              That is a great goal :) I will certainly watch lilith to see where you take it.

                                          1. 3

                                            Genuine question: what actual value are these conferences providing to attendees vs, say, the speaker just making a video and hosting it on the web?

                                            I’ve been to packaging machinery conferences, Big Data conferences, general tech conferences, and all of them seem to serve a purely commercial purpose: I get tchotchkes, my badge gets scanned a bajillion times, and then I get a looooot of spam. I don’t even mean targeted emails! It’s stuff that’s so mediocre that my various email accounts have already flagged it into my junk folder.

                                            Lately I’ve been saying “Naw, thanks. Save the train/airfare and let me spend a few days reading or watching training stuff at home.” Am I nuts?

                                            1. 9

                                              From personal experience, I would say the single biggest value is in networking. If you attend a talk, you have something to talk about with a person who is in forefront of their research, in a topic that the person is clearly interested in at the moment. If you have something interesting to say, perhaps that would turn out to be a collaboration at a later point.

                                              1. 1

                                                Are the odds of that better than a tweet or an email? I mean, I’ve attended talks at these conferences and given talks at my company, and one thing I’ve noticed: most people are kind of tired after speaking (or listening!) for 30-90 minutes. I’m mostly a space cadet after those conferences, and I find I don’t really retain well, even if it’s a 1:1 convo. Do you have that same problem? If so, what do you do to to help that?

                                                1. 6

                                                  Typically, for academic conferences, one gets to talk for 20 minutes, followed by a few minutes of questions. Anything that requires more in depth answers get taken offline which essentially mean in person chats immediately after the talk. Most people in academic conferences do not listen to talks continuously either. People are usually selective in attending specific talks in a conference, with the in between time spent talking to people around you who work in similar or interesting fields. The dynamic is rather different from the few industry talks I attended.

                                                  My experience has also been that tweets and emails have very different effect from meeting in person. Specifically, answering emails can be postponed, and tweets can be ignored. But, a person in front of you usually holds your attention if he has something interesting to talk about.

                                                  1. 1

                                                    Interesting… I suppose I haven’t really attended purely academic conferences, so perhaps that’s where I’ve gotten a little jaded. A lot of the feel at the industry stuff I’ve been to seems to encourage “spend 100% of the time attending things, just in case, then pick up a suitcase full of swag for the people back home.”

                                                    1. 6

                                                      For academic conferences, few attend them exclusively to listen. Most people are there to give a talk or are in a group that did research work that will be presented in that conference (because travel costs are beyond your budget as a student, and your advisor is not going to give you money unless you are going to present your research). So, the emphasis is more on giving the presentation and making connections rather than simply listening to talks.

                                            1. 16

                                              Using > instead of \t is horrible imho - but the rest was pretty interesting.

                                              1. 2

                                                I think \t would be better replaced by ; than by ‘>’. It would let you join two consecutive lines together (vi J) or split a long line with multiple commands with little effort.

                                              1. 2

                                                This is pretty good. I did not know about .RECIPEPREFIX. Thanks for introducing me to that.

                                                I have my own set of best practices here — It is not a blog post yet, but more for my own recall.

                                                1. 3

                                                  At the first sight, I thought this was something similar to swarm testing which is a very effective testing technique.

                                                  1. 2

                                                    Plan9 noob here: I switched to Acme about a week ago (not 9front, just Acme), and my scroll wheel broke like 2 days later. Too much middle clicking! I think this is the 2nd mouse Acme has killed.

                                                    First, I tried AcmeSAC.app, which is ok (where are my files?), but cd /usr/local/plan9; ./INSTALL works great on OS X (easy to find files, fonts work..).

                                                    1. 1

                                                      Let me say I am in the same boat with AcmeSAC.app! I loved it, and was my introduction to Inferno and Plan 9. Unfortunately, it is not maintained any more, and I have no idea how to update it to the latest Inferno sources. I couldn’t even get it to compile.

                                                      While using it on Mac, I found that the following worked if you had a trackpad: (copy = b1+b2 and paste = b2+b3 in ACME)

                                                      • Option + click is middle click (b2)
                                                      • Command + click is right click (b3)

                                                      For cut, select text with trackpad, then click option without releasing the click.

                                                      For paste, click or select text with track pad, then click command without releasing the click

                                                      For copy, selecting the text, then click and release option first, then click and release the command, which copies the text.

                                                      1. 2

                                                        On newer versions of plan9port, a 3-finger tap will simulate button 2, and a 4-finger tap will do the 2-1 chord for a given text selection to send it to another command. Command-x/c/v all work for cut/copy/paste as well.