1. 2

    Nice article – I liked the recap of XML and XHTML, and I agree that blockchain is useful and interesting from a CS perspective, but a bad fit for most applications.

    He points out that there are cheaper ways to reach consensus for many applications, and you don’t necessarily want to publish every transaction to the entire world.

    James Mickens is saying some of the same things here:


    Despite the meme-y title, it’s somewhat constructive: can we design a blockchain that fits more applications? I think there is a continuum of how “adversarial” transactions are in various applications, and assuming the worst is a tradeoff that has consequences (i.e. it makes usability worse).

    1. 3

      Would zig compile time evaluation be powerful enough for something like string -> PEG parser as a library?

      1. 1

        The only potential roadblocks I foresee for this use case are:

        • zig compile time code execution is much slower than it should be. It should be able to roughly match CPython’s performance, but it’s much slower and doesn’t free memory. Ironically we actually need to add a garbage collector in compile time code execution.
        • zig compiler doesn’t yet have sophisticated caching that would make it practical to have a really complicated compile time implementation. So you’d wait for your thing to run with every build.

        Both planned to be fixed, it’s just a matter of time.

        1. 1

          That’s interesting, so you have a full Zig interpreter that runs at compile-time?

          But won’t collecting garbage make it slower? Are the compile-time programs allocating so much that they need to free memory?

          I’m curious if any other languages have run into this problem.

          1. 2

            so you have a full Zig interpreter that runs at compile-time?

            It’s a little more complicated than that. Zig AST compiles into Zig IR. Each instruction’s value is either compile-time known or not. Most instructions which have all compile-time known operands produce a compile-time known result. There are some exceptions - for example, external function calls always produce a runtime result.

            If statements and switch statements whose condition/target value is compile-time known, the branch is chosen at compile-time. This means that zig has “implicit static if”. E.g. if you do if (false) foo(); then foo() is not even analyzed, let alone included in code generation.

            In addition, there is the comptime expression: https://ziglang.org/documentation/master/#Compile-Time-Expressions This causes all the branches and function calls - including loops - to be compile-time evaluated.

            But, importantly, you can mix compile-time and run-time code. Variables can be marked comptime which means that loads and stores are always done at compile time.

            For loops and while loops can be marked inline which unrolls the loops and makes the iteration variables known at compile-time. You can see this in action for the printf implementation: https://ziglang.org/documentation/master/#Case-Study-printf-in-Zig

            But won’t collecting garbage make it slower?

            I can’t answer this in a clear way yet as I haven’t tried to solve it. The basic problem is the same as in e.g. Python where you could potentially have 2 compile-time values with references to each other, but not referenced from any root that is actually going to go into the executable, so they should not be in the binary.

            In Debug builds zig has a goal of compiling fast, and willing to create a more bloated binary with worse runtime performance. In ReleaseFast builds, zig can take a few orders of magnitude longer to compile, but the performance should be optimal and bloat should be minimal. So it might be a thing where Zig does not garbage collect comptime values for Debug builds unless it starts to use too much memory, but it would certainly take the time to do this for ReleaseFast builds.

            Are the compile-time programs allocating so much that they need to free memory?

            I don’t personally have any use cases where that is true, but in general, I could create a program that allocates an arbitrarily large amount of memory at compile time in order to do a computation, but that value is not ultimately used in the binary, yet the memory allocated has references to each other, and so it would fool the reference counter.

          2. 1

            Ironically we actually need to add a garbage collector in compile time code execution.

            Why? It seems like if you allocate and free as you would in normal Zig, this wouldn’t be a requirement.

            1. 1

              Thats really cool, things like regexes or sql statements could be pre-prepared at compile time with features like this.

          1. 1

            I use bash on NixOS with whatever completion it provides by default (program names in PATH and filenames in current directory). I also use most shells from within Emacs shell-mode, which I think does its own completion.

            I get incredibly frustrated by certain completion systems, I think Debian’s default, which will sometimes refuse to tab-complete filenames unless they contain certain patterns. For example mplayer or vlc commands will only tab-complete filenames if they contain strings like .mp4 or .avi. I find such behaviour surprising and obnoxious, so would highly recommend not doing that in other tab-completion systems :)

            1. 1

              Yeah I get frustrated with them too. It’s not so much that they only complete .mp4 only by default, but if you want to change that, you are confronted with a nightmare of inscrutable and untestable code !!!

            1. 1

              You should have a look at how git autocompletion is generated. From what I understand, either very soon or very recently it will be/was changed to generate the bash completion from the command line argument parsing code rather than being written from scratch.

              1. 1

                That sounds interesting – do you have a pointer? I have looked at the git completion, but it doesn’t look autogenerated:



                1. 1

                  I think the googleable terms are ‘compgen’ and/or ‘gitcomp’.

                  1. 1

                    Hm neither of those turned up anything, and the combination didn’t either.

              1. 3

                This is the most pleasant thing I’ve used, by far: https://github.com/mbrubeck/compleat

                It’s not entirely flexible enough, but it is far more pleasant than anything else I’ve dealt with.

                1. 2

                  Thanks this looks really interesting! But does it have a large body of existing completions? Are you using it, or is it something you’ve written a completion for.

                  It looks more principled, but yeah I think the DSL approach falls down when you need to handle special cases, and there are a lot of them in shell.

                  Example corner case: after -c in Python and shell, the remaining things are arguments, not flags. You don’t need to use -- to end the flags in that case.

                  I would like something more principled than bash or zsh for sure. But I also don’t want to rewrite completions for the world, or even start an entirely new body of completions. I’m open to more info on this though.

                  1. 2

                    So I only know of a a few pieces of software that actually ship with completions written for it (e.g., polyglot), but I do use it actively and write my own completions. It’s my go-to tool whenever something lacks completions and it starts to bug me.

                    While it can be somewhat limiting, it does allow you to do some reasonable interesting things since you can shell out. The only issue I’ve run into that I couldn’t work around was trying to write completions for the heroku command, which uses subcommands of the form heroku access:add, heroku config:get, etc. Something about the notion of a word boundary and the colons did not mix. I eventually abandoned the issue since Heroku started shipping decent shell completion with their CLI.

                    Some examples of the more elaborate ones I’ve written can be found here: https://gist.github.com/cmhamill/a7e39eb576f83292cfb09f2e3d0090ed

                    1. 2

                      Thanks for the examples! Yeah this does look nice.

                      If I were to do something like this, I’d probably want to let you mention an arbitrary function in the DSL, to solve the problem of needing to do “weird” stuff. I imagine it’s something like parser combinators.

                      Also I don’t think the DSL should hard-code its notion of tokenization, which seems like why you ran into : problem.

                1. 5

                  I use bash on Debian, with whatever defaults it has.

                  A crazy idea: when you get down to it, completion is really about defining a grammar for command-line options - “a command can be the token ls, followed by zero or more of -a, -r, …” where some of the grammar productions are defined dynamically (“filename” being the most obvious example). I’d love a completion system where I can dump grammar files in a standard format (PEG, yacc, EBNF, whatever) into a directory, and executables to produce dynamic productions into another directory; I feel like it would be a lot easier to write completion grammars in a declarative grammar syntax than the imperative-grammar-manipulation system that bash and zsh seem to use.

                  1. 2

                    It looks like this tool is grammar-based, or at least it’s a DSL and not imperative: https://github.com/mbrubeck/compleat

                    I definitely think the imperative model is verbose. But I think you have the classic DSL problem in this domain too: need to be able to “escape” out of the DSL for special cases. @cmhamill, who mentioned compleat in this thread, said that it’s not entirely “flexible”, and I presume that’s what he means.

                    1. 1

                      That’s a pretty cool tool, thanks for pointing it out!

                      It looks like the implemented grammar is a lot simpler than, say, yacc or OMeta, though. While DSLs do often need escape hatches, I’m not sure that the limits of this DSL imply that all command-line parsing DSLs are too limited.

                  1. 4

                    I use zsh which has the more comprehensive coverage with completions. Especially when you consider things like support for system commands on non-Linux systems. Note that zsh itself includes most of them and the separate zsh-completions project is only a small collection and of lower quality.

                    Zsh’s is much the superior system but you’d have to emulate a whole lot more to support them. Completion matches can have descriptions which makes it vastly more useful. The process of matching what is on the command-line against the candidates is much more flexible and is not limited to dividing the command-line up by shell arguments - any arbitrary point can be the start and finish point for each completion candidate. And as that implies, what is to the right of the cursor can also be significant.

                    My advice would be to take the zsh compadd builtin approach which is more flexible and extensible than compgen/complete, do your own implementation of _arguments (which covers 90% of most completions) and similarly your own _files etc. It’d then be straightforward for people to write completions targetting both oilshell and zsh.

                    1. 2

                      Hm interesting, yeah I am looking around the Completion/ dir in the zsh source now and it looks pretty rich and comprehensive.

                      I also just tried out zsh and I didn’t realize it had all the descriptions, which is useful too. Don’t they get out of date though? I guess most commands don’t change that much?

                      I recall recall skimming through parts of the zsh manual like a year ago, and from what I remember there are 2 different completion systems, and it seemed like there was a “froth” of bugs, or at least special cases.

                      I will take another look, maybe that impression is wrong.

                      I think the better strategy might be to get decent bash-like completion for OSH, and then convince someone to contribute ZSH emulation :)

                      I guess I am mainly interested in the shell system that has the best existing corpus of completion scripts. Because I don’t want to boil the ocen and duplicate that logic in yet another system. zsh does seem like a good candidate for that. But I don’t understand yet how it works. Any pointers are appreciated.

                      I’ll look into _arguments… it might cover 90% of cases, but it’s not clear what it would take to run 90% of completions scripts unmodified.

                      1. 3

                        The zsh descriptions do get out of date. The strings are copied by the completion script author, so if the --help text changes, the script will need to be updated too.

                        Zsh’s completion system is vast and old, the best combination. That’s why the 2 engines exist still today, as there are a number of completion scripts that are in the old style. I believe that most of those underneath Completion/ are using the newer system.

                        1. 2

                          Of the 2 systems, the old, compctl system was deprecated 20 years ago. Everything under Completion/ uses the new system. I wouldn’t say there’s a “froth” of bugs - it is just that there is a lot to it.

                          It isn’t the descriptions so much as the options themselves that can get out of date. The task of keeping them up-to-date is semi-automated based on sources such as --help output and they are mostly well maintained.

                      1. 12

                        On one hand, I’m sympathetic to the idea of bringing systems programming more in line with “systems thinking” and “systems theory” in other fields. On the other, I think under that definition of “systems programming” we don’t have any truly systems-oriented languages yet.

                        1. 7

                          That’s actually what Oil is supposed to be! (eventually) Maybe this connection isn’t obvious, but one way to think of it is:

                          • A Unix shell is a language for describing processes on a single machine.
                          • A distributed system is a set of Unix processes spread across multiple machines. [1]
                          • So you can imagine extending shell to talk about processes on multiple machines. There are existing languages that do this, although sometimes they are only “config files”. I believe you need the expressiveness of a full language, and an extension of the shell is the natural candidate for that language.

                          I briefly mentioned this 18 months ago, in the parts about Borg/Kubernetes: Project Goals and Related Projects

                          But I haven’t talked about it that much, because I want to keep things concrete, and none of this exists.

                          I also mentioned Kubernetes in this blog post: Why Create a New Unix Shell?

                          The build time and runtime descriptions of distributed systems are pretty disjoint now, but I think they could be moved closer together. Build time is a significant problem, not just a detail.

                          I mentioned a couple books on the philosophy of systems here: Philosophy of Systems

                          I like the OP’s framing of things. I agree that “systems programming” is overloaded and there should be another word for describing the architecture of distributed systems.

                          Although I guess I totally disagree with the conclusion about OCaml and Haskell. I’m pretty sure we are talking about the same thing, but maybe not exactly.

                          I guess he defining a “system” with the 5 qualities, which I agree with, but I am picking out “distributed systems” as an important subset of systems that have those 5 qualities.

                          My basic thesis is that Shell should be the language for describing the architecture of distributed systems. The architecture is essentially a set of processes and ports, and how they are wired together. And how they can be applied to a particular hardware/cluster configuration.

                          Right now those two things are heavily entangled. We’re basically still in the era where you have to modify your (distributed) program to run it on a different computer (cluster).

                          Concretely, I think a cleaner shell mostly needs Ruby-like blocks, and it can express a lot of things, to do stuff like this:


                          … which looks pretty similar to Google’s (internal) Borg configuration language. I described that language to someone as roughly “JSON with map/filter/cond/inheritance” :) It evaluates to protocol buffers that get send to the Borg master, and then flags get sent to the Borg “slaves” on each machine.

                          Kubernetes has almost exactly the same architecture as far as I can tell, but everybody seems to use Go templates to generate YAML. That is not a good description language! :-( Things seem to have gone backward in this respect when the tech made its way out of Google (where I used all this cluster / big data stuff for many years).

                          Also, I’m not the only one who thinks this. I link to Why Next Generation Shell? in my FAQ. He uses the term “systems engineer” too, which is also overloaded.

                          [1] And they really are Unix processes; it’s hard to think of any significant non-Unix distributed systems. I guess there are still many mainframes in similar roles, but I imagine the number of nodes is fairly small compared to Unix-based systems.

                          1. 4

                            Feature request for oil. currying of commands…

                            let targz = tar args… | gzip

                            I think a proper ‘systems language’ would let composing OS processes be trivial while looking a bit like ocaml or reason ml. If I designed something i would consider having a clean distinction between functions and processes. Then let processes be first class things like functions, that can be curried, passed as arguments generated on the fly, just like closures in functional languages.

                            The difference between ‘proc’ and ‘func’ would be procs can interop with OS processes and can die or be cancelled.

                            and a static type system…

                            1. 3

                              “I think a proper ‘systems language’ would let composing OS processes be trivial while looking a bit like ocaml or reason ml.”

                              I’ve said this, too, but in the context of getting rid of pipes. I thought modularity and composition were good but defaulting on the processes and pipes weren’t. At the least, we should have a choice. It seems we can specify what’s supposed to happen at a high level with the implementation (eg pipes, function calls) being generated later. The developer enters the criteria for that. So, we start getting the benefits of high-level languages plus can get close to specific functionality like how UNIX works. Might avoid maintenance and security issues, too.

                              1. 3

                                Oil will have proc and func, with exactly those keywords:


                                proc is identical to current shell functions, which take argv and return an exit code, and can also be transparently put inside a pipeline or run in a subshell/command sub. It’s a cross between a procedure and a process.

                                func is basically like Python or JavaScript functions.

                                As for currying, I’d have to see some examples. I don’t see why a normal function syntax doesn’t solve the problem.

                                By the way, a complex service at Google can have over 10K lines of the Borg config language, briefly mentioned here:


                                It is actually a functional language – it’s sort of like JSON with map/filter/cond/lambda/inheritance. And practically speaking, the functional style doesn’t really add anything. Most people seem to complain about masses of highly nested curly braces and awkward lambdas of map and filter. Plenty of people at Google have background in functional languages and they don’t even like it as a “config” syntax.

                                Some teams went as far as to develop and alternate Python-derived language for describing service configs instead. A coworker actually made a statically typed variant which was abandoned.

                                I basically think there is a confusion between functional-in-the-small and functional-in-the-large. I don’t care about functional-in-the-small – map/filter/cond etc. can be written in an imperative style. However functional-in-the-large is very important in distributed systems. It lets you compose processes and reason about them.

                                And you can write pure functions in an imperative style! In fact that is how most of my programs including Oil are written. They use zero mutable globals, and pure dependency injection of I/O and state, which is essentially equivalent to functional programming.

                                More concretely, I want Oil to be familiar to both existing shell users, and users of common languages like Python, JavaScript, Go, etc. I’m trying not to invent any new syntax – it should all be borrowed from a popular language. I think Reason ML is great and they make some good critiques of the inconsistency of OCaml syntax, and they bring it closer to JavaScript. So Oil might look more like Reason ML than OCaml.

                                1. 1

                                  Main advantage of functional vs imperative for me is the first is easier get get correct the first time or during refactorings. This article has a good explanation why functional programs are easier to verify than imperative.

                                2. 2

                                  Whilst it’s a nice idea, for that example I would use a bash function:

                                  function targz {
                                    tar "$@" | gzip

                                  Also, I’ve become less fond of command line arguments over time, and tend to prefer env vars instead for key/value things. That way we don’t have to care about their order, we don’t need to match up keys with values ourselves, they’re automatically propagated through wrapper scripts, etc. I tend to only use command lines for inherently sequential things, like a list of filenames to act on.

                                  I’m not sure if ‘currying environments’ makes sense, although something like Racket’s parameterize for env vars in the shell would be nice. I’ve actually written some Racket machinery which lets me set env vars this way when invoking subprocesses :)

                                  1. 1

                                    I am interested in another example of use case for this.

                                    1. 3

                                      Try writing a Go program that does a lot of shelling out to commands like ssh, tar, gzip, gsutil or awscli , then try the same in bash. In bash you will have crappy programming experience ‘in the large’, with go you will have an overly verbose mess ‘in the small’.

                                      1. 2

                                        Yeah writing shell scripts in Go seems to be more and more common these days, because Go is a common “cloud service” language, and a lot of devs don’t know shell, or (understandably) want to avoid it.

                                        My friend sent me a shell script rewritten in Go. I get why. It works, but it’s clunky.

                                        Here’s another example:


                                        Here’s a post about rewriting shell in Python that I link from my FAQ. IMO it inadvertenty proves the opposite point: Python is clunky for this use case!


                                  2. 1

                                    Personally, I’m keeping my fingers crossed hard for Luna language to (eventually) become the language of distributed computing… and of scaling the ladder of abstraction in both ways…

                                  3. 4

                                    Agreed. But don’t expect the languages to come before the thinking and theories!

                                    1. 2

                                      VDM with code generator? ;) Also, ASM’s have been used to model everything. One, Asmeta, is a programming language, too. So, it seems doable. I’m not going to say pragmatic, though.

                                    1. 11

                                      Oil is compiled with a bytecode compiler (and front end) written in Python. It’s the only program that uses this specific compiler. (I didn’t write it, but I cobbled it together and heavily refactored it, with plans to add many more features.)

                                      I’ve written at length about it [1] – in short the plan is to evolve Oil into a more efficient program without doing a big-bang rewrite in C or C++. I have to implement at least 2 programming languages for the Oil project – OSH and Oil, so using a high-level language helps.

                                      I think of Oil as being “metaprogrammed”, not programmed. There are custom compilers for DSLs like re2c and ASDL, and also some small Python code generators for things like enums, online documentation, etc.

                                      The latest release only has 9K significant lines of code in the core! [2] Compared to 110K significant lines of C code for bash (as measured by cloc, bash 4.4).

                                      The way I’ve been explaining thing to people is with an analogy TeX. TeX is actually written in an abstract subset of Pascal! The versions on your machine are not compiled using a Pascal compiler. They are translated to C and then compiled with a C compiler!


                                      [1] Building Oil with the OPy Bytecode Compiler

                                      [2] http://www.oilshell.org/release/0.6.pre3/metrics.wwz/line-counts/oil-osh-cloc.txt

                                      1. 3

                                        TeX is actually written in an abstract subset of Pascal! The versions on your machine are not compiled using a Pascal compiler. They are translated to C and then compiled with a C compiler!

                                        Oh, wow, thanks for writing this! I knew about WEB, but never looked at it, or where it came from. Definitely learned something today! Thanks!

                                      1. 7

                                        Hm the handle thing reminds me of set -o errexit and trap ERR in shell, but it generalizes to bigger programs. That is, it’s not just a global handler; it’s scoped.

                                        Interesting idea I hadn’t seen before. I might steal that for Oil! :)

                                        1. 8

                                          As someone who is a total stranger to Elm, its dev and its community, but was interested for a long time in learning this language, I wonder if this opinion reflects the feeling of the “great number” or not.

                                          1. 21

                                            I have to say that I personally can very much see where he’s coming from. GitHub contributions are dealt with in a very frustrating way (IMO they’d do better not allowing issues and PRs at all). There’s a bit of a religious vibe to the community; the inner circle knows what’s good for you.

                                            That said, they may very well be successful with their approach by a number of metrics. Does it hurt to loose a few technically minded independent thinkers if the language becomes more accessible to beginners?

                                            Where I see the largest dissonance is in how Elm is marketed: If the language is sold as competitive to established frameworks, you’re asking people to invest in this technology. Then turning around and saying your native modules are gone and you shouldn’t complain because no one said the language was ready feels a bit wrong.

                                            1. 7

                                              Yeah when I look at the home page, it does seem like it is over-marketed: http://elm-lang.org/

                                              At the very least, the FAQ should probably contain a disclaimer about breaking changes: http://faq.elm-community.org/

                                              Ctrl-F “compatibility” doesn’t find anything.

                                              It’s perhaps true that pre-1.0 software is free to break, but it seems like there is a huge misunderstanding in the community about compatibility. The version number doesn’t really mean much in my book – it’s more a matter of how many people actually rely on the software for production use, and how difficult their upgrade path is. (Python 3 flauted this, but it got by.)

                                              I think a lot of the conflict could be solved by making fewer promises and providing some straightforward, factual documentation with disclaimers.

                                              I watched the “What is Success?” talk a couple nights ago and it seemed like there is a lot of unnecessary conflict and pain in this project. It sounds like there is a lot to learn from Elm though – I have done some stuff with MUV and I like it a lot. (Although, while the types and purity probably help, but you can do this in any language.)

                                              1. 4

                                                I watched the “What is Success?” talk a couple nights ago and it seemed like there is a lot of unnecessary conflict and pain in this project

                                                I watched the talk also, after another… Lobster(?)… Posted it in another thread. My biggest takeaway was that Evan really doesn’t want to deal with an online community. People at IRL meetups, yes. Students in college, yes. People/companies online trying to use the language? No. His leading example of online criticism he doesn’t want to deal with was literally “Elm is wrong” (he quoted without any context, which isn’t that helpful. But maybe that was all of it.)

                                                That’s fine. He’s the inventor of the language, and the lead engineer. He probably does have better things to do. But as an outsider it seems to me that someone has to engage more productively with the wider community. Our, just come out and say you don’t care what they think, you’ll get what you’re given, and you can use it if you choose. But either way communicate more clearly what’s going on, and what to expect.

                                            2. 13

                                              I’ve shipped multiple production applications in Elm and attempted to engage with the community and I can say that their characterization perfectly matches mine.

                                              Native modules being removed in particular has caused me to no longer use Elm in the future. I was always ok with dealing with any breakage a native module might cause every release, and I’m even ok with not allowing them to be published for external consumption, but to disallow them completely is unreasonable. I’m sure a number of people feel the same way as I do, but it feels impossible to provide meaningful feedback.

                                              1. 9

                                                I work for a company that began using Elm for all new projects about a year and a half ago. That stopped recently. There are several reasons that people stopped using Elm. Some simply don’t like the language. And others, like the author of this post, want to like the language but are put off by the culture. That includes me. This article closely resembles several conversations I’ve had at work in the past year.

                                              1. 3

                                                Trivia related to computer graphics + early Linux: Bruce Perens was the leader of Debian for awhile, and also worked at Pixar for 12 years.

                                                I assume that Pixar was an early adopter of Linux, because otherwise they would have to pay commercial OS licensing fees for the hundreds / thousands of machines they used to render movies.

                                                Although I read Ed Catmull’s recent book and I don’t think he mentioned Linux? That book did mention the NYIT graphics lab.



                                                1. 2

                                                  I vaguely recall some news around 2003 (I think it was) about Pixar switching from Sun to Intel hardware, and porting renderman.

                                                  1. 1

                                                    Sun? I know they bought a ton of SGI Octanes for Toy Story.

                                                    1. 3

                                                      found this: https://www.cnet.com/news/pixar-switches-from-sun-to-intel/
                                                      May have been what I was recalling.

                                                      Maybe they used SGI before that?

                                                      1. 1

                                                        Cool! I didn’t know that.

                                                        You are probably right - buying SGI in the 2000s isn’t likely a smart move ;)

                                                      2. 2

                                                        This story said they used SGI for desktops and Suns for rendering.

                                                        Also for @trousers.

                                                        1. 2

                                                          This story said they used SGI for desktops and Suns for rendering.

                                                          Also for @trousers.

                                                          They used Suns for trousers? Sparc64 pants? A novel usecase for sure. ;)

                                                          I kid, I kid. Thanks for the link. :)

                                                          1. 3

                                                            They were rendering them in the movie. Had to get accurate lighting, ruffling, and so on. Geek producing it spent so much on the hardware they couldnt afford all the actors. Show got cancelled.

                                                            Many investors now suspect the Trouser Tour documentary was a ruse devised so the producer could play with a bunch of SGI and Sun boxes. Stay tuned for updates.

                                                  1. 1

                                                    Hm, I used to convert web logs to JSON records – one per line – and then use grep to do a pre-filter! It can filter out 90% or 99% of the lines that need to be filtered, and then you parse JSON to get the exact filter.

                                                    grep is amazingly fast! This seems like the same idea taken a little further. I’ll have to look at how they do it in more detail.

                                                    1. 2

                                                      Section 7.2 of the paper actually uses grep/ripgrep as a basis of comparison. It seems the two have the same or better performance than Sparser, which still wins out by a small margin for the most selective queries.

                                                      1. 2

                                                        Yes. Always use grep first, even if one awk would do. This special one purpose only tool really cut the time down. Especially when you want to work on less than 100 million lines out of a billion lines.

                                                      1. 2

                                                        I scanned over this, and OSH [1] should support basically everything here. If it doesn’t, feel free to file a bug!


                                                        Hm now that I look more at the repo, it looks like this would be a good test: https://github.com/dylanaraps/pure-bash-bible/blob/master/test.sh

                                                        [1] http://www.oilshell.org/blog/2018/07/23.html

                                                        1. 8

                                                          About analytics: You can do them on the server side by parsing your web logs! That used to be how everyone did it! Google Analytics popularized client side analytics using JavaScript around 2006 or so.

                                                          Unfortunately I feel like a lot of the open source web analytics packages have atrophied from disuse. But I wrote some Python and R scripts to parse access.log and it works pretty well for my purposes.

                                                          http://www.oilshell.org/ is basically what this article recommends, although I’m using both client-side and server-side analytics. I can probably get rid of the client-side stuff.

                                                          related: http://bettermotherfuckingwebsite.com/ (I am a fan of narrow columns for readability)

                                                          1. 4

                                                            I agree, I used to use JAWStats a PHP web app that parsed and displayed the AWStats generated data files to provide visually appealing statistics a lot like Google analytics but entirely server side with data originating from apache/nginx log files.

                                                            It’s a shame that it was last worked on in 2009. There was a fork called MAWStats but that hasn’t been updated in four years either :(

                                                            For a while I self hosted my feed reader and web analytics via paid for apps, Mint and Fever by Shaun Inman but those where abandoned in 2006. It seems like all good software ends up dead sooner or later.

                                                            1. 3

                                                              Maybe the GDPR will give these project a new breath.

                                                              They are much better for privacy aware people.

                                                              1. 2

                                                                It’s been on my list of projects to attempt for a while, but my static site generator Tapestry takes up most of my spare time.

                                                            2. 4

                                                              You want GoAccess. Maintained, and looks modern. Example. I’m using it and it has replaced AWStats for me completely.

                                                              1. 2

                                                                I currently use GoAccess myself, the only thing that would make the HTML reports better is seeing a calendar with visit counters against days.

                                                            1. 8

                                                              I saw SAT solvers as academically interesting but didn’t think that they have many practical uses outside of other academic applications. … I have to say that modern SAT solvers are fast, neat and criminally underused by the industry.

                                                              Echoing a good comment on reddit: The author didn’t list any practical applications!!! How can you then say they are criminally underused?

                                                              The only one I know of is writing versioned dependency solver for a package manager (in the style of Debian’s apt). However, very few people need to write such code.

                                                              What are some other practical applications? I think they are all quite specialized and don’t come up in day-to-day programming. Happy to be proven wrong.

                                                              EDIT: I googled and I found some interesting use cases, but they’re indeed specialized. Not something I’ve done or seen my coworkers do:



                                                              I can think of a certain scheduling algorithm that might have used a SAT solver, but without details I can’t be sure.

                                                              1. 5

                                                                They see some use in formal methods for model checking. The Alloy Analyzer converts Alloy specs to SAT problems, which is one of the reasons its such a fast solver.

                                                                There’s also this talk on analyzing floor plans.

                                                                1. 4

                                                                  A previous submission on SAT/SMT might help you answer that.

                                                                  1. 4

                                                                    I’ve used Z3 to verify that a certain optimized bitvector operation is equivalent to the obvious implementation of the intended calculation.

                                                                    Just typed up the two variants as functions in the SMT language with the bitvector primitives and asked Z3 for the satisfiability of f(x) != g(x) and rejoiced when it said “unsatisfiable.”

                                                                    1. 1

                                                                      Hm this is interesting. Does this code happen to be open source?

                                                                      1. 7

                                                                        I’ll just post it here. :)

                                                                        (define-sort Word () (_ BitVec 64))
                                                                        (define-fun zero () Word  (_ bv0 64))
                                                                        ;; Signed addition can wrap if the signs of x and y are the same.
                                                                        ;; If both are positive and x + y < x, then overflow happened.
                                                                        ;; If both are negative and x + y > x, then underflow happened.
                                                                            ((x Word) (y Word)) Bool
                                                                            (or (and (bvslt x zero)
                                                                                     (bvslt y zero)
                                                                                     (bvsgt (bvadd x y) x))
                                                                                (and (bvsgt x zero)
                                                                                     (bvsgt y zero)
                                                                                     (bvslt (bvadd x y) x))))
                                                                        ;; Here is a clever way to calculate the same truth value,
                                                                        ;; from _Hacker's Delight_, section 2.13.
                                                                            ((x Word) (y Word)) Bool
                                                                            (bvslt (bvand (bvxor (bvadd x y) x)
                                                                                          (bvxor (bvadd x y) y))
                                                                        (set-option :pp.bv-literals false)
                                                                        (declare-const x Word)
                                                                        (declare-const y Word)
                                                                        (assert (not (= (add-overflow-basic x y)
                                                                                        (add-overflow-clever x y))))
                                                                        1. 2

                                                                          Here’s you an example of SMT solvers used for stuff like that. I added some more stuff in comments. You might also like some examples of Why3 code which is translated for use with multiple solvers. Why3 is main way people in verification community use solvers that I’m aware. WhyML is a nice, intermediate language.

                                                                      2. 3

                                                                        Keiran King and @raph compiled a SAT solver to WebAssembly to use for the “auto-complete” feature of Phil, a tool for making crossword puzzles (source).

                                                                        1. 2

                                                                          I found this library (and its docs) interesting. They go over some practical examples.


                                                                          1. 1

                                                                            This looks like it’s about linear programming, not SAT solving. They’re related for sure but one reason it would be nice to see some specific examples is to understand where each one is applicable!

                                                                          2. 2

                                                                            I have personally used them when writing code to plan purchases from suppliers for industrial processes and to deal with shipping finished products out using the “best” carrier.

                                                                          1. 5

                                                                            Neat stuff!

                                                                            I wonder what @andyc has to say about Oh vs. Oil :)

                                                                            1. 10

                                                                              Thanks for the shout out :) I listened to the end of the talk (thanks for the pointer by @msingle), and basically everything he says is accurate. I found his 2010 master’s thesis several years ago, and it is a great read. Section 3 on Related Work is very good.


                                                                              Oh and Oil have very similar motivations – treating shell as a real programming language while preserving the interactive shell. Oh seems to be more Lisp-like while Oil simply retains the compositionality of the shell; it doesn’t add the compositionality of Lisp. (In other words, functions and cons cells compose, but processes and files also compose).

                                                                              I mention NGS here, which probably has even more similar motivations:


                                                                              The main difference between Oh and Oil is the same as the difference between NGS/Elvish/etc. and Oil: Oil is designed to be automatically converted from sh/bash.

                                                                              My thesis is that if this conversion works well enough, Oil could replace bash. If I just designed a language from scratch, I don’t think anyone would use it. Many people seem to agree with this. After all, fish has existed for 15 years, and is a nicer language than bash for sure, but I’ve seen it used zero times for scripts (and I’ve looked at dozens if not a hundred open source projects with shell scripts.)

                                                                              However as he correctly notes (and I point out in the FAQ), Oil doesn’t exist yet! Only OSH exists.

                                                                              The experience of implementing OSH and prototyping the OSH-to-Oil gave me a lot of confidence that this scheme can work. However it’s taking a long time, longer than expected, like pretty much every software project ever.

                                                                              I’m not really sure how to accelerate it. Maybe it will get there and maybe it won’t :-/ No promises!

                                                                              I “front-loaded” the project so that if I only manage to create a high-quality implementation of OSH (a bash replacement), then I won’t feel bad. I’m pretty sure people will use that if it exists.

                                                                              I maintain a “help wanted” tag for anybody interested in contributing:


                                                                              This is mainly helping with OSH, as Oil is a concentrated design/implementation effort that is hard to parallelize. Feel free to join https://oilshell.zulipchat.com too!

                                                                              1. 3

                                                                                Michael MacInnis mentions talking to @andyc ~44:25 in the talk, but I’d also like to hear his opinion on Michael’s work.

                                                                                1. 1

                                                                                  Oh! (no pun intended :p) I’m still at the ~33:00 mark into the video and hadn’t gotten to that part. But yeah, I’d like to hear that.

                                                                              1. 8

                                                                                Wow great to see this! I remarked after one of the chapters in craftinginterpreters that I didn’t find a lot of literature on bytecode VMs.

                                                                                There are a lot of books on compilers, but not as much for interpreters (i.e. the bytecode compiler + dynamic runtime combo that many languages use).

                                                                                There are the Lua papers, a few blog posts about Python, some papers about the predecessor to OCaml, and craftinginterpreters, but not much else I could find. I found this book recently, but it’s not specifically about the compiler / VM:


                                                                                Anyway I am glad to see another addition to this small space! :)

                                                                                I’m hoping to find some time to really push into my compiler / VM project this winter: http://www.oilshell.org/blog/2018/03/04.html .

                                                                                Basically the idea is that Python has a “dumb” compiler and a very rich runtime. (This fact has been impressed upon me by hacking on its source code!). I want to make it have more of a smart compiler and small/dumb runtime.

                                                                                1. 5

                                                                                  There’s also Nils M Holm’s books: http://t3x.org/

                                                                                  1. 2

                                                                                    That’s just pure gold :O

                                                                                    1. 1

                                                                                      Which parts discuss bytecode interpreters? I see a lot of different things there, including native code compilers, but no bytecode interpreters.

                                                                                      1. 3

                                                                                        One of the features of the T3X compiler is a portable bytecode interpreter. Here’s the source, I think you’ll like it: https://www.t3x.org/t3x/t.t.html

                                                                                    2. 4

                                                                                      There are a lot of books on compilers, but not as much for interpreters (i.e. the bytecode compiler + dynamic runtime combo that many languages use).

                                                                                      Not a book, but you still might find this paper interesting: Engineering Definitional Interpreters:

                                                                                      Abstract: A definitional interpreter should be clear and easy to write, but it may run 4–10 times slower than a well-crafted bytecode interpreter. In a case study focused on implementation choices, we explore ways of making definitional interpreters faster without expending much programming effort. We implement, in OCaml, interpreters based on three semantics for a simple subset of Lua. We compile the OCaml to x86 native code, and we systematically investigate hundreds of combinations of algorithms and data structures. In this experimental context, our fastest interpreters are based on natural semantics; good algorithms and data structures make them 2–3 times faster than interpreters. Our best interpreter, created using only modest effort, runs only times slower than a mature bytecode interpreter implemented in C.

                                                                                      1. 3

                                                                                        Wow thanks! This is exactly the kind of thing I’m looking for.

                                                                                        For example, even the first sentence is somewhat new to me. Has anyone else made a claim about how much slower a tree interpreter (I assume that’s what they mean by definitional) is than a bytecode interpreter? I’ve never seen that.

                                                                                        I know that both Ruby and R switched from tree interpreters to bytecode VMs in the last 5-8 years or so, but I don’t recall what kind of speedup they got. That’s something to research (and would make a good blog post).

                                                                                        Anyway I will be reading this and following citations :-) I did find a nice paper on the design of bytecode VM instructions. Right now choosing instructions seems to be in the category of “folklore”. For example, there are plenty of explanations of Python’s bytecode, but no explanations of WHY they are as such. I think it was basically ad hoc evolution.

                                                                                        1. 1

                                                                                          I did find a nice paper on the design of bytecode VM instructions.

                                                                                          Would mind sharing what that paper is? I’d love to read more about how to do this properly.

                                                                                          1. 3

                                                                                            I’m pretty sure this is the one I was thinking of:

                                                                                            ABSTRACT MACHINES FOR PROGRAMMING LANGUAGE IMPLEMENTATION


                                                                                            We present an extensive, annotated bibliography of the abstract machines designed for each of the main programming paradigms (imperative, object oriented, functional, logic and concurrent). We conclude that whilst a large number of efficient abstract machines have been designed for particular language implementations, relatively little work has been done to design abstract machines in a systematic fashion.

                                                                                            A good term to search for is “abstract machine” (rather than bytecode interpreter). The OCaml paper is called the “ZINC Abstract Machine” and it’s quite good.

                                                                                            There is a bunch of literature on stuff like the SECD machine, which is an abstract machine for Lisp, and which you can find real implementations for.

                                                                                            There seems to be less literature on stack / register bytecode interpreters. The Lua papers seem to be the best read in that area.

                                                                                            This guy is asking a similar question: https://stackoverflow.com/questions/1142848/bytecode-design

                                                                                            1. 1

                                                                                              This is great, thank you for sharing!

                                                                                    1. 37

                                                                                      I think practically all “Why You Should…” articles would be improved if they became “When You Should…” articles with corresponding change of perspective.

                                                                                      1. 23

                                                                                        An even better formulation would be “Here is the source code for an app where I didn’t use a framework. It has users, and here are my observations on building and deploying it”.

                                                                                        In other words, “skin in the game” (see Taleb). I basically ignore everyone’s “advice” and instead look at what they do, not what they say. I didn’t see this author relate his or her own experience.

                                                                                        The problem with “when you should” is that the author is not in the same situation as his audience. There are so many different programming situations you can be in, with different constraints, and path dependence. Just tell people what you did and they can decide whether it applies to them. I think I basically follow that with http://www.oilshell.org/ – I am telling people what I did and not attempting to give advice.

                                                                                        (BTW I am sympathetic to no framework – I use my own little XHR wrapper and raw JS, and my own minimal wrapper over WSGI and Python. But yes it takes forever to get things done!)

                                                                                        1. 2

                                                                                          Thanks for the Taleb reference. I didn’t know it existed, and so far it is a good read.

                                                                                          1. 1

                                                                                            His earlier books are also good. It is a lot of explaining the same ideas in many different ways, but I find that the ideas need awhile to sink in, so that’s useful.

                                                                                            He talks about people thinking/saying one thing, but then acting like they believe its opposite. I find that to be painfully true, and it also applies to his books. You could agree with him in theory, but unless you change your behavior then you might not have gotten the point :-)

                                                                                            Less abstractly, the worst manager I ever had violated the “skin in the game” rule. He tried to dictate the technology used in a small project I was doing, based on conversations with his peers. That technology was unstable and inappropriate for the task.

                                                                                            He didn’t have to write the code, so he didn’t care. I was the one who had to write the code, so I’m the one with skin in the game, so I should make the technology choices. I did what he asked and left the team, but what he asked is not what the person taking over wanted I’m sure.

                                                                                            In software, I think you can explain a lot of things by “who has to maintain the code” (who has skin in the game). I think it explains why the best companies maintain long term software engineering staff, instead of farming it out. If you try to contract out your work, those people may do a shitty job because they might only be there for a short period. (Maybe think of the healthcare.gov debacle – none of the engineers really had skin in the game.)

                                                                                            It also explains why open source code can often be higher quality, and why it lasts 30+ years in many cases. If the original designer plans on maintaining his or her code for many years, then that code will probably be maintainable by others too.

                                                                                            It also explains why “software architect” is a bad idea and never worked. (That is, a person who designs software but doesn’t implement it.)

                                                                                            I’m sure these principles existed under different names before, and are somewhat common sense. But they do seem to be violated over and over, so I like to have a phrase to call people on their BS. :-)

                                                                                            1. 2

                                                                                              Yeah, the phrase works as a good lens and reminder. Interestingly, as most parents will attest to - the “do as I say not as I do” is generally unsuccessful with kids. They are more likely to emulate than listen.

                                                                                        2. 2

                                                                                          I definitely agree with this change. It’d get more people thinking architecturally, something that’s sorely needed.

                                                                                        1. 4

                                                                                          I agree that make is too freaking hard. it’s a terrible tool and you don’t have to use it. It took me years to realize this. I deleted the makefiles from my projects. I no longer use makefiles.

                                                                                          1. 4

                                                                                            Yup. I should also write a blog post on “invoking the compiler via a shell script”.

                                                                                            The main thing to know is that the .c source files and -l flags are order dependent. With a makefile, most people use separate processes to compile and link so I think it doesn’t come up as much.

                                                                                            1. 4

                                                                                              I don’t use Make as a build tool, but I find it quite handy to collect small scripts and snippets with PHONY targets that don’t attempt any dependency tracking. Make is almost universally available, the simple constructs I use are portable between gmake and BSD make, and almost every higher-level tool out there understands Makefile — so coworkers using various IDEs and the command lin can all discover and run the “build”, “this test”, “download dependencies”, “run import process”, “lint”, etc tasks. If I need a task that’s more than two lines, I put it in a shell script.

                                                                                              Although some languages now come with tooling that understands scripts, such as Cargo or NPM, I still find a Makefile useful for polyglot projects or when it’s necessary to modify the environment before calling down to that language specific tooling.

                                                                                              1. 4

                                                                                                Yes, I want to write about this too! You are using Make like a shell script :)

                                                                                                I use this pattern in shell:

                                                                                                # run.sh
                                                                                                build() {
                                                                                                test() {

                                                                                                Then I invoke with

                                                                                                $ run.sh build
                                                                                                $ run.sh test

                                                                                                I admit that Make has a benefit in that the targets are auto-completed on most distros. But I wrote my own little auto-complete that does this. I like the simplicity of shell vs. make, and the syntax highlighting in the editor.

                                                                                                When I need dependency tracking, I simply invoke make from the shell script! Processes compose.

                                                                                                You’ll see this in thousands of lines of shell scripts (that I wrote) in the repo:


                                                                                              2. 1

                                                                                                About two years ago I finally sat down and read the GNUMake manual. It’s very readable, and it’s more capable than just about any other make out there. For one project, the core of the Makefile is:

                                                                                                %.a :
                                                                                                    $(AR) $(ARFLAGS) $@ $?
                                                                                                libIMG/libIMG.a     : $(patsubst %.c,%.o,$(wildcard libIMG/*.c))
                                                                                                libXPA/src/libxpa.a : $(patsubst %.c,%.o,$(wildcard libXPA/src/*.c))
                                                                                                libStyle/libStyle.a : $(patsubst %.c,%.o,$(wildcard libStyle/*.c))
                                                                                                libWWW/libWWW.a     : $(patsubst %.c,%.o,$(wildcard libWWW/*.c))
                                                                                                viola/viola         : $(patsubst %.c,%.o,$(wildcard viola/*.c))     \
                                                                                                            libIMG/libIMG.a         \
                                                                                                            libXPA/src/libxpa.a     \
                                                                                                            libStyle/libStyle.a     \

                                                                                                The rest of it is defining the compiler and linker flags (CC, CFLAGS, LDFLAGS, LDLIBS) and some other targets (clean, depend (one command line to generate the dependencies), install, etc). And this builds a program that is 150,000 lines of code. I can even do a make -j to do a parallel build. I’m not entirely sure where all this make hate comes from.

                                                                                                1. 2

                                                                                                  I’ve read the GNU make manual (some parts multiple times) and written 3 significant makefiles from scratch. One of them is here:

                                                                                                  https://github.com/oilshell/oil/blob/master/Makefile (note that it includes .mk fragments)

                                                                                                  It basically works but I’m sure that there are some bugs in the incremental and parallel builds. I have to make clean sometimes and I’m not brave enough do parallel builds. How would I track these bugs down? I have no idea. I tried but I kept breaking other things, and I got no feedback about this.

                                                                                                  In other words, it’s extraordinarily difficult to know whether your incremental build is correct, and whether your parallel build is correct. Make offers you no help there essentially.

                                                                                                  There are a lot of other criticisms out there, but if you scroll down here you’ll see mine:


                                                                                                  (correctness, gcc -M, wrong defaults for .SECONDARY, etc.)

                                                                                                  There is also a debugging incantation I use that I had to figure out with some hard experience. Basically I disable the builtin rules database and enable verbose mode.

                                                                                                  Another criticism is that the builtin rules database can make builds significantly slower.

                                                                                                  I’m not using Make for a simple problem, but most build problems are not simple! It is rare that you just want to build a few C files in a portable fashion. For that, it’s fine. But most systems these days are much more complex than that. Multiple languages and multiple OSes lead to an explosion and complexity, but the build system is the right place to handle those problems.

                                                                                                  1. 2

                                                                                                    I somehow seem to miss these “complex builds that break Make.” I have a project that uses C, C++ and Lua in a single executable and make handled it fine (and that includes compiling the Lua code into Lua bytecode, then transforming that into a C file which is then compiled into an object file for final inclusion in the executable).

                                                                                                    I don’t know. For as bad as make is made out to be, I’ve found the other supposed solutions to be worse.