1. 3

    I find it interesting that he starts with Google as an example.

    If I had to name a company which has a product that is relatively simple in technical terms a search engine would certainly not be on top of my list. It seems relatively obvious to me that creating a good search engine is a tough achievement. Sure, creating some kind of search engine is probably easy - but then you end up with another Bing, not another Google.

    1. 3

      I’m sure Bing has well over a thousand people working on it too. I think it’s a respectable effort – it must be the second best search engine AFAICT.

      FWIW I joined Google when it had around 3,000 employees, in early 2005. So this was 6 months after GMail came out, but before Google Maps came out, so Google was still a search engine.

      I remember my former coworkers asking why the company was so damn big. They thought it should be like 50 engineers and 50 non-technical people. It was just a website with a box, so how hard can it be?

      I don’t think their reaction was stupid, as I think pretty much everybody thought that at the time. Certainly it was a shock to see how much technology there was behind the scenes.

      1. 1

        Bing is actually quite good; it’s probably only about 3-4 years behind Google & you’ll recall that Google was still pretty damn good 4 years ago. DDG may be a better example of an 80% search engine ;-p

      1. 8

        Regarding Table of Contents generation, what do you think about hxtoc(1) which is part of the HTML/XML utilities by the w3c?

        Also, I’ve made a similar experience regarding a joyful discovery of CommonMark recently, but instead of using the parser you mention, I’ve taken up lowdown as my client of choice. I guess this is something it has in common with most C implementations of markdown, but especially when compared to pandoc, it was fast. It took me a fraction on a second to generate a website, instead of a dozen or more. So I guess, I wanted to see, what other clients you’ve looked into, for example discount, as an example of an another popular implementation.

        1. 5

          Hm, I’ve actually never heard of hxtoc, lowdown, or discount!

          I haven’t been using Markdown regularly for very long. I started it using more when I started the Oil blog in late 2016. Before that, I wrote a few longer documents in plain HTML, and some in Markdown.

          I had heard of pandoc, but never used it. I guess cmark was a natural fit for me because I was using markdown.pl in a bunch of shell scripts. So cmark pretty much drops right in. I know a lot of people use framework-ish static site generators, which include markdown. But I really only need markdown, since all the other functionality on my site is written with custom scripts.

          So I didn’t really do much research! I just felt that markdown.pl was “old and smelly” and I didn’t want to be a hypocrite :-) A pile of shell scripts is pretty unhygienic and potentially buggy, but that is what I aim to fix with Oil :)

          That said, a lot of the tools you mention do look like the follow the Unix philosophy, which is nice. I would like to hear more about those types of tools, so feel free to post them to lobste.rs :) Maybe I don’t hear about them because I’m not a BSD user?

          1. 4

            I had heard of pandoc, but never used it.

            It’s a nice tool, and not only for working with Markdown, but tons of other formats too. But Markdown is kind of it’s focus… If you look at it’s manual, you’ll find that it can be very finely tuned to match ones preferences (such as enabling or disabling raw HTML, syntax highlighting, math support, BibLaTeX citations, special list and table formats, etc.). It even has output options that make it resemble other implementations like PHP Markdown Extra, GitHub-Flavored Markdown, MultiMarkdown and also markdown.pl! Furthermore, it’s written by John MacFarlane, who is one of the guys behind CommonMark itself. In fact if you look at the cmark contributers, he seems to be the most active maintainer.

            I usually use pandoc to generate .epub files or to quickly generate a PDF document (version 2.0 supports multiple backends, besides LaTeX, such as troff/pdfroff and a few html2pdf engines). But as I’ve mentioned, it’s a bit slow, so I tend to not use it for simpler texts, like when I have to generate a static website.

            I know a lot of people use framework-ish static site generators, which include markdown.

            Yeah, pesonally I use zodiac which uses AWK and a few shell script wrappers. You get to choose the converter, which pipes some format it, and HTML out. It’s not ideal, but other than writing my own framework, it’s quite ok.

            Maybe I don’t hear about them because I’m not a BSD user?

            Nor am I, at least not most of the time. I learned about those HTML/XML utilities because someone mentioned them here on lobste.rs, and I was supprised to see how powerful they are, but just how seemingly nobody knows about them. hxselect to query specific elements in a CSS-fashion, hxclean as an automatic HTML corrector, hxpipe/hxunpipe converts (and reconverts) HTML/XML to a format that can be more easily parsed by AWK/perl scripts – certainly not useless or niche tools.

            But I do have to admit that a BSD user influenced me on adopting lowdown, and since it fits my use-case, I stick by it. Nevertheless, I might take a look at cmark, since it seems interesting.

          2. 2

            Unfortunately, it looks like lowdown is a fork of hoedown which is a fork of sundown which was originally based on the markdown.pl implementation (with some optional extensions), and is most likely not CommonMark compliant. Pandoc is nice because it can convert between different formats, but it also has quite a few inconsistencies.

            One of the biggest reasons I like CommonMark is because it aims to be an extremely solid, consistent standard that makes markdown more sane. It would be nice to see more websites move towards CommonMark, but that’s probably a long shot.

            Definitely check out babelmark if you get a chance which lets you test different markdown inputs against a bunch of different parsers. There are a bunch of example divergences on the babelmark FAQ. The sheer variety of outputs for some simple inputs is precisely why CommonMark is useful as a standard.

            1. 3

              Lowdown isn’t CommonMark conformant, although it has some bits in place. The spec for CommonMark is huge.

              If you’re a C hacker, it’s easy to dig into the source to add conformancy bit by bit. See the parser in document.c and look for LOWDOWN_COMMONMARK to see where bits are already in place. The original sundown/hoedown parser has been considerably simplified in lowdown, so it’s much easier to get involved. I’d be thrilled to have somebody contribute more there!

              In the immediate future, my biggest interest is in going an LCS implementation into the lowdown-diff algorithm. Right now it’s pretty ad hoc.

              (Edit: I’m the author of lowdown.)

              1. 2

                One of the biggest reasons I like CommonMark is because it aims to be an extremely solid, consistent standard that makes markdown more sane. It would be nice to see more websites move towards CommonMark, but that’s probably a long shot.

                I guess I can agree with you when it comes to websites like Stackoverflow, Github and Lobsters having Markdown formatting for comments and other text inputs, but I really don’t see the priority when it comes to using a not 100% CommonMark compliant tool for your own static blog generator. I mean, it’s nice, no doubt, as long as you don’t intentionally use uncertain constructs and don’t over-format your texts to make them more complicated than they have to be, I guess that most markdown implementations are find in this regard – speed on the other hand, is a different question.

                1. 1

                  Are you saying that CommonMark should be used for comments on websites, but not for your own blog?

                  I would say the opposite. For short comments, the ambiguity in Markdown doesn’t seem to be a huge problem, and I am somewhat comfortable with just editing “until it works”. I don’t use very many constructs anyway – links, bold, bullet points, code, and block code are about it.

                  But blogs are longer documents, and I think they have more lasting value than most Reddit comments. So although it probably wasn’t strictly necessary to switch to cmark, I like having my blog in a format with multiple implementations and a spec.

                  1. 3

                    At least in my opinion, its useful everywhere, but more so for comments, because it removes differences in implementations. Often times the people using a static site generator are developers and can at least understand differences between implementations.

                    That being said, I lost count of how many bugs at Bitbucket were filed against the markdown parser because the library used resolves differences by following what markdown.pl does. I still remember differences in bbcode parsing between different forums - moving to a better standard format like markdown has been a step in the right direction… I think CommonMark is the next step in the right direction.

                    1. 1

                      The point has already been brought up, but I just want to stress it again. You will probably have a feeling for how your markup parser works anyway, and you will write according. If your parser is commonmark compliant, that’s nice, but really isn’t the crucial point.

                      On the other hand, especially if one likes to write longer comments, and uses a bit more than the basic markdown constructs on websites, having a standar to rely on does seem to me to offer an advantage, since you don’t necessary know what parser is running in the background. And if you don’t really use markdown, it doesn’t harm you after all.

              1. 3

                Nitpick:

                <p>"Oil"</p><p>&quot;Oil&quot;</p>. The former might be valid HTML, but the latter is better. (The former is also not valid XML.)

                The former is a perfectly valid[1] XML, there’s nothing wrong with " outside tags.

                [1]: More correctly, it’s well-formed. “Valid” only has meaning against a specified DTD schema, which is absent here. But if we assume it’s HTML, then it’s also a valid HTML fragment.

                1. 3

                  Yes, you’re right, I made a correction:

                  http://www.oilshell.org/blog/2018/02/14.html#toc_2

                1. 1

                  Has anyone used Open Build Service here? I noticed that Alpine Linux is having some problems with getting a good continuous build service up on multiple platforms. Is this the intended use case for Open Build Service? Do you have to modify their server code to support a new distro, or is it “self serve”?

                  http://lists.alpinelinux.org/alpine-devel/6057.html

                  1. 7
                    • I got some rough static analysis of shell scripts working (for Oil shell). It figures out what external binaries you use, statically. If anyone has done any projects / research related to static analysis (in any language), I’m interested!
                    • I’m thinking of ways to make the Oil binary smaller without rewriting code. I think I figured out a good way to do it based on modifying/generating CPython’s PyMethodDef module tables – that is, get rid of functions that are used by a given Python program. I plan to try it out this week. I had experimented with dynamic code coverage, but this more static strategy feels better for a first pass at chopping down 150K lines of C code.
                    1. 1

                      Are there other kernels that have a better culture, and are easier to approach if one would like a chance to work at this level? One of the BSDs? Something springing from L4 somewhere I don’t know about? Haiku? ReactOS? Something more esoteric? That Rust one whose name escapes me right now?

                      1. 5

                        Linux is so unique in its development style and pace that I think most of those comparisons are misleading. There are probably less than 5 or 10 projects you can compare Linux too, and even then it’s hard to.

                        Something like L4 is done by a small group of people, who probably have physical contact with each other. It’s more like Google’s Fuschia than Linux. Likewise with Minix – it’s a relatively small group of people and thus cohensive, at least compared to Linux.

                        As far as I understand, FreeBSD is the “biggest” of the BSDs. Not sure if that means it has the most developers. But I would guess that Linux has 10x the developers of the next biggest one. And the developers come from 10x more diverse institutions (corporations, etc.)

                        This is my informal sense of things; I’m sure there are numbers, and if anyone has them I’d appreciate it. But I’d be surprised if the pace / #developers isn’t at least 10x, even 100x.

                        Redox OS is the Rust one. I only know a little bit about it, but it also has the property of being nascent, which will make for a very different culture than Linux. There is a lot “at stake” in Linux, hence the disagreements.

                        Also, for better or worse, almost all the projects you mention are kernel + user space, not just a kernel.

                        1. 1

                          I don’t mean “I want to make value judgements about which communities are better because they deal with more stuff”, I mean “I would love to hack on kernels but want no part in this kind of environment, where should I go?”

                          1. 2

                            I haven’t hung out here myself, but I’ve heard good things about: https://wiki.osdev.org/Expanded_Main_Page

                        2. 3

                          This is a good time to shill my favourite project (It’s NetBSD, and has great culture), but to be honest - I don’t know many projects with governance as insane as Linux. Even in the article, Daniel Vetter refers to group maintainership and handing out commit access as “more like a standard project”.

                          I’ve seen people who had potential to be toxic maintainers that drive away contributions but their impact was limited by them not having absolute power.

                          Also, by having a very weak hierarchy, nobody is immune from being kicked from the project. Toxicity will get you pulled aside and have someone ask you to stop/apologize, continuing will result in being kicked from the project, and I’ve seen it happen in practice.

                        1. 10

                          POSIX shell, on the other hand, has many competing implementations on many different operating systems - all of which are compatible with each other because they conform to the standard.

                          That’s not really true. dash and bash are incompatible in some basic ways, which are not resolved by POSIX. I list two here:

                          http://www.oilshell.org/blog/2018/01/28.html#limit-to-posix

                          Example:

                          $ dash -c 'x="a b"; readonly y=$x; echo $y'
                          a
                          $ bash -c 'x="a b"; readonly y=$x; echo $y'
                          a b
                          

                          The above is a very short POSIX shell script, but which one is right?

                          dash is actually the most “divergent” shell I’ve tested. bash, mksh, and busybox ash all tend to agree on the important stuff (although they implement different features, and wildly disagree on error cases). dash has basic disagreements like this.

                          Also, I mention that Debian doesn’t even limit itself to POSIX – they forked/developed dash to replace bash, but apparently couldn’t do what they wanted without some extensions like local, which I agree is essential.

                          BTW, this is from my user experience, not just writing Oil. I tried to switch to dash maybe 4 years ago, but gave up because it was so incompatible (and harder to use than bash).

                          1. 2

                            The above is a very short POSIX shell script, but which one is right?

                            You could just put double quotes around the argument to readonly and be done with it:

                            dash -c 'x="a b"; readonly "y=$x"; echo $y'
                            a b
                            

                            Yes, it’s inconsistent compared to normal assignment. Yes, it’s portable.

                            Having worked on code that needs to run on multiple platforms and having had to deal with shell scripts incompatible between e.g. zsh and bash with people on either side arguing that their shell is superior, I feel that the least common denominator (POSIX) is the way to go. Pragmatic side-steps can still be made if the result is that it doesn’t cause more problems than it solves. local tends to work across shells and does solve a real-world problem.

                            1. 2

                              Sure, but my point is that “use POSIX”, and especially “read the POSIX spec”, is not great advice.

                              That’s not the only example; I could name at least a half dozen other divergences that occur in practice, which POSIX is silent on.

                              I’ve never heard anyone say “use POSIX, but make sure to quote your assignments, and … “.

                              There are very few people who learn programming languages by reading specifications. Usually it’s a matter of working with an existing codebase, googling and reading StackOverflow results, reading books, etc.

                              I have a whole book on POSIX shell, so I’m very familiar with this kind of advice. I just don’t think it’s useful. It’s just a big headache that very few people will follow. For example, echo vs. printf. 95%+ of shell scripts I see use echo, even though it’s not fully-specified and not portable.

                              1. 1

                                YMMV. Following the advice in “Beginning Portable Shell Scripting” by Peter Seebach, the Open Group base spec1 and tools like ShellCheck has worked very well for me, deploying to Slackware with /bin/sh symlinked to bash (IIRC, or maybe it was a copied binary?), Debian with dash as /bin/sh, FreeBSD with /bin/sh being an Almquist shell (ash) descendant, and running the same scripts directly in bash and zsh as well.

                                I tend to stay away from csh, tcsh and I’ve never seen them being used as a /bin/sh substitute, though I would be interested in knowing if there are such systems.

                          1. 5

                            I plan to work on these things […] A “carrot” for Oil, i.e. a feature that existing shells don’t have. Ideas: [static analysis, app bundles, and crash reports]

                            If you’re looking for ideas, here’s another one that might interest you: nestable string literals, a.k.a. tagged string delimiters, a.k.a. I don’t think this construct has an official name. It’s a rare language feature AFAICT, but solves an annoying and error-prone task that is especially common in shell: quoting, escaping, and multiple escaping.

                            Examples of tagged string delimiters in other languages (I am aware of only two):

                            • Lua’s long brackets let you write string literals (and multiline comments) as [[...]], [=[...]=], [==[…]==]`, and so forth – the number of equals signs in the opening delimiter is counted, and the string ends when the matching closing delimiter is found.

                                --[[ This comments out an assignment to my_lua_string
                                my_lua_string = [==[one [=[inner]=] two]==]
                                ]]
                              
                                -- This is a string delimited with long brackets, that contains several other closing delimiters that are ignored without needing escaping
                                [=[one ]==]'" two]=]
                                --> 'one ]==]'" two'
                              
                                -- Using long brackets with loadstring (Lua's `eval`):
                                f = loadstring([[i = i + "x"]])
                                i = 'a'
                                f()  -- i = 'ix'
                                f()  -- i = 'ixx'
                              
                            • PostgreSQL’s dollar quoting:

                                $$Dianne's horse$$
                                $SomeTag$Dianne's horse$SomeTag$
                              
                                CREATE OR REPLACE
                                    FUNCTION increment(i integer) RETURNS integer AS $myAddOne$
                                        BEGIN
                                            RETURN i + 1;
                                        END;
                                    $myAddOne$ LANGUAGE plpgsql;
                              
                            • Heredocs don’t count, because you can’t write them inline.

                            Ways tagged string delimiters would improve the OSH/shell experience:

                            • Tagged string delimiters make it trivial to write any string literal without escaping the contents: you simply choose a delimiter that doesn’t occur in the string.
                            • Tagged delimiters make it easy to nest commands with quoted string arguments inside the quoted string argument of a higher-level command: for example, cp 'my file with.spaces' /etc within sudo sh -c 'multiple; commands; cp "..."' within ssh 'multiple; commands; sudo sh -c "..."'.
                            • Shell’s string-centric nature and interactive usage means is the context in which I frequently quote strings, or pass quoted shell code that contains its own quotes that need escaping, and so on. The example in the previous item is one I have encountered in real life, albeit only once. It would be good to get away from escaping, and wonderful to get away from double escaping.

                            Sorry to dump such a long post on you (although turnabout is fair play :-P), but I thought you might be interested. For my part, I find your OSH project fascinating, and avidly read every post. Thank you for writing such good writeups!

                            1. 5

                              Perl has nestable strings with q{} / qq{} / qx{} as well.

                              print q{foo q{bar}};
                              
                              1. 3

                                Yes I totally agree that shell’s string-centric nature means that this is one of the most important features! I often have HTML snippets in my shell scripts, and I see tons of config file snippets in shell scripts, like /etc/resolv.conf, etc. Not to mention Awk, Perl, and Python snippets.

                                I think of this feature as “multiline strings”. It will subsume here docs, and as you mention you can use one as an argument to a command.

                                It will replace all the variants of here docs: << EOF , << 'EOF' (quoted), <<-EOF (indented), etc.

                                I’m thinking using a variant on Python’s syntax:

                                # operator << takes a string literal, not a filename, like <<< in shell
                                cat << 'normal string' 
                                
                                cat << '''
                                $var is NOT expanded in 3 single quotes, equivalent to quoting 'EOF'
                                '''
                                
                                cat << """
                                $var is expanded in 3 double quotes, like unquoted EOF
                                """
                                

                                I have thought about the “tag” problem too. I originally had a proposal to have some sort of tag, but someone pointed out to me that it’s not necessary for multiple here docs on a line. You can just use the order of the here docs.

                                I guess I have never really had an issue with Python’s multiline strings – e.g. embedding a multiline string in a multiline string! The fact that there are two different types of quotes helps. But I’m open to adding that later if it’s a problem.

                                I plan to write a preview of Oil syntax, as requested on lobste.rs. So I’ll make sure to highlight this part. Unfortunately I have at least 2-3 posts to write before that, so it might not come for awhile.

                                There is also the issue of C escapes, e.g. $'\n' in shell.

                                Thanks for the feedback!

                                1. 1

                                  multiline strings with interpolation and also trimming of leading indentation is super handy, the way nix does it is pretty fun to use.

                                  1. 1

                                    OK interesting, I didn’t know Nix did that. That’s pretty much what I expect Oil to have – the indentation of the closing quote is what you strip off of every line.

                                    But ‘’ already means something in both shell and Oil, so it will be ‘’’ and “”” like Python, except that single and double mean what they already do in shell.

                                    https://learnxinyminutes.com/docs/nix/

                              1. 2

                                I’m not real familiar with Oil.

                                I like what I read, so far. Looks very promising. I recently wanted to write a static site generator, something real barebones, and I decided not to write it in bash; instead, I chose to write most of it in CoffeeScript, with some bash one-liners here and there. Coming from my angle, I guess I have two questions: why not just compile down to .bash? Is the runtime itself lacking? And also, given that I tend to rewrite as much bash as I can into CS, why would living in a Oil shell be better than coffee> ?

                                I read the Replacing Shell Scripts with Python post, and I can’t say I disagree with the author. I guess I am just unconvinced by many arguments on the page. Maybe just focus on the positives and put the rest in a -hater’s handbook.

                                Oil can and should succeed bash. I hope it does. Every time I write a bash script at work, often in the name of getting things done, I wince a little bit. I know I’m that coworker writing that smelly glue.

                                1. 1

                                  Good questions. Yes, I want to improve upon the bash runtime too. A lot of it is syntax, but there are also semantic problems.

                                  Right now, there’s no reason to ditch CoffeeScript for Oil. But I’m very much targeting command line programs written in Python / Ruby / node.js, so eventually Oil could be a replacement for what you do with CoffeeScript. But that won’t happen for quite awhile.

                                  It’s roughly accurate to say Oil is a hybrid of shell and Python.

                                1. 4

                                  I briefly looked at Vala for writing Oil [1], although I also looked at 5 or 10 other languages.

                                  For my purposes, Go is in a similar spot: they’re both C-like with more abstraction (interfaces in Go, classes in Vala). But Go has a complex runtime which starts threads and so forth.

                                  So Vala was interesting in it compiles to C and I believe have no runtime. However when I actually tried the C runtime (as opposed to the GUI runtime), it didn’t work for me. I think I got a compile error. I only spent about 10 or 15 minutes on it though.

                                  I’m still a bit curious if Vala is suited for writing low level / low-dependency code like a shell, vs. something like GUI code. My sense is that it’s more suited for the latter.

                                  [1] http://www.oilshell.org/

                                  1. 1

                                    what made you choose python over go for Oil?

                                    1. 1

                                      Go’s runtime is actually more complex than Python’s, in that it starts threads. Threads and processes don’t mix.

                                      Also, Go doesn’t use libc, which a shell pretty much needs. They have their own wrappers over raw syscalls.

                                      1. 1

                                        What’s wrong with it starting threads? As long as you don’t start more than one goroutine, you should be fine.

                                        Also, why does a shell need libc?

                                  1. 40

                                    Thanks to the Recurse Center for inviting me to speak and for making the video. I’m here if anyone has questions.

                                    1. 21

                                      A very non-technical question: Why should Xi “only” be an editor for the next 20 years? In terms of text editors, that’s not that long. People, like me, use Editors that are nearly twice as old as I am, and the reasons don’t seem to be tied to performance or the internal structure of the implementations, but rather a core “philosophy” regarding how things are done, or how the programmer should relate to text. What does Xi have to offer regarding these “practical” qualities, that have made, for example Emacs or Vi(m) last for so long? Does Xi see itself in such a certain tradition, having a certain ideal that you aspire to, or do you set your own terms? These could seem important if one intends to write an editor that should be practically used, which is what I gathered from the video, as opposed to being a “purely academic” experiment, which would obviously have different goals and priorities.

                                      1. 4

                                        Do you plan on doing a Linux frontend yourself and would it matter performance-wise? I saw that some people are working on a gtk+ frontend but I was wondering if it will be as fast as the mac one.

                                        1. 4

                                          In my ideal world, there’d be a cross-fertilization of code and ideas so the linux front-end would be just as nice and performant as the mac one, but it’s unlikely at this point I’ll take it on myself.

                                          1. 2

                                            I just tried xi-gtk and it’s very fast. Not sure what it’s like compared to the swift one but it’s a whole lot faster than gedit.

                                            1. 1

                                              nice, thanks!

                                          2. 4

                                            Also, here is a cool demo of async loading of big text files – you can navigate and I think even edit while loading:

                                            https://youtu.be/sPhpelUfu8Q?t=1601

                                            Using immer, Clojure-like immutable data structures in C++:

                                            https://github.com/arximboldi/immer

                                            The editor is a demo of the library: https://github.com/arximboldi/ewig

                                            1. 3

                                              I just watched the video. It looks really interesting, although a lot of it was over my head!

                                              I more or less understand the process model, async architecture, and distributed data structures. I like that part – very Unix-y.

                                              But there were a lot of rendering terms I didn’t understand. Maybe because some of it is Mac-specific. But also some of the OpenGL issues. Is there any background material on text rendering you’d recommend?

                                              Also, I don’t understand the connection to Fuschia? I was under the impression that Fuschia was more consumer-facing, and Xi is more developer-facing. That is, I imagine most consumers don’t have text editors installed. There is no text editor on Android or ChromeOS.

                                              Or is xi more general than a vi/emacs replacement – is it meant to be used as part of a browser for implementing text boxes?

                                              1. 4

                                                Glad you enjoyed the talk!

                                                Unfortunately, there really isn’t a lot of material on text processing, especially from a modern perspective. A lot of what I learned about rendering came from reading other code (alacritty in particular), and talking to people like Patrick Walton and my teammates on Chrome and Android.

                                                There is an EditText widget on Android (a good chunk of my career involved working on it), but you certainly wouldn’t want to write code (or long-form text) in it. My goal with xi is to make a core lightweight and performant enough it can be used in such cases, easily embedded in apps, yet powerful enough for cases where you really do need a dedicated editor application.

                                              2. 2

                                                I feel like it’s fairly out of my league, but I’ve been thinking about implementing a Sublime Text-like editor (multiple cursors, smart brackets/quotation marks) for arbitrary text fields in web sites. Would it be possible to use Xi as a backend for something like that? Perhaps via compilation to WASM?

                                                1. 8

                                                  Eventually it is my hope that something like that could work. There are some technical details (the current implementation uses threads), so it’s not an easy project. In the meantime, the excellent CodeMirror does multiple selections, and is very widely used embedded in websites.

                                              1. 2

                                                As someone who is just starting to dive deep into operating systems, especially Unix, I’m grateful for all the writing you’ve done about the Oil project.

                                                Oil is taking shell seriously as a programming language, rather than treating it as a text-based UI that can be abused to write programs.

                                                One question in response to this statement is at what point does the shell language become just another programming language with an operating system interface. This question seems especially important when the Oil shell language targets users who are writing hundreds of lines of shell script. If someone is writing an entire program in shell script, what is the advantage of using shell script over a programming language? You seem to anticipate this question by comparing the Oil shell language to Ruby and Python:

                                                …Python and Ruby aren’t good shell replacements in general. Shell is a domain-specific language for dealing with concurrent processes and the file system. But Python and Ruby have too much abstraction over these concepts, sometimes in the name of portability (e.g. to Windows). They hide what’s really going on.

                                                So maybe these are good reasons (not sure if they are or aren’t) why Ruby and Python scripts aren’t clearly better than shell scripts. You also provide a mix of reasons why shell is better than Perl. For example: “Perl has been around for more than 30 years, and hasn’t replaced shell. It hasn’t replaced sed and awk either.”.

                                                But again, it doesn’t seem to clearly answer why the domain language for manually interacting with the operating system should be the same language used to write complex scripts that interact with the operating system. Making a language that is capable of both should provide a clear advantage to the user. But it’s not clear that there is an advantage. Why wouldn’t it be better to provide two languages: one that is optimized for simple use cases and another that is optimized for complex use cases? And why wouldn’t the language for complex use cases be C or Rust?

                                                1. 3

                                                  My view is that the most important division between a shell language and a programming language is what each is optimized for in terms of syntax (and semantics). A shell language is optimized for running external programs, while a programming language is generally optimized for evaluating expressions. This leads directly to a number of things, like what unquoted words mean in the most straightforward context; in a fluid programming language, you want them to stand for variables, while in a shell language they’re string arguments to programs.

                                                  With sufficient work you could probably come up with a language that made these decisions on a contextual basis (so that ‘a = …’ triggered expression context, while ‘a b c d’ triggered program context or something like that), but existing programming languages aren’t structured that way and there are still somewhat thorny issues (for example, how you handle if).

                                                  Shell languages tend to wind up closely related to shells (if not the same) because shells are also obviously focused on running external programs over evaluating expressions. And IMHO shells grow language features partly because people wind up wanting to do more complex things both interactively and in their dotfiles.

                                                  (In this model Perl is mostly a programming language, not a shell language.)

                                                  1. 1

                                                    Thanks, glad you like the blog.

                                                    So maybe these are good reasons (not sure if they are or aren’t) why Ruby and Python scripts aren’t clearly better than shell scripts.

                                                    Well, if you know Python, I would suggest reading the linked article about replacing shell with Python and see if you come to the same conclusion. I think among people who know both bash and Python (not just Python), the idea that bash is better for scripting the OS is universal. Consider that every Linux distro uses a ton of shell/bash, and not much Python (below a certain level of the package dependency graph).

                                                    The main issue is that people don’t want to learn bash, which I don’t blame them for. I don’t want to learn (any more) Perl, because Python does everything that Perl does, and Perl looks ugly. However, Python doesn’t do everything that bash does.

                                                    But again, it doesn’t seem to clearly answer why the domain language for manually interacting with the operating system should be the same language used to write complex scripts that interact with the operating system.

                                                    There’s an easy answer to that: because bash is already both languages, and OSH / Oil aim to replace bash.

                                                    Also, the idea of a REPL is old and not limited to shell. It’s nice to build your programs from snippets that you’ve already tested. Moving them to another language wouldn’t really make sense.

                                                  1. 3

                                                    “Because the shell is terrible” is a sufficient answer. I am very excited to see where Oil and osh are going. I’d love to see what a new language environment designed under the same basic constraints as the shell would end up if bash compatibility weren’t important. I keep trying e.g. xonsh or eshell and always switch back to fish because as much as I prefer elisp or python to shell, the impedance mismatches are just too great.

                                                    Part of that, of course, could be the ~30yrs of Unix shell experience that I have burdened myself with, and were I starting from scratch, I might be able to escape the sad hexadecimal chains of unstructured 7-bit text.

                                                    1. 1

                                                      Thanks! By the way, I love the phrase “tilter at path-dependent windmills” in your profile :)

                                                      Shell is an extreme example of path dependence. Really, we ended up with this??? There is 40+ year unbroken chain going back to the Thompson shell. I’m pretty sure that technologies of a similar age like Lisp and Fortran have seem multiple “overhauls” since then.

                                                      And I think of bash as a friend-but-enemy… on the one hand, it helps me get things done faster than any other tool. On the other hand, I want to kill it. :) I’ve worked with a lot of non-programmers doing programming, e.g. including technical artists and data scientists, and I feel I can’t recommend that they learn bash “with a straight face”. Even though it will solve many of their problems. They usually prefer Python even though it’s not the best tool for the job.

                                                    1. 3

                                                      I agree that trying to maintain POSIX comparability isn’t worth it, but I’m interesting in the author(s) decision to separate out an oil shell and language. They’re trying to make a complete alternative for the shell, that can also run classic shell code. The fish approach is that you can still rush your bash/sh script, just like you can a Python or Ruby script; and that fish is for your interactive shell.

                                                      This post doesn’t really talk about fish, the shell I’ve personally been using it since 2013. It’s really an amazing shell and has come a long way, and it’s designed primarily around usability features and command line highlighting. Tab completion can be derived from man pages, and the highlighting, reverse searching and directory stack navigation are all incredibly useful.

                                                      So I gave oil 0.3.0 a shot. A couple of interesting things: Ctrl+C actually breaks you out of the shell. Tab completion for directory navigation doesn’t seem to work, although I do like the status bar at the top with information a search time. Another bug, if a user doesn’t have access to a bin directory, I get an Unhandled exception while completing: [Errno 13] Permission denied: '/usr/games/bin .. didn’t realize my user wasn’t in the games group.

                                                      I realize it’s still really early, and shells are incredibly incredibly difficult to write. Early versions of fish would craft often enough I’d wonder about the security implications. I applaud the contributors to this project and do hope we see more viable shell alternatives. Although I do suggest authors spend some time with fish as it comes with a lot of good stuff right out of the box, whose concepts I’d like to see incorporated elsewhere.

                                                      1. 1

                                                        I’m not sure what you mean with respect to fish. As far as I know, fish doesn’t run sh or bash scripts itself. You can of course invoke bash from fish, and you can invoke bash from osh too.

                                                        The first and last FAQ implicitly address your question about fish. Fish is “friendly interactive shell”. OSH is more concerned about the language for now, but I believe that will lead to a good interactive shell later. See the answers for details.

                                                        Thanks for trying OSH. I made a note of the Ctrl-C issue here:

                                                        https://github.com/oilshell/oil/issues/36

                                                        Also filed:

                                                        https://github.com/oilshell/oil/issues/69

                                                        I have looked a bit at fish, and in particular I like how they parse man pages for completion. I will probably steal that Python script! But that won’t happen for awhile, since I’m focused on the language.

                                                        At the very least, as I mention, OSH/Oil needs real functions so you don’t need to mutate globals to write completions!

                                                      1. 18

                                                        Some feedback:

                                                        I have seen many oil shell posts, but still don’t know what the heck the actual OIL language looks like.

                                                        1. 4

                                                          OK thanks, maybe I should link these at the very top of the “A Glimpse of Oil”:

                                                          http://www.oilshell.org/blog/tags.html?tag=osh-to-oil#osh-to-oil

                                                          They are linked somewhere in the middle, which is probably easy to miss.

                                                          It’s sort of on purpose, since Oil isn’t implemented yet, as I mention in the intro. But I think those posts give a decent idea of what it looks like (let me know if you disagree).

                                                          1. 7

                                                            I’ve seen your posts and hat around and never really understood what Oil was really about, but this link is really wonderful. The comparison to shell, the simplifications, the 4 different languages of shell vs the two of Oil, it all really clicked. Really cool project.

                                                            1. 3

                                                              I agree with the others. Until I see what’s your vision for the language, I’m not motivated to get involved.

                                                              The only example you give contains “if test -z $[which mke2fs]”, which can’t be what you’re aiming at.

                                                              IMHO If you really want Oil to be easy to use, you should take as much syntax from Python or Javascript as you can. And use similar semantics too.

                                                              1. 11

                                                                I’m willing to be convinced that a new syntax would be better for shell programming.

                                                                I’m not very confident that moving towards an existing non-shell scripting language will get us there.

                                                                The problem I have with writing shell programs in some non-shell language is that I expect to keep using the same syntax on the command line as I do in scripts I save to disk, and non-shell languages don’t have the things that make that pleasant. For example, a non-shell language has a fixed list of “words” it knows about, and using anything not on that list is a syntax error. That’s great in Python, where such a word is almost certainly a spelling error, but in a shell, most words are program names and I don’t want my shell constantly groveling through every directory in my $PATH so it knows all my program names before I try to use them.

                                                                I’ve also never seen a non-shell language of any type with piping and command substitution as elegant as bash and zsh, but I’m willing to be convinced. I’m afraid, though, anyone in the “Real Language” mindset would make constructions such as diff <(./prog1 -a -b) <(./prog1 -a -c) substantially more verbose, losing one of the main reasons we have powerful shells to begin with.

                                                                1. 3

                                                                  Yes it has to be a hybrid. I talk a little about “command vs expression” mode in the post. I guess you’ll have to wait and see, but I’m aware of this and it’s very much a central design issue.

                                                                  Of course “bare words” behave in Oil just as they do in bash, e.g.

                                                                  echo hi
                                                                  ls /
                                                                  

                                                                  I will not make you type

                                                                  run(["echo", "hi"])
                                                                  

                                                                  :-)

                                                                  One of the reasons I reimplemented bash from scratch is to be aware of all the syntactic issues. Process substitution should continue to work. In fact I’ve been contemplating this “one line” rule / sublanguage – that is, essentially anything that is one line in shell will continue to work.

                                                                  Also, OSH and Oil will likely be composed, and OSH already implements the syntax you are familiar with. This is future work so I don’t want to promise anything specific, but I think it’s possible to get the best of both worlds – familiar syntax for interactive use and clean syntax for maintainable programs.

                                                                  1. 1

                                                                    For example, a non-shell language has a fixed list of “words” it knows about, and using anything not on that list is a syntax error. That’s great in Python, where such a word is almost certainly a spelling error, but in a shell, most words are program names and I don’t want my shell constantly groveling through every directory in my $PATH so it knows all my program names before I try to use them.

                                                                    tclsh is an interesting example of not having this problem.

                                                                    I’m afraid, though, anyone in the “Real Language” mindset would make constructions such as diff <(./prog1 -a -b) <(./prog1 -a -c) substantially more verbose, losing one of the main reasons we have powerful shells to begin with.

                                                                    You get constructs that look like this in pipey libraries for functional languages (the likes of fs2 or conduit), though they’re controversial.

                                                                    1. 1

                                                                      Well put. There’s also loads of muscle memory built up that is hard to leave behind. That point keeps me off of fish; I like almost everything else about it, but I don’t see why it can’t have a separate bash-like syntax.

                                                                    2. 2

                                                                      OK that’s fair. I’m on the fence about outside contributions – some people have contributed, but I think most projects have more users and more “bones” before getting major contributions. I’m really looking for people to test OSH on real shell scripts, not necessarily adopt it or contribute. (although if you can figure out the code, I applaud you and encourage your contributions :) )

                                                                      As I mention in the post, the OSH language is implemented (it runs real shell scripts), but Oil isn’t.

                                                                      There will be a different way to test if a string is empty, but for the auto-conversions, if you have [ -z foo ], it will become test -z foo. The auto-conversion is going to make your script RUN, not make it idiomatic.

                                                                      As far as appearance, you can definitely think of Oil as a hybrid between shell and Python/JavaScript.

                                                                      I can probably write up a cheatsheet for those curious. I haven’t really done so because it feels like promising something that’s not there. But since I’ve written so many blog posts, it might be worth showing something in the style of:

                                                                      https://learnxinyminutes.com/docs/bash/

                                                                  2. 0

                                                                    Yes and I don’t think I’ll care about it until I do. It could look like APL for all we know.

                                                                  1. 3

                                                                    I’ve been seeing more and more about Oil Shell, but only now, after having have read this I have to say that I have found actual interest in the project. Until now, you only had a pretty website, but ideas like these really intrigue me.

                                                                    But seeing that you’ve engaged with many different shell implementations,have you taken a look a stuff like Plan 9’s rc or related tools like mk? While I personally haven’t engaged too much with either, I know there are people who certainly see these as superior, and I’m guessing that there has to be something to it?

                                                                    1. 3

                                                                      Thanks for the feedback!

                                                                      I’ve read the rc and mk papers, and discussed them with a few people on /r/oilshell [1]. This comment has an analysis of mk. As expected, there are good things and bad things about it:

                                                                      https://lobste.rs/s/azldgm/oil_comments_about_shell_awk_make#c_rat8kv

                                                                      It’s impressive given its age, but I think it has been surpassed. Ditto for rc. Although rc and es certainly are cleaner and smaller than bash, bash can do everything they can. As I recall, both of them were under 10,000 lines of C code, and there’s only so much you can do in that space.

                                                                      More links here: https://github.com/oilshell/oil/wiki/ExternalResources

                                                                      [1] https://www.reddit.com/r/oilshell/

                                                                    1. 8

                                                                      Thank you for both the Oil project and this post. This is definitely the explanation I will point people to.

                                                                      I haven’t adopted Oil yet myself, and probably won’t until at least 1.0. I’ve tried zsh, fish, and xonsh, and have nice things to say about them all… but so far I always keep setting my login shell back to bash on linux, because there are just too many other people’s scripts for me to deal with. The net semantic complexity of $NEAT_NEW_SHELL plus that of $CRANKY_OLD_SHELL is always greater than the latter alone, so I find myself stuck with bash despite its irritations. It’s apparently just another one of these insoluble collective action problems.

                                                                      The embrace, extend, (eventually) extinguish approach that source translation enables is the only one I can endorse for having a hope of success in such an entrenched, messy, decentralized context as the Unix diaspora. There’s an important lesson here, and I hope similar projects take note.

                                                                      1. 8

                                                                        but so far I always keep setting my login shell back to bash on linux, because there are just too many other people’s scripts for me to deal with

                                                                        What does this have to do with the shell that you run? I run fish and that is no obstacle to running programs written in any other language, including bash.

                                                                        1. 6

                                                                          It’s not just the shell I run, it’s the shell “all the things” expect. I can easily avoid editing C++ or ruby source (to pick a couple of random examples) but, in my job at least, I can’t avoid working with bash. I can’t replace it, and I need to actually understand how it works.

                                                                          Of course, other people with other jobs, or those who have long since attained fluency in bash, may have better luck avoiding in their personal environment. I can’t, because I have to learn it, ugly corners and all. I’d be happy to stick with fish, it’s just not a realistic option for me right now. My observation is that, for my current needs, two shells are worse than one.

                                                                          1. 3

                                                                            I’ve used fish for years now. Whenever I need to run a bash script I just run bash script.sh. The smallest hurdle I have to deal with is the small mental effort I have to make translating bash commands to fish equivalents when copying bash one liners directly into the shell.

                                                                            1. 3

                                                                              I don’t understand what working with bash scripts has to do with the shell that you run, though. Just because you run Python programs doesn’t mean your shell has to be a Python reple, these things are separate. In the case you’re referring to it sounds like bash is just a programming language like Ruby or Python.

                                                                          2. 2

                                                                            Thanks, yes I didn’t explicitly say “embrace and extend”, since that has a pretty negative Microsoft connotation :)

                                                                            But that’s the idea, and that’s how technology and software evolves. And that’s is how bash itself “won”! It implemented the features of every shell, including all the bells and whistles of the most popular shell at the time – AT&T ksh.

                                                                            Software and in particular programming languages have a heavy lock-in / network effects. I mean look at C and C++. There’s STILL probably 100x more C and C++ written every single day than Go and Rust combined, not even counting the 4 decades of legacy!

                                                                            It does seem to me that a lot of programmers don’t understand this. I suppose that this was imprinted on my consciousness soon after I got my first job, by reading Joel Spolsky’s blog:

                                                                            https://www.joelonsoftware.com/2004/06/13/how-microsoft-lost-the-api-war/

                                                                            There are two opposing forces inside Microsoft, which I will refer to, somewhat tongue-in-cheek, as The Raymond Chen Camp and The MSDN Magazine Camp.

                                                                            The most impressive things to read on Raymond’s weblog are the stories of the incredible efforts the Windows team has made over the years to support backwards compatibility:

                                                                            This was not an unusual case. The Windows testing team is huge and one of their most important responsibilities is guaranteeing that everyone can safely upgrade their operating system, no matter what applications they have installed, and those applications will continue to run, even if those applications do bad things or use undocumented

                                                                            This is a good post, but there are others that talked about the importance of compatibility. Like the “never rewrite post” (although ironically I’m breaking that rule :) )

                                                                            https://www.joelonsoftware.com/2000/04/06/things-you-should-never-do-part-i/

                                                                            Another example of this that people may not understand is that Clang implemented GCC’s flags bug-for-bug! GCC has an enormous number of flags! The Linux kernel uses every corner of GCC, and I think even now Clang is still catching up.

                                                                            Building the kernel with Clang : https://lwn.net/Articles/734071/

                                                                          1. 11

                                                                            Thank you for the wonderful comments last week.

                                                                            I wrote an Earley parser. And a Pratt parser. The Pratt parser is what I’ve been looking for all this time: a modular recursive descent parser. What it lacks in formalism it makes up with in brevity and power-to-weight.

                                                                            Now, I need to choose a host language. I’d like to pick Rust, but I’m not sure it has a ready-made GC solution right now, and I don’t want to go down that rabbit hole. That leaves C++, JVM, or OTP. Any thoughts?

                                                                            1. 3

                                                                              What kind of language are you looking to interpret/execute? The three platforms you mention all have really different tradeoffs.

                                                                              1. 3

                                                                                A Lisp-esque language under the hood with a non-Lisp syntax on top. Idea is the functional paradigm can subsume the other two big paradigms (imperative/logic). Can use the CEK machine for proper tail call handling, so that isn’ta requirement of the host. Big thing I’m looking for is a GC (whether lib or built-in) and a language I like that I can target it with.

                                                                              2. 2

                                                                                For rust, you can wrap everything in a Rc, or if you have multiple threads an Arc, or if you want tracing GC you can use this, or if you just need epoch-style reclamation there’s crossbeam-epoch or if you just need hazard pointers there’s conc. I’ve had a lot of success with crossbeam-epoch in lock-free systems I’ve built.

                                                                                1. 1

                                                                                  Rc (and friends) would need cycle detection, no? Maybe the thing to do is just use Rc and do research on cycle-detection algorithms to see if they are hard or not.

                                                                                  I looked at Epoch and hazard pointers and wasn’t sure if they were ok as a general GC. I need to do more reading. Thanks!

                                                                                  1. 2

                                                                                    Yeah, you can create memory leaks with Rc cycles in rust. But this is rarely an issue in most use cases. Rust memory can feel a little confusing at first, but cycles tend not to come up once you learn some different idioms for structuring things in non-cyclical ways.

                                                                                    For example, if you want to build a DAG, you can quickly implement it with a HashMap from ID to Node, where ID is some monotonic counter that you maintain. Each Node can contain Vec’s of incoming and outgoing edges. You can implement your own RC-like thing that tracks the sum of indegree and outdegree, and when it reaches 0, you just remove the Node out of the containing hashmap. For the cases where performance or concurrency concerns rule out this approach (which are rare and should not be pursued until this is measured to be a bottleneck) you can always write Rust like C with unsafe pointers, Box::into_raw, dereferencing inside unsafe blocks, and free’ing by calling Box::from_raw (actually calling drop() on that if you want to be explicit about what’s happening, but it will be dropped implicitly when it goes out of scope). Use mutexes on shared state until… basically always, but if you REALLY want to go lock-free, that’s when you can benefit from things like crossbeam-epoch to handle freeing of memory that has been detached from mutable shared state but may still be in use by another thread.

                                                                                    Feel free to shoot me an email if you’re curious about how something can be done in Rust! I know it can be overwhelming when you’re starting to build things in it, and I’m happy to help newcomers get past the things I banged my head against the wall for days trying to learn :)

                                                                                2. 2

                                                                                  FWIW, many languages written in C or C++ use arenas to hold the nodes that result from parsing . For example, CPython uses this strategy. I’m pretty sure v8 does too. So you don’t manage each node individually, which is a large load on the memory allocator/garbage collector – you put them all in a big arena and then free them at once.

                                                                                  1. 2

                                                                                    Save the earth , use C++ or OTP

                                                                                    1. 1

                                                                                      You also have Go and .NET Core as possible host runtimes.

                                                                                      1. 1

                                                                                        What about Nim? It seems to be a memory-safe language with low-latency GC, macros, and produces C. I mean, the Schemes are ideal if doing language building with LISP thing underneath since they start that way.

                                                                                      1. 9

                                                                                        Hm I’ve seen the original Python 0.1 sources posted to Usenet, but I hadn’t seen this Python 1.0.0 announcement.

                                                                                        After dissecting the Python interpreter for Oil, I have more of an appreciation for how amazing an achievement it is! I read the Python 0.1 sources, and they’re a significant amount of work that I doubt many people (including myself) could replicate now in the same period of time, even with 25 years of “progress”.

                                                                                        I think there is a meme that the Python interpreter is kinda crappy because it doesn’t support concurrency well, the object representation is a bit bloated, etc.

                                                                                        But interpreters are pretty large programs and require you do to multiple things well. They require more than one skill, and you have to be able to write a certain volume of code. For awhile it seemed like anybody could write a little interpreter, and maybe even make it fast and concurrent. That might be true, but you might fail to write a good API or a good data structures (and I think Python has the most powerful and convenient data structures of almost any language.)

                                                                                        There ’s a world of difference between a toy interpreter which you can run some benchmarks on, and one that people can use for real problems. (e.g. debugging support is already in Python 1.0.0)


                                                                                        It’s also interesting that Python’s syntax vs. Perl and Bourne shell are what Guido chose as headlines! That makes sense, and I think shell still needs to be fixed 25 years later :)

                                                                                        I made the observation elsewhere that Python 3 and Perl 6 are both worse shell replacements than their predecessors. They both migrated more into the space of “applications” rather than “shell scripts”, so I think there is a need for a new shell.

                                                                                        It’s also amazing to me that Perl was so much more popular than Python for 15+ years, and then the Perl interpreter “topped out” – it was no longer possible to add significant and important features to it. Python’s codebase has aged a lot better than Perl’s.

                                                                                        1. 1

                                                                                          Do you have a link for the 0.1 sources that got posted to usenet, or should I look into the git repository?

                                                                                          1. 1

                                                                                            Ah, I misremembered and it’s Python 0.9 from February 1991:

                                                                                            https://www.python.org/download/releases/early/

                                                                                            As far as I understand that is not long after Guido started Python (less than 2 years?), but it looks like 27K lines of C and 13K lines of Python. From what I can tell, most of it made it to the present day!

                                                                                        1. 7

                                                                                          FWIW when reading papers I have often come cross the work of Mary Shaw, and liked it very much:

                                                                                          https://en.wikipedia.org/wiki/Mary_Shaw_(computer_scientist)