Oh, thanks for the interview with Bourne, that adds considerable context to my recent writing. In exchange, I offer the explanation of why Bell Labs got a miserable PDP rather than a powerful machine. Unix is partly Hamming’s fault.
I went in to Ed David’s office and said, ``Look Ed, you’ve got to give your researchers a machine. If you give them a great big machine, we’ll be back in the same trouble we were before, so busy keeping it going we can’t think. Give them the smallest machine you can because they are very able people. They will learn how to do things on a small machine instead of mass computing.’’ As far as I’m concerned, that’s how UNIX arose. We gave them a moderately small machine and they decided to make it do great things. They had to come up with a system to do it on. It is called UNIX!
Ha, constraints breed creativity :) I think there’s something to that and the inefficiency of the modern cloud.
“Software is a gas; it expands to fill its container”, and the modern cloud is basically a container of unbounded size (since new capacity is being built as fast as people will pay for it).
It’s possible to require quotes around arguments like 'with spaces and | operator' without splitting unquoted substitutions like $var.
I can imagine that splitting unquoted substitutions was the sensible thing to do at the time. If you want to run a bunch of different commands with the same options (say, compiler include path flags or preprocessor defines), sticking them in a variable so you can say mycommand $COMMONARGS foo bar seems like a good idea. If you really want it to be expanded as a single variable you can quote it, so both behaviours are available. Sure, people might shoot themselves in the foot occasionally, but I don’t think the implementors of Unix had ergonomics on their mind.
The “proper” solution (which I believe Oil implements) is to support array variables, and provide syntax to splice an array into a command-line (if that’s what they want), but pervasive array support is more complicated than just tweaking the variable expansion rules, and Worse is Better, so…
Yeah I think they could have just changed the default though, so echo $var would not “molest” the variable, and echo @var would or something like that. You could still have string-only variables, quoting of operator characters like 'a;b', and word splitting off by default!
But yes Oil uses arrays so that the strings can contain any characters (and because everyone knows how to use arrays in Python, JS, Ruby, etc.). With the string-only solution you always have to think about the split character.
I think there is a parallel between UNIX and network protocols. UNIX exposes byte streams and programs use that to communicate their specific information. Transport layer protocols also expose byte streams, either undelimited (UDP) or delimited (TCP), and application layer protocols use them to communicate applications-specific information.
If you squint, two UNIX programs communicating via pipes is not that different from two computers talking HTTP over the network. You can even bridge UNIX pipes with a TCP or UDP connection with nc.
IMO the Perlis-Thompson principle is a formulation of how layered software architecture looks like: build an application-agnostic, universal lower layer first, and place your domain-specific stuff above that layer.
For operating system and networking protocols, a sequence of bytes is a really good choice for the data structure of that lower layer; there is literally nothing that is more application-agnostic and more universal; that is why UNIX and the TCP/IP stack have succeeded. Generalizing this to higher-level domains is much harder, although there are some less straightforward parallels - intermediate languages in compiler design, VMs for runtime design, etc.
Yes exactly, that is the idea of the “narrow waist” or “hourglass model”, briefly mentioned in the post, which I hope to elaborate on. It probably needs a wiki page at this point. As far as I can see it’s a special case of the Perlis-Thompson principle – probably the most important one, as it governs the evolution of the very biggest and longest lived systems in software.
I noted the TCP/IP, LLVM, and Unix syscall examples in a comment below the “Unix text as narrow waist” comment linked in the post:
Someone else brought up Pandoc as having the same architecture. Instead of writing O(N^2) translators between document formats, you design a common IR and write O(N) translators to the IR and O(N) translators from the IR.
TCP/IP was explicitly designed as a narrow waist or “hourglass model”. It’s also called “the distinguished layer” in network architecture. I would define it as the “lowest common denominator”. In the old days, you would have an e-mail app that works over wired networks, and another e-mail app that works over wireless. That is not economical – you want e-mail, chat, web, etc. to go over TCP/IP, and TCP/IP works on both wired and wireless. This is not optimal, but it’s economical.
In the Kubernetes-Multics thread on Hacker News (in a reply to @brandombloom who also wrote about O(M * N) problems recently), I said I traced the idea to Kleinrock via Eric Brewer, but I can’t tell if this paper was ever published and what the date was.
https://news.ycombinator.com/item?id=27914632 (If anyone has a more original reference for the narrow waist / hourglass model than Kleinrock, let me know, I will credit you in the blog posts)
Kleinrock was one of the inventors of packet switching. (Which reminds me that I think I first heard of the idea of the narrow waist from a talk about Named Data Networking by Van Jacobsen at Google over 10 years ago. Van Jacobsen being another early Internet architect.)
All of these are the same idea in software architecture, but people in different subfields use different words for it. In EE textbooks on network engineering and CS textbooks on compiler architecture they will both refer to this “hourglass” architecture. It’s a “standard” engineering concept, although for some reason many programmers seem to be unaware of it, and are content to code the area rather than the perimeter :) That is, write O(M * N) amounts of code rather than O(M + N). @brandonbloom has a good example with microservices in the cloud.
Chris Latter said that LLVM IR is explicitly a narrow waist, and that comes straight from compiler textbooks. On the one side you have M languages, and on the other side you have N CPU architectures. Instead of writing O(M * N) separate compilers, you write O(M) front ends and O(N) back ends, for O(M + N) total work.
You can never design a perfect narrow waist – it’s a compromise by definition. I’ve seen some good lists of design mistakes in LLVM IR, where a certain change would probably make life easier for everyone. But LLVM works well despite not being perfect.
Likewise, TCP/IP is also inefficient for some applications, Unix text processing can be cumbersome, and the POSIX file system API doesn’t fit all apps, but that doesn’t mean they are bad. They are necessary compromises for the overall structure of the system. You can have “out of band” optimizations, but you still want a coherent architecture overall.
You want to avoid writing O(M * N) code, and I think that’s exactly what Kubernetes has failed to do. And also Borg and Google in general. See the “Unix vs. Google” that Brandon and I both reference: https://www.youtube.com/watch?v=3Ea3pkTCYx4 . I claim that Google execs actually started to notice this, but failed to fix it with the ~2012 pivot to the cloud. The “system design interview” concept started around this time, because they (correctly) thought that engineers were designing bad systems.
I have a feeling that in the networking and compiler case it was more explicitly designed than in operating systems / shells, but I’m not sure. Anyway I need to make a wiki page and I’ll be glad to have input from others :)
I’m curious about how one would design a Kubernetes alternative though. IMO a large part of its complexity comes from the constraint of having to integrate a lot of existing technologies: an Linux kernel, TCP/IP, DNS (or whatever else it uses for service discovery), runtime of popular programming languages, etc.. When you have to fit many heterogeneous technologies it’s very hard to come up with a coherent design without breaking backward compatibility in some components - which will be suicide in terms of adoption.
Fitting TCP/IP into UNIX got us BSD sockets, and fitting GUI into UNIX got us X. Both of them are more complicated and less flexible than Plan 9’s design, but you can take your existing UNIX program and put BSD socket and X support in it. Porting it to Plan 9 is a rewrite. You think that Kubernetes is cloud computing’s Multics, but I fear that it is actually cloud computing’s BSD sockets and X, and a radically simpler system will go down the same route as Plan 9 because it requires rewriting all your existing software.
I also think that in hindsight, one could say that Borg was a huge missed opportunity: you have a proprietary environment where a lot of new software will be created from scratch, so you can afford to reshape the Linux kernel, the network stack, RPC protocols and language runtimes to fit into a much coherent design and reduce a lot of complexity that arose from mismatch in these components. The only problem is that people who developed Borg could not possibly know how large Google was going to grow into. By the time Google’s scale became clear it was too late.
Something Google did do that resembles such an effort is AppEngine. But it’s too high level to be a universal cloud computing system.
Hm I actually think Kubernetes was the missed opportunity, because the strengths and weaknesses of Borg were relatively well-known at that point. Arguably, they made something that’s worse than Borg, not better! (Part of that is due to the awkward nesting of Kubernetes within VMs like Google Cloud.)
On the other hand Borg was done by a small team and Google was growing like crazy at that time. And Borg was understandably biased toward running web search (stateless services, no consistency guarantees needed). Nonetheless, Linux cgroups came directly out of Borg (from Paul Menage, tech lead / designer of Borg), so they did make changes to the kernel.
I actually sat next to the App Engine team in SF in 2007-2008 and was a pre-release user. I ported (a trivial) part of Google Code to it!
And from 2010-2015 I tried to write something like App Engine (more like a web server), and then tried a second version that looked something like Borg (more like a distributed init).
It was inspired by a very particular problem: neither App Engine or Borg could run R code very well! They have what I call the “monoglot distributed OS” problem. Unix is polyglot, but App Engine and Borg were to various degrees coupled to specific languages. (App Engine was Python-only then, and it took a long time to even get Python 3, or PHP. It was a VERY long time, because of certain things baked into the architecture, and also limitations of running on top of Borg, and the feature set of the Linux kernel in those days)
I actually had about 20 apps from 10-20 R users using it. I learned a lot but realized it was too big a job, and security in particular was a bottomless pit of work (and it’s something the cloud still hasn’t solved!)
I briefly mentioned this on the blog over 4 years ago after Ilya Sher of NGS asked me why I was creating Oil.
(I also recently discussed it with the organizers of the HotOS shell panel, who are doing cloud / shell research.)
This leads perfectly into the next question – what would a distributed OS following the Perlis-Thompson principle look like? Well I claim it should be made of shell scripts :) I explicitly make that claim in a recent blog post, but I have some elaborations on that.
A radically simpler system will go down the same route as Plan 9 because it requires rewriting all your existing software.
This is actually why I’m sort of obsessed with shell and distros. Because any successful distributed OS should run all your existing software, and right now that is bound up with distros (creating a lot of headaches), which are bound up with shell (more headaches)!
With the “Poly” cluster manager / OS project I mentioned in the last comment, I spent a lot of time on the package management / container problem (this was pre-Docker, some color here: https://lobste.rs/s/kj6vtn/it_s_time_say_goodbye_docker#c_cjy5on). One reason was that Blaze doesn’t deal well with R code, and that’s part of the reason Borg doesn’t. (My coworker wrote the Blaze R support and I reviewed it; it’s kind of a Rube Goldberg machine.)
As for the question of a distributed OS following the Perlis-Thompson principle – this is going to be another really long reply, but you asked a really big question :) Also people on Hacker News asked the same question, so I might as well write it up.
This comes with all the same disclaimers as the “Better Kubernetes From the Ground Up” post I referenced: My experienced is biased; this is a hard problem; and most likely I’m not actually going to do anything about this :) Honestly right now it’s more interesting to me to do as much as you can with one machine rather than build big clusters.
To repeat the bold claim: I would make a distributed OS out of shell scripts (obviously invoking some custom tools written in other languages). One reason is that it’s the control plane and not the data plane. If you think about having 10,000 machines, 1000 users, and 5000 apps, it’s not very much data to store and manage. The state of the entire system can fit in a 1 GB file system tree. I would think of it like a “distributed init”.
(The Borg master was a single machine with all cluster state in memory for many years – no Paxos. This was well past the point that hundreds of billions of dollars of revenue were made on top of it.)
The complement to shell scripts is a file system, but I would try not to use a distributed file system. Instead I would base the system on state synchronization.
Think of the scheduler making decisions about which apps go on which machines. It simply writes those decisions to a git repo. And the workers sync the repo. When they reboot; they do not contact the master again. They just use the sync’d state on the local file system.
So in this system there are no RPCs at all. It’s logically just “git push” and “git pull”. (I believe you need a mechanisms for subscriptions like an inotify() on the repos, but let’s leave that aside for now.)
In order to really store the whole state of the system, you need another elaboration on git – something like git annex, which I started using. This is because all the binary images and containers are too big.
This leads to one of the central ideas: the “root” git repo contains a “distributed pointers” to other repos, i.e. other versioned trees. Because of the versioning, these pointers are also values in the Hickey sense. It could simply be a text file with a (git repo URL, sha256) pair.
I called this system “Keg” – a wrapper around git. And let’s call the (URL, sha256) pairs a “KID” – this is a “distributed pointer”. Following the Perlis-Thompson principle, everything is a “KID” – images, running containers, users, machines, etc. It’s the analog of a file descriptor – an opaque ID that can represent many things.
And users can refer to apps, and apps can refer to image versions, and apps can refer to hosts, all by the uniform KID mechanism.
So two big simplifications I mentioned are:
A single storage namespace and mechanism for cluster state. We do not depend on separate Docker registries, Debian repositories, and storage systems for cluster membership like Chubby/etcd. Just like Unix doesn’t have 3 different file systems – it has one. It’s more of a “von Neumann architecture” than a Harvard architecture.
Unified storage and networking (basically like Plan 9)
Here are some other simplifications I would like:
No separation between WAN and LAN. It can run across a WAN. No AWS regions or Google data centers. The two-level hierarchy causes a lot of complication (arguably disrespecting the Perlis-Thompson principle). (The Better k8s article also mentioned this.)
The data centers still exist, but the cluster manager layer doesn’t care about them. (Apps like distributed databases will care, because they’re on the performance sensitive data plane, not the control plane. State sync is not expensive across a WAN because you have differential compression “for free”.)
The cluster is designed to be easily turned up / bootstrapped and doesn’t have an “inner platform effect”. There’s no another distributed OS below it that distributes its binaries and has its own auth system. (This is my beef about there being 3.5 distributed OSes: Kubernetes on top of Google Cloud, on top of Borg, on top of a system to distribute binaries and kernel images. All of these systems have a different concept of “user” an auth, leading to another O(M * N) problem.)
It’s a source-based OS. You push source code (like Heroku) and build configuration, and the system spins up processes to build it into a binary / OCI image that can be deployed to many machines. Both the source code and binary image are identified with KIDs.
Being source-based means that distributed debugging and tracing tools can always refer back to source code. The system knows more about what’s running on it than just opaque containers.
Controversial: no types or schemas! (at the lowest layer; they can be built on top)
Schemas are another possibly incommensurable “concept”; types inhibit metaprogramming and generic operations. All of this has a bearing on distributed systems. Instead I would think of Hickey’s style of interfaces: “strengthen a promise” and “relax a requirement”.
Remember most of it is written in shell; it’s dynamically typed
Everything is a container, AND the virtual machine monitor can even run in a container. This solves some composition problems with VMs and containers. (Right now we have different distributed OSes that are better at managing one or the other.) The Better k8s article also mentions this issue.
The data model is REST with the “uniform interface constraint”, which is like Plan 9. The resources/files are tables, objects, and documents. Because they are TEXT, they all can be sync’d with differential compression. A single algorithm handles all of them.
Objects are used for configuration (Oil configuration evaluates to JSON)
Tables/relations are used for cluster state (which apps are on which machines) and metrics (maybe there are streams which are like infinite tables)
Documents are used for the user interface (as well as docs/online help!). The state of every node has to be reflected a web UI.
Uses only process-based concurrency. There are no threads and no goroutines. (This solves O(M * N) problems in distributed tracing and debugging, which are very important.) As mentioned, I think there does need to be some kind of git-inotify in order to trigger processes on events (or maybe you can literally use soething like git hooks).
There is a single shell language for coordinating processes that also runs all your old shell scripts :) The same language is also used for configuration, with its Ruby-like blocks.
Overall this system should be a modest and humble extension of Unix, just like the web was. The new concepts should compose with the old concepts! That leads to something smaller and more stable.
There are a undoubtedly a lot of holes here, but I hope it’s interesting or sparks new ideas. I obviously haven’t built this, but I did work on the problem for 5 years across 2 codebases that were each north of 20K lines of code from scratch, and the system had (a few) real users. So it’s at least based on something!
BTW everything that runs is a “contained process”. And then there is another idea on top of that. Since KIDs represent values that are versioned trees, then you can come up with a deteriministic computing abstraction:
(KID for code, KID for input data) -> contained process -> (KID for output data)
So basically the private namespace for each process (plan 9-like) is set up from versioned trees that are sync’d.
Actually one of the authors of gg (mentioned on the Oil blog) described this same idea recently:
I think it’s obvious and clearly useful but cloud platforms don’t support it. (AWS Lambda seems overly code-centric, not data-centric.)
More examples of things a KID can represent:
the source for an app and ALL its dependencies. The source should have KID pointers to dependencies. (So I guess I should call it a Merkle tree)
the binary image for an app (an executable)
a shell script that invokes multiple executables, which is itself an executable. (this is just a recursive definition, which Unix also respects with shebang lines)
I also forgot to say that the whole web UI and any internal services (of which there are few because usually you consult your copy of sync’d state) are first implemented as a FastCGI processes, and then eventually Oil coprocesses.
There is a direct analogy:
CGI : FastCGI :: Process : Oil Coprocess
It just solves “the problem of VMs that start slowly”. And you get to use process-based concurrency everywhere – no threads and no goroutines.
Since I have been talking about this software architecture idea across many recent comments and blog posts, and there have many recent lobste.rs articles touching on the ideas of scaling your codebase and O(N^2) and O(M*N) problems, I spent awhile collecting ALL the links I could dig up:
I would appreciate any help digging up textbook references! I know I’ve seen these narrow waist diagrams in compiler and networking textbooks. I think they should also appear in OS textbooks.
And I also appreciate comment on whether there is any overreach here. Is it useful to think of all of these as the same idea? I think so but would be interested in arguments otherwise.
It’s a very dense set of links with counterpoints / fallacies too :)
Also @xiaq at the very end there is a point about the design of shells :) That is partially where this going – I want to explain why Oil is a Bourne shell, where there is a minimal difference between external processes and internal procs.
One concrete thing it makes possible is the “$0 dispatch pattern”, which I mentioned in the recent xargs thread (another thing I need to make a blog post out of.)
I question the two-tiered design of PowerShell, Elvish, and nushell. I would say it doesn’t follow the Perlis-Thompson principle, although of course the principle is a tradeoff and not a hard rule.
But I actually want to get to your Kubernetes/Borg comments first; I may make another wiki page about it as it’s been something I’ve been thinking about. That is, what would a distributed OS that follows the Perlis-Thompson principle look like?
I question the two-tiered design of PowerShell, Elvish, and nushell. I would say it doesn’t follow the Perlis-Thompson principle, although of course the principle is a tradeoff and not a hard rule.
IMO the solution is not to impose on shell functions the same restrictions as external commands, but to allow processes to communicate in a more structured way, like you can pass distinct, typed arguments to functions in the same process. But that’s a problem that has to be solved at the OS level.
There is now a consensus on what sorts of data structures are considered the lowest common denominators - strings, numbers, arrays and maps; popularized by dynamic languages and JSON. I doubt the idea will ever take off, but it’s not hard to imagine a kernel with first-class support for exchanging such data between processes, and language runtimes that 100% match the kernel semantics. If this sounds like COM or Corba - the key here is that OS should focus exclusively on data, not any kind of function calls.
Getting back to the original topic of two-layered design a bit: even in the most basic shell languages, there is already a difference between internal and external commands. Internal commands have access to the shared memory in the process and can exchange data using variables, external commands can’t. And if you support array-typed and map-typed variables (which IIUC Oil does to some extent), that’s another mechanism accessible to internal functions but not external processes.
Update: I see you already have a response to that article. I disagree with your analysis that nobody can agree on the LCD. I think the JSON data structures are the LCD.
Yes so this is an interesting point: Oil supports JSON, and Oil has Python-like data structures (recursive dict and list) precisely to support JSON.
But I don’t believe JSON is the right narrow waist! I still think of byte streams as the narrow waist – “level 0” if you want to label it that.
And then JSON, the TSV extension called QTT, and HTML are at “level 1”. They are all structured interchange formats, but they are also text that you can grep. They reduce to text in some sense, because they are text. That is part of what I’m defining as the Perlis-Thompson principle – when you introduce new concepts, they should reduce to the old ones. (Another reduction by design is UTF-8 to ASCII, a property that other encodings don’t share, and cause tremendous complexity, to the point of say forking the entire Python language …)
I don’t think you can claim that JSON-shaped data is a narrow waist simply because it’s pretty awkward for describing tabular data and documents. There is an amazing amount of tabular data in the world – i.e. every SQL database and every R program uses tabular data. Datalog is also built on relations / tabular data.
And the entire web uses semi-structured documents that are better represented by HTML than JSON (even if we were to start over, which is impossible). The whole JSON vs. XML debate was always silly, and I remember Steve Yegge has a good quote describing the difference: “Use XML when you have more text than data, and use JSON when you have more data than text”. They are just different things. Books are still represented as XML and that’ss better than JSON. (Actually I just googled for this and my own comment from 2014 came up: https://news.ycombinator.com/item?id=7312572)
Here is a very long-winded meandering comment from January 2020 where I was thinking through these ideas for the Oil language. I claimed that Oil wouldn’t have types, and wouldn’t have serialization/deserialization, and it would deal with “concretions” directly. It’s the idea of directly manipulating serialized data, not doing a deserialize -> in-memory operations -> serialize dance.
But I’ve now gone back on that slightly. Oil does have Python/JS/Ruby-like types, i.e. a garbage collected heap, simply because they “won” (notably Perl and PHP got this wrong; they have poorly designed core data structures.)
However I still like the idea of using say CSS selectors (a DSL) to directly query documents. Not deserialize docs, then write code to traverse a DOM, then reserialize. Big graphs of pointers are expensive, and deserialization can and should be done lazily on portions of the input stream that are relevant to the query.
Ditto for tables. In SQL you don’t materialize an entire table in memory to query it – there is a VM that knows how to seek to only the parts of the table that matter (via indices, etc.)
In that comment I cited the paper “Unifying Documents, Objects, and Tables” by Meier. This influenced the design of the C# language, i.e. how it has built in SQL tables with LINQ.
So Oil will also be about documents, objects, and tables – HTML, JSON, and QTT. This is actually an important refinement of the Perlis-Thompson principle – a single notion is not always a good thing! Documents, objects, and tables are different.
One of the examples I’m going to use in my blog post is Python/Clojure vs. Lua/Racket. Python has dicts and lists, while Lua only has tables. Racket has s-expressions, while Clojure adds maps.
I think Python and Clojure are better. So that directly contradicts single notion. So I’ve taken the liberty of softening the Perlis-Thompson principle: it says use fewer notions, and when you introduce a new one, they need to reduce to an existing notion. (Python and Clojure protocols like iterators accomplish this effectively – they bridge concepts)
In the latest tour document, I describe Oil as oriented around “languages for data”.
The other part of the Oil language where the Perlis-Thompson principle plays a role is the design of procs, processes, coprocesses, and functions.
This is not done yet – it’s one of the biggest remaining pieces of the language. But I think of processes and coprocesses as the “physical layer” – coprocesses are a startup time optimization. I claim coprocesses in Oil are better than bash and ksh coprocesses because they follow the Perlis-Thompson principle.
And then logically procs and funcs are on top of processes and coprocesses. Actually I am taking a cue from Elvish and using the stdout of processes/coprocesses/shared libraries as the “return value” of the function.
Originally I had procs and funcs as separate but equal abstractions. Procs looked like processes, and funcs looked like Python functions. This causes all sorts of composition problems, because now you have exit codes and exceptions too, etc. It multiplies the complexity of the language.
So now I think that defining funcs as syntactic sugar on procs, and having coprocesses and shared libraries (which bash has) as a “physical” optimization, is a much better design.
As a thought experiment I’d be interested in how you can use xargs -P 8 with Elvish functions, or if that’s not considered idiomatic? Is it better to use Elvish’s own parallelism?
Oil will have an “each” builtin that takes a block, but you are also allowed to use xargs and it’s exactly as convenient as it is in bash!
Anyway this is not all done yet, but this is probably 70% of the reason I’m writing about the Perlis-Thompson principle. The other 30% is the Borg/Kubernetes stuff, which I have an answer for :)
More of a note to myself, but there is a parallel with k8s here in terms of trying our best not to introduce incommensurable concepts. Here is a feeling I agree with:
my problem with k8s, is that you learn OS concepts, and then k8s / docker shits all over them.
Likewise for Oil, I don’t want to have too much of a split between external and internal, old and new. There is going to be some, but it should be minimized. The new stuff (Python-like data types and functions) must compose with the old stuff (processes, argv, env, exit codes, signals). Oil is still a thin layer over the kernel, not a big VM on top.
(I don’t have enough direct experience with PowerShell, Elvish, or nushell to judge them in this regard, but I would be interested in learning more.)
Here are some areas where we have some divergence, but the benefit may be worth the cost. POSIX sh already has a difference in that shell functions can mutate parent scopes, but external processes can’t. Oil is going to have expression-like arguments to procs specifically to support JSON:
They have a 2x2 matrix of internal and external. This is exactly what Ken Thompson was talking about – exponential complication, although you can argue that 2^2 is not that bad, and 3^2 or 4^2 would be worse :) !
I’m not necessarily saying it’s bad – the benefit could be worth the cost. But there is a cost. I would have to use it more to weigh the benefits vs. costs.
They have a 2x2 matrix of internal and external. This is exactly what Ken Thompson was talking about – exponential complication, although you can argue that 2^2 is not that bad, and 3^2 or 4^2 would be worse :) !
FWIW, Elvish doesn’t have explicit handling of interfacing internal/external commands. External commands behave exactly like internal ones, they just don’t accept value inputs or write value outputs. For example, the external echo behaves pretty much identically to the internal echo command.
But I don’t believe JSON is the right narrow waist! I still think of byte streams as the narrow waist – “level 0” if you want to label it that.
I don’t dispute byte streams as “level 0”, but I say that the data types of numbers, strings, arrays and maps are a suitable “level 1” for the majority of applications. I use “JSON types” as a shorthand for these data structures, but I don’t imply anything about the actual encoding, and it should probably be a binary encoding for efficiency. Let me call it “universal exchange format” instead for clarity.
As an imperfect analogy, I think of byte streams as UDP/TCP and the universal exchange format as HTTP. UDP and TCP are generic enough to suit everything, but 90% of the applications can use HTTP rather than UDP/TCP. (This is not a good analogy because HTTP is mostly concerned about metadata rather than data, but it’s a good example of a “level 1” that can satisfy a lot of use cases.)
And the entire web uses semi-structured documents that are better represented by HTML than JSON (even if we were to start over, which is impossible). The whole JSON vs. XML debate was always silly, and I remember Steve Yegge has a good quote describing the difference: “Use XML when you have more text than data, and use JSON when you have more data than text”. They are just different things.
So Oil will also be about documents, objects, and tables – HTML, JSON, and QTT. This is actually an important refinement of the Perlis-Thompson principle – a single notion is not always a good thing! Documents, objects, and tables are different.
XML and HTML still use the same underlying structure of strings, arrays and maps (I don’t think they have numbers though). Attributes form a map and child nodes form an array. DOM API exposes these data structures; CSS selectors navigate these data structures. The surface representation is irrelevant.
Tables can be modelled as a list of maps where all the maps happen to have the same keys. This is less efficient than a specialized data frame format, but then this is the exactly the kind of tradeoff encouraged by the Thompson-Perlis principle. A sophisticated universal exchange format implementation can support interned keys (so that the repeated keys take up minimal storage) and other features that help approximate the efficiency of a data frame.
There will always be a plethora of serialization formats in the world, but I envision a computing environment where you deserialize data exactly once (when you fetch from elsewhere) into a universal exchange format, use a set of standard tooling to manipulate it. Run CSS selectors on JSON documents, or SQL on XML documents. Use the same tool to extract values of the "name" key from a JSON document, values of the "href" attribute from an HTML document, or working directories of all running processes.
Filesystems can also be modelled by the universal exchange format. A directory is a list of files, and a file is map of attributes with keys such as name, owner, creation_time and content. You can find files with CSS selectors. Or batch rename files in the same way you would transform JSON documents.
You might say some of the use cases are going to be very inefficient, because you can’t just pass a whole filesystem between processes without risking OOM. But data can be materialized lazily and on demand, and and consumers don’t need to hold on to the entirety of the data set (like how line-oriented UNIX tools typically only keep one line of data in memory). But this is something hard to implement purely in user space and that’s why I think kernel support is necessary. There is prior art - Clojure’s ISeq is lazy and all the sequence manipulations are implemented in terms of it. (At this point I realized that the universal exchange format is actually an abstract interface, not a concrete format. Like how UNIX files are actually an abstract interface.)
In fact I claim that an OS structured around the universal exchange format adheres more to the Perlis-Thompson principle than UNIX does. In UNIX files and filesystems are two entirely different things. You can’t edit the filesystem with vi, or find files with grep, or batch-rename files with sed.
As a thought experiment I’d be interested in how you can use xargs -P 8 with Elvish functions, or if that’s not considered idiomatic? Is it better to use Elvish’s own parallelism?
Oil will have an “each” builtin that takes a block, but you are also allowed to use xargs and it’s exactly as convenient as it is in bash!
You can’t use xargs on Elvish functions. I’m actually curious how it works in Oil - this only works with $0 dispatching, and it’s starting a new instance of the script which doesn’t share any state with the current script, right?
So the claim is that nearly ALL good languages have a narrow waist or lowest common denominator – they just don’t agree on what it is.
Even the 3 scientific languages disagree:
Matlab / Julia: everything is a matrix / vector (homogeneous typed N-dimensional array)
R: everything is a data frame / vector (heterogeneously typed 2D table)
Mathematica: everything is a M-expression (for symbolic computation)
ALL of these are structured data. And they fundamentally disagree on what the narrow waist is. Or rather they each have their own narrow waist which is domain specific, but the lowest common denominator between all of them is text.
And if you try to write Mathematica code in R, or Matlab code in Mathematica, then it will be apparent very quickly why they are separate languages, and why they choose a different waist. Julia is getting some data frame features but they had to make MAJOR changes to the language to support it, and it’s still not as convenient / functional / usable as R (despite Julia being a vastly better language in general).
So then the thing I didn’t realize is that this idea also explains shells!
Oil / Bourne shell: Everything is byte stream. There is support for JSON, QTT and eventually HTML (objects, tables, documents), but they’re all byte streams.
Elvish: Everything is a “value”, which has a JSON-like data model (and lambdas are values, etc.). When you have tables, the idiom is to make them look tree-structured.
Nushell: everything is a table. But cells can have structured data to represent hierarchical structures.
PowerShell: everything is a .NET object. I’ve heard PowerShell described as “a really awkward syntax for writing C#” and I think that gets at the core of the issue. Likewise Microsoft broke VB6 and VB.NET “another syntax for writing C#”.
So this is basically where the claim on the wiki page: the lowest common denominator between an Elvish program, a Nushell program, and a PowerShell program is a bash or Oil program :) I predict that this will actually happen and isn’t theoretical.
This is not to say bash or Oil is better, just that it sits at a more basic level of the “hourglass”. People rightly complain that working with text is cumbersome. It absolutely is but it’s also necessary to send anything across the network or persist it to disk.
Another example of people not agreeing on the narrow waist: .gob files are not protocol buffers are not JSON. Which are not Python pickle files. Go developers would love it if everything were gob files, etc. but then they have to talk to services written in Java or Ruby, and those Java programmers wish everything were Java serialized objects.
In other words, every language is biased toward its own data format. JSON has definitely emerged as a strong interchange format over the last 15 years, but it’s far from universal, and there’s millions of people who touch CSV way more than JSON (e.g. data scientists extracting from SQL databases and analyzing it with R. Ask them about JSON vs. CSV and you might get some blank stares)
I addressed the kernel issue here, but maybe it needs more elaboration.
My comment on text as a narrow waist was in response to a typical misunderstanding. The kernel does not need records or types. And they should be added to a shell with care. We don’t want to break the compositional properties of shell.
Anyway this thread is long and old but I’d definitely be interested in continuing the conversation via e-mail, https://oilshell.zulipchat.com/ , or elsewhere. I think it is a very interesting topic that gets to the core of language design!
Adding a bit to the part about Multics’s
RUNCOM
- “run command” is also what therc
in.bashrc
,.vimrc
, etc. means.Plan 9’s shell is also called
rc
for the same reason.Oh, thanks for the interview with Bourne, that adds considerable context to my recent writing. In exchange, I offer the explanation of why Bell Labs got a miserable PDP rather than a powerful machine. Unix is partly Hamming’s fault.
Ha, constraints breed creativity :) I think there’s something to that and the inefficiency of the modern cloud.
“Software is a gas; it expands to fill its container”, and the modern cloud is basically a container of unbounded size (since new capacity is being built as fast as people will pay for it).
I can imagine that splitting unquoted substitutions was the sensible thing to do at the time. If you want to run a bunch of different commands with the same options (say, compiler include path flags or preprocessor defines), sticking them in a variable so you can say
mycommand $COMMONARGS foo bar
seems like a good idea. If you really want it to be expanded as a single variable you can quote it, so both behaviours are available. Sure, people might shoot themselves in the foot occasionally, but I don’t think the implementors of Unix had ergonomics on their mind.The “proper” solution (which I believe Oil implements) is to support array variables, and provide syntax to splice an array into a command-line (if that’s what they want), but pervasive array support is more complicated than just tweaking the variable expansion rules, and Worse is Better, so…
Yeah I think they could have just changed the default though, so
echo $var
would not “molest” the variable, andecho @var
would or something like that. You could still have string-only variables, quoting of operator characters like'a;b'
, and word splitting off by default!But yes Oil uses arrays so that the strings can contain any characters (and because everyone knows how to use arrays in Python, JS, Ruby, etc.). With the string-only solution you always have to think about the split character.
I think there is a parallel between UNIX and network protocols. UNIX exposes byte streams and programs use that to communicate their specific information. Transport layer protocols also expose byte streams, either undelimited (UDP) or delimited (TCP), and application layer protocols use them to communicate applications-specific information.
If you squint, two UNIX programs communicating via pipes is not that different from two computers talking HTTP over the network. You can even bridge UNIX pipes with a TCP or UDP connection with
nc
.IMO the Perlis-Thompson principle is a formulation of how layered software architecture looks like: build an application-agnostic, universal lower layer first, and place your domain-specific stuff above that layer.
For operating system and networking protocols, a sequence of bytes is a really good choice for the data structure of that lower layer; there is literally nothing that is more application-agnostic and more universal; that is why UNIX and the TCP/IP stack have succeeded. Generalizing this to higher-level domains is much harder, although there are some less straightforward parallels - intermediate languages in compiler design, VMs for runtime design, etc.
Yes exactly, that is the idea of the “narrow waist” or “hourglass model”, briefly mentioned in the post, which I hope to elaborate on. It probably needs a wiki page at this point. As far as I can see it’s a special case of the Perlis-Thompson principle – probably the most important one, as it governs the evolution of the very biggest and longest lived systems in software.
I noted the TCP/IP, LLVM, and Unix syscall examples in a comment below the “Unix text as narrow waist” comment linked in the post:
https://lobste.rs/s/vl9o4z/case_against_text_protocols#c_03mx7g
Someone else brought up Pandoc as having the same architecture. Instead of writing O(N^2) translators between document formats, you design a common IR and write O(N) translators to the IR and O(N) translators from the IR.
TCP/IP was explicitly designed as a narrow waist or “hourglass model”. It’s also called “the distinguished layer” in network architecture. I would define it as the “lowest common denominator”. In the old days, you would have an e-mail app that works over wired networks, and another e-mail app that works over wireless. That is not economical – you want e-mail, chat, web, etc. to go over TCP/IP, and TCP/IP works on both wired and wireless. This is not optimal, but it’s economical.
In the Kubernetes-Multics thread on Hacker News (in a reply to @brandombloom who also wrote about O(M * N) problems recently), I said I traced the idea to Kleinrock via Eric Brewer, but I can’t tell if this paper was ever published and what the date was.
https://news.ycombinator.com/item?id=27914632 (If anyone has a more original reference for the narrow waist / hourglass model than Kleinrock, let me know, I will credit you in the blog posts)
Kleinrock was one of the inventors of packet switching. (Which reminds me that I think I first heard of the idea of the narrow waist from a talk about Named Data Networking by Van Jacobsen at Google over 10 years ago. Van Jacobsen being another early Internet architect.)
In this comment I elaborated more on the narrow waist moving to HTTP rather than TCP/IP: https://lobste.rs/s/mjo19d/unix_microservice_platforms#c_n9zwbw
I linked to an article which talks about “the hourglass model”
https://cacm.acm.org/magazines/2019/7/237714-on-the-hourglass-model/fulltext
All of these are the same idea in software architecture, but people in different subfields use different words for it. In EE textbooks on network engineering and CS textbooks on compiler architecture they will both refer to this “hourglass” architecture. It’s a “standard” engineering concept, although for some reason many programmers seem to be unaware of it, and are content to code the area rather than the perimeter :) That is, write O(M * N) amounts of code rather than O(M + N). @brandonbloom has a good example with microservices in the cloud.
Chris Latter said that LLVM IR is explicitly a narrow waist, and that comes straight from compiler textbooks. On the one side you have M languages, and on the other side you have N CPU architectures. Instead of writing O(M * N) separate compilers, you write O(M) front ends and O(N) back ends, for O(M + N) total work.
You can never design a perfect narrow waist – it’s a compromise by definition. I’ve seen some good lists of design mistakes in LLVM IR, where a certain change would probably make life easier for everyone. But LLVM works well despite not being perfect.
Likewise, TCP/IP is also inefficient for some applications, Unix text processing can be cumbersome, and the POSIX file system API doesn’t fit all apps, but that doesn’t mean they are bad. They are necessary compromises for the overall structure of the system. You can have “out of band” optimizations, but you still want a coherent architecture overall.
You want to avoid writing O(M * N) code, and I think that’s exactly what Kubernetes has failed to do. And also Borg and Google in general. See the “Unix vs. Google” that Brandon and I both reference: https://www.youtube.com/watch?v=3Ea3pkTCYx4 . I claim that Google execs actually started to notice this, but failed to fix it with the ~2012 pivot to the cloud. The “system design interview” concept started around this time, because they (correctly) thought that engineers were designing bad systems.
I have a feeling that in the networking and compiler case it was more explicitly designed than in operating systems / shells, but I’m not sure. Anyway I need to make a wiki page and I’ll be glad to have input from others :)
I agree with pretty much everything you said!
I’m curious about how one would design a Kubernetes alternative though. IMO a large part of its complexity comes from the constraint of having to integrate a lot of existing technologies: an Linux kernel, TCP/IP, DNS (or whatever else it uses for service discovery), runtime of popular programming languages, etc.. When you have to fit many heterogeneous technologies it’s very hard to come up with a coherent design without breaking backward compatibility in some components - which will be suicide in terms of adoption.
Fitting TCP/IP into UNIX got us BSD sockets, and fitting GUI into UNIX got us X. Both of them are more complicated and less flexible than Plan 9’s design, but you can take your existing UNIX program and put BSD socket and X support in it. Porting it to Plan 9 is a rewrite. You think that Kubernetes is cloud computing’s Multics, but I fear that it is actually cloud computing’s BSD sockets and X, and a radically simpler system will go down the same route as Plan 9 because it requires rewriting all your existing software.
I also think that in hindsight, one could say that Borg was a huge missed opportunity: you have a proprietary environment where a lot of new software will be created from scratch, so you can afford to reshape the Linux kernel, the network stack, RPC protocols and language runtimes to fit into a much coherent design and reduce a lot of complexity that arose from mismatch in these components. The only problem is that people who developed Borg could not possibly know how large Google was going to grow into. By the time Google’s scale became clear it was too late.
Something Google did do that resembles such an effort is AppEngine. But it’s too high level to be a universal cloud computing system.
Hm I actually think Kubernetes was the missed opportunity, because the strengths and weaknesses of Borg were relatively well-known at that point. Arguably, they made something that’s worse than Borg, not better! (Part of that is due to the awkward nesting of Kubernetes within VMs like Google Cloud.)
On the other hand Borg was done by a small team and Google was growing like crazy at that time. And Borg was understandably biased toward running web search (stateless services, no consistency guarantees needed). Nonetheless, Linux cgroups came directly out of Borg (from Paul Menage, tech lead / designer of Borg), so they did make changes to the kernel.
I actually sat next to the App Engine team in SF in 2007-2008 and was a pre-release user. I ported (a trivial) part of Google Code to it!
And from 2010-2015 I tried to write something like App Engine (more like a web server), and then tried a second version that looked something like Borg (more like a distributed init).
It was inspired by a very particular problem: neither App Engine or Borg could run R code very well! They have what I call the “monoglot distributed OS” problem. Unix is polyglot, but App Engine and Borg were to various degrees coupled to specific languages. (App Engine was Python-only then, and it took a long time to even get Python 3, or PHP. It was a VERY long time, because of certain things baked into the architecture, and also limitations of running on top of Borg, and the feature set of the Linux kernel in those days)
I actually had about 20 apps from 10-20 R users using it. I learned a lot but realized it was too big a job, and security in particular was a bottomless pit of work (and it’s something the cloud still hasn’t solved!)
I briefly mentioned this on the blog over 4 years ago after Ilya Sher of NGS asked me why I was creating Oil.
http://www.oilshell.org/blog/2017/01/19.html
(I also recently discussed it with the organizers of the HotOS shell panel, who are doing cloud / shell research.)
This leads perfectly into the next question – what would a distributed OS following the Perlis-Thompson principle look like? Well I claim it should be made of shell scripts :) I explicitly make that claim in a recent blog post, but I have some elaborations on that.
https://www.oilshell.org/blog/2021/07/cloud-review.html
(another long reply coming up)
This is actually why I’m sort of obsessed with shell and distros. Because any successful distributed OS should run all your existing software, and right now that is bound up with distros (creating a lot of headaches), which are bound up with shell (more headaches)!
With the “Poly” cluster manager / OS project I mentioned in the last comment, I spent a lot of time on the package management / container problem (this was pre-Docker, some color here: https://lobste.rs/s/kj6vtn/it_s_time_say_goodbye_docker#c_cjy5on). One reason was that Blaze doesn’t deal well with R code, and that’s part of the reason Borg doesn’t. (My coworker wrote the Blaze R support and I reviewed it; it’s kind of a Rube Goldberg machine.)
As for the question of a distributed OS following the Perlis-Thompson principle – this is going to be another really long reply, but you asked a really big question :) Also people on Hacker News asked the same question, so I might as well write it up.
This comes with all the same disclaimers as the “Better Kubernetes From the Ground Up” post I referenced: My experienced is biased; this is a hard problem; and most likely I’m not actually going to do anything about this :) Honestly right now it’s more interesting to me to do as much as you can with one machine rather than build big clusters.
To repeat the bold claim: I would make a distributed OS out of shell scripts (obviously invoking some custom tools written in other languages). One reason is that it’s the control plane and not the data plane. If you think about having 10,000 machines, 1000 users, and 5000 apps, it’s not very much data to store and manage. The state of the entire system can fit in a 1 GB file system tree. I would think of it like a “distributed init”.
(The Borg master was a single machine with all cluster state in memory for many years – no Paxos. This was well past the point that hundreds of billions of dollars of revenue were made on top of it.)
The complement to shell scripts is a file system, but I would try not to use a distributed file system. Instead I would base the system on state synchronization.
Think of the scheduler making decisions about which apps go on which machines. It simply writes those decisions to a git repo. And the workers sync the repo. When they reboot; they do not contact the master again. They just use the sync’d state on the local file system.
So in this system there are no RPCs at all. It’s logically just “git push” and “git pull”. (I believe you need a mechanisms for subscriptions like an inotify() on the repos, but let’s leave that aside for now.)
In order to really store the whole state of the system, you need another elaboration on git – something like git annex, which I started using. This is because all the binary images and containers are too big.
This leads to one of the central ideas: the “root” git repo contains a “distributed pointers” to other repos, i.e. other versioned trees. Because of the versioning, these pointers are also values in the Hickey sense. It could simply be a text file with a (git repo URL, sha256) pair.
I called this system “Keg” – a wrapper around git. And let’s call the (URL, sha256) pairs a “KID” – this is a “distributed pointer”. Following the Perlis-Thompson principle, everything is a “KID” – images, running containers, users, machines, etc. It’s the analog of a file descriptor – an opaque ID that can represent many things.
And users can refer to apps, and apps can refer to image versions, and apps can refer to hosts, all by the uniform KID mechanism.
So two big simplifications I mentioned are:
Here are some other simplifications I would like:
The data centers still exist, but the cluster manager layer doesn’t care about them. (Apps like distributed databases will care, because they’re on the performance sensitive data plane, not the control plane. State sync is not expensive across a WAN because you have differential compression “for free”.)
The cluster is designed to be easily turned up / bootstrapped and doesn’t have an “inner platform effect”. There’s no another distributed OS below it that distributes its binaries and has its own auth system. (This is my beef about there being 3.5 distributed OSes: Kubernetes on top of Google Cloud, on top of Borg, on top of a system to distribute binaries and kernel images. All of these systems have a different concept of “user” an auth, leading to another O(M * N) problem.)
It’s a source-based OS. You push source code (like Heroku) and build configuration, and the system spins up processes to build it into a binary / OCI image that can be deployed to many machines. Both the source code and binary image are identified with KIDs.
Controversial: no types or schemas! (at the lowest layer; they can be built on top)
Everything is a container, AND the virtual machine monitor can even run in a container. This solves some composition problems with VMs and containers. (Right now we have different distributed OSes that are better at managing one or the other.) The Better k8s article also mentions this issue.
The data model is REST with the “uniform interface constraint”, which is like Plan 9. The resources/files are tables, objects, and documents. Because they are TEXT, they all can be sync’d with differential compression. A single algorithm handles all of them.
Uses only process-based concurrency. There are no threads and no goroutines. (This solves O(M * N) problems in distributed tracing and debugging, which are very important.) As mentioned, I think there does need to be some kind of git-inotify in order to trigger processes on events (or maybe you can literally use soething like git hooks).
There is a single shell language for coordinating processes that also runs all your old shell scripts :) The same language is also used for configuration, with its Ruby-like blocks.
Overall this system should be a modest and humble extension of Unix, just like the web was. The new concepts should compose with the old concepts! That leads to something smaller and more stable.
There are a undoubtedly a lot of holes here, but I hope it’s interesting or sparks new ideas. I obviously haven’t built this, but I did work on the problem for 5 years across 2 codebases that were each north of 20K lines of code from scratch, and the system had (a few) real users. So it’s at least based on something!
BTW everything that runs is a “contained process”. And then there is another idea on top of that. Since KIDs represent values that are versioned trees, then you can come up with a deteriministic computing abstraction:
So basically the private namespace for each process (plan 9-like) is set up from versioned trees that are sync’d.
Actually one of the authors of gg (mentioned on the Oil blog) described this same idea recently:
https://news.ycombinator.com/item?id=27951752
I think it’s obvious and clearly useful but cloud platforms don’t support it. (AWS Lambda seems overly code-centric, not data-centric.)
More examples of things a KID can represent:
I also forgot to say that the whole web UI and any internal services (of which there are few because usually you consult your copy of sync’d state) are first implemented as a FastCGI processes, and then eventually Oil coprocesses.
There is a direct analogy:
It just solves “the problem of VMs that start slowly”. And you get to use process-based concurrency everywhere – no threads and no goroutines.
Since I have been talking about this software architecture idea across many recent comments and blog posts, and there have many recent lobste.rs articles touching on the ideas of scaling your codebase and O(N^2) and O(M*N) problems, I spent awhile collecting ALL the links I could dig up:
https://github.com/oilshell/oil/wiki/Perlis-Thompson-Principle
@brandonbloom and @mpweiher might be interested
I would appreciate any help digging up textbook references! I know I’ve seen these narrow waist diagrams in compiler and networking textbooks. I think they should also appear in OS textbooks.
And I also appreciate comment on whether there is any overreach here. Is it useful to think of all of these as the same idea? I think so but would be interested in arguments otherwise.
It’s a very dense set of links with counterpoints / fallacies too :)
Also @xiaq at the very end there is a point about the design of shells :) That is partially where this going – I want to explain why Oil is a Bourne shell, where there is a minimal difference between external processes and internal procs.
One concrete thing it makes possible is the “$0 dispatch pattern”, which I mentioned in the recent xargs thread (another thing I need to make a blog post out of.)
https://lobste.rs/s/wlqveb/xargs_considered_harmful#c_7cax3s
I question the two-tiered design of PowerShell, Elvish, and nushell. I would say it doesn’t follow the Perlis-Thompson principle, although of course the principle is a tradeoff and not a hard rule.
But I actually want to get to your Kubernetes/Borg comments first; I may make another wiki page about it as it’s been something I’ve been thinking about. That is, what would a distributed OS that follows the Perlis-Thompson principle look like?
I also mentioned @jamii ’s comment on SQL as a narrow waist in the wiki page (https://github.com/oilshell/oil/wiki/Perlis-Thompson-Principle) Lots of good lobste.rs links and discussions recently.
Feedback appreciated :)
IMO the solution is not to impose on shell functions the same restrictions as external commands, but to allow processes to communicate in a more structured way, like you can pass distinct, typed arguments to functions in the same process. But that’s a problem that has to be solved at the OS level.
There’s an article that articulates this sentiment, which I mostly agree with: https://blog.rfox.eu/en/Programming/Programmers_critique_of_missing_structure_of_operating_systems.html.
There is now a consensus on what sorts of data structures are considered the lowest common denominators - strings, numbers, arrays and maps; popularized by dynamic languages and JSON. I doubt the idea will ever take off, but it’s not hard to imagine a kernel with first-class support for exchanging such data between processes, and language runtimes that 100% match the kernel semantics. If this sounds like COM or Corba - the key here is that OS should focus exclusively on data, not any kind of function calls.
Getting back to the original topic of two-layered design a bit: even in the most basic shell languages, there is already a difference between internal and external commands. Internal commands have access to the shared memory in the process and can exchange data using variables, external commands can’t. And if you support array-typed and map-typed variables (which IIUC Oil does to some extent), that’s another mechanism accessible to internal functions but not external processes.
Update: I see you already have a response to that article. I disagree with your analysis that nobody can agree on the LCD. I think the JSON data structures are the LCD.
Yes so this is an interesting point: Oil supports JSON, and Oil has Python-like data structures (recursive dict and list) precisely to support JSON.
But I don’t believe JSON is the right narrow waist! I still think of byte streams as the narrow waist – “level 0” if you want to label it that.
And then JSON, the TSV extension called QTT, and HTML are at “level 1”. They are all structured interchange formats, but they are also text that you can grep. They reduce to text in some sense, because they are text. That is part of what I’m defining as the Perlis-Thompson principle – when you introduce new concepts, they should reduce to the old ones. (Another reduction by design is UTF-8 to ASCII, a property that other encodings don’t share, and cause tremendous complexity, to the point of say forking the entire Python language …)
I don’t think you can claim that JSON-shaped data is a narrow waist simply because it’s pretty awkward for describing tabular data and documents. There is an amazing amount of tabular data in the world – i.e. every SQL database and every R program uses tabular data. Datalog is also built on relations / tabular data.
And the entire web uses semi-structured documents that are better represented by HTML than JSON (even if we were to start over, which is impossible). The whole JSON vs. XML debate was always silly, and I remember Steve Yegge has a good quote describing the difference: “Use XML when you have more text than data, and use JSON when you have more data than text”. They are just different things. Books are still represented as XML and that’ss better than JSON. (Actually I just googled for this and my own comment from 2014 came up: https://news.ycombinator.com/item?id=7312572)
Here is a very long-winded meandering comment from January 2020 where I was thinking through these ideas for the Oil language. I claimed that Oil wouldn’t have types, and wouldn’t have serialization/deserialization, and it would deal with “concretions” directly. It’s the idea of directly manipulating serialized data, not doing a deserialize -> in-memory operations -> serialize dance.
https://news.ycombinator.com/item?id=22157587
But I’ve now gone back on that slightly. Oil does have Python/JS/Ruby-like types, i.e. a garbage collected heap, simply because they “won” (notably Perl and PHP got this wrong; they have poorly designed core data structures.)
However I still like the idea of using say CSS selectors (a DSL) to directly query documents. Not deserialize docs, then write code to traverse a DOM, then reserialize. Big graphs of pointers are expensive, and deserialization can and should be done lazily on portions of the input stream that are relevant to the query.
Ditto for tables. In SQL you don’t materialize an entire table in memory to query it – there is a VM that knows how to seek to only the parts of the table that matter (via indices, etc.)
In that comment I cited the paper “Unifying Documents, Objects, and Tables” by Meier. This influenced the design of the C# language, i.e. how it has built in SQL tables with LINQ.
https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.125.8343&rep=rep1&type=pdf
So Oil will also be about documents, objects, and tables – HTML, JSON, and QTT. This is actually an important refinement of the Perlis-Thompson principle – a single notion is not always a good thing! Documents, objects, and tables are different.
One of the examples I’m going to use in my blog post is Python/Clojure vs. Lua/Racket. Python has dicts and lists, while Lua only has tables. Racket has s-expressions, while Clojure adds maps.
I think Python and Clojure are better. So that directly contradicts single notion. So I’ve taken the liberty of softening the Perlis-Thompson principle: it says use fewer notions, and when you introduce a new one, they need to reduce to an existing notion. (Python and Clojure protocols like iterators accomplish this effectively – they bridge concepts)
In the latest tour document, I describe Oil as oriented around “languages for data”.
https://www.oilshell.org/release/latest/doc/oil-language-tour.html#languages-for-data-interchange-formats
The other part of the Oil language where the Perlis-Thompson principle plays a role is the design of procs, processes, coprocesses, and functions.
This is not done yet – it’s one of the biggest remaining pieces of the language. But I think of processes and coprocesses as the “physical layer” – coprocesses are a startup time optimization. I claim coprocesses in Oil are better than bash and ksh coprocesses because they follow the Perlis-Thompson principle.
And then logically procs and funcs are on top of processes and coprocesses. Actually I am taking a cue from Elvish and using the stdout of processes/coprocesses/shared libraries as the “return value” of the function.
Originally I had procs and funcs as separate but equal abstractions. Procs looked like processes, and funcs looked like Python functions. This causes all sorts of composition problems, because now you have exit codes and exceptions too, etc. It multiplies the complexity of the language.
So now I think that defining funcs as syntactic sugar on procs, and having coprocesses and shared libraries (which bash has) as a “physical” optimization, is a much better design.
As a thought experiment I’d be interested in how you can use
xargs -P 8
with Elvish functions, or if that’s not considered idiomatic? Is it better to use Elvish’s own parallelism?Oil will have an “each” builtin that takes a block, but you are also allowed to use
xargs
and it’s exactly as convenient as it is in bash!Anyway this is not all done yet, but this is probably 70% of the reason I’m writing about the Perlis-Thompson principle. The other 30% is the Borg/Kubernetes stuff, which I have an answer for :)
More of a note to myself, but there is a parallel with k8s here in terms of trying our best not to introduce incommensurable concepts. Here is a feeling I agree with:
https://news.ycombinator.com/item?id=27910897 (and see my reply agreeing with this)
Likewise for Oil, I don’t want to have too much of a split between external and internal, old and new. There is going to be some, but it should be minimized. The new stuff (Python-like data types and functions) must compose with the old stuff (processes, argv, env, exit codes, signals). Oil is still a thin layer over the kernel, not a big VM on top.
(I don’t have enough direct experience with PowerShell, Elvish, or nushell to judge them in this regard, but I would be interested in learning more.)
Here are some areas where we have some divergence, but the benefit may be worth the cost. POSIX sh already has a difference in that shell functions can mutate parent scopes, but external processes can’t. Oil is going to have expression-like arguments to procs specifically to support JSON:
https://lobste.rs/s/wprseq/on_unix_composability#c_gsvugk
But I have some semantic rules in mind to make things gracefully degrade / reduce, and not create separate worlds that need to be bridged.
This part of the nushell docs raised some eyebrows for me: https://www.nushell.sh/contributor-book/commands.html#communicating-between-internal-and-external-commands
They have a 2x2 matrix of internal and external. This is exactly what Ken Thompson was talking about – exponential complication, although you can argue that 2^2 is not that bad, and 3^2 or 4^2 would be worse :) !
I’m not necessarily saying it’s bad – the benefit could be worth the cost. But there is a cost. I would have to use it more to weigh the benefits vs. costs.
FWIW, Elvish doesn’t have explicit handling of interfacing internal/external commands. External commands behave exactly like internal ones, they just don’t accept value inputs or write value outputs. For example, the external
echo
behaves pretty much identically to the internalecho
command.I don’t dispute byte streams as “level 0”, but I say that the data types of numbers, strings, arrays and maps are a suitable “level 1” for the majority of applications. I use “JSON types” as a shorthand for these data structures, but I don’t imply anything about the actual encoding, and it should probably be a binary encoding for efficiency. Let me call it “universal exchange format” instead for clarity.
As an imperfect analogy, I think of byte streams as UDP/TCP and the universal exchange format as HTTP. UDP and TCP are generic enough to suit everything, but 90% of the applications can use HTTP rather than UDP/TCP. (This is not a good analogy because HTTP is mostly concerned about metadata rather than data, but it’s a good example of a “level 1” that can satisfy a lot of use cases.)
XML and HTML still use the same underlying structure of strings, arrays and maps (I don’t think they have numbers though). Attributes form a map and child nodes form an array. DOM API exposes these data structures; CSS selectors navigate these data structures. The surface representation is irrelevant.
Tables can be modelled as a list of maps where all the maps happen to have the same keys. This is less efficient than a specialized data frame format, but then this is the exactly the kind of tradeoff encouraged by the Thompson-Perlis principle. A sophisticated universal exchange format implementation can support interned keys (so that the repeated keys take up minimal storage) and other features that help approximate the efficiency of a data frame.
There will always be a plethora of serialization formats in the world, but I envision a computing environment where you deserialize data exactly once (when you fetch from elsewhere) into a universal exchange format, use a set of standard tooling to manipulate it. Run CSS selectors on JSON documents, or SQL on XML documents. Use the same tool to extract values of the
"name"
key from a JSON document, values of the"href"
attribute from an HTML document, or working directories of all running processes.Filesystems can also be modelled by the universal exchange format. A directory is a list of files, and a file is map of attributes with keys such as
name
,owner
,creation_time
andcontent
. You can find files with CSS selectors. Or batch rename files in the same way you would transform JSON documents.You might say some of the use cases are going to be very inefficient, because you can’t just pass a whole filesystem between processes without risking OOM. But data can be materialized lazily and on demand, and and consumers don’t need to hold on to the entirety of the data set (like how line-oriented UNIX tools typically only keep one line of data in memory). But this is something hard to implement purely in user space and that’s why I think kernel support is necessary. There is prior art - Clojure’s ISeq is lazy and all the sequence manipulations are implemented in terms of it. (At this point I realized that the universal exchange format is actually an abstract interface, not a concrete format. Like how UNIX files are actually an abstract interface.)
In fact I claim that an OS structured around the universal exchange format adheres more to the Perlis-Thompson principle than UNIX does. In UNIX files and filesystems are two entirely different things. You can’t edit the filesystem with
vi
, or find files withgrep
, or batch-rename files withsed
.You can’t use
xargs
on Elvish functions. I’m actually curious how it works in Oil - this only works with $0 dispatching, and it’s starting a new instance of the script which doesn’t share any state with the current script, right?(late reply)
OK so this is interesting and it actually justifies all the effort I spent writing about the Perlis-Thompson Principle (which still isn’t done yet)
On the wiki page I quote “everything is an X” by Luke Plant and my own comment on /r/ProgrammingLanguages
https://lukeplant.me.uk/blog/posts/everything-is-an-x-pattern/
https://old.reddit.com/r/ProgrammingLanguages/comments/lliyuo/are_there_any_interesting_programming_languages/
So the claim is that nearly ALL good languages have a narrow waist or lowest common denominator – they just don’t agree on what it is.
Even the 3 scientific languages disagree:
ALL of these are structured data. And they fundamentally disagree on what the narrow waist is. Or rather they each have their own narrow waist which is domain specific, but the lowest common denominator between all of them is text.
And if you try to write Mathematica code in R, or Matlab code in Mathematica, then it will be apparent very quickly why they are separate languages, and why they choose a different waist. Julia is getting some data frame features but they had to make MAJOR changes to the language to support it, and it’s still not as convenient / functional / usable as R (despite Julia being a vastly better language in general).
So then the thing I didn’t realize is that this idea also explains shells!
So this is basically where the claim on the wiki page: the lowest common denominator between an Elvish program, a Nushell program, and a PowerShell program is a bash or Oil program :) I predict that this will actually happen and isn’t theoretical.
This is not to say bash or Oil is better, just that it sits at a more basic level of the “hourglass”. People rightly complain that working with text is cumbersome. It absolutely is but it’s also necessary to send anything across the network or persist it to disk.
Another example of people not agreeing on the narrow waist: .gob files are not protocol buffers are not JSON. Which are not Python pickle files. Go developers would love it if everything were gob files, etc. but then they have to talk to services written in Java or Ruby, and those Java programmers wish everything were Java serialized objects.
In other words, every language is biased toward its own data format. JSON has definitely emerged as a strong interchange format over the last 15 years, but it’s far from universal, and there’s millions of people who touch CSV way more than JSON (e.g. data scientists extracting from SQL databases and analyzing it with R. Ask them about JSON vs. CSV and you might get some blank stares)
I addressed the kernel issue here, but maybe it needs more elaboration.
http://www.oilshell.org/blog/2021/08/history-trivia.html#the-first-paper-about-unix-shell-thompson
Anyway this thread is long and old but I’d definitely be interested in continuing the conversation via e-mail, https://oilshell.zulipchat.com/ , or elsewhere. I think it is a very interesting topic that gets to the core of language design!