1. 8

    For the example, its actually really useful to allow a program to listen on either a unix socket or a tcp socket just by a command line flag. The string arg is useful when passed through to users.

    1. 9

      Yes. The fact that an address is a mostly-opaque string to programs allowed plan 9 to add support for ipv6, without touching userspace networking APIs at all.

      Often, the best thing your program can do is treat data the user passes it as an opaque blob, for as much of its travel through your code as possible.

      1. 3

        Yeah I would call that a form of polymorphism, just like file descriptors.

        Unix could have had different data types for disk files, pipes, and sockets, instead of using integers, but then you would lose the polymorphism. You wouldn’t be able to write select(), and shell couldn’t have redirects.

        It also same caveats because there are some things you can do on disk files that you can’t do on the others (seek()).

        Still, given the constraints of C, the untyped design is better.

        This relates to my comment about textual protocols in Unix: https://lobste.rs/s/vl9o4z/case_against_text_protocols#c_wsdhsm (which will hopefully appear on my blog in the near future)

        Basically text is “untyped” and therefore you get generic operations / polymorphism (copy, diff, merge, etc.).

        Types can inhibit composition.


        (I know somebody is going to bring up more advanced type systems. I still want to see someone write a useful operating system that way. I’m sure it can be done but it hasn’t so far AFAIK. Probably somewhat because of the expression problem – i.e. you want extensibility in both data types (files, sockets) and operations (read, write, seek) across a stable API boundary. )

        1. 1

          Basically text is “untyped” and therefore you get generic operations / polymorphism (copy, diff, merge, etc.).

          “Sir, this is an Arby’s.”

          (Go already has this problem. I do not see your point, nor your point about the expression problem, as the number of address types is enumerable, and set in stone..)

          1. 2

            The “challenge” is to show me an operating system WITH fine-grained types that does NOT have the O(M*N) explosion of code. That is, show me how it solves the polymorphism problem. Here’s a canonical situation of M data types and N operations, and you can generalize it with more in each dimension (M and N both get bigger):

            • The data types: the operating system has persistent disk files, IPC like pipes, and networking to remote machines.
            • Now for the “operations”
              • How do you simultaneously wait on events from files, IPC and networks? Like select() or inotify().
                • Ditto with signals and process exits – waitfd(), signalfd(), etc. (Unix fails here so Linux invented a few more mechanisms).
              • How do you do redirects? A shell can redirect from a file or a pipe. (Both Rob Pike and DJB complained about the lack of compositionality for sockets: https://cr.yp.to/tcpip/twofd.html. Pike doesn’t like the Berkeley socket API because it’s non-compositional. That’s why both Plan 9 and Go have something different AFAIK)
              • How do you copy from a disk to IPC, disk to network, IPC to network, etc. In Unix, cat can work in place of cp. netcat also sort of works, but I think Plan 9 does better because networking is more unified.

            The connection is:

            • When you have fine-grained types, you get O(M * N) problems, and the expression problem arises out of that. You have M data types and N operations.
            • If you have a single type like a file descriptor or a string, then you don’t have an O(M * N) problem. So you don’t have a composition problem, and the resulting explosion of code.

            In other words, types can inhibit composition. Again, not saying it can’t be done, but just that I haven’t seen an OS that addresses this.

            Plan 9 is more compositional precisely because it has FEWER types, not more. As mentioned, opaque strings are used as addresses.

            (Rich Hickey also has a great talk on this about the Java HTTP Request interface. It’s a type in the Java standard library that inhibits composition and generic operations. Link appreciated from anyone who knows what I’m talking about.)


            Kubernetes has a similar issue as far as I can tell:

            https://twitter.com/n3wscott/status/1355550715519885314

            I’m not familiar with the details, but I used the “predecessor” Borg for many years and it definitely has some composition problems. The Kubenetes ecosystem has a severe O(M * N) code explosion problem. Unix has much less of that, and Plan 9 probably has even less.

            Widely mocked diagram: https://twitter.com/QuinnyPig/status/1328689009275535360

            Claim: This is an O(M * N) code explosion due to lack of compositionality in Kubernetes.


            Also read Ken Thompson’s “sermonette” in his paper on the design of Unix shell:

            https://lobste.rs/s/asr9ud/unix_command_language_ken_thompson_1976#c_1phbzz

            A program is generally exponentially complicated by the number of notions that it invents for itself. To reduce this complication to a minimum, you have to make the number of notions zero or one, which are two numbers that can be raised to any power without disturbing this concept. Since you cannot achieve much with zero notions, it is my belief that you should base systems on a single notion.

            “Single notion” means ONE TYPE. Now this is taken to an extreme – Unix obviously does have both file descriptors and strings. But it uses those concepts for lots of things that would be modelled as separate types in more naive systems.


            Many familiar computing ‘concepts’ are missing from UNIX. Files have no records. There are no access methods. User programs contain no system buffers. There are no file types. These concepts fill a much-needed gap.

            Records are types. Unix lacks records. That’s a feature and not a bug. Records should be and ARE layered on top.

            In distributed systems, types should be layered on top of untyped byte streams.

            (This is going to be on the Oil blog. Many people have problems seeing this because it’s an issue of architecture and not code. It’s a systems design issue.)

            1. 1

              The problem is that different instances support different functionality. Seek works on some file descriptors, but not others. Ioctl is a horrific mess.

              Even COM was a better solution.

        2. 2

          without touching userspace networking APIs at all.

          This kind of statement sounds amazing, but has about 10 *’s next to it pointing out all the caveats. User space code still needed to change to account for new address inputs, otherwise who’s to say that “window.alert()” isn’t a valid address? You’re gonna pass arbitrary strings to that syscall?

          No. Create a constraint on the interface that ensures the caller isn’t an idiot.

          1. 4

            who’s to say that “window.alert()” isn’t a valid address?

            “window.alert()” IS a valid address for a unix domain socket. Maybe other things too.

            1. 1

              Unfortunate random string choice. :)

            2. 3

              You’re gonna pass arbitrary strings to that syscall?

              To the (userspace) connection server, which decides how to best reach the address specified, and returns an appropriate error if the caller provides garbage – yes. Why not? Having one place that does all the work shared across the system makes it easy to have a single, consistent, interface with one location to search for bugs.

              There are, of course, a few asterisks: for example, the dns resolver needs to be taught about how to handle AAAA records, 6in4 tunnels need to know how to encapsulate, etc – but the programs that need this knowlege are generally the programs that provide the userspace APIs.

              No. Create a constraint on the interface that ensures the caller isn’t an idiot.

              Opacity is the strictest possible constraint on an interface. You may do nothing with the data, other than pass it on to something else. If the caller may not do anything, then the caller will not do anything stupid.

              1. 1

                Opacity is the strictest possible constraint on an interface. You may do nothing with the data, other than pass it on to something else. If the caller may not do anything, then the caller will not do anything stupid.

                Ok? But it’s not at all opaque in these examples. The two parameters actually relate to each other….

                1. 1

                  Yes. That’s a poor choice in the Go API – it should have been a single opaque blob, instead of two dependent ones. Plan 9 does not make this mistake.

              2. 3

                How much risk does that actually mitigate in practice, and how much toil does it create?

                It is not true that stronger type systems, or stronger constraints, are strictly better in all circumstances.

                1. 5

                  and how much toil does it create?

                  In the example, the senior programmer spent nearly a day on this. It could have been a type error.

                  It is not true that stronger type systems, or stronger constraints, are strictly better in all circumstances.

                  I am not a static types apologist, and prefer to write stuff in Lispy languages, which are often quite dynamic, but I will never concede that two parameters that have a dependency on each other, as in a pair such as (socket type, address), are best represented by two arbitrary strings instead of an enumeration carrying a constrained value. That’s an absurd thing to argue. You’ve created an infinitely large set of potential inputs and told the programmer “Don’t worry! We’ll tell you if you’re wrong when you run the program.” How silly, especially when that state space can be reduced significantly, and the usage checked by the compiler we’re already paying for.

                  1. 2

                    In the example, the senior programmer spent nearly a day on this. It could have been a type error.

                    Yep, that’s definitely toil, and it definitely could have been prevented by the compiler. But what about the toil that those type system features would bring to every other aspect of programming in the language? How do you measure that, and how do you weigh it? And not in the abstract, either, but in the specific context of Go as it exists today. Honest questions. I don’t know! But I do know it’s important to do, if you want a true reckoning of the calculus.

                    Of course, this could also have been caught and prevented by a unit test. Or an integration test. Or during code review. Or by using an architectural model that was a better fit to the problem. There are many tools available to software engineering teams to mitigate risks, and each tool carries its own costs and benefits as it is applied in different contexts. If the game is mitigating risk — which it is — then not everything needs to be solved at the language level.

                    I will never concede that two parameters that have a dependency on each other, as in a pair such as (socket type, address), are best represented by two arbitrary strings instead of an enumeration carrying a constrained value.

                    Sure, that sounds good to me, but it’s much more abstract than the specific claim made in the article, which is that

                    net.Dial("tcp", "1.2.3.4:9090")
                    

                    should become

                    net.Dial(net.TCPIPAddr{
                        IP:   net.IPv4(1, 2, 3, 4),
                        Port: 9090,
                    })
                    

                    This would certainly convert a class of bugs that are currently runtime errors into compile time errors. But it would also make every use of net.Dial substantially more laborious. Is the benefit worth the cost? I don’t think the answer is obviously yes or no.

                  2. 3

                    I don’t think it’s possible to have a clear answer to this—but FWIW, my experience is that in ecosystems where unioned string literal types are deployed widely (notably in TypeScript), there is effectively no meaningful dissent about whether that solution is strictly better than naked-string-only typing, in every practical measurable dimension. It makes docs better, it makes tooling and autocomplete better, it seems to prevent bugs, people generally just seem to like using it, and there are no appreciable downsides in practice (compile time impact is negligible, no implications for generics, etc.).

                    I understand that in Go this does not quite pass muster, because Go’s methodology for evaluating language features implicitly (and significantly) discounts what most other languages would call “ergonomic” improvements. Other ecosystems are willing to rely on intuition for deciding what sorts of day-to-day activities are worth improving, and in those ecosystems, it is harder to argue that we should not make better type amenities around the most important data type in the field of CS (i.e., the string) for things that people do all the time (e.g., pass in “stringly-typed” arguments to functions), especially when there are no significant downsides (e.g., compilation speed, design subtlety, etc.).

            1. 7

              The irony is that he’s now trying to build better tools that use embedded DSLs instead of YAML files but the market is so saturated with YAML that I don’t think the new tools he’s working on have a chance of gaining traction and that seems to be the major part of angst in that thread.

              One of the analogies I like about the software ecosystem is yeast drowning in the byproducts of their own metabolic processes after converting sugar into alcohol. Computation is a magical substrate but we keep squandering the magic. The irony is that Michael initially sqauandered the magic and in the new and less magical regime his new tools don’t have a home. He contributed to the code-less mess he’s decrying because Ansible is one of the buggiest and slowest pieces of infrastructure management tools I’ve ever used.

              I suspect like all hype cycles people will figure it out eventually because ant used to be a thing and now it is mostly accepted that XML for a build system is a bad idea. Maybe eventually people will figure out that infrastructure as YAML is not sustainable.

              1. 1

                What alternative would you propose to DSLs or YAML?

                1. 4

                  There are plenty of alternatives. Pulumi is my current favorite.

                  1. 3

                    Thanks for bringing Pulumi to my radar, I hadn’t heard of it earlier. It seems quite close to what I’m currently trying to master, Terraform. So I ended up here: https://pulumi.io/reference/vs/terraform.html – where they say

                    Terraform, by default, requires that you manage concurrency and state manually, by way of its “state files.” Pulumi, in contrast, uses the free app.pulumi.com service to eliminate these concerns. This makes getting started with Pulumi, and operationalizing it in a team setting, much easier.

                    Which to me seemed rather dishonest. Terraform’s way seems much more flexible and doesn’t tie me to Hashicorp if I don’t want that. Pulumi seems like a modern SAAS money vacuum: https://www.pulumi.com/pricing/

                    The positive side, of course, is that doing many programmatic-like things in Terraform HCL is quite painful, like all non-turing programming tends to be when you stray from the path the language designers built for you … Pulumi handles that obviously much better.

                    1. 3

                      I work at Pulumi. To be 100% clear, you can absolutely manage a state file locally in the same way you can with TF.

                      The service does have free tier though, and if you can use it, I think you should, as it is vastly more convenient.

                      1. 3

                        You’re welcome to use a local state file the same way as in Terraform.

                      2. 1

                        +100000

                  1. 3

                    This is a description of production search engine of Bing.

                    1. 6

                      I’m an author on this paper. 2 things:

                      1. A copy-paste-with-some-cleanup version of the code that serves production traffic in Bing is available here: https://github.com/BitFunnel/BitFunnel This was mostly done by Dan and Mike.
                      2. Bing is actually multiple search indexes. So, BitFunnel serves every production query, but it “only” maintains the “SuperFresh” index, which is to say, the index of documents that need to be updated really frequently.
                    1. 7

                      I find it very odd he’s completely chucking out PowerShell. PowerShell, both as a scripting language and as an interactive shell, is actually one of the best environments I’ve ever used. It’s definitely not perfect, but it honestly gets a lot of things right. The trivial extensibility from .NET, the entire remoting/workflow system, things like Out-GridView, the Interactive Scripting Environment (ISE) for writing scripts…seriously, they really got a hell of a lot of things right.

                      I’m really excited to have bash on Windows because it means that bash is now the lowest common denominator for a quick script (v. writing separate PowerShell/batch and bash scripts), but if you’re just talking about day-to-day usability, I don’t actually think bash helps a ton.

                      1. 11

                        Author here.

                        I’m not sure what you mean when you say I’m completely chucking out PowerShell. I called it “the cure for polio” and Jeff Snover “the Jonas Salk of the Windows ecosystem”. To be completely honest, I feel like I was a little hard on Bash if anything.

                        1. 5

                          I think I took the first at least really differently; in context, I took it to mean “a great cure for something else” (i.e., fixing the wrong problem). I’ve heard a lot of devs say that (“it’s a better WSH, but we didn’t need a better WSH”, for example, or “It’s a better shell, but the console subsystem is still crap”, and so on), so maybe that’s where my head was at. The post being called “the Windows command line,” and not “cmd.exe”, seemed to cement that.

                          At any rate, I wasn’t trying to mischaracterize your writing. People always glom onto random parts of my posts, extracting a meaning I not only didn’t intend but actively disagree with. Sorry I was the one doing it here.

                        2. 1

                          I don’t think he’s throwing it out per se, I just think he’s not really talking about it (and using a slightly click-bait-y headline.) I mean at the end he says your choices are batch, bash (now) and powershell. Fairly clearly only one of those is not the right choice anymore.

                        1. 4

                          OSv is a new kernel written in C++ with Linux compatibility. They claim 2x throughput for unmodified Redis.

                          http://osv.io/benchmarks/

                          1. 2

                            Yes, but the point is that we want to lower latency, not throughput.

                            1. 3

                              I think 2x throughput will also mean 0.5x latency in this case.

                            2. 1

                              Oof. Looks like that OS doesn’t support users, which might be at least moderately reasonable. But it might not support processes, which might make certain servers difficult to handle. Maybe it silently translates all forks into thread spawns?

                              Anyway, interesting idea, but there are some tradeoffs I’m not yet comfortable with.

                            1. 4

                              Very interesting, thanks for posting this.

                              For someone who hasn’t had the chance to read through all the documentation (yet), what are the main ways Bond differs from Protocol Buffers?

                              1. 9

                                Hey, OP here.

                                The current offerings (Thrift, ProtoBuffs, Avro, etc.) tend to have similar opinions about things like schema versioning, and very different opinions about things like wire format, protocol, performance tradeoffs, etc. Bond is essentially a serialization framework that keeps the schema logic stuff the same, but making the tasks like wire format, protocol, etc., highly customizable and pluggable. The idea being that instead of deciding ProtoBuffs isn’t right for you, and tearing it down and starting Thrift from scratch, you just change the parts that you don’t like, but keep the underlying schema logic the same.

                                In theory, this means one team can hand another team a Bond schema, and if they don’t like how it’s serialized, fine, just change the protocol, but the schema doesn’t need to.

                                The way this works, roughly, is as follows. For most serialization systems, the workflow is: (1) you declare a schema, and (2) they generate a bunch of files with source code to de/serialize data, which you can add to a project and compile into programs that need to call functions that serialize and deserialize data.

                                In Bond, you (1) declare a schema, and then (2) instead of generating source files, Bond will generate a de/serializer using the metaprogramming facilities of your chosen language. So customizing your serializer is a matter of using the Bond metaprogramming APIs change the de/serializer you’re generating.

                              1. 9

                                The short, unhelpful answer is that it depends on what you’re doing.

                                The somewhat longer, more useful answer is that there’s no good way to do this, so you should do it only a couple of times if you can. Concretely, in the OSS web infrastructure world (which is where I am from), you will typically pick a small set of very flexible infrastructure projects that you know really well, and deploy them everywhere, for as many things as you can. It’s easier to locate errors. It’s easier to deploy. It’s simpler to reason about the infrastructure.

                                A concrete example is, if you have a really kickass Hadoop team, then it’s worth it to phrase your problems as MapReduce jobs if you can, even if it’s a slight abuse of Hadoop, because then you can just farm it out to your Hadoop cluster, and your problem is solved incidentally by your Hadoop team. Same goes for Redis, Riak, RabbitMQ, whatever.

                                Another thing to consider is that, in most cases, your team’s competence will limit you much sooner than your stack will. This is another reason to make big infrastructure choices as little as possible: it lets you deal primarily with one issue (your teams competence) rather than two issues (competence AND crazy stack that you don’t understand).

                                1. 4

                                  Reminds me that a coworker at fog creek fixed a bug using reflection once, which is perhaps a shade simpler than rebuilding the whole dll. We reported a bug to MS, they basically sent us here and said the referenced reflection fix was as good we would get for our version of .NET since they only backport the super serious fixes.

                                  1. 6

                                    Author here.

                                    One of the advantages of working at MS (btw, I work at MS) is that I can just bother my friends until they fix it, and then grab the newest version on the release branch!

                                    Or that’s how it would work if I was not an incredibly impatient man.

                                  1. 4

                                    The abstracts are here[1] – it’s sort of hard to tell which talks you’ll like a priori without them.

                                    [1] http://bangbangcon.com/speakers.html

                                    1. 1

                                      Continuation-passing style is a powerful and mind-warping technique that lets code play with its own control-flow (its “future”, so to speak). For example, it lets you elegantly express backtracking search algorithms such as regular expression matching. This curious technique also has deep connections to topics as diverse as compiler optimization, programming language design, and classical versus constructive logic.

                                      I’ve been interested in this for a while but didn’t know the name for it. Thanks.