1. 42
  1.  

    1. 9

      Interesting that E is cited under “capabilities”, but not under “loosen up the functions”. E’s eventual-send RPC model is interesting in a number of ways. If the receiver is local then it works a bit like a JavaScript callback in that there’s an event loop driving execution; if it’s remote then E has a clever “promise pipelining” mechanism that can hide latency. However E didn’t do anything memorable (to me at least!) about handling failure, which was the main point of that heading.

      For “capabilities” and “A Language To Encourage Modular Monoliths”, I like the idea of a capability-secure module system. Something like ML’s signatures and functors, but modules can’t import, they only get access to the arguments passed into a functor. Everything is dependency injection. The build system determines which modules are compiled with which dependencies (which functors are passed which arguments).

      An existing “semi-dynamic language” is CLOS, the Common Lisp object system. Its metaobject protocol is designed so that there are clear points when defining or altering parts of the object system (classes, methods, etc.) at which the result is compiled, so you know when you pay for being dynamic. It’s an interesting pre-Self design that doesn’t rely on JITs.

      WRT “value database”, a friend of mine used to work for a company that had a Lisp-ish image-based geospatial language. They were trying to modernise its foundations by porting to the JVM. He had horror stories about their language’s golden image having primitives whose implementation didn’t correspond to the source, because of decades of mutate-in-place development.

      The most common example of the “value database” or image-based style of development is in fact your bog standard SQL database: DDL and stored procedures are very much mutate-in-place development. We avoid the downsides by carefully managing migrations, and most people prefer not to put lots of cleverness into the database. The impedance mismatch between database development by mutate-in-place and non-database development by rebuild and restart is a horribly longstanding problem.

      As for “a truly relational language”, at least part of what they want is R style data frames.

      1. 4

        Something like ML’s signatures and functors, but modules can’t import, they only get access to the arguments passed into a functor. Everything is dependency injection. The build system determines which modules are compiled with which dependencies (which functors are passed which arguments).

        MirageOS does exactly this with Functoria! I think MirageOS unikernels largely use this pattern to ensure that they can be run on a wide range of targets, be it hypervisors, bare metal, or a Unix binaries, without modifications to the application code, but they could also be seen as a form of capabilities.

        1. 2

          [The CLOS MOP is] an interesting pre-Self design that doesn’t rely on JITs.

          I believe all reasonably fast MOP implementations rely on compiling dispatch and method combination lambdas at runtime, in response to actual method call arguments. When is runtime compilation not just-in-time? ;)

          Of course, with image dumping, it’s possible to warm CLOS caches and save the result to disk, so runtime code generation isn’t mandatory.

          1. 2

            Yeah, maybe I should have qualified JIT with Self-style too :-) The distinction I was getting at is whether compilation happens at class or method mutation time (which is roughly what CLOS is designed for) or at call time (which is the Self / Strongtalk / Hotspot / v8 tradition).

            1. 1

              When you call a CLOS generic with a combination of types that you haven’t used before, it may need to figure out a new effective method, which could involve compiling a new lambda and caching it.

        2. 7

          Regarding E and capabilities, those ideas are still alive! Some of E’s concepts like promises made their way into JavaScript because people like Mark Miller were/are involved in both, and basically the rest of E is being brought to JavaScript with the Endo project. Spritely, the org I work for, is bringing many ideas from E to Scheme with our Goblins project.

          1. 6

            I’ve had success using Julia as a semi-dynamic language. It was designed with dynamic semantics and the capability to JIT compile idiomatic code to efficient machine code. I use it (in production) for both staged programming and dynamically “compiling” user-provided logic at runtime (unlike e.g. C++ plus Lua, the user-provided logic runs at full speed, albeit with JIT latency the first time it runs).

            1. 5

              Julia has one of the most interesting compilation models of recent languages, the way it uses multiple dispatch is really clever. It’s a terrible shame it is so undermined by poor startup performance. (It’s getting better, I think?)

              1. 2

                Definitely getting much better since it now supports precompilation (native code caching). I doubt you’d be troubled at all using it in a Jupyter notebook or REPL for interactive analytics, plotting, experimenting with calling APIs, etc.

                The tradeoff is there is a bit of compilation time when you update a whole bunch of external packages. This experience is a bit like running cargo in rust - it only happens once. (It feels faster to me than rust compilation but its hard to come up with an equivalent to test)

                And you can precompile your app to a binary now (its quite large, but work is being done on intelligent pruning - which will reduce RAM usage since you won’t bring in LLVM, BLAS, etc unless you call for it).

              2. 1

                Also I think Zig might be good for the modular monoliths? It supports informal but verified interfaces for dependency injection, since you don’t define a contract/interface in advance yet it is type-checked however it is invoked.

                People either seem to love that or hate that part of Zig, on the basis of a formalized interface being good developer documentation and appropriate for writing non-monolithic software. I actually think it makes perfect sense in Zig, where you’d probably write a concretely-typed “extern C” interface instead for non-monolithic programs (dynamic linking or whatever) rather than specify trait bounds or whatever.

                1. 1

                  Do you have a link to some description of how that works?

                  1. 4

                    What you are looking for is the anytype keyword. You can allow a function argument to be literally anything. When code is compiled a version of the function for each concrete type passed in is compiled and type-checked using automatic type inference.

                    https://ziglang.org/documentation/0.13.0/#Function-Parameter-Type-Inference

                    Together with the comptime keyword that (amongst other things) lets you work with generic types (parameterized types) you can do basically anything you are used to in other languages with generics.

                    The downside is there are no explicit constraints on what type can be passed into an anytype slot. It’s all implicit from usage and whether the “inferred” function compiles. To see what I mean by “either ove that or hate that part of Zig”, here’s a post from someone who is not a fan:

                    https://typesanitizer.com/blog/zig-generics.html

              3. 5

                I think MUMPS has the global database feature, which was it’s killer feature for a long time.

                1. 3

                  I see capabilities as more of an API feature than something baked into the language. They’re just unforgeable tokens that act as credentials for services. Ordinary objects work well for that, provided the language prevents forging them. (Or if the capabilities are managed by something outside the process, like Unix file descriptors or Mach ports.)

                  The cool thing about E-like languages/libraries is that they support RPC using long-lived capabilities, and the capabilities can be transferred & delegated securely over the network. (But again, this doesn’t need to be part of the language itself. Spritely is implementing it in Scheme. Cap’nProto does part of it in C++ with many language bindings.)

                  1. 3

                    Ordinary objects are indeed (or can be) capabilities. See Rees’s W7, http://mumble.net/~jar/pubs/secureos/secureos.html .

                  2. 3
                    • Loosen up the functions: while I blogged about this six years ago, it was even longer back (around twenty) that I first thought about this. The two axes I identified; one is value (copy) vs. reference (pointer). The other is synchronous (wait for reply) vs. asynchronous (send, continue running). QNX does a synchronous call by value; AmigaOS does asynchronous by reference. The idea is that a function call is similar to a synchronous call (can be either by value or reference depending upon the ABI) and we don’t really have a model (at the CPU level) of an asynchronous function call.

                    • Capabilities: nice in theory, in practice, I hated it. Back in the early 2000s, whenever I install a Linux system, the very first thing I would do is disable SELinux. For me, it caused more problems than it ever solved, probably in my case because it assumed a particular usage of Linux that never matched what I wanted to do. I’m not sure if I would want to use a language with capabilities baked in.

                    • Production-Level Releases: this, and the section A Language To Encourage Modular Monoliths are similar to me. A way to say “I want this functionality, but I don’t care how I get it.” To me, this is what Enterprise Java is all about—FactoryFactoryFactory objects and what not. Or rather, perhaps this is better abstracted as a “protocol” vs. an “interface”. Or perhaps “protocol” and “interface” are similar—I don’t care about the implementation as long as it follows the “protocol/interface”. Thinking this way, it also brings in the Modular Linting section.

                    • Semi-Dynamic Language: I’ve been wondering lately about gradual typing in a language. In my experience in dynamically typed languages, rare is the type of a parameter or variable that changes—it’s usually either a number, or a string, or some collection (array, hash, what have you). It might be nice to have a language where you don’t have to supply the type up front, but as the project matures, you can add types (or have the compiler/IDE inform you of the inferred type via usage). One of my largest complaints against dynamic languages are due to the collection type—it’s hard to enforce that this object is the “user object”.

                    • Value Database: this to me reads as “an initialized data segment of a program that is saved upon exiting”. A type of persist unsigned int blah = 5 and the underlying system knows to load and save this segment (somehow) of the program when running it. One problem I see with this is, like he says, entropy over time. How you do refresh this?

                    • A Truly Relational Language: way back in college, when I took the database course, I recall embedding SQL in C using a special compiler. You could include SQL statements in C, and the compiler would translate the embedded SQL to C. It could also be that I’m misremembering how this worked but I was always curious as to why this wasn’t done more often. Probably because using yet another tool on source code wasn’t seen as a good idea. But the tool (as I recall) would ensure the input was sanitized and map the results of an SQL query to a C structure.

                    • A Language To Encourage Modular Monoliths: see above

                    • Modular Linting: see above

                    1. 15

                      SELinux is not a capability-based system. The important thing about a capability-based system is there is no “ambient authority”, ie, there isn’t something available everywhere (ie, ambient, for example the filesystem API) that gives you access to everything (ie, authority, eg the filesystem root). A capability-based filesystem API such as Capsicum instead uses file descriptors as capabilities, and provides calls like openat(), fchdirat() that access the filesystem relative to an existing capability, instead of using a global namespace (unix style) or a per-process namespace (plan9 style).

                      Inside a programming language, capabilities look like dependency injection instead of module imports. In an OO language, capability-security means you must not have APIs that allow code to traverse the object hierarchy. Everything is sandboxed by default so (in the absence of bugs!) you can mix and match code with different levels of trust in different scopes.

                      1. 6

                        There’s some overloading going on with the word “capability”. SELinux capabilities have nothing to do with the capabilities mentioned in this article that were part of the E language.

                        1. 4

                          Capabilities: nice in theory, in practice, I hated it. Back in the early 2000s, whenever I install a Linux system, the very first thing I would do is disable SELinux

                          That’s a confusing thing to say, because SELinux is a MAC framework, not a capability framework. Most programming languages have things that are almost capabilities: pointers are (approximately) capabilities that grant access to an object. If you don’t hold a pointer to an object, you can’t access the object (unless your language is not memory safe). You cannot fabricate pointers out of thin air (unless the language is unsafe). You can pass a pointer to other functions and grant them access to the object.

                          If you can understand pointers, you can understand capabilities.

                          Most capability systems also have rich permissions on capabilities. For example, being able to hand someone a (deep or shallow) read-only or write-only view of an object is useful. Unlike ACL-based approaches, the permissions are directly associated with the thing that you’re using, so the principle of intentional use is trivial.

                          Capsicum is in a similar problem space to SELinux and is a capability system. It removes ambient authority (no access to global namespaces from your process) and requires you to have a file descriptor to an OS object before you can open it. File descriptors get a richer set of permissions. If you want to open a new file that isn’t in a directory for which you have a descriptor, for example, you send a message to a power box process and that process provides a file dialog and asks the user to select the file, then hands you back a file descriptor. At each step, it’s obvious in the source code what rights you are exercising.

                          Contrast this with Apple’s sandbox framework, which rewrites ACLs in the background to grant you access to things. Will an open system call work? Depends what you did before hand. In contrast, the Capsicum version just hands you a file descriptor and if you have a file descriptor then you can use it.

                        2. 2

                          I’ve laid claim to modular linting

                          1. 2

                            I have some thoughts about the “modular monolith” points.

                            I’d be interested in something that strikes a middle ground; a static language with compile time guarantees, but one where all function parameters are automatically interfaces, even if they are given an “exemplar” type in their type signature.

                            I think languages with a structural type system kind of get you there.

                            In TypeScript if you define a function that takes a Rectangle, even if that Rectangle is defined as a class in an external library you could technically pass in any object that satisfies the interface that the Rectangle class defines. One of the issues with this approach is that even if the function only uses methods a, b and c of that interface, you’re gonna have to define all the methods the interface defines to satisfy the type checker.

                            The author argues that this could be solved by extracting the subset of the methods that are used by the function into a bespoke interface at type checking time. To me that sounds like a recipe for extremely long compile times (it reminds me of Swift’s problems with overloading) and inscrutable inference errors, but it’s an interesting idea.

                            Another issue with TypeScript’s (and basically every other language out there’s) approach is that there’s no separation between the type interface and implementation when interacting with other modules. To get something like what the author wants only the interfaces should be available globally, while the concrete objects that satisfy those interfaces should only be available… at the entry point, I suppose? If you wanted to do this at a global level there would have to be clear boundaries over who can provide the implementation for something.

                            Still, I’ve been dreaming of a language with a more ergonomic approach to ML-style functors: each module could declare an “abstract” dependency on a set of module interfaces at the top level where normal imports would be, and importers would be able to either pass that dependency through to the next importer or fill it in with a concrete implementation. Maybe libraries could even specify configuration parameters that could result in something like conditional compilation this way.

                            1. 2

                              Production one: erlang does 90% of that. They have the release and logging. We have open telemetry nearly there for metrics and structured logging. Packaging is there

                              1. 1

                                Capabilities have also made their way into the WebAssembly WASI interface.

                                1. 1

                                  Re “Modular linting”, I did a deeper dive on the underlying ideas here: https://neugierig.org/software/blog/2022/01/rethinking-errors.html

                                  For the reasons described there, it seems obviously correct to me to instead fold linting into the language tooling, much in the same way providing a standard library helps every other library agree on what a String is.

                                  1. 2

                                    That’s an interesting post. I think it’s missing a discussion of what to do when shipping source code to others. In that situation you lose control over which version of the language is used: it might be a different implementation with a different collection of warnings, or it might be a future or past version that you haven’t adapted your warning configuration for. That’s one of the big reasons for having a separate dev build or lint rule that’s much stricter than the normal build for end users.