1. 3

    Everything old is new again; I’m heavily reminded of Zephyr ASDL, which Oil, Python, and Monte all use for AST schemata. It seems like there should be a simple adapter from Preserves to ASDL.

    1. 2

      Yes, Zephyr ASDL was something I looked into when thinking about schemas. It was definitely an influence. Another strong influence is RELAX NG, though I’ve not properly lifted the stuff about sequencing and interleave from that latter yet! One difference that I can see between Preserves schemas and Zephyr is that Preserves schemas include not only a description of (roughly) algebraic types, but also a (kind of invertible) connection between those types and surface syntax.

      So for example NamedAlternative = [@variantLabel string @pattern Pattern] in the metaschema matches a two-element sequence (array) with a string in slot 0 and something that parses to a Pattern in slot 1, but the parse of a NamedAlternative results in a record containing a string-valued “variantLabel” field and a Pattern-valued “pattern” field; whereas NamedSimplePattern_ = <named @name symbol @pattern SimplePattern> again results in a record with a “name” and a “pattern” field, but matches records with label named that have two fields parsing as a symbol and a SimplePattern respectively. And serializing the two types produces syntax that parses back to them.

      It’d probably be straightforward to extract something very close to an ASDL definition from a Preserves schema.

    1. 2

      It would be great if this article and the one on Preserve’s data model had more information on the rationale and goals, as it’s not immediately apparent what problem Preserve is addressing and what makes it different from the rest. Looks interesting, though!

      1. 2

        Thanks! That’s a comment that keeps coming back, it’s good feedback. I’ll try to write something up on the motivation.

      1. 5

        Can it be less ambient?

        A recent topic in capability theory is whether Dataspaces’ way of modeling ambient shared scratchpads (the “data spaces” themselves) needs to have its exemption from capability-safety. At the same time, a common complaint of systemd/D-Bus/GNOME/etc. is that there is an ambient message bus which has many powerful clients connected by default. These have the same taste to me, when imagining a capability-aware language being used to implement PID 1; is there some ambient authority that could be removed here?

        I might well not understand Dataspaces, and I’m happy to learn more about your plan.

        1. 3

          Absolutely. So the theory up until recently had exactly one dataspace as execution context & communications medium for each actor in a tree.

          But now I’ve reworked things to include capabilities, I’ve also moved away from that perspective. Now a dataspace is an object-in-a-vat like any other. Capabilities secure access to dataspaces or to any other object-in-a-vat.

          There’s no longer exactly one privileged dataspace per actor. My early impressions of this new style of “syndicated actor” programming are that it will lead to many more smaller more tightly-focussed task- or domain-specific dataspaces interconnected in a loose web, within and among machines.

          Programs connect to a server and upgrade access from Macaroon-like datastructures (basically sturdyref-like) to more ephemeral references.

          There’s a little (completely undocumented) proof-of-concept in https://git.syndicate-lang.org/syndicate-lang/novy-syndicate.

          Hang on, I’ll do a quick screencast and post it here.

          1. 3
        1. 2

          This is a very nice project! Now I feel very curious about what a deamon soup looks like in Syndicate-lang, and about the extension of object-capabilities to syndicated actors. The latter may have suggestions for many system designs outside the current scope of your project – I was just reading about Matrix Spaces which seem to be trying to couple spatial intuitions with access/permissions, and I wonder what Chris Webber wonderings about ocaps in the Fediverse.

          1. 1

            Yes indeed! I’m excited to find out what it will be like.

            I’ve actually been discussing all this stuff somewhat regularly with Chris Webber. I really like his stuff and our discussions are always useful and interesting.

          1. 2

            Thanks @gasche :-)

            Hi, I’m the author, AMA!

            1. 2

              Self taught dev here. I’ve been really enjoying reading your dissertation but I’m getting stuck at the type theory. What’s out there for getting up-to-speed on how to read CS proofs?

              1. 3
                1. 2

                  Hi! @gasche’s recommendation is solid, and I’d also like to recommend the Redex book [1] [2], the first half of which is a course in modern small-step operational semantics. That book, plus the standard type systems text [3], should be heaps to be getting on with :-)


                  [1] Felleisen, Matthias, Robert Bruce Findler, and Matthew Flatt. Semantics Engineering with PLT Redex. Cambridge, Massachusetts: MIT Press, 2009. https://mitpress.mit.edu/books/semantics-engineering-plt-redex

                  [2] https://redex.racket-lang.org/

                  [3] Pierce, Benjamin C. Types and Programming Languages. MIT Press, 2002. https://www.cis.upenn.edu/~bcpierce/tapl/

              1. 2

                Can’t we just implement actors inside the linux kernel ?

                1. 2

                  Sure; we can model actors as Linux processes. This gives us the mutable state, isolated turn-taking and I/O, and ability to connect to other actors.

                  In terms of improving the security around each process so that they behave more like isolated actors, Capsicum/CloudABI was a possibility and is available on e.g. FreeBSD, but on Linux, eBPF is the API that folks are currently using.

                  1. 2

                    This is the complaint I commonly hear about processes in linux - it allocates too much memory … can that be solved ? Assuming it allocates even 1kb you can’t run millions of actors. Supervising is also tricky it seems.

                    1. 3

                      Yes, the kernel isn’t designed for millions of processes. (Though it is only software, so it could be changed…) One approach I find interesting is recursive layering of actor systems: where a runtime-for-actors is itself an actor. This gives you a tree of actors. With a non-flat system, the vanilla actor model of addressing and routing doesn’t work anymore, so you have to do something else; that’s part of what motivated my research (which in turn led to this Java actor library…). But it does solve the issue of not being able to run millions of actors directly under supervision of the kernel. You’d have one actor-process acting (heh) as an entire nested actor system with millions of actor-greenthreads within it.

                      (Edited to add: 1kb * 1 million = 1 gigabyte. Which doesn’t seem particularly excessive.)

                1. 5

                  This core library is very similar to the promise and vat cores of E, Monte, etc. I am not really surprised that it is relatively small and simple. The difficulty will come when wiring up vats to I/O, and converting blocking I/O actions into intra-vat actions which don’t mutate actors.

                  1. 3

                    What difficulties did you have mind? I’ve done it before, largely following the Erlang playbook, and didn’t have any particular trouble. It does mean that users of the system have to really buy in to the Actor style - naively using java.net.Socket won’t work well - but that’s rather a benefit than a drawback :-)

                    1. 6

                      Many issues come to mind. It’s quite possible that we overcomplicated the story of I/O in Monte, and of course our reference implementation is horribly messy, like a sawblade caked with sawdust. I don’t think that you have to deal with any of these, but they all came up in the process of making a practical flavor of E, so I figure that they’re worth explaining.

                      • A few community members wanted Darwin and Windows support, so rather than writing my own glue over ((RPython’s version of) Python’s version of) BSD sockets, I bound libuv. This probably was a mistake, and I constantly regret it. But we don’t have to write many reactors/event-loops, so it’s a wash.
                      • We have a rather disgusting amount of glue. IPv4 and IPv6 are distinct; sockets, files, and pipes are all streams but with different implementation details.
                      • I/O needs to be staged. For example, on file resources, the method .getContents() :Vow[Bytes] will atomically read a file into memory, returning a promise for a bytestring, but the underlying I/O might need to be broken up into several syscalls and the intermediate state has to live somewhere. Our solution to this was a magic with io: macro system which turns small amounts of Python into larger amounts of Python, adding state storage and error-handling.
                      • Backpressure isn’t free. Our current implementation of perfect backpressure requires about 900 lines of nasty state machines, promise-routing, and error handling (in Monte) and mostly it makes slowness.
                      • I/O must happen between vat turns. This is probably obvious to you, but it wasn’t obvious to us that we can’t just have a vat where all the I/O happens, and instead we have to explicitly interleave I/O. The scheduler I wrote is a baroque textbook algorithm which can gallop a bit but is not great at throughput.

                      But there’s also thoughts on how to make it better.

                      • We used to joke that TLS is difficult but stunnel is easy. Similarly, HTTP is hard, but shelling out to curl is easy. Maybe we should use socat instead of binding many different pipe-like kernel APIs. After all, if we ever have Capsicum or CloudABI support, then we’ll have all I/O multiplexed over a single pipe anyway.
                      • We can do per-platform support modules, and ask community members to be more explicit in indicating what they want. If we stopped using libuv, then I think that we could isolate all of the Darwin-specific fixes.
                      • Because Monte is defined as a rich sugar on a kernel language, including high-level promise-handling idioms, we can redefine some of those idioms to improve the semantics of promises. We’ve done it before, simplifying how E-style when-blocks are expanded.
                      1. 2

                        Thanks for these! Very interesting. And looking at the Monte website, there’s a lot for me to learn from there too.

                        Re: backpressure: good generic protocols for stream-like interactions are, it seems to me, still an open research area. So far I’ve had good-enough results with explicit credit-based flow control, and circuit-breaker-like constructs at system boundaries, but it still feels pretty complex and I find it easy to get it wrong.

                        Re: I/O staging: in the Erlang world, there’d likely be a new actor for managing the state of the transfer. Is this kind of approach not suitable for Monte?

                        Re: I/O and turns. My approach to this is that “real” I/O is managed by special “device driver” actors(/vats), like Erlang, but that the I/O calls themselves actually take place “outside” the actor runtime. Special gateway routines are needed to deliver events from the “outside world” into the actor runtime. In the OP Java library, the “gateway” is to schedule a Turn with Turn.forActor() (if you don’t have a properly-scoped old Turn object already) or myTurn.freshen() (if you do).

                        1. 3

                          I/O staging: in the Erlang world, there’d likely be a new actor for managing the state of the transfer. Is this kind of approach not suitable for Monte?

                          It probably would work in a hypothetical future Monte-in-Monte compiler which could replace various low-level data objects (Double in particular) with pure-Monte implementations. In that world, we’d have to generically solve the state-storage problem as a function of memory layouts, and then we could implement some complex I/O in terms of simpler I/O.

                          Thanks for your hard work. I like learning from other actor systems.

                  1. 5

                    As ever, Homoiconicity isn’t the point. Automated, scriptable refactoring tools, though - those are nice! Good to see an example here from Clojure land. Smalltalk also has (or, can have) good support for such things. For an example, see the library that underpins the RefactoringBrowser in Squeak Smalltalk, the Refactoring Engine. You can use it for ad-hoc refactorings from a Workspace window.

                    1. 2

                      Super cool! I have wanted to get into ST recently. At the moment the thing that most prevents me is not having a recent vm on OpenBSD.

                      I did pick up an M1 Mac recently though, maybe working under Rosetta will be fast enough.

                      1. 2

                        Looks like Cog can be built for OpenBSD: https://github.com/OpenSmalltalk/opensmalltalk-vm/issues/413

                        I haven’t tried it myself! But there’s a screenshot in that thread that shows Squeak running, so it could be a promising line of investigation.

                        Also, the aarch64 build of Cog works pretty well. (Not sure about M1 specifically.) The aarch64 build of Cog is what’s driving squeak-on-a-phone.

                        1. 1

                          Oh awesome! ty for digging this up! Last I knew it took a bunch of patches :D