It seems to me that the motivation for using capabilities and effect
systems overlap, but the two
communities don’t seem to be very well aware of each other?
I hadn’t thought of the overlaps, but I think you’re right. Both are mechanisms for expressing the set of side effects that a piece of code can have. With an effects system, you cannot call anything that you do not, which is similar to the narrowing aspects of capabilities. Capabilities can be captured though, which is probably hard to express in an effects system. For example, a TLS layer may hold the capability to make network connections but the caller has only the capability to make TLS connections. This lets you express ideas like ‘this component may make a TLS connection to this server but may not make unencrypted connections anywhere’. In an effects system, the fact that the component will, transitively, be able to create network connections would have to be modelled somehow. I’m not overly familiar with modern effects systems but I haven’t seen one that expresses this notion of narrowing and widening, the only ones I’ve seen have only narrowing operations.
More generally, I believe capabilities are designed to allow modelling mutual distrust relationships but effects systems have a hierarchical trust model. An effects system designed to support mutual distrust would be interesting.
It’s standard folklore in the community that simply-typed lambda calculi are capability-safe. (There’s most of a proof in Mark Miller’s thesis.) It’s also proven that every Turing-complete lambda calculus has a simply-typed lambda calculus in its core, with the property that the provably-total programs in the Turing-complete calculus are all implementable in the simply-typed core. For example, the untyped lambda calculus can be restricted to simple typing, and all of the expressible programs are total polymorphic combinators. (See references for nLab on restriction categories for more details; sorry for not putting more text on that page!)
This means that, when we think of a statically-typed functional programming language as having capabilities, we should think that its simply-typed core is capability-safe by default. It only loses capability-safety when we add various features without carefully proving that they don’t provide new ways of forging capabilities. IO, FFI, and most of the effects that people want to use in practice; all tend to violate capability-safety unless tamed.
Yeah, it’s interesting! The author of Crochet seems interested in that at least. The language supports both effects and capabilities, but I’m not sure how much these systems actually overlap (more info can be found in the reference).
Scala 3’s experimental capture checking feature, along with its (in)famous implicit parameters, provides a way to create an effect system through the tracking of capability objects
Yeah, this connection occurred to me a few weeks ago. It might provide a way to simplify the calling conventions so code using capabilities didn’t have to explicitly pass the capability as a parameter to every function that needs it … which is, for example, one of my gripes with Zig when it comes to memory allocators.
The down side of this is that you lose the principle of intentionality, which is one of the two big wins for capabilities. It’s not enough that the caller has the right to perform some action, it is specifically giving you the right to do so. For example, a component that has a capability to perform filesystem access may wish to delegate access to a temporary directory or read-only access to the configuration file location. Implicit delegation would lead to it passing access to the whole filesystem, which breaks the other big win from capability systems: the principle of least privilege.
For example, a component that has a capability to perform filesystem access may wish to delegate access to a temporary directory or read-only access to the configuration file location. Implicit delegation would lead to it passing access to the whole filesystem […]
Would it? I’ll use the notation _*_ where the first blank is the capability and the second is the return type.
foo : FileSystem * ()
foo = bar
bar : Read ./config * ()
This can be made to typecheck as long as Read is a subeffect of FileSystem. Or did you mean something else?
That works if you can express the intention in the filesystem, but it’s likely to lead to complex APIs. I’m not sure I understand your syntax, is bar indicating that it requires a capability to read the ./config directory? If so, that means I’m now hard coding my config file path into my type system (what happens if I want a command-line flag to specify it?) and also means that this capability is a different type. I now need some subtyping relationship between it and the kind of thing that I can pass to my TOML/UCL/JSON/whatever library to actually open and parse the file.
I think we only need a subeffect/subcapability relation (left handside of the *-notation), not subtype relation, which should be a simpler problem.
(Btw, the E * X notation comes from free monads and is also inspired by the Kleene star from regex, i.e. use zero or more operations from some effect signature (given by some functor E) and then return some type X.)
I think I’m going to need a lot more explanation than your three short comments to be able to understand what you mean. Would you mind writing up a commented and explained example? It sounds like you might have an interesting model but I’m lacking a lot of the context that I’d need to be able to comment intelligently on it.
The main idea is as old as universal algebra. Some signature describes your
operations, e.g. group/ring operations, and from this signature one can freely
generate the syntax of the theory. By quotenting the syntax by the axioms of the
theory we get groups, rings, etc.
We can do the same for effects in a functional programming language. Lets take
the filesystem example, with the operations Open and Read. The signature
would be:
data FileSystem x
= Open FilePath (Handle -> x)
| Read Handle (String -> x)
(Note that Handle -> x is an exponent, which can also be written as
x^Handle, i.e. Handle is the arity of the operation.)
We can then create the free term algebra aka free monad of this signature as
follows:
data e * x = Return x | Op (e (e * x))
Think of this as a tree with xs in the leaves and e-operations in the nodes
(with the arity being the branching factor).
To make programming a bit easier we can define:
open : FilePath -> FileSystem * Handle
open fp = Op (Open fp Return)
read : Handle -> FileSystem * String
read h = Op (Read h Return)
Also note that * is a monad, which lets us combine computations using:
_>>=_ : E * X -> (X -> E * Y) -> E * Y
Finally we define a semantics/algebra for our syntax which gives it meaning:
alg : FileSystem (IO x) -> IO x
alg (Open fp) k = cOpen fp >>= k
alg (Read h) k = cRead h >>= k
This is used to fold/iterate over our syntax tree to actually perform the
filesystem IO, cOpen and cRead here are FFI bindings or primitives of the
RTS.
Higher-level APIs can be built using refinement mappings between free monads.
So far all of the above is pretty standard functional programming, see for
example Sweistra’s Data types à la
carte
(2008).
Now adding implicit casts between subeffects, i.e. weaken : E <= F -> F * X -> E * X shouldn’t be much harder than having safe implict casts between numeric
types. It would require language level support though, it’s not something you
can tuck on as an afterthought (just like safe implicit casts between numeric
types).
Regarding restricting Read to some file, it can perhaps be done using
dependent types as follows:
data Read (fp : FilePath)(X : Type) where
open : (fp' : FilePath)(proof : fp == fp') -> (Handle -> X) -> Read fp X
The proof would need to be supplied when opening the file, but it should be
trivial assuming the filepaths are in fact equal.
Because capabilities are linear, they are not copyable.
Capabilities were originally studied in the Cartesian-closed context, where they are copyable. Today, cryptographic capabilities, which are difficult-to-guess (“unguessable”) rather than unforgeable or uncopyable, are commonplace.
Capabilites are, by default, irrevocable.
This isn’t a limitation, but a good design choice. Revoking any sort of built-in capability is something that must be done via runtime API, rather than inside user code; users can build revocable tokens themselves using a single mutable cell, and Capability Myths Demolished claims that this pattern was known in 1974, around the time of the Lambda Papers.
Every programming language needs a escape hatch. When interacting with the outside world, you have to do unsafe things.
One of the goals of E was to show that this isn’t true. In Monte, a flavor of E that I helped build, there are no escape hatches; every foreign and unsafe feature of the outside world is carefully given a handcrafted wrapper, a process known as taming. We didn’t implement FFI. Network access, filesystem access, system timers and clocks, cryptographic routines, subprocessing, even examining caught exceptions was privileged, and capabilities have to be explicitly imported. The least safe thing in the prelude was access to the unit-test and benchmark runners!
I want some kind of capability system for JS so bad…. just… by default don’t let code do anything and I have to explicitly pass in (or otherwise grant) capabilities for network access, fs access, etc. Even a rough, flawed system would be a huge boon. Do you have any recommendations?
You want to search for “Secure ECMAScript”, often just called SES. E’s authors have been steadily improving ECMAScript for nearly a decade, too. Outside of SES, common style improvements are actually also capability-safety improvements: Use modules, freeze immutable objects, ponder WeakMap, etc.
There was Caja developed by Mark Miller at Google. I gather it failed because it was too difficult to tame the browser APIs into a capability-secure subset.
Deno’s permissions seem to be too coarse to me. AIUI they apply to the entire program, so they don’t help with things like restricting third party libraries.
I’m not a JS expert, but given how dynamic it is, it might not be possible to create and enforce real capabilities in it (that sort of concern was raised in the article too.)
Instead of allowing arbitrary FFI calls, though, we simply removed FFI altogether, and the capabilities can only enter the execution space via anonymous injection. I may have missed something in the linked article, but I just don’t see the purpose of having a capabilities model when the code can still use an FFI.
I just don’t see the purpose of having a capabilities model when the code can still use an FFI.
You might be interested in the Verona sandbox code. This is one of my goals for Verona: there may be no unsafe escape hatch. This comes from the Chrome team’s rule of two, which says that code may be no more of two out of written in an unsafe language, processing untrusted data, or unsandboxed. I want to take this further and say it must be either sandboxed or type-safe. This project has the baseline, which works on Linux and FreeBSD, to run a library in an unprivileged context with some nice affordances. The library has no access to any global namespace but can access files or sockets if explicitly provided by the trusted code. If you integrate this with a safe language runtime such that the only way of accessing global namespaces in the safe code is via a capability mechanism then you can flow these things all of the way through the system into FFI.
This is very interesting, and I’ve thought along similar lines (language features to mitigate supply chain vulnerabilities). I’ve written about that here, more from the context of a package manager but covering language features as well. I considered purity (ie lack of side effects) a powerful defence but hadn’t considered capabilities/effect systems, which are basically a more fine grained version of purity.
I’m not sure how fundamental linear types are to the capability system in Austral. It sounds like all you need is a way to hide type constructors? I don’t understand why it would be bad if you could, for example, duplicate a NetworkCapability value.
They need to be linear types if you want to use them as “conch shell” that gives access to an underlying system resource. If they could be duplicated then there’s no way to know that there aren’t multiple logical threads with access to it.
I can understand a resource handle being linear, and I guess you’re saying that you might want to wrap up a handle and its associated capability into a single object, which would need to be linear. But that still doesn’t require the capability to be linear - the combined handle-capability pair would inherit linearity from the handle.
If the capability is a linear type, you know whoever presents it has the right to that exclusive resource and you don’t need any further locking. If capabilities can be duplicated, you need locking.
It seems to me that the motivation for using capabilities and effect systems overlap, but the two communities don’t seem to be very well aware of each other?
I hadn’t thought of the overlaps, but I think you’re right. Both are mechanisms for expressing the set of side effects that a piece of code can have. With an effects system, you cannot call anything that you do not, which is similar to the narrowing aspects of capabilities. Capabilities can be captured though, which is probably hard to express in an effects system. For example, a TLS layer may hold the capability to make network connections but the caller has only the capability to make TLS connections. This lets you express ideas like ‘this component may make a TLS connection to this server but may not make unencrypted connections anywhere’. In an effects system, the fact that the component will, transitively, be able to create network connections would have to be modelled somehow. I’m not overly familiar with modern effects systems but I haven’t seen one that expresses this notion of narrowing and widening, the only ones I’ve seen have only narrowing operations.
More generally, I believe capabilities are designed to allow modelling mutual distrust relationships but effects systems have a hierarchical trust model. An effects system designed to support mutual distrust would be interesting.
The OCaml eio library seems to be in the overlap. See also a recent blog post on Lambda Capabilities by one of the eio authors.
It’s standard folklore in the community that simply-typed lambda calculi are capability-safe. (There’s most of a proof in Mark Miller’s thesis.) It’s also proven that every Turing-complete lambda calculus has a simply-typed lambda calculus in its core, with the property that the provably-total programs in the Turing-complete calculus are all implementable in the simply-typed core. For example, the untyped lambda calculus can be restricted to simple typing, and all of the expressible programs are total polymorphic combinators. (See references for nLab on restriction categories for more details; sorry for not putting more text on that page!)
This means that, when we think of a statically-typed functional programming language as having capabilities, we should think that its simply-typed core is capability-safe by default. It only loses capability-safety when we add various features without carefully proving that they don’t provide new ways of forging capabilities. IO, FFI, and most of the effects that people want to use in practice; all tend to violate capability-safety unless tamed.
Yeah, it’s interesting! The author of Crochet seems interested in that at least. The language supports both effects and capabilities, but I’m not sure how much these systems actually overlap (more info can be found in the reference).
Scala 3’s experimental capture checking feature, along with its (in)famous implicit parameters, provides a way to create an effect system through the tracking of capability objects
Yeah, this connection occurred to me a few weeks ago. It might provide a way to simplify the calling conventions so code using capabilities didn’t have to explicitly pass the capability as a parameter to every function that needs it … which is, for example, one of my gripes with Zig when it comes to memory allocators.
The down side of this is that you lose the principle of intentionality, which is one of the two big wins for capabilities. It’s not enough that the caller has the right to perform some action, it is specifically giving you the right to do so. For example, a component that has a capability to perform filesystem access may wish to delegate access to a temporary directory or read-only access to the configuration file location. Implicit delegation would lead to it passing access to the whole filesystem, which breaks the other big win from capability systems: the principle of least privilege.
Would it? I’ll use the notation
_*_
where the first blank is the capability and the second is the return type.This can be made to typecheck as long as
Read
is a subeffect ofFileSystem
. Or did you mean something else?That works if you can express the intention in the filesystem, but it’s likely to lead to complex APIs. I’m not sure I understand your syntax, is bar indicating that it requires a capability to read the ./config directory? If so, that means I’m now hard coding my config file path into my type system (what happens if I want a command-line flag to specify it?) and also means that this capability is a different type. I now need some subtyping relationship between it and the kind of thing that I can pass to my TOML/UCL/JSON/whatever library to actually open and parse the file.
One way to solve this would be via dependent types, i.e.
bar : (config : FilePath) -> Read config * ()
.I think we only need a subeffect/subcapability relation (left handside of the *-notation), not subtype relation, which should be a simpler problem.
(Btw, the
E * X
notation comes from free monads and is also inspired by the Kleene star from regex, i.e. use zero or more operations from some effect signature (given by some functorE
) and then return some typeX
.)I think I’m going to need a lot more explanation than your three short comments to be able to understand what you mean. Would you mind writing up a commented and explained example? It sounds like you might have an interesting model but I’m lacking a lot of the context that I’d need to be able to comment intelligently on it.
The main idea is as old as universal algebra. Some signature describes your operations, e.g. group/ring operations, and from this signature one can freely generate the syntax of the theory. By quotenting the syntax by the axioms of the theory we get groups, rings, etc.
We can do the same for effects in a functional programming language. Lets take the filesystem example, with the operations
Open
andRead
. The signature would be:(Note that
Handle -> x
is an exponent, which can also be written asx^Handle
, i.e.Handle
is the arity of the operation.)We can then create the free term algebra aka free monad of this signature as follows:
Think of this as a tree with
x
s in the leaves ande
-operations in the nodes (with the arity being the branching factor).To make programming a bit easier we can define:
Also note that * is a monad, which lets us combine computations using:
Finally we define a semantics/algebra for our syntax which gives it meaning:
This is used to fold/iterate over our syntax tree to actually perform the filesystem IO,
cOpen
andcRead
here are FFI bindings or primitives of the RTS.Higher-level APIs can be built using refinement mappings between free monads.
So far all of the above is pretty standard functional programming, see for example Sweistra’s Data types à la carte (2008).
Now adding implicit casts between subeffects, i.e.
weaken : E <= F -> F * X -> E * X
shouldn’t be much harder than having safe implict casts between numeric types. It would require language level support though, it’s not something you can tuck on as an afterthought (just like safe implicit casts between numeric types).Regarding restricting
Read
to some file, it can perhaps be done using dependent types as follows:The proof would need to be supplied when opening the file, but it should be trivial assuming the filepaths are in fact equal.
Capabilities were originally studied in the Cartesian-closed context, where they are copyable. Today, cryptographic capabilities, which are difficult-to-guess (“unguessable”) rather than unforgeable or uncopyable, are commonplace.
This isn’t a limitation, but a good design choice. Revoking any sort of built-in capability is something that must be done via runtime API, rather than inside user code; users can build revocable tokens themselves using a single mutable cell, and Capability Myths Demolished claims that this pattern was known in 1974, around the time of the Lambda Papers.
One of the goals of E was to show that this isn’t true. In Monte, a flavor of E that I helped build, there are no escape hatches; every foreign and unsafe feature of the outside world is carefully given a handcrafted wrapper, a process known as taming. We didn’t implement FFI. Network access, filesystem access, system timers and clocks, cryptographic routines, subprocessing, even examining caught exceptions was privileged, and capabilities have to be explicitly imported. The least safe thing in the prelude was access to the unit-test and benchmark runners!
I want some kind of capability system for JS so bad…. just… by default don’t let code do anything and I have to explicitly pass in (or otherwise grant) capabilities for network access, fs access, etc. Even a rough, flawed system would be a huge boon. Do you have any recommendations?
You want to search for “Secure ECMAScript”, often just called SES. E’s authors have been steadily improving ECMAScript for nearly a decade, too. Outside of SES, common style improvements are actually also capability-safety improvements: Use modules, freeze immutable objects, ponder WeakMap, etc.
There was Caja developed by Mark Miller at Google. I gather it failed because it was too difficult to tame the browser APIs into a capability-secure subset.
That’s part of the elevator pitch for Deno.
Deno’s permissions seem to be too coarse to me. AIUI they apply to the entire program, so they don’t help with things like restricting third party libraries.
I agree, but it’s an important first step.
I’m not a JS expert, but given how dynamic it is, it might not be possible to create and enforce real capabilities in it (that sort of concern was raised in the article too.)
A capabilities-based approach is the security model that we selected for Ecstasy: https://xtclang.blogspot.com/2019/04/on-god-turtles-balloons-and-sandboxes.html
Instead of allowing arbitrary FFI calls, though, we simply removed FFI altogether, and the capabilities can only enter the execution space via anonymous injection. I may have missed something in the linked article, but I just don’t see the purpose of having a capabilities model when the code can still use an FFI.
You might be interested in the Verona sandbox code. This is one of my goals for Verona: there may be no unsafe escape hatch. This comes from the Chrome team’s rule of two, which says that code may be no more of two out of written in an unsafe language, processing untrusted data, or unsandboxed. I want to take this further and say it must be either sandboxed or type-safe. This project has the baseline, which works on Linux and FreeBSD, to run a library in an unprivileged context with some nice affordances. The library has no access to any global namespace but can access files or sockets if explicitly provided by the trusted code. If you integrate this with a safe language runtime such that the only way of accessing global namespaces in the safe code is via a capability mechanism then you can flow these things all of the way through the system into FFI.
This is very interesting, and I’ve thought along similar lines (language features to mitigate supply chain vulnerabilities). I’ve written about that here, more from the context of a package manager but covering language features as well. I considered purity (ie lack of side effects) a powerful defence but hadn’t considered capabilities/effect systems, which are basically a more fine grained version of purity.
I’m not sure how fundamental linear types are to the capability system in Austral. It sounds like all you need is a way to hide type constructors? I don’t understand why it would be bad if you could, for example, duplicate a NetworkCapability value.
They need to be linear types if you want to use them as “conch shell” that gives access to an underlying system resource. If they could be duplicated then there’s no way to know that there aren’t multiple logical threads with access to it.
I can understand a resource handle being linear, and I guess you’re saying that you might want to wrap up a handle and its associated capability into a single object, which would need to be linear. But that still doesn’t require the capability to be linear - the combined handle-capability pair would inherit linearity from the handle.
If the capability is a linear type, you know whoever presents it has the right to that exclusive resource and you don’t need any further locking. If capabilities can be duplicated, you need locking.
OT, but admiring the author’s cojones in using such an out-of-vogue syntax as Ada’s as inspiration for his language.
There’s an easy way for the language to make terminal access unique: it could define
RootCapability
as a struct:where
TerminalCapability
is a linear type. This does require all of the top-level capabilities to be known up front.