I thought codata was related to corecursion, in fact serving a similar purpose to Python generators. My notion of codata comes from the Idris construct of that name:
codata Stream : Type -> Type where
(::) : (e : a) -> Stream a -> Stream a
This gets translated into the following by the compiler.
data Stream : Type -> Type where
(::) : (e : a) -> Inf (Stream a) -> Stream a
codata is indeed related to corecursion. Let’s say you have a functor F and let’s look at two things in category theory. F maps the object X to F(X) and by inversion of the arrow we also have the mapping F(X) -> X.
Let’s look at a simple functor like F_A(X) = 1 + A x X. It defines the algebra for lists over A when we consider F_A(List_A) -> List_A, we either have an empty list (nil or 1 in the notation) or pair of an element and a list that we can cons, these are called the constructors and we use them for data. The other direction of the arrow is X -> F_A(X)and defines the stream over A with termination StreamT -> 1 + A x StreamT. In this direction we observe the stream ending (1) or we get the head and tail (A x StreamT), the cocostructors are used to observe the stream and indeed they are codata. The idea is that data and codata are the two directions of the functor relation!
Computer scientists are well-versed in dealing with data structures. The same cannot be said about their dual: codata. Even though codata is pervasive in category theory, universal algebra, and logic, the use of codata for programming has been mainly relegated to representing infinite objects and processes. Our goal is to demonstrate the benefits of codata as a general-purpose programming abstraction independent of any specific language: eager or lazy, statically or dynamically typed, and functional or object-oriented. While codata is not featured in many programming languages today, we show how codata can be easily adopted and implemented by offering simple inter- compilation techniques between data and codata. We believe codata is a common ground between the functional and object-oriented paradigms; ultimately, we hope to utilize the Curry-Howard isomorphism to further bridge the gap.
Induction gives us a way to define larger and larger objects, by building them up; but we need to start somewhere (a base case). A classic example is the successor : Nat -> Nat which constructs a natural number that is one larger than its input; and zero : Nat is the base case we start with.
Coinduction gives us a way to define objects by how we can “destruct” them into smaller parts, or alternatively by the ways that it can be used/observed. The interesting thing is that coinduction doesn’t need a base case, so we can define infinite objects like streams, or infinite binary trees, or whatever. We can still have base cases if we like, e.g. we can make a corecursive list with nil and cons, but in that case we can’t assume that every list will end in nil. As an example, recursive types in Haskell are coinductive: they might be infinite (they might also contain errors or diverge, since Haskell is unsound).
Recursive (total) functions require a base case, since that’s the point where they return a value. Corecursive functions don’t require a base case, as long as they are “productive”, defining some part of the return value. For example, a corecursive function returning a stream can define one element and recurse for the rest; this will never end, but each part of the return value can be produced in some finite amount of time (i.e. it doesn’t diverge).
AFAIK Python generators are inherently linear, like a stream; e.g. they can’t define an infinite binary trie like we could in Haskell (data Trie = Node Trie Trie). Another wrinkle is that Python uses side-effecting procedures, rather than pure, total functions; so we might want to consider those effects as part of the “productivity”. For example, we might have a recursive procedure which has no base case and never returns any value, but it causes some effect on each call; that can still be a useful thing to do, so we might count it as productive even though it diverges from the perspective of calculating a value. (I don’t think Python eliminates tail-calls, so maybe Scheme would be a better example for recursing without a base case, since it is then essentially the same as iteration)
I thought codata was related to corecursion, in fact serving a similar purpose to Python generators. My notion of codata comes from the Idris construct of that name:
This is similar to what Wikipedia has to say:
Are these two separate notions with the same name or am I missing a connection here? Granted I kind of skimmed both texts.
codata is indeed related to corecursion. Let’s say you have a functor F and let’s look at two things in category theory. F maps the object X to F(X) and by inversion of the arrow we also have the mapping
F(X) -> X
.Let’s look at a simple functor like
F_A(X) = 1 + A x X
. It defines the algebra for lists over A when we considerF_A(List_A) -> List_A
, we either have an empty list (nil
or 1 in the notation) or pair of an element and a list that we can cons, these are called the constructors and we use them fordata
. The other direction of the arrow isX -> F_A(X)
and defines the stream over A with terminationStreamT -> 1 + A x StreamT
. In this direction we observe the stream ending (1) or we get the head and tail (A x StreamT), the cocostructors are used to observe the stream and indeed they are codata. The idea is that data and codata are the two directions of the functor relation!Great, now it makes more sense, thanks
The paper abstract:
Emphasis added.
Induction gives us a way to define larger and larger objects, by building them up; but we need to start somewhere (a base case). A classic example is the
successor : Nat -> Nat
which constructs a natural number that is one larger than its input; andzero : Nat
is the base case we start with.Coinduction gives us a way to define objects by how we can “destruct” them into smaller parts, or alternatively by the ways that it can be used/observed. The interesting thing is that coinduction doesn’t need a base case, so we can define infinite objects like streams, or infinite binary trees, or whatever. We can still have base cases if we like, e.g. we can make a corecursive list with
nil
andcons
, but in that case we can’t assume that every list will end innil
. As an example, recursive types in Haskell are coinductive: they might be infinite (they might also contain errors or diverge, since Haskell is unsound).Recursive (total) functions require a base case, since that’s the point where they return a value. Corecursive functions don’t require a base case, as long as they are “productive”, defining some part of the return value. For example, a corecursive function returning a stream can define one element and recurse for the rest; this will never end, but each part of the return value can be produced in some finite amount of time (i.e. it doesn’t diverge).
AFAIK Python generators are inherently linear, like a stream; e.g. they can’t define an infinite binary trie like we could in Haskell (
data Trie = Node Trie Trie
). Another wrinkle is that Python uses side-effecting procedures, rather than pure, total functions; so we might want to consider those effects as part of the “productivity”. For example, we might have a recursive procedure which has no base case and never returns any value, but it causes some effect on each call; that can still be a useful thing to do, so we might count it as productive even though it diverges from the perspective of calculating a value. (I don’t think Python eliminates tail-calls, so maybe Scheme would be a better example for recursing without a base case, since it is then essentially the same as iteration)