“Read it as: If the property Q holds for all values of type T, then if every object of type S maps to exactly one object of type T, the property Q holds for all values of type S.”

iow if:
if S is a subset of T then every property of element of T is also a property of elements of S

In other words, a subset is a subset.

As far as I can tell, this is elementary school algebra, with a bunch of silly terms added. The original results in formal logic are kind of interesting, but what do we learn about programming here? I don’t see it - which may just mean I’m not seeing it.

1) We’re dealing with types, not sets, so set-based intuition can be a guide but doesn’t always hold
2) OO types sometimes behave really weirdly

In particular, in an OO system you can implicitly treat any value of a type s : S as if it were a type T so long as S is a subtype of T. This creates a map, as we’re asking for, letting us consider each value s : S as if it were a value of T—as we ask.

Unfortunately, this mapping may not always convey all of the properties we’re interested in. LSP warns about this and says we ought to be careful. In particular, it’s up to us to ensure that all supertype coercions respect all possible properties of the supertype. That’s not obvious.

I think the whole concept of type hierarchy is dubious - it’s an attempt to simplify too much.

What I did not like about the posting, however, was that it tossed around terms from logic and category theory as if they were obviously significant in programming, and yet the use of those terms seemed limited to making some simple stuff appear complicated and perhaps also making some complex stuff appear solved. What do we know more about programming or programs after we invoke a lot of vaguely correlated terms from foundational logic and category theory? I don’t see any additional insight.

What I did not like about the posting, however, was that it tossed around terms from logic and category theory as if they were obviously significant in programming…

Type theory (which more or less exists to discuss programming) draws a lot on category theory and logic, not just set theory. That’s the link.

I thought it was interesting in that it had simplified, but usable, models of all sorts of things in programming expressed with elementary set theory. It’s clean enough that a non-formal methodist like me can follow a lot of it. I also have a bunch of papers in my collection using FOL or set theory to build CA’s, certifying compilers, automated proving, etc. So, I’d like a logic experts opinion on it whether it’s worth further research or development given how it models foundations of programming easy for both people and computers to get.

I don’t get how it is “usable” - and I’m not saying that as a way of saying it is not usable, it’s that I read these things and don’t see how all that apparatus produces any insights or assurance.

sorry for being so terse. I get why it’s useful to think of a (um) program as a map or relation from states to states, but once you get further down into that explanation, it seems to just restate programming language concepts in a clumsy special notation.

Ok. That’s more helpful. Im just surveying opinions on some of this niche work. More interesting one so far was Hehner’s Formal Methods of Software Design. Quite a few liked that.

The embedded folks are prefering model-driven stuff with code generation to SPARK or statically-checkable C. Rockwell-Collins has a nice toolchain that ties together Simulink, provers, SPARK, and AAMP7G.

So, I just dig up stuff like this to run it by diverse crowds of pro’s to see what they think is worth passing onto specialists in CompSci wondering what to tackle or improve.

The Liskov substitution principle is as much of a logical triviality with types as it is with sets. And the following corollary is also a triviality: Absent any laws governing how methods can be overridden (an object-oriented counterpart to Haskell’s type class laws), the real meaning of a non-final method in a non-final class is simply undefined.

Yet most object-oriented programmers still define APIs full of non-final methods without associated laws. This is why we need to restate the Liskov substitution principle over and over.

Maybe I miss the point, but the Liskov substitution principle seems to me to indicate that type hierarchies are too simple minded. Squares are obviously a type of rectangle, but operations on rectangles that are closed in the set of rectangles are not necessarily closed on squares. What’s the problem? The problem comes up because there is a desire to allow all methods defined for rectangles to be applied to squares, but that seems like an odd idea. All odd numbers are numbers, but doubling an number produces a number, while doubling an odd number takes us out of the set. So what?

Squares are obviously a type of rectangle, but operations on rectangles that are closed in the set of rectangles are not necessarily closed on squares. What’s the problem?

The problem is that object-oriented programmers call Rectangle and Square classes whose instances don’t quite behave like rectangles or squares. Mathematical objects don’t “adjust themselves” in response to method calls. If you perform an operations on a mathematical object, what you get is another mathematical object (unless the given object happens to be a fixed point of the given operation, of course). If you could actually define types of rectangles and squares, rather than pointers to some memory blob whose current state happens to contain the parameters of a rectangle or square, you’d see how beautifully Square is a subtype of Rectangle.

This is what I don’t get. If you perform x times 2 on an odd number, you get an even number. That’s why we use groups, rings, fields, etc. to capture the notion of a set with a collection of operations. If you increase the length of a pair of sides on a square, you get a rectangle. These are both mathematical objects, but the operation: increase length of parallel sides is not closed for “squares”. I am undoubtedly missing the point, but to me it seems like the difficulty here is that the abstraction of a type hierarchy is wrong. The reason one might want to inherit rectangle is code reuse. That makes sense. Shouldn’t we really be asking: does subtyping efficiently reuse code, not is the subtype closed under all type methods? In Liskov’s paper, one issue she brings up is aliasing - where an object “square” is also an element of “rectangle” so you could apply a rectangle operation unaware. But that seems like the real problem: if I constrain a type, shouldn’t the language at least complain if I evade the constraints?

These are both mathematical objects, but the operation: increase length of parallel sides is not closed for “squares”.

Right, it has type enlarge :: Number -> Rectangle -> Rectangle (where the number specifies by how much the length of parallel sides must grow), but not Number -> Square -> Square. However, if you get your subtypes right, you can pass a Square to enlarge, and while the output isn’t guaranteed to be a Square, it’s guaranteed to be a Rectangle.

I am undoubtedly missing the point, but to me it seems like the difficulty here is that the abstraction of a type hierarchy is wrong.

Type hierarchies are totally fine. The problem is what types you’re putting in the type hierarchy. What an OOPer calls a Rectangle (resp. Square) is actually a mutable cell whose state can be interpreted as the parameters of a rectangle (resp. square). A Haskell programmer would call such things IORef Rectangle (resp. IORef Square). Haskell doesn’t have subtyping, but you can obviously write a function upcast :: Square -> Rectangle. And a very naïve form of the Liskov substitution principle can be expressed: contramap upcast :: Predicate Rectangle -> Predicate Square. However, you can’t lift upcast to another function of type IORef Square -> IORef Rectangle, because IORef isn’t a Functor. This is why logic provides insight about computation.

In this particular case: a mutable cell containing a rectangle isn’t the same thing as a rectangle, thus you can’t expect properties of rectangles to apply to mutable cells containing rectangles. Phrased this way, it’s an utter triviality, but apparently most programmers won’t see the obvious.

It is logically trivial with types, but to state it one must accept that the notion of subtype isn’t as obvious as it is with sets. If you use set intuition you’ll skip right over the interesting misconception since it’s been so thoroughly hammered out.

Half-agree and half-disagree. You correctly note that the Liskov substitution principle is a triviality. It holds in any type safe language with subtyping: Java, OCaml, etc. However, believe it or not, not everyone understands this! For example, many, perhaps most object-oriented programmers think they can somehow “break” the Liskov substitution principle by overriding in a derived class a non-abstract method from a base class. Here’s the classical example:

class Animal {
public void makeNoise() { System.out.println("generic noise"); }
}

After reading this snippet, many programmers conclude that:

(0) If animal is an Animal, then animal.makeNoise() produces a generic noise.

But then we show them the following snippet:

class Dog extends Animal {
public void makeNoise() { System.out.println("bark"); }
}

And then they conclude that:

(1) Dog is a subtype of Animal.

(2) If dog is a Dog, then dog.makeNoise()doesn’t produce a generic noise.

After putting the pieces together, the only thing we can conclude is that we just broke the Liskov substitution principle. But, wait, didn’t I just say that the Liskov substitution principle always holds? Where did we go wrong?

I thought the classic example was a type hierarchy that has a square as a subclass of a rectangle since it is a “special type of rectangle that happens to have equal sides”. It violates LSP since a square cannot be substituted for a rectangle in some situtations such as when the width and height are being varied independently.

The problem is that you’re wrongly relying on names to determine the meaning of your types. Just because your classes are called Rectangle and Square, it doesn’t automatically mean that their instances actually behave like mathematical rectangles and squares.

There are a few things in this post that I’m unhappy with.

Translating subset relationship as →.

If you have two sets A and B and they are defined as A = {x ∈ X | p x} and B = {x ∈ X | q x}
then ∀ (x:X), p x → q x means the same as A is a subset of B.
But that’s not what this post uses.

To demonstrate the mistake made: “Real numbers are not a subset of the integers.”
Proper formalization: ¬(∀ (n:numbers), is_real n → is_integer n)
What this post uses: ¬(real → integer)

Since I have a proof for real → integer namely floor the latter can be disproven.
It can be disproven even though the original sentence is true because they don’t
represent the same proposition.

Mixing the language being analyzed and the language in which the proof is made.

On the level of the proof S and T are not types, they are values. Values of a type I’ll call OOType.

You can of course define a function lift : OOType → Type but then your signatures on proof level would look like lift S -> lift T not S -> T.

Missing the point of the LSP.

LSP is not a consequence. It’s an axiom. It defines what subtyping means.
Though axiom is probably flattering because despite its formal look it has a “should” in it for a reason.
Without the “should” you can always construct a property so strict that only T fulfills it (and every S must be T)
making it quite useless.

LSP is not a consequence. It’s an axiom. It defines what subtyping means.

The definition of subtyping is the rule of subsumption: “if S <: T and t : S, then t : T”. The Liskov substitution principle is an immediate corollary.

Though axiom is probably flattering because despite its formal look it has a “should” in it for a reason.

No, there’s no reason for it to have a “should”. The Liskov substitution principle always holds in a type-safe language with subtyping. What actually doesn’t hold is that you can make logically valid assumptions about things that can be arbitrarily redefined. Such as the behavior of non-final methods.

After putting the pieces together, the only thing we can conclude is that we just broke the Liskov substitution principle. But, wait, didn’t I just say that the Liskov substitution principle always holds? Where did we go wrong?

What’s your conclusion? The way I understand it, we did break it. Exactly because methods can be overridden.

Maybe I’m missing the point. Allow me to reword my third paragraph:

There is the concept of subtyping, and what individual languages use as “subtyping”. A language designer can choose to include a subtype relation in their language that fulfills the requirements of LSP.
With dynamic type dispatch (which overrideable methods offer) you can leak dynamic type info to the program. Every property you construct around that information will naturally differ between S and T and violate LSP.

You also said:

The definition of subtyping is the rule of subsumption: “if S <: T and t : S, then t : T”. The Liskov substitution principle is an immediate corollary.

Subsumption rule states that the property “is of type T” propagates from T to its subtypes.
LSP states that every decidable property propagates from T to its subtypes.
So I’d say it’s the other way round. LSP implies the subsumption rule because “is of type T” is just another property about which LSP makes a statement. In fact, I have a proof for it: http://lpaste.net/246224

What’s your conclusion? The way I understand it, we did break it. Exactly because methods can be overridden.

Nope, we didn’t break it. What’s actually broken is my assumptions: (0) and (2) are wrong. Not every Animal produces generic sounds, precisely because makeNoise isn’t a final method. In fact, without some analog of Haskell’s type class laws for [EDIT: oops!] overriding methods, you can’t guarantee anything about the behavior of non-final methods.

Subsumption rule states that the property “is of type T” propagates from T to its subtypes.

Subsumption says that, if S is a subtype of T, then every term t that has type S also has type T. The Liskov substitution principle is just a very trivial corollary of it.

LSP states that every decidable property propagates from T to its subtypes.

Every property, decidable or not, propagates from T to its subtypes.

The issue, really, is that programs have little control over how predicates are formed. If you want a type that is honestly a tuple of Doubles to behave morally like a point in R^2 then you must outlaw certain (many) predicates. Once this is done we can have a real conversation about LSP.

To me it seems like: all integers can be expressed as a unique product of primes. Even numbers are integers. Therefore even numbers can be expressed as a unique product of primes. QED.

Or perhaps “all records in this database are consistent” with their copies on replicas and all employee records are consistent on all replicas.

I think the concept of “type” in these kinds of discussions is not connected in any meaningful way to proof theory.
But I’d be interested to find out differently.

“Read it as: If the property Q holds for all values of type T, then if every object of type S maps to exactly one object of type T, the property Q holds for all values of type S.”

iow if: if S is a subset of T then every property of element of T is also a property of elements of S

In other words, a subset is a subset.

As far as I can tell, this is elementary school algebra, with a bunch of silly terms added. The original results in formal logic are kind of interesting, but what do we learn about programming here? I don’t see it - which may just mean I’m not seeing it.

Two things make it interesting

1) We’re dealing with types, not sets, so set-based intuition can be a guide but doesn’t always hold 2) OO types sometimes behave really weirdly

In particular, in an OO system you can implicitly treat any value of a type

`s : S`

as if it were a type`T`

so long as`S`

is a subtype of`T`

. This creates a map, as we’re asking for, letting us consider each value`s : S`

as if it were a value of`T`

—as we ask.Unfortunately, this mapping may not always convey all of the properties we’re interested in. LSP warns about this and says we ought to be careful. In particular, it’s up to us to ensure that all supertype coercions respect

allpossible properties of the supertype. That’s not obvious.I think the whole concept of type hierarchy is dubious - it’s an attempt to simplify too much.

What I did not like about the posting, however, was that it tossed around terms from logic and category theory as if they were obviously significant in programming, and yet the use of those terms seemed limited to making some simple stuff appear complicated and perhaps also making some complex stuff appear solved. What do we know more about programming or programs after we invoke a lot of vaguely correlated terms from foundational logic and category theory? I don’t see any additional insight.

Type theory (which more or less exists to discuss programming) draws a lot on category theory and logic, not just set theory. That’s the link.

So, what do you think of this one from Meyer?

http://se.ethz.ch/~meyer/publications/proofs/top.pdf

I thought it was interesting in that it had simplified, but usable, models of all sorts of things in programming expressed with elementary set theory. It’s clean enough that a non-formal methodist like me can follow a lot of it. I also have a bunch of papers in my collection using FOL or set theory to build CA’s, certifying compilers, automated proving, etc. So, I’d like a logic experts opinion on it whether it’s worth further research or development given how it models foundations of programming easy for both people and computers to get.

I don’t get how it is “usable” - and I’m not saying that as a way of saying it is not usable, it’s that I read these things and don’t see how all that apparatus produces any insights or assurance.

Appreciate the feedback. :)

sorry for being so terse. I get why it’s useful to think of a (um) program as a map or relation from states to states, but once you get further down into that explanation, it seems to just restate programming language concepts in a clumsy special notation.

Ok. That’s more helpful. Im just surveying opinions on some of this niche work. More interesting one so far was Hehner’s Formal Methods of Software Design. Quite a few liked that.

The embedded folks are prefering model-driven stuff with code generation to SPARK or statically-checkable C. Rockwell-Collins has a nice toolchain that ties together Simulink, provers, SPARK, and AAMP7G.

So, I just dig up stuff like this to run it by diverse crowds of pro’s to see what they think is worth passing onto specialists in CompSci wondering what to tackle or improve.

The Liskov substitution principle is as much of a logical triviality with types as it is with sets. And the following corollary is also a triviality: Absent any laws governing how methods can be overridden (an object-oriented counterpart to Haskell’s type class laws), the real meaning of a non-final method in a non-final class is simply undefined.

Yet most object-oriented programmers still define APIs full of non-final methods without associated laws. This is why we need to restate the Liskov substitution principle over and over.

Maybe I miss the point, but the Liskov substitution principle seems to me to indicate that type hierarchies are too simple minded. Squares are obviously a type of rectangle, but operations on rectangles that are closed in the set of rectangles are not necessarily closed on squares. What’s the problem? The problem comes up because there is a desire to allow all methods defined for rectangles to be applied to squares, but that seems like an odd idea. All odd numbers are numbers, but doubling an number produces a number, while doubling an odd number takes us out of the set. So what?

The problem is that object-oriented programmers call

`Rectangle`

and`Square`

classes whose instances don’t quite behave like rectangles or squares. Mathematical objects don’t “adjust themselves” in response to method calls. If you perform an operations on a mathematical object, what you get isanothermathematical object (unless the given object happens to be a fixed point of the given operation, of course). If you could actually define types of rectangles and squares, rather than pointers to some memory blob whose current state happens to contain the parameters of a rectangle or square, you’d see how beautifully`Square`

is a subtype of`Rectangle`

.This is what I don’t get. If you perform x times 2 on an odd number, you get an even number. That’s why we use groups, rings, fields, etc. to capture the notion of a set with a collection of operations. If you increase the length of a pair of sides on a square, you get a rectangle. These are both mathematical objects, but the operation: increase length of parallel sides is not closed for “squares”. I am undoubtedly missing the point, but to me it seems like the difficulty here is that the abstraction of a type hierarchy is wrong. The reason one might want to inherit rectangle is code reuse. That makes sense. Shouldn’t we really be asking: does subtyping efficiently reuse code, not is the subtype closed under all type methods? In Liskov’s paper, one issue she brings up is aliasing - where an object “square” is also an element of “rectangle” so you could apply a rectangle operation unaware. But that seems like the real problem: if I constrain a type, shouldn’t the language at least complain if I evade the constraints?

Right, it has type

`enlarge :: Number -> Rectangle -> Rectangle`

(where the number specifies by how much the length of parallel sides must grow), but not`Number -> Square -> Square`

. However, if you get your subtypes right, you can pass a`Square`

to`enlarge`

, and while the output isn’t guaranteed to be a`Square`

, it’s guaranteed to be a`Rectangle`

.Type hierarchies are totally fine. The problem is what types you’re putting in the type hierarchy. What an OOPer calls a

`Rectangle`

(resp.`Square`

) is actually a mutable cell whose state can be interpreted as the parameters of a rectangle (resp. square). A Haskell programmer would call such things`IORef Rectangle`

(resp.`IORef Square`

). Haskell doesn’t have subtyping, but you can obviously write a function`upcast :: Square -> Rectangle`

. And a very naïve form of the Liskov substitution principle can be expressed:`contramap upcast :: Predicate Rectangle -> Predicate Square`

. However, you can’t lift`upcast`

to another function of type`IORef Square -> IORef Rectangle`

, because`IORef`

isn’t a`Functor`

. This is why logic provides insight about computation.Well, I’m glad that gives you some insight about computation, but I have no idea what it could be.

In this particular case: a mutable cell containing a rectangle isn’t the same thing as a rectangle, thus you can’t expect properties of rectangles to apply to mutable cells containing rectangles. Phrased this way, it’s an utter triviality, but apparently most programmers won’t see the obvious.

It is logically trivial with types, but to state it one must accept that the notion of subtype isn’t as obvious as it is with sets. If you use set intuition you’ll skip right over the interesting misconception since it’s been so thoroughly hammered out.

Half-agree and half-disagree. You correctly note that the Liskov substitution principle is a triviality. It holds in any type safe language with subtyping: Java, OCaml, etc. However, believe it or not, not everyone understands this! For example, many, perhaps most object-oriented programmers think they can somehow “break” the Liskov substitution principle by overriding in a derived class a non-abstract method from a base class. Here’s the classical example:

After reading this snippet, many programmers conclude that:

(0) If

`animal`

is an`Animal`

, then`animal.makeNoise()`

produces a generic noise.But then we show them the following snippet:

And then they conclude that:

(1)

`Dog`

is a subtype of`Animal`

.(2) If

`dog`

is a`Dog`

, then`dog.makeNoise()`

doesn’tproduce a generic noise.After putting the pieces together, the only thing we can conclude is that we just broke the Liskov substitution principle. But, wait, didn’t I just say that the Liskov substitution principle always holds? Where did we go wrong?

I thought the classic example was a type hierarchy that has a square as a subclass of a rectangle since it is a “special type of rectangle that happens to have equal sides”. It violates LSP since a square cannot be substituted for a rectangle in some situtations such as when the width and height are being varied independently.

The problem is that you’re wrongly relying on names to determine the meaning of your types. Just because your classes are called

`Rectangle`

and`Square`

, it doesn’t automatically mean that their instances actually behave like mathematical rectangles and squares.“properties of functions (which is what programming really is about)”

It’s common to mistake a description of the thing for the thing itself.

There are a few things in this post that I’m unhappy with.

Translating subset relationship as →.

If you have two sets A and B and they are defined as A = {x ∈ X | p x} and B = {x ∈ X | q x} then

`∀ (x:X), p x → q x`

means the same as A is a subset of B. But that’s not what this post uses.To demonstrate the mistake made: “Real numbers are not a subset of the integers.”

Proper formalization:

`¬(∀ (n:numbers), is_real n → is_integer n)`

What this post uses:

`¬(real → integer)`

Since I have a proof for

`real → integer`

namely`floor`

the latter can be disproven. It can be disproven even though the original sentence is true because they don’t represent the same proposition.Mixing the language being analyzed and the language in which the proof is made.

On the level of the proof

`S`

and`T`

are not types, they are values. Values of a type I’ll call`OOType`

.You can of course define a function

`lift : OOType → Type`

but then your signatures on proof level would look like`lift S -> lift T`

not`S -> T`

.Missing the point of the LSP.

LSP is not a consequence. It’s an axiom. It defines what subtyping means. Though axiom is probably flattering because despite its formal look it has a “should” in it for a reason. Without the “should” you can always construct a property so strict that only

`T`

fulfills it (and every`S`

must be`T`

) making it quite useless.The definition of subtyping is the rule of subsumption: “if

`S <: T`

and`t : S`

, then`t : T`

”. The Liskov substitution principle is an immediate corollary.No, there’s no reason for it to have a “should”. The Liskov substitution principle

alwaysholds in a type-safe language with subtyping. What actuallydoesn’thold is that you can make logically valid assumptions about things that can be arbitrarily redefined. Such as the behavior of non-`final`

methods.In a different comment you wrote:

What’s your conclusion? The way I understand it, we did break it. Exactly because methods can be overridden.

Maybe

I’mmissing the point. Allow me to reword my third paragraph:You also said:

Subsumption rule states that the property “is of type T” propagates from T to its subtypes.

LSP states that

everydecidable property propagates from T to its subtypes.So I’d say it’s the other way round. LSP implies the subsumption rule because “is of type T” is just another property about which LSP makes a statement. In fact, I have a proof for it: http://lpaste.net/246224

Nope, we didn’t break it. What’s actually broken is my assumptions: (0) and (2) are wrong. Not every

`Animal`

produces generic sounds,preciselybecause`makeNoise`

isn’t a`final`

method. In fact, without some analog of Haskell’s type class laws for [EDIT: oops!] overriding methods, you can’t guaranteeanythingabout the behavior of non-`final`

methods.Subsumption says that, if

`S`

is a subtype of`T`

, then every term`t`

that has type`S`

also has type`T`

. The Liskov substitution principle is just a very trivial corollary of it.Everyproperty, decidable or not, propagates from`T`

to its subtypes.The issue, really, is that programs have little control over how predicates are formed. If you want a type that is honestly a tuple of Doubles to behave morally like a point in R^2 then you must outlaw certain (many) predicates. Once this is done we can have a real conversation about LSP.

This kind of theory proves results about theory and not about programming.

Can you explain what the distinction is you’re making? I find theory very insightful when it comes to programming.

[Comment removed by author]

Can you explain this in concrete terms?

To me it seems like: all integers can be expressed as a unique product of primes. Even numbers are integers. Therefore even numbers can be expressed as a unique product of primes. QED.

Or perhaps “all records in this database are consistent” with their copies on replicas and all employee records are consistent on all replicas.

I think the concept of “type” in these kinds of discussions is not connected in any meaningful way to proof theory. But I’d be interested to find out differently.