I looked through your introduction (well, the beginning and then searched the page) but didn’t find what ^ or explicit means so this post was harder to follow.
For the naive package manager, we’ll say it works as follows:
As soon as I read this list, I was thinking of version SAT. I was pleasently surprised that this is the first issue addressed (although stated differently) and that you predicted readers would see the problem.
By the way, do you manually set the colours on the diffs or send it through a different highlighter? I want to do that sometimes.
It’s always nice to see such a bright optimism, both in using a new generation of tools as a go-to learning tool, and in estimating the current situation.
«This is table stakes for a package manager» — if only it was true for general-purpose (not single-ecosystem) package managers, Nix would probably try to solve very different problems… You are right it is nice to have it, but it is not always there when you need it.
And an interesting property of verification is that it forces you to pay attention to details. So it doesn’t even matter if your model describes things correctly, you are still forced to think about the small details. If you ever look at the real thing, you have a prepared state of mind to consider it.
«This is table stakes for a package manager» — if only it was true for general-purpose (not single-ecosystem) package managers, Nix would probably try to solve very different problems…
I’m just gonna call this a point in favor of formal methods, since it made the upgrade bug so incredibly obvious that I assumed nobody would fall for it!
Nobody have failed for the bug you have correctly described.
Instead, a lot of people use package managers that specify compatibility ranges for packages, and also can upgrade a lot of packages by the logic «new GCC needs new glibc that forces and upgrade to a lot of things».
Then they use SAT-solvers to find out if the requested changes to the current system are possible to implement among all the version mess. Then it turns out that a typical installation is too large for precise SAT-solving, and they use heuristics to limit the scope and sometimes miss obvious solutions (as in: «you complain that you cannot find a solution where an unrelated package X is still installed, let me add it to the explicit part of my request, much better now»). Then a user wants two versions of a program that are not supported to be installed at once, and finds out it is impossible.
And along the way the package manager developers have always used whatever tools for this task they could take from automated formal methods (which was «none» in the beginning of the journey). The formal methods cannot save you from inherited axioms about tradeoffs in resources (these axioms were obviously true in the 1980s, but nowadays the situation is more complicated).
Since you’re bringing logic into this, you might find this paper interesting which implemented make in Prolog. I found it when I was looking into the concept of cheating around verified implementations of some software by using first-order logic for specification with a verified Prolog instead. Mostly knock out the coding step. Keep the specs and code closer together if nothing else.
Separately, re: value of formal methods. Your article actually shows their value more as a mental discipline, a method to force attention to details — after all, you explicitly acknowledge you didn’t check if your model matches what you verify. I would say that implementing your final approach would result in something closer to Flatpak or Docker than Nix.
An example is that you naive model with explicit and requires has a problem roughly symmetrical to what you found but less interesting. It requires a formally longer demonstration, so you still were pushed to look for a workaround for a more realistic problem. This actually happens a lot — after all, a lot of proofs in inconsistent naive set theory are directly applicable in many other set theories, hopefully including consistent ones! You don’t know what your solution corresponds to in technical terms, but if you ever need to learn the model of a similar package manager, you are already mentally prepared.
I think what matters a lot is that you think about the details, but are prevented from coupling that thinking with code. At least a formal specification by a person prohibited from coding helps understanding even when it is impossible to formally verify. I would expect that going into polar direction opposite to formal methods — something like hiring people who can translate from many languages, including ones they don’t actually speak, but who cannot program because program languages are too rigid and unforgiving to translate — might also allow you to get a code documentation with attention to details but separate from the code, and reading that translation would also be quite enlightening. I guess the intersection of companies who want to experiment in crazy directions and companies who want to spend money on code comprehension and correctness is too small, so nobody tried.
I looked through your introduction (well, the beginning and then searched the page) but didn’t find what
^orexplicitmeans so this post was harder to follow.As soon as I read this list, I was thinking of version SAT. I was pleasently surprised that this is the first issue addressed (although stated differently) and that you predicted readers would see the problem.
By the way, do you manually set the colours on the diffs or send it through a different highlighter? I want to do that sometimes.
Looks like op is using pygments. It has a builtin diff lexer.
The version SAT thing was worth a submission itself. Thanks for linking to it.
It’s always nice to see such a bright optimism, both in using a new generation of tools as a go-to learning tool, and in estimating the current situation.
«This is table stakes for a package manager» — if only it was true for general-purpose (not single-ecosystem) package managers, Nix would probably try to solve very different problems… You are right it is nice to have it, but it is not always there when you need it.
And an interesting property of verification is that it forces you to pay attention to details. So it doesn’t even matter if your model describes things correctly, you are still forced to think about the small details. If you ever look at the real thing, you have a prepared state of mind to consider it.
I’m just gonna call this a point in favor of formal methods, since it made the upgrade bug so incredibly obvious that I assumed nobody would fall for it!
Nobody have failed for the bug you have correctly described.
Instead, a lot of people use package managers that specify compatibility ranges for packages, and also can upgrade a lot of packages by the logic «new GCC needs new glibc that forces and upgrade to a lot of things».
Then they use SAT-solvers to find out if the requested changes to the current system are possible to implement among all the version mess. Then it turns out that a typical installation is too large for precise SAT-solving, and they use heuristics to limit the scope and sometimes miss obvious solutions (as in: «you complain that you cannot find a solution where an unrelated package X is still installed, let me add it to the explicit part of my request, much better now»). Then a user wants two versions of a program that are not supported to be installed at once, and finds out it is impossible.
And along the way the package manager developers have always used whatever tools for this task they could take from automated formal methods (which was «none» in the beginning of the journey). The formal methods cannot save you from inherited axioms about tradeoffs in resources (these axioms were obviously true in the 1980s, but nowadays the situation is more complicated).
Since you’re bringing logic into this, you might find this paper interesting which implemented make in Prolog. I found it when I was looking into the concept of cheating around verified implementations of some software by using first-order logic for specification with a verified Prolog instead. Mostly knock out the coding step. Keep the specs and code closer together if nothing else.
Separately, re: value of formal methods. Your article actually shows their value more as a mental discipline, a method to force attention to details — after all, you explicitly acknowledge you didn’t check if your model matches what you verify. I would say that implementing your final approach would result in something closer to Flatpak or Docker than Nix.
An example is that you naive model with explicit and requires has a problem roughly symmetrical to what you found but less interesting. It requires a formally longer demonstration, so you still were pushed to look for a workaround for a more realistic problem. This actually happens a lot — after all, a lot of proofs in inconsistent naive set theory are directly applicable in many other set theories, hopefully including consistent ones! You don’t know what your solution corresponds to in technical terms, but if you ever need to learn the model of a similar package manager, you are already mentally prepared.
I think what matters a lot is that you think about the details, but are prevented from coupling that thinking with code. At least a formal specification by a person prohibited from coding helps understanding even when it is impossible to formally verify. I would expect that going into polar direction opposite to formal methods — something like hiring people who can translate from many languages, including ones they don’t actually speak, but who cannot program because program languages are too rigid and unforgiving to translate — might also allow you to get a code documentation with attention to details but separate from the code, and reading that translation would also be quite enlightening. I guess the intersection of companies who want to experiment in crazy directions and companies who want to spend money on code comprehension and correctness is too small, so nobody tried.
Great post, I really enjoyed it!
I like the diff based code snippets, makes it easy to follow. This is the first time I see this, good idea! I’ll probably steal it :)