1. 46
  1.  

  2. 13

    In the last rust thread four days ago I talked about how I was getting into using rust’s type system to try to follow the principle of making invalid state non-representable, and extending it to making invalid state transitions non-representable. I regret to report I’ve hit a sort of roadblock here: every additional state I add at the type level makes the ergonomics of the actual code I write worse and worse, even though there are more useful guard-rails. As an example, I am using type-states to model a lexer consuming text. The primary action is to call advance(), which will result in either a Lexer<LexingState> instance or a Lexer<EndState> instance; only the former has the function advance() defined. I had hit a few cases of accidentally entering an infinite loop when not actually calling advance() in the main lexer loop, so wanted to add another state Lexer<AdvancedLexingState> that is constructed only by calling advance(), and then requiring the main lexer loop return an instance of Lexer<AdvancedLexingState> thus ensuring it has called advance() at least once. Pretty neat, right? Very close to proving termination at the type level. Unfortunately this shifted the lexing code I was writing from tolerably horrible to absolutely horrible; here is the code before adding Lexer<AdvancedLexingState>, and here it is afterward (the code in lexer_state.rs is even worse; the whole thing doesn’t quite compile yet). Endless, endless pattern-matching. Having to write nearly identical code to function when we are in state Lexer<LexingState> vs. Lexer<AdvancedLexingState>.

    I don’t think I’ll keep it. Maybe I’m going about this wrong and there are language features I could be using to make this easier on myself (macros?) or maybe I really am just trying to use the type system in a way it isn’t meant to be used, and if I really want to program this way I should use some weird dependently-typed research language instead. Maybe reading The Little Typer has permanently broken my brain. Anyway, I’m rambling but I really do love rust, I just need to moderate my worst (best?) impulses to codify everything at the type level.

    1. 15

      I suspect that code would get a lot cleaner if parsing state was represented as an enum. Match is your friend. Type params are best saved for open (user definable) types.

      Having proved termination at the type level before, trust me, you don’t want to go there if you don’t have to.

      1. 3

        I ended up reaching a similar conclusion in a project at work. We decided to use sm, which has a similar design to the code @ahelwer described but with even more complicated trait-level machinery to represent transitions that could act on multiple states. The most painful part turned out to be the fact that .transition consumes the previous state, which is theoretically great for helping prove that the code terminates, but in reality resulted in a lot of repetitive and extremely ugly code and unnecessary clones.

        I ended up rewriting the code generation portion of sm over a weekend and ported our internal project to it in a day or so, and deleted half the code in the module in the process. The new design was less “provably correct” but had much better ergonomics, at least for my use case. I also wrote a bit about this in the crate’s readme.

        1. 1

          Yeah, state machines are one area where this sort of pattern should be really nice. If it’s not making life easier, that’s a problem.

      2. 9

        I ran into a similar issue writing Haskell where I got lost down the rabbit hole of type-driven design. Ultimately I gave up on that and restricted myself to writing procedural-style programs, basically just using functions and basic types with some monads defined by third party libraries for parsing. It worked OK for the applications I wrote and I found it a productive approach. I hear folks say this is what writing OCaml is like but I haven’t tried that myself. Personally I think type-driven design is best suited at module boundaries to define how to consume/interact with a public interface. But otherwise it’s too much cognitive overhead getting in the way of getting things done. I’m interested to hear others’ thoughts though.

        1. 1

          When you say you hear this is what writing OCaml is like you mean it’s designed around writing programs in the type-driven style I am trying to write one here?

          1. 4

            No, more so I get the impression that OCaml makes a design compromise between pure type-driven style and procedural/OOP, just as I have done consciously when writing Haskell

        2. 5

          Simple types (not just a descriptive adjective) have an upper bound on what you can represent with them. Their purpose isn’t to encode all constraints of a program, only to form a base set of types as a foundation. Dependent types are of course more powerful, but come with their own tradeoffs.

          I’m all for making illegal states unrepresentable, but at the same time it’s not actually possible in an ergonomic way in any language that we have available to us. So it’s more of an ideal than something actually achievable in practice.

          1. 4

            This seems to be a phase lots of people go through. Keep in mind that this kind of type metaprogramming saves you the most work when there’s parts of the state which are complicated, you don’t control, or can’t predict. Usually at the edges of subsystems where you want to prevent other subsystems from getting or giving you invalid state. If the un-representable state never escapes the module then you usually don’t need the compiler’s help to keep it that way. For things like “ensuring it has called advance() at least once”, if your program is pretty small you can just eyeball it and say “yep that’s true”, and maybe throw in an assertion or doc comment so you know not to accidentally change it later.

            TLDR, type metaprogramming is a spice, not the main course. Use it when it’s helpful, don’t try to build every single piece of logic around it.

            1. 1

              That’s a good heuristic. So it might still make sense to wrap the iterator in this type metaprogramming style to ensure I don’t call advance() on an ended iterator, but ensuring I have called advance() at least once bleeds this into the actual logic I’m writing myself.

            2. 3

              Thanks for sharing this experience. I have not tried Rust, but very comfortable with C++ and Java type systems.

              I have tried to model my Android UI work to follow React’s paradigm (render function, and states for each component in an Android activity handeled with explicit state). What I was not able to do – is to get my system to work in a way where I would get compile time errors when I would introduce a state that had no representation handling in the ‘render’ function, or having a compile time error where a render function does not handle correctly the state (that is a composite state of the components in activity).

              I have tried the Android Lifecycle system, I have tried RxJava – but there is something missing. I can get the functionality, ok – but it is, then, similar to Javascript where I cannot get a compile time error when I should.

              There is something missing in these languages, I feel. I am thinking what is missing is a notion of a ‘sequence’ as a first class citizen, and a notion of ‘object projection’ as a first class citizen.

              The Sequence trait (or at least the way I am thinking about it) – needs to allow me to specify what state changes can follow what state changes (at a language level). And, therefore, what values can be accessed or modified at which state in a lifecycle.

              And a ‘Projection’ capability is something where i can say that ‘this representation/class’ is really a projection of something else – so when that ‘original’ changes – projection will change, and the compiler checks that the projection state is in synch with the original, and that the sequence rules of what value can be access at which point in a state are still followed (whether I use a projection or the original)

              Oh, and it would be nice that the documentation of code (comments) can clearly include the above capabilities, so that I can explicitly comment why I have the states I have, and why I have the relationships across states .

              1. 1

                to specify what state changes can follow what state changes (at a language level). And, therefore, what values can be accessed or modified at which state in a lifecycle.

                I recall “Let’s Prove Leftpad” including an example of formal verification for Java that looked fairly lightweight. I wonder whether that could help.

              2. 3

                Type level termination proofs is basically formal verification, and you know how hairpullingly frustrating formal verificatiion can be!

                1. 2

                  I’ve had similar experiences in F#. Trying to encode invariants at the type level is appealing, but then your code basically becomes the proof of the invariants. Which is the whole point of type first development, but the code ergonomics suffers. I couldn’t find a reasonable work around.

                2. 11

                  Second of all, enums are insanely powerful. Result and Option are useful concepts, but what makes them fantastic is the fact that everybody uses them. These two enums are common language for all libraries (including the standard one) to express fallible computation and optional value.

                  This reminds me of a hard earned TLA+ lesson. As an expert, the things I find most interesting live at the very limits of the language: partial refinements, hyperproperties, test extraction. The things that get my students excited are basic features: invariants and simple concurrency. It’s too easy to get wrapped up in the advanced stuff and lose sight of what’s important.

                  1. 1

                    Did you consider any other language? Especially Ada?

                    1. 16

                      I’ve been looking into this lately, and Ada is cool but if you look online the only jobs for it are at, like, Lockheed Martin and Raytheon. If I could learn Ada and work on train control software or something that would be cool. But it seems my only option in the US is using it to more effectively murder people around the world. When selecting a programming language at your job your employability elsewhere is an important factor. Network effects, I guess.

                      1. 2

                        Your point is spot on recently there’s been post on reddit with same question https://www.reddit.com/r/ada/comments/11qov17/ada_developer_looking_for_a_job_as_software/