1. 21
  1. 18

    Interestingly, I’ve recently been thinking that Rust is redundant here. Most of the time, when I take a shortcut and use, eg, a tuple struct, I come to regret it later when I need to add an .4th field.

    I now generally have the following rules of thumb:

    • Don’t use tuple structs or tuple enum variants. Rational: if you need to tuple at least two fields, you’d want to tuple three and four in the future, at which point refactoring to record syntax would be a chore. Starting with record syntax isn’t really all that verbose, especially now that we have field shorthand
    • For newtype’d structs, use record syntax with raw / repr as a field name: struct Miles { raw: u32 }. Rationale: x.0 looks ugly at the call-site, x.raw reads nicer. It also gives canonical name for the raw value.
    • “newtype” enum variatns are OK. Rational: you don’t .0 enum variant, there’s only pattern matching (trivia: originally, the only way to work with tuple-struct was to pattern-match it, the .0 “cute” hack was added later (which, several years down the line, required breaking lexer/parser boundary to make foo.0.1 work))
    • In fact, probably most enums which are not just a private impl detail of some module should contain only unit/newtype variants: Rational: no strong rational here, this is the highest FPR rule of all, but, for larger enums, often you’ll fiend that you need to work with a specific variant, and splitting off enum variant into a dedicatded type often messes-up naming/module structure.
    1. 8

      My rule of thumb has been that tuples (struct or otherwise) should only very rarely cross module boundaries. However, if I’m just wiring a couple of private methods together, it’s often not worth the verbosity to definite a proper struct.

      1. 3

        That’s also reasonable! My comment was prompted by today’s incident, where I had to add a bool field to a couple of tuple-variants to a private enum used in tests :)

      2. 2

        My experience directly, almost word-for-word. I think the importance of having a convention for the “default” name instead of .0 makes all the difference and removes the biggest source of inertia when using a record struct instead of a tuple struct (you picked .raw, I picked .value - tomato tomato).

        In fact, if I had to pick globally, I would take anonymous strongly-typed types (see C#) over tuples in a heartbeat. We just need a shorthand for how to express them as return types (some sort of impls?)

        As an aside, again to steal prior art from C# (which, for its time and place is incredibly well-designed in many aspects) , record types and tuple types don’t have to be so different. Recent C# versions support giving names to tuple members that act as syntactic sugar for .0, .1, etc. If present at the point of declaration they supersede the numbered syntax (intellisense shows names rather than numbers) but you can still use .0, .1, etc if you want; they also remain “tuples” from a PLT aspect, eg the names don’t change the underlying type and you can assign (string s, int x) to a plain/undecorated (string, int) or a differently named (string MyString, int MyInt) (or vice-versa).

        1. 2

          This illustrates for me the kinds of decisions Swift makes in the language that you would otherwise have to settle on by policy and diligence. If you can’t combine two features in a certain way because there’s a better way to solve all the relevant real world problems, they would consider that a benefit to the programmer. That’s especially true when safety is involved, like Rust, but also just ease of use, readability, etc. I think it’s partly a reaction to C++ in which you can do anything and footguns are plentiful.

          1. 2

            Honestly, that’s my least favorite part of Rust. Since there’s always a bunch of ways to do the same thing, you have to be very careful to simultaneously avoid bike shedding and also avoid a wildly incident codebase.

          2. 1

            Could we put this in the docs somewhere?

            1. 2

              I don’t think this kind of advice is good for docs: it’s very opinionated, and Rust’s philosophy is very pluralistic. It’s also a bit dangerous, as it needs nuance to not be misapplied.

              I do find that I’ve accumulated a lot of similar rules of thumbs and heuristics which help clarify thinking and make decisions when coding in the small, perhaps I’ll compile them in some long read one day.

            2. 1

              Huh, if only a tool existed that made refactoring these things easy and quick! (Jk, I bet rust-analyzer can do it!)

              1. 1

                The distinction between “newtype” and “tuple” is an interesting one I hadn’t considered, and I agree with it. Just(T) seems fine; my example with (T, U, V) not so much. I don’t share the dislike for newtype structs, but I also get where you’re coming from; Foo { field } is fine for both destructuring in a pattern match and for constructing an instance, so I agree that it doesn’t add that much these days.

                I’m not entirely sure I follow your last bullet point; the stated preference and the argument seem to contradict each other, but probably I’m just misreading you?

                1. 5

                  Yeah, that’s confusing, more directly:

                  // DO
                  pub enum Expr { 
                    AddExpr(AddExpr), 
                    ... 
                  }
                  pub struct AddExpr { lhs: Expr, rhs: Expr }
                  
                  // DON'T
                  pub enum Expr { 
                    AddExpr { lhs: Expr, rhs: Expr }
                    ... 
                  }
                  

                  If you do the latter, you might find yourself wanting to refactor it later. But, again, this is very weak guideline, as adding an extra struct is very verbose. Eg, rust-analyzer doesn’t do this for expressions:

                  https://github.com/rust-lang/rust-analyzer/blob/2836dd15f753a490ea5b89e02c6cfcecd2f32984/crates/hir-def/src/expr.rs#L175-L179

                  But it does this for a whole bunch of other structs, to the point where we have a dedicated macro there:

                  https://github.com/rust-lang/rust-analyzer/blob/2836dd15f753a490ea5b89e02c6cfcecd2f32984/crates/stdx/src/macros.rs#L22-L47

                  1. 1

                    Ah, so basically your rule here would be “just do what Swift requires you to do here.” I understand but… well, see the original post; I obviously disagree. 😂 I get why, from the refactoring POV and the “it’s useful to name use the inner type sometimes” but find that to be much more of a judgment call on how the type is used. I think I would find that design constraint more compelling as an option if there were a way to limit where you can construct a type that gets inherently exposed that way, while still allowing it to be matched against. You’d need something like Swift’s ability to overload its ~= to make that actionable, though, I think. 🤔

              2. 5

                So when did an “enum” go from a series of abstract values:

                enum month { January, February, March, April, ... };
                

                to a union of structures? Or is my C bias showing here?

                1. 9

                  Note that a Rust enum is not quite a C union either; it is a discriminated/tagged union. I think the idea behind calling these enums in rust is you can explain it as “similar the enums you’re used to, but maybe with data attached to the variant.” I suspect union would still have been a better starting point pedagogically, but all I’ve got is a gut instinct based on a couple interactions, and anyway that ship has sailed.

                  1. 5

                    Also, it allowed Rust to reserve union for something which (unsafe-ly!) has the semantics of C-style unions. To the grandparent post: Rust’s enums do exactly what your example shows in the base case: if you write what you wrote there it will have the same behavior it would in C.

                    1. 3

                      I think union would have been even more confusing given the C/C++ baggage there. I think rust reused that keyword for exactly what it should be used for.

                      1. 2

                        Enumerating values?

                      2. 3

                        The enums I am used to are just named integer values, like the example I gave in C, or in Pascal:

                        TYPE
                            color = (red , green , blue);
                            month = (January, February, March, ...);
                        

                        The C keyword enum is short for “enumeration”. As for discrimiated/tagged unions, you can do that too in C:

                        typedef union _XEvent
                        {
                          int type;
                          XAnyEvent xany;
                          /* ... */
                        } XEvent;
                        
                        typedef struct
                        {
                          int type;
                          unsigned long serial;
                          /* ... */
                        } XAnyEvent;
                        

                        Yes, it’s not automatic, but it can still be done. And I don’t mind it being done automatically, it’s just in my head, that’s still a union, not an enum.

                        1. 2

                          Yep, of course you can build it with C! The key here (as I said on a different reply) is that the simplest form of enum does have the same semantics as the C/C++ idiom. This is perfectly valid Rust:

                          enum Month { January, February, March, April, ... };
                          

                          And you can even, in the case where none of the variants carries data, explicitly give them discriminant values, just as in C:

                          enum Month {
                              January = 1,
                              February = 2, // doesn't have to be specified, as in C/C++
                              // ...
                          }
                          

                          When you do this, you are enumerating a closed set of choices. It’s not a far jump to the idea that you might also want to enumerate a closed set of choices which include data—which is exactly what the rest of Rust’s enum capability is. It is as if you wrote two things to give yourself the ability to do exhaustiveness checking with a manually tagged union in C:

                          • an enum for the tag
                          • a union which always uses an enum value to distinguish the union fields, and where those are carefully maintained as disjoint

                          Rust chose enum here because it allows this broader case (where it carries data) to be a generalization of the narrow case (where it is basically just a discriminant for the case), rather than having to switch modes entirely when you want to start carrying around data.

                          1. 1

                            Yeah, this is more or less the perspective I’ve gotten from basically every C programmer I’ve talked to who has a strong opinion.

                        2. 3

                          When Rust needed a non-jargon term for a sum type.

                          1. 1

                            And so they took a jargon term (enum) for another concept from their main “replacement target” language (C/C++) and used that, just to keep things interesting for all those C/C++ developers that wish to switch to Rust?

                            To me there are only two plausible explanations for overloading enum like this: whomever made this decision didn’t think this through or it’s a big middle finger to C/C++. It’s in all likelihood the former and admitting mistakes were made is IMO a better way to deal with it.

                            1. 5

                              Why are you interpreting this as some kind of attack from Rust? As if C owned the word and had some exclusive use-patent for it. There is a common vocabulary used across programming languages and math, and terms are adopted between them when it helps with familiarity.

                              C adopted some mathematical terms too, e.g. it has “integers” even though they’re not infinite! But int x is more understandable than e.g. ring x or some jargon about ℤ subsets, congruence relations, or modulo arithmetic.

                              Naming is hard. There is a trade-off between being specific and familiar. In the case of sum types, “rebranding” as enum makes them more familiar and easier to understand for the target audience. It’s close enough that people get it “oh, it’s like the C enum, but can have non-integer values”. There are cases where Rust chose the other way: trait could have been interface or abstract class. There’s also a case where use of a familiar term IMHO backfired: reference. They’re sort-of like C++ references, but the name doesn’t capture the crucial ownership aspect. IMHO Rust would have been easier to learn if & was called a loan.

                              1. 1

                                Why are you interpreting this as some kind of attack from Rust?

                                I do not and I don’t see how my comment could be interpreted as such. If anything, I am expressing my bewilderment at the decision to overload the enum meaning like this while pitching Rust as an alternative to C/C++ and trying to attract developers familiar with these languages.

                                It’s close enough that people get it “oh, it’s like the C enum, but can have non-integer values”.

                                I can tell you that when I (having 20+ years developing in C++) first read about Rust’s enum, that was not my reaction. And judging by this thread, I am not the only one.

                                Naming is hard. […]

                                I agree with the overall sentiment, but I think in this case it was not hard to foresee. In fact, I wonder who made this decision and if it’s documented somewhere? I would really like to see if this was even a consideration? Anyone has any pointers?

                                1. 5

                                  There was discussion on the mailling list (a bit hard to read, as there are two threads intertwined):

                                  The general algorithm for finding this sort of thing:

                                  • go to rust pre-history: https://github.com/graydon/rust-prehistory
                                  • find the relevant test
                                  • find the test in the modern rust repo (test files are not renamed)
                                  • blame to find the commit that introduced the change
                                  • along the way, fish for keywords to search the mailling list (tag&enum) in this case

                                  My summary of the discussion:

                                  • the thing was actually called tag , this felt weird
                                  • variant was considered, but was too long (at that time, there was a hard limit of 5 chars for keywords)
                                  • union was considered, but it had wrong semantics: a C programmed typing union in Rust would be confused
                                  • enum was considered, a C programmer typing enum in Rust wold find that it just works.
                                  • there’s precedent for using enum for data-carrying variants in Java

                                  Note that how, in my other comment, I’ve re-creating all the arguments (including the Java one!) working backwards from the language values. It’s a weak evidence that this is indeed a value-driven design, rather than an arbitrary decision.

                                  1. 5

                                    I do not and I don’t see how my comment could be interpreted as such.

                                    TBH, the “there are only two plausible explanation” bit rubbed me a wrong way too: it has an unfortunate combination of being arguably wrong (there is a possibility that someone thought hard about this and made an informed decision for values other than spite) and emotionally loaded (can’t say that “middle finger” reads neutral to me).

                                    1. 3

                                      Ok, fair enough, I guess the enum choice seemed so obviously wrong to me that it made me think of less charitable explanations. But from your comment I can see how/why someone could reasonably arrive at such a choice (but still wrong, IMO). Also thanks for digging up the history on this, much appreciated.

                                2. 5

                                  Well, there’s at least a third possible explanation: the designer thought carefully, and it it was the best choice according to their values.

                                  To me, the choice to use enum seems pretty defensible.

                                  • There’s already a precedent in attaching data to variants: Java’s enum do that (they require the same data for all variants, so they are not sum-types, but they are decidedly not C-style bunch of integer constants).
                                  • Another possible choice of keyword is union, but that arguably has even more different semantics. In particular, without associated data Rust’s enum is pretty close to C’s enum, while a C union without associated data is just an incoherent thing. Even more directly, rust has union keyword to declare unions, which are a different thing from enums.
                                  • Another choice would be variant. That’s probably the right term here. The problem with it is that it’s unfamiliar and doesn’t have precedent in C-style languages. This also leaves the enum unused, which creates needless difference with C-style enumerations
                                  • Rust also, by design, has short keywords, and it’s difficult to truncate variant without it being confused with variable.
                                  • Besides union, variant, and enumeration, I don’t think there are any other pre-existing terms for this semantic space. And inventing a new term (eg, pub choice Month { Jan, … }) goes way against the goal of familiarity.
                                  • Another option would be to get rid of the struct/enum keywords at all and use ML-style sigil syntax for specifying enums, but that seems to stray even further from C-style curly-braced typed declarations.

                                  Given Rust’s values of:

                                  • being familiar to C/C++ developers
                                  • bringing good “functional” ideas to practice without high-brow academic jargon
                                  • preference for terse keywords

                                  I personally don’t see any other choice Rust could’ve made here (but maybe I’ve missed some non obviously horrible alternative?)

                                  1. 1

                                    Well, there’s at least a third possible explanation: the designer thought carefully, and it it was the best choice according to their values

                                    Wouldn’t that be bad? Shouldn’t they use the agreed upon language values (e.g., those you list at the end of your comment) rather than their personal values?

                                    To comment on your other points: variant is pretty familiar to C++ developers: std::variant is in C++17 and was long before that in Boost and it was clear to everyone in the community that Boost’s design and name are being used as a base for the standardization. I also find the argument that the variant keyword is too long unconvincing: it’s only one character longer than struct and is a relatively low-frequency keyword, unlike, say, let and mut.

                                    1. 2

                                      Shouldn’t they use the agreed upon language values (e.g., those you list at the end of your comment) rather than their personal values?

                                      I use “language values” and “language designer” values interchangeably: these are very similar, especially in the early days of Rust (it can be argued that today Rust strays away a bit from Graydon’s original value design).

                                    2. 1

                                      If Rust valued being familiar with C/C++ developers, I think they failed in this case. I’m also failing to see why keywords have a limit of five characters (but I’m not interested in tracking that decision down, but it seems rather arbitrary to me). One other question—what is a Rust union then?

                                      1. 2

                                        It’s an unsafe construct which has exactly the semantics of a C (untagged) union and was reserved for that exact purpose for interop.

                                        See also my other comments in this discussion (on sibling threads, esp. here) showing how the C definition of an enum copied over to Rust does exactly what it would in C, which is a pretty good way of being “unsurprising” in my view!

                                      2. 1

                                        How about data, which is used for sum types in Haskell?

                                      3. 4

                                        As far as I can see, Swift uses the term enum in the same way Rust does:

                                        https://docs.swift.org/swift-book/LanguageGuide/Enumerations.html

                                    3. 1

                                      This is what happens if you force C++ syntax on an ML-like language.

                                    4. 5

                                      I think this points out that Rust features are often built in a way that is complementary to the rest of the Rust language in a way that is more cohesive than just adding features.

                                      1. 4

                                        Well, the “tuple” fields in a Swift enums case can be named as well.

                                        1. 2

                                          Ah, good point – I had forgotten that, because I’ve not usually seen it… except in function definitions, since that’s how args are modeled, which had entirely fallen out of my head until you mentioned this.

                                          1. 2

                                            In addition to that, if you really wanted to use structs there, you could move them inside the enum if that’s the only place you want to use them. This reduces the clutter in your top level namespace and makes it obvious that a value of that struct type came from that enum when the value is bound to a variable through a pattern match for example.

                                        2. 2

                                          Since, as can be easily deduced from this article, a “struct” is equivalent to an “enum” wïth one branch, what’s the point of having this distinction at all?

                                          1. 2

                                            There are things you can do with structs you cannot do with enums; in particular, creating a zero-sized struct which has no runtime cost but is useful for type-checking (struct Empty;). I don’t believe the emit for a single-variant discriminate-only enum is the same there: it still emits the discrimination I believe.

                                            There’s also user-facing value in being able to define the data structure itself standalone, in terms of both convenience and usability, and that goes extra when composing together types.

                                          2. 2

                                            That’s neat that enum cases are automatically inner types. Swift doesn’t offer that.

                                            But in everyday use I like Swift’s tradeoff: An enum case can have many named properties, and you can bind each to a variable right in the switch case. To the article’s point, there’s no reason you can’t use a whole struct as a Swift enum case’s sole unnamed property. But there’s a good chance that type exists already and doesn’t need to be created just for the enum case. If so, the enum case name is likely to be different from the property type name, to give it meaning in context with the other cases.

                                            Over the years Swift has removed some features that didn’t get used, like automatically using a tuple as a function argument list, and curried function declaration syntax. Swift was highly influenced by Rust and surely got its enums from Rust, so I wonder what decisions led from that to this?