1. 13
  1. 5

    The relevant type system features would be “refinement types” and any form of a “record type”. It would also be worth checking out “row polymorphism”. (NB: refinement types are more general than just restricting the fields of a record.) Generally the problems are as follows:

    • Adding records and refinements might require width and/or depth subtyping. And then subtyping adds complexity to the type system, to the point where it may no longer be able to infer or accept some useful types. (In particular, combining subtyping with a method of abstraction, like a bounded generic or associated type, tends to be difficult.)
    • When you have subtyping, the runtime layout of the objects involved may no longer be static, and require you to have a method of resolving fields/methods at runtime.

    You can define a refined record type of the same style in TypeScript:

    type Foo = { bar: number, baz: string };
    type FooWithOnlyBar = Foo | { bar: number };
    

    Usually, you want to add at least some implicit subtyping to such a type system, or else it becomes unergonomic to use. For example:

    struct Foo {
      f1: usize,
      f2: usize,
      f3: Vec<usize>,
    }
    
    impl Foo {
      fn print_f1(&{f1} self) {
        println!("f1 is {}", foo.f1);
      }
    
      fn use(self) {
        // Move out of `f3` here. After this, the type of `foo` should
        // be refined to `{f1, f2} Foo`.
        let f3 = self.f3;
    
        // Surely this call should be allowed? But without implicit subtyping,
        // you only have a `&{f1, f2} Foo`, and that's not directly compatible
        // with `&{f1} Foo`.
        self.print_f1(&foo);
      }
    }
    

    One solution is to force the user to write a type annotation/downcast in any place where the types don’t match up exactly. This is potentially reasonable — Rust already forces you to write type annotations in some cases, like as &dyn Foo to accomplish subtyping with trait objects. But it seems clunky if you have to do with self.

    If you do choose to support implicit subtyping, then this kind of subtyping would be called “width” subtyping. Type inference gets a little bit more complicated — but, then again, the compiler will already give up in some cases where it gets too difficult.

    There’s also some cases where people will probably want to be polymorphic over a T with captured fields, like this:

    fn print_foo<T>(value: &{foo: usize} T) {
      println!("foo is {}", value.foo);
    }
    

    It’s a little difficult to implement in a language like Rust, since the layout of T will be for different types, which means that the accessor value.foo would also generate different code. (Whereas in many object-oriented dynamic languages, since all the values are boxed and all the values admit arbitrary property lookup, adding a static type system on top of it which supports this kind of analysis is easier.) It is quite possible, though; see what OCaml did for row polymorphism.

    Most of the rest of the complications from using these type system features together come from “depth” subtyping, in which a record type is a subtype of another record type if it has the same fields and all of the fields are subtypes, respectively. But if you can’t be polymorphic over the fields themselves (as with row polymorphism), and you can’t be polymorphic over the base type, then it doesn’t seem that depth subtyping is useful.

    1. 2

      There’s also some cases where people will probably want to be polymorphic over a T with captured fields, like this:

      fn print_foo<T>(value: &{foo: usize} T) {
        println!("foo is {}", value.foo);
      }
      

      It’s a little difficult to implement in a language like Rust, since the layout of T will be for different types, which means that the accessor value.foo would also generate different code.

      I think Rust is principly based around zero-cost-y abstractions, so they wouldn’t do this, but this seems to be a similar problem to generics in general, where you have something like value.into().thingy for various things, and the method call for into is definitely not going to be at the same address.

      I guess the way Rust ends up implementing this is by passing in some sort of side-structure to provide the right metadata for the into method, and you could imagine something like that being generated anonymously here by the compiler in the case of these views.

      This would be messy and probably not acceptable for Rust, but I’d be fine with it. That or just monomorphising everything that gets touched.

      1. 2

        There’s also some cases where people will probably want to be polymorphic over a T with captured fields, like this:

        fn print_foo(value: &{foo: usize} T) { println!(“foo is {}”, value.foo); }

        It’s a little difficult to implement in a language like Rust, since the layout of T will be for different types, which means that the accessor value.foo would also generate different code. (Whereas in many object-oriented dynamic languages, since all the values are boxed and all the values admit arbitrary property lookup, adding a static type system on top of it which supports this kind of analysis is easier.) It is quite possible, though; see what OCaml did for row polymorphism.

        I don’t think this is any different from the case where a function is polymorphic over kind of generic type, which Rust supports currently - the compiler would create a monomorphized version of the function for every concrete type actually used in the crate.

        1. 1

          You’re right.

      2. 2

        One positive thing from Rust’s current restrictions is that it has sometimes encouraged me to factor a single large type into multiple smaller ones, where the smaller ones encapsulate a group of logically related fields that are accessed together.[^ex] On the other hand, I’ve also encountered situations where such refactorings feel quite arbitrary – I have groups of fields that, yes, are accessed together, but which don’t form a logical unit on their own.

        As an example of both why this sort of refactoring can be good and bad at the same time, I introduced the [cfg] field of the MIR Builder type to resolve errors where some methods only accessed a subset of fields. On the one hand, the CFG-related data is indeed conceptually distinct from the rest. On the other, the CFG type isn’t something you would use independently of the Builder itself, and I don’t feel that writing self.cfg.foo instead of self.foo made the code particularly clearer.

        This is something that I’ve also noticed myself being lead to by Rust’s reference semantics. One idea that I’ve had about this is to allow defining nested types. E.g. if the CFG type only ever makes sense in the context of the MIRBuilder type, allow CFG to be defined such that it is scoped “within” MIRBuilder (e.g. MIRBuilder::CFG), and can only be used within the MIRBuilder type unless explicitly exported. This doesn’t solve the problem of self.cfg.foo being more verbose, however.