1. 31
  1. 4

    I agree that newtypes theoretically decrease type safety when used improperly, and if there’s a (reasonably) better way to represent a type you should absolutely use that, but short of switching to a language with a dependent type system I don’t think it’s a problem that can be solved.

    And even with dependent types, the difference between structurally equivalent values like a Distance and a Duration, or a String containing a valid GraphQL name and a GraphQL.Name is only in their name, so giving them a name in the type system is the only way to represent them.

    1. 2

      I don’t think using newtypes is necessarily a problem, but they should be treated properly. They are fundamentally less type safe than a constructive representation, and should be handled as such. Even with dependent types there are likely situations that are far trickier to model with constructive types than newtypes. Its just tradeoffs all the way down.

      1. 1

        The post gives NonEmpty datatype as an example that does not require dependent type. The argument is that instead of wrapping [T] (list of T) in NonEmpty newtype, one should model NonEmpty datatype as pair of head and tail list, that is, (T, [T]).

        The example itself is pretty compelling to me, but I wonder how common such cases are. Maybe such cases are rare without dependent type. I am not sure.

        1. 1

          The other example in the post is an enum, which is a very common example.

      2. 3

        Perhaps languages should have a strict divide between structural types and behavioral types. The safety we desire is sometimes about knowing that certain otherwise-illegal states are unrepresentable, but also sometimes knowing that certain otherwise-undesired behaviors are impossible.

        1. 3

          Is there a family of type systems that lets me say something like, this validation function returns a proof of sorts that a value is safe, and all functions that operate on that value with the assumption of it being safe have to take that as a parameters, and the compiler checks that the proof is of the correct type and corresponds to the value you provided, and that only one function can return that proof? Essentially something like this pseudo-ocaml:

          let validate_html str : option html_string =
            if is_valid_html str then
              Some (html_string str)
            else None
          with proof html_string : string
          let do_something (html : html_string) = (* *)
          (* type error: only validate_html is allowed to generate a fresh html_string value *)
          let validate_html2 str : option html_string = (* *)
          (* type error: expected html_string but got string *)
          do_something "abcde"
          let s : html_string = validate_html "<div />" |> Option.get
          String.substr s 0 2 (* html_string could gracefully degrade to string *)
          |> do_something (* but this would be a type error *)

          I’ve heard of refinement types and Liquid Haskell, but it’s not quite what I’m thinking of, I think this looks more like Rust’s borrow checker.

          1. 1

            In the general case that is a dependant type system, like Idris.

        2. 3

          I think the underlying idea here—that it’s preferable to model data so that impossible states are eliminated internally—is an important one. But focusing on newtype seems misguided.

          You can have this level of safety with newtype:

          newtype NonEmpty a = NonEmpty (a, [a])

          And you can lack this safety with data:

          data NonEmpty a = NonEmpty [a]

          It’s not about newtype vs. data. It’s about how you model that data within the constructor.

          Lexi does mention newtype wrappers specifically in a couple places, trying to call out the specific anti-pattern of wrapping an existing type and adding an external constraint that isn’t enforced internally. Within that context, this advice makes sense, but I think it’s not clear as it could be.

          I think this nuance is important because you should be looking at the individual constructors within your data types as well.

          1. 2

            I’m not convinced. There’s a bit of a strawman feel to this argument, setting up a particularly narrow definition of type safety and tearing down newtypes based on that. Am I missing the point?

            Specifically, regarding the GraphQL.Name example, there seems to be a clear benefit to newtyping that I’d be happy to label type safety:

            Say we have notions of GraphQL variables and functions. (No idea if GraphQL really has that, but it shouldn’t be relevant to the argument.) At some point, we have relatively unstructed GraphQL data that includes definitions of both variables and functions, which both have identifiers of type GraphQL.Name. Then we have some parsing step that distinguishes between variables and functions, and maps the former’s identifiers to a newtype wrapper VarName, the latter’s identifiers to a separate newtype wrapper FuncName. And then somewhere else in the codebase, we have a function apply :: FuncName -> VarName -> String. Now the type system is pretty helpful in ensuring you don’t accidentally call apply var func, isn’t it?

            1. 4

              My takeaway was not that this post was tearing down newtypes but instead putting them in the correct place with the tradeoffs defined. Constructive modeling is better than newtypes if it can be reasonably done, since it provides more type safety, but newtypes are still better than nothing they just have to be treated as potentially dangerous.

              The GraphQL.Name point I think is somewhat unrelated. There are definitely use-cases for a newtype pattern here that actually provides safety. But for this particular case the author saw a newtype being used which didn’t provide any additional safety, which I think we can all agree is just noise.

              Maybe the takeaway is that simply using a newtype does not provide any benefits by itself, and requires creating and maintaining an API around it to see benefits. Whereas a constructive type provides benefit by default.

              1. 1

                How do you know that ArgumentName newtype didn’t provide any safety, e.g. along the lines of the example I laid out? I don’t see the author arguing that. Instead we get a flippant “This newtype is useless noise.”.

                1. 5

                  I guess I’m not completely following. The example the author chose was specifically a scenario where the newtype was effectively no different from the encapsulated type, and the usage in the codebase was entirely just wrapping and unrwapping. In this scenario the author is arguing that the newtype was just providing noise.

              2. 1

                And then somewhere else in the codebase, we have a function apply :: FuncName -> VarName -> String. Now the type system is pretty helpful in ensuring you don’t accidentally call apply var func, isn’t it?

                There are cases where this can happen, but with the example given I think the point was that the constructor is exported (or at least an unwrapper function is) and every single codepath immediately unwrapped. Sot it’s still easy to say:

                apply (FuncName "someVar") (VarName someFunc)

                because you’re cutting and pasting stuff around or being lazy and keeping things in text/string too long or taking them from direct user input or whatever.

              3. 1

                For the OneToFive type, is there a middle ground between the enum and newtype approaches? Is there any way to achieve the benefits of both?

                1. 3

                  Sure! One way could be this:

                  data OneToFive = One | Two | ... deriving (Bounded, Enum)

                  Such that you can convert to/from Int and do Int-y operations when needed, but you always have to covert back to a OneToFive when you’re done. And make sure you have enough helpers around for the common actual purpose of the enum that you can usually use it without the convert to Int!