1. 11
  1.  

  2. 5

    C# made the opposite choice — to update their VM, and invalidate their existing libraries and all the user code that dependend on it. They could do this at the time because there was comparatively little C# code in the world; Java didn’t have this option at the time.

    I think in retrospect, the C# choice was completely correct, but indeed, not available to Java. This is one of several places where C# benefited from learning from Java’s mistakes. But there are also a lot of other places where it didn’t, which are more obvious today as the alternatives become better known. Reference types being always nullable is probably the most obvious one.

    1. 4

      This is a very odd article.

      The discussion of erasure is all true but doesn’t feel relevant. A compiler erases details when lowering to a lower-level abstraction in such a way that semantics are preserved. The fact that a signed and unsigned integer end up in the same CPU register doesn’t matter because the compiler is enforcing the guarantee that it will only perform signed or unsigned operations on those registers (for any operations where the difference would be observable) for the duration that those registers hold the signed / unsigned value. If this kind of erasure happens at the same level of abstraction then it is typically considered an escape hatch from the type system and not something that you want to provide.

      The homogeneous vs heterogenous discussion is conflating two things. Are generics fully reified at compile time or are they dynamically dispatched and is the static type of Foo<X> different from the static type Foo<Y>. These are largely orthogonal choices. You can do dynamic dispatch if each Foo instance captures the type of the generic parameter and asserts that objects passed to methods of the type specified by the generic parameter are of that type (or doesn’t if the static type system can guarantee it). As an optimisation, an implementation may then choose to perform full reification of Foo<T> for certain values of T and avoid the field containing the generic parameter and specialise methods for this case. If the generic parameters are defined at a fixed location in the class structure then this can even be done in a JIT on a per-method basis, reifying only hot-path methods and filling in all of the others with a single implementation that alters its behaviour dynamically based on the generic parameter that it reads from the class definition.

      Indeed, some Java implementations do this kind of reification, even for non-generic types. If you are always using a specific concrete type at a point in the code then they will specialise for this. You don’t need to embed anything in Java bytecode to make this possible in a JIT but doing so can provide useful hints both for early JIT compilation and for AOT compilers.

      The problem with Java generics is that they don’t provide guarantees that you can rely on. A Java Array<T> may contain types that are not subtypes of T. This has the knock-on problem that an Array<int> may actually contain pointers to other objects (Java generics auto-box primitives) and so must reserve enough space for pointers, even though that may double the memory usage.

      The flag day problem is more of an issue. You could say that Foo<Object> is equivalent to Foo, but that just pushes the problem to API consumers and any interface that takes a Foo will break when it’s changed from Foo<Object> to Foo<SpecificThing>. My understanding of the history of Java generics is that this was the overriding concern for generics and, with that constraint, type-erasure is the only thing that you can do. The only real question in my mind is whether, given that constraint, it’s worth having generics at all. If the programmer can’t rely on them because the type checking is advisory and the optimiser can’t rely on them because it has to do type tracking and can apply only the same transforms that it would be able to do for Object fields / arrays, then is there actually any benefit?

      1. 4

        The only real question in my mind is whether, given that constraint, it’s worth having generics at all. If the programmer can’t rely on them because the type checking is advisory

        I think it’s misleading to call the types “advisory”. They are enforced in all code that chooses not to disregard them, by using unsafe casts, calling into non-generic APIs, or by using reflection APIs in a way that obviously subverts the type system. I would prefer a language which did not have those holes, but calling it “advisory” is a bit much. Rather, there is a relatively clear coding style you can generally adhere to that prevents the problem.

        The result is that, in practice, I can count the number of erasure related bugs I’ve dealt with in the past 8 years on my fingers, while I use generics every single day to help understand the behavior of the code I’m dealing with (and I do, sadly, have to deal with some older code that wasn’t fully genericized, so I know that I’m missing something).

        1. 3

          I think it’s misleading to call the types “advisory”. They are enforced in all code that chooses not to disregard them, by using unsafe casts, calling into non-generic APIs, or by using reflection APIs in a way that obviously subverts the type system. I would prefer a language which did not have those holes, but calling it “advisory” is a bit much. Rather, there is a relatively clear coding style you can generally adhere to that prevents the problem.

          They’re advisory because those checks appear only in new code. Old code doesn’t see a Map<String>, it sees a Map, which is equivalent to a Map<Object>. Old code can add arbitrary objects to the map. The new code will then get a run-time bad-cast exception when it tries to get the non-String object out of the collection but it may not be catching or correctly handling that exception.

          If they were separate types, even if old code didn’t see the type, then the check would happen inside the insert methods and so the old code would get the exception when it did the wrong thing.

          The result is that, in practice, I can count the number of erasure related bugs I’ve dealt with in the past 8 years on my fingers, while I use generics every single day to help understand the behavior of the code I’m dealing with (and I do, sadly, have to deal with some older code that wasn’t fully genericized, so I know that I’m missing something).

          It’s been a long time since I wrote Objective-C on a daily basis, but Objective-C back then didn’t have any kind of generics and all collections could store any object that conformed to the NSObject protocol. I think I encountered one or two cases where the wrong type was put in a collection and this introduced a bug. Java and Objective-C Generics are a form of documentation with some minimal automated checks. Documentation is good. It’s not clear to me that they’re a better form of documentation than anything else.

          How many times in the past 8 years have you had code fail to compile because you tried to put the wrong type in a generic? That should tell you the value that generics add.

          1. 2

            Yes, the interface with old code is unsafe. But a tremendous amount of code has been generified. It’s the nature of the thing that adding generics leads to a dynamic where less and less code is unsafe over time. The code base I work on dates back to 2000 or 2001, so substantial parts were written prior to generics, but all the new code is genericized, and older code is genericized as its touched (an alternate path would’ve been to genericize it in a focused effort).

            Java and Objective-C Generics are a form of documentation with some minimal automated checks. Documentation is good. It’s not clear to me that they’re a better form of documentation than anything else.

            It’s a form of documentation that is enforced by the compiler in the ordinary case. That has a leg up on other forms of documentation.

            How many times in the past 8 years have you had code fail to compile because you tried to put the wrong type in a generic? That should tell you the value that generics add.

            Putting the wrong type in a generic is probably not that common, but what is common is passing a List to a method that requires a List. Those sorts of errors happen constantly, too frequently for me to estimate how often I’ve encountered them. The IDE reports the error, I fix it, then I move on, before I’ve had to run any code.

            The vast majority of Java developers use generics and seem to find them useful. Nothing prevents you from writing code that doesn’t use generics, but I’ve never really seen this option promoted. This is not definitive, as peer pressure is a thing, but it’s not meaningless either. In the past, I had one old school coworker who did not consistently use them, but his code was a huge pain to maintain.

        2. 4

          The only real question in my mind is whether, given that constraint, it’s worth having generics at all.

          I learned Java in the 1.4 days and despite their shortcomings the introduction of generics was a huge improvement at the time. removing explicit casts all over the place was worth it.