1. 20
  1. 5

    Wow! It’s a shame the 64-bit version of NT wasn’t finished and released a few years earlier, that might have given people a compelling reason to buy the Alpha machines. I remember seeing these advertised in a computer magazine and I wanted one so much. I think the manufacturer that bought the first ad page in the magazine was selling up to 266MHz-500MHz Alphas in the same product line that had Intel chips that topped out at 200MHz. Unfortunately, the Windows NT versions then ran a 32-bit OS and mostly ran software in FX32! (an x86 emulator), so you didn’t get nearly as much of a performance boost as you’d like.

    The thing I most remember was that they weren’t that expensive. They were far more than I could afford, but it wasn’t like the jump from PC to UNIX Workstation (add a zero on the end of the price tag), just an incremental increase similar to the same increases within the x86 line.

    In some ways I’m glad that the Alpha died (the memory model was completely insane - C++11 explicitly gave up on the idea of supporting the Alpha in the C++11 memory model) and the floating point behaviour could be fun but it would have been nice if x86 market share had dropped to <80% in the ’90s.

    1. 2

      In some ways I’m glad that the Alpha died (the memory model was completely insane - C++11 explicitly gave up on the idea of supporting the Alpha in the C++11 memory model)

      I used to own a cast-off Alphastation which was a fun experience but I only ran OpenBSD on it.

      What was the “goal” of Alpha? I know it was developed by DEC, so was it an evolution of their VAX philosophy? Was VAX “native” to Alpha? I don’t know much about VAX but from what I’ve read I believe its memory model was way simpler than x86.

      For example, I’ve read that the Itanium would have required new compilers to be really effective, but no-one was prepared to develop these. Was this the case with Alpha too?

      1. 5

        Alpha was probably the most extreme RISC chip ever made. The overriding goal was to design an ISA that maximised the possible instruction throughput in an implementation. This led to some fun things:

        • There were only 32-bit loads and stores. 8- or 16-bit loads had to be masked, stores became read-modify-write. Alpha compilers generally just expanded every stack variable to 32 bits, but an array of shorts or chars could be slow to write.
        • The memory model was basically ‘anything goes in terms of ordering’ (slight exaggeration, not much) with a load of different barriers to enforce serialisation. A lot of things required barriers that no subsequent CPU has ever needed because the hardware interlocks are a lot cheaper than the large number of barriers that you need for correctness.
        • There was no implicit synchronisation between the integer and floating point pipelines at all, floating point exceptions were imprecise (I think you could implement the precise exception mode by dispatching a floating point instruction and then doing a serialising integer instruction, but that was really slow).

        The Alpha was designed to support UNIX and VMS and so intended to make fast everything that compilers and these two operating systems needed (though, in typical ‘90s fashion, didn’t really speak to the compiler people and so missed some things). VMS and UNIX wanted quite different sets of privileged primitives, which Alpha implemented via pluggable PALCode. PALCode is very similar to RISC-V’s Machine Mode or Arm’s EL3: a set of firmware routines that ran with interrupts and (optionally?) address translation disabled. I think the VMS ones had things like atomically-add-to-queue, I don’t recall if they were in the UNIX ones. This also let the Alpha provide something like a consistent set of privileged interfaces across versions with significant microarchitecture changes between them, as well as different privileged interfaces for different operating systems. Most operating systems now have a hardware abstraction layer, so the latter part of this is less important.

        I don’t know if NT on Alpha supported multiple threads on different cores. Most of the big UNIX Alpha systems were sold to run multiple processes (and predated pthreads) and so all of the single-threaded userspace programs could ignore the fun of the memory model, as long as the kernel was correct.

        I’m not really sure what the VAX memory model was. The VAX handbook just says that multiple threads may not concurrently access the same data. It looks as if it had some CISCy instructions for implementing locks and then made it undefined behaviour to do any racy accesses (modern memory models are all about specifying the kinds of racy accesses that are allowed).

        1. 3

          VMS is the operating system; VAX is the architecture. The VAX did indeed have a INSQUE (insert into queue) and REMQUE (remove from queue) instructions that dealt with a doubly linked list. The VAX seems like it would be fun to write assembly for (I never got the chance myself).

          1. 1

            Raymond’s blog has a lot more about what NT on Alpha could and couldn’t do, if you’re genuinely interested enough to read through them: https://devblogs.microsoft.com/search?query=Alpha+&blog=%2Foldnewthing%2F&sortby=relevance

            1. 1

              Unlike other RISC processors of its era, the Alpha AXP does not have branch delay slots. If you don’t know what branch delay slots are, then consider yourself lucky.

              The Alpha AXP, part 1: Initial plunge