1. 17
  1.  

  2. 14

    The 5th level of achievement as a programmer is architecting your code in such a way that errors are impossible to compile. In other words, use a language with a sophisticated type system (and other forms of static checking).

    This article was written in 2005, meaning it was somewhat before the current era of actually using the good ideas from academic programming language research, and enough time has passed since then that it shows. Few people talk about Hungarian notation anymore, because ML-style types allow programmers to accomplish a much more general and robust version of the same thing without needing a special variable-naming convention - and languages with these type systems are pretty easy to use in production these days.

    1. 15

      While having a more sophisticated type system is definitely useful, you can use older type systems to enforce this type safety as well. E.g.

       typedef struct {char* s} unsafe_string_t;
       typedef struct {char* s} safe_string_t;
       unsafe_string_t read_input(socket_t s);
       void write_output(safe_string_t s);
       safe_string_t encode(unsafe_string_t s);
      
       ...
       unsafe_string_t in = read_input(sock);
       write_output(in); // whoops! won't compile
       write_output(encode(in)); // will compile and is correct
       ...
      
      1. 7

        You don’t need ‘ML-style types’ for this. You need opaque typedefs, a simple enough feature that I think it could be proposed for and added to C itself without kicking up a fuss.

        1. 4

          I’m pretty sure opaque typedefs is literally what was meant by ML style types. I’m not sure how it would work in C since you still need to be able to add to indices even when they have a non basic integer type.

          1. 4

            There’s also tagged unions and generics.

            1. 1

              ML-style types would normally refer to algebraic data types with (at least) first order parametric polymorphism. That means you need tagged unions, first class functions and closures.

              I’m not sure how it would work in C since you still need to be able to add to indices even when they have a non basic integer type.

              I’d like to just be able to define a typedef where you have to explicitly cast to and from it in order to use it as its underlying type. It would be super extra double nice if you could somehow apply it to enums too.

            2. 2

              Opaque non pointer typedefs?

              1. 2

                As in being able to say opaque_typedef int foo; and then being unable to do

                int x = 1;
                foo f = x;
                

                You would need to explicitly cast to and from the typedef and the underlying type. Would be useful for enums too, although how exactly it would be worded is a little complicated as I think you’d want to allow the values of the enum to be used without casting them, even though they’re integers.

                enum option {
                    OPTION_0,
                    OPTION_1,
                    OPTION_2,
                    OPTION_3,
                };
                opaque_typedef enum option option_t;
                
                option_t opt_1 = OPTION_1; /* ought to be legal */
                option_t opt_2 = 2;        /* ought not to be legal */
                /* however OPTION_1 is just an integer constant in C I'm pretty sure */
                

                Perhaps in this wild fantasy world I’m suggesting the C standard could allow explicit underlying type annotation for enums and then allow you to use opaque_typedef there:

                opaque_typedef int option_t;
                enum option : option_t {
                    OPTION_0,
                    OPTION_1,
                    OPTION_2,
                    OPTION_3,
                };
                
                option_t opt_1 = OPTION_1; /* legal, OPTION_1 is of type option_t */
                option_t opt_2 = 2;        /* not legal, 2 is not of type option_t */
                

                but then what about flags where you want to be able to bitwise-or them without casting and set the values of the options explicitly, would you need to do enum : flags_t { ... FLAG_3 = (flags_t)4, FLAG_4 = (flags_t)8, ...}, etc.?

                Anyway I think the idea has merit and really just needs someone to write it up and submit it to the committee probably…

            3. 4

              As far as I know type systems can’t ensure that a program conforms to a specification and can’t check the soundness of that specification. In my experience, type systems are the way to go but in order to make all errors impossible to compile, formal methods are likely required.

              1. 2

                Or on the other side you can write Clojure where you name your variables with english words like list-of-names and capitalized-list-of-names that make them clear.

              2. 4

                I think quote is the money

                “The way to write really reliable code is to try to use simple tools that take into account typical human frailty, not complex tools with hidden side effects and leaky abstractions that assume an infallible programmer.”

                1. 3

                  As we all know, C never does anything surprising when you multiply integers. There are no cases where the types are widened or converted from unsigned to signed underneath you in an expression.

                  1. 3

                    The C conversion rules are stated in a really complicated way but are actually relatively simple, and very straightforward if you assume a modern system with char/short/int/long as 8/16/32/64-bit. No sane system deviates from that set nowadays anyway.

                    1. Integer promotion: Anything smaller than int is promoted to int. signed char, unsigned char, short, unsigned short and int all are promoted to int when operations are done on them.
                    2. The obvious things happen: if both sides are the same type, the result is obviously that type, if both sides are signed, the result is obviously the larger type, and if both sides are unsigned, the result is obviously the larger type. Now you’re only left with unsigned+signed operations.
                    3. unsigned+signed pt 1: If the signed type is smaller or equal size, convert it to the unsigned type.
                    4. unsigned+signed pt 2: If the unsigned type will fit entirely into the signed type, convert it to the signed type.
                    5. unsigned+signed pt 3: Otherwise, convert both to the unsigned version of the signed type.

                    Once you’re aware of integer promotions, almost every case follows the obvious rule anyone would choose for implicit conversions. The only remaining cases are: int + unsigned (i32+u32), long + unsigned (i64+u32) , int + unsigned long (i32+u64) and long + unsigned long (i64+u64). Third bullet, fourth bullet, third bullet, fifth bullet. Results are u32, i64, u64 and u64.

                    In other words, once you’re aware of integer promotion, they’re all very obvious (int + long gives int, wow!), with the only non-obvious ones being those last four which you can just learn.

                    And IMO the only one of the last four that really is unintuitive is int + unsigned = unsigned, which is unfortunate as it’s probably the most common as well…