1. 72
    1. 11

      That means Blink has an approachable codebase since we’ve only got 63,500 lines of ANSI C11 code

      While I get that this is impressive for what they do, 63k lines of C is in no way “approachable”

      1. 5

        I recently started hacking on a 75k line C codebase, and I was able to get up and running in a few hours. It’s definitely easier to get into than larger codebases, since you can read 5% of the codebase in a short time. You can e.g. read through entire files to get an idea of what functions are available, and be reasonably certain you’ve seen most of them.

      2. 4

        …especially when that code looks like

        double DeserializeLdbl(const u8 b[10]) {
          union DoublePun u;
          u.i = (u64)(MAX(-1023, MIN(1024, ((Read16(b + 8) & 0x7fff) - 0x3fff))) + 1023)
                    << 52 |
                ((Read64(b) & 0x7fffffffffffffff) + (1 << (11 - 1))) >> 11 |
                (u64)(b[9] >> 7) << 63;
          return u.f;
        }
        

        or

        static int xed_modrm_scanner(struct XedDecodedInst *x, int *disp_width,
                                     int *has_sib) {
          u8 b, rm, reg, mod, eamode, length, has_modrm;
          xed_set_has_modrm(x);
          has_modrm = x->op.has_modrm;
          if (has_modrm) {
            length = x->length;
            if (length < x->op.max_bytes) {
              b = x->bytes[length];
              x->length++;
              rm = b & 0007;
              reg = (b & 0070) >> 3;
              mod = (b & 0300) >> 6;
              x->op.rde &= ~1;
              x->op.rde |= mod << 22 | rm << 7 | reg;
              if (has_modrm != XED_ILD_HASMODRM_IGNORE_MOD) {
                eamode = kXed.eamode[Asz(x->op.rde)][Mode(x->op.rde)];
                *disp_width = xed_bytes2bits(kXed.has_disp_regular[eamode][mod][rm]);
                *has_sib = kXed.has_sib_table[eamode][mod][rm];
              }
              return XED_ERROR_NONE;
            } else {
              return xed_too_short(x);
            }
          } else {
            return XED_ERROR_NONE;
          }
        }
        
        1. 6

          The first function is pretty obvious to anyone that understands x86, it transforms an 80-bit long double into a double (which is probably not a good thing to do). The second bit looks like it is part of an x86 instruction parser for handling the ModR/M encodings, which is a common part of many x86 instruction encodings. The bit twiddling here looks like it is decoding the 2-byte ModR/M operand value.

          Context is important for these things. I would prefer code like this to contain some comments and references to the bit of the spec that they’re implementing, but for someone with a passing familiarity with x86 it’s not that hard to read. I would expect that most people that might want to usefully contribute would know enough about x86 to be able to follow that code.

          In a slightly higher-level language (C++, Rust, and so on), the bit twiddling would be replaced by helpers to extract specific bit ranges (and the methods would be attached to the class that they’re operating over), but the code probably wouldn’t be very different overall.

        2. 3

          some days I think “maybe I should go back to my college days and write some C,” but comments like this keep me on the holy path

        3. 2

          This is the kind of code that needs copious comments.

          1. 2

            It’s easy to cherry pick bad examples. Here’s a good one:

            Intel’s ALU instructions are the most important thing. You won’t find a more succinct elegant implementation of them.

      3. 3

        It’s probably relative. How many lines in Qemu?

        1. 8

          Qemu has 1,730,794 lines of code.

          1. 3

            That’s a slightly misleading comparison because a lot of that is driver emulators. Of the remainder, this also includes several different architectures. The x86 portion is quite small and TCG is pretty approachable (in spite of starting life as an obfuscated C competition entry).

    2. 3

      I would really like it better if it’d implement POSIX- instead of Linux-Syscalls, but I suspect the ship of “POSIX-compatible as a baseline” has sailed a few years ago…

      1. 1

        POSIX doesn’t specify system calls, it specifies libc interfaces. An implementation needs to decide what is implemented at each layer. As a baseline, POSIX is also annoying because they never standardised a good event source API. Solaris has /dev/poll, FreeBSD introduced kqueue which XNU inherited, and then Linux decided to implement something worse than either with epoll and then something better with io_uring.

        If you want to provide a POSIX (rather than Linux) x86 layer, then you need to provide a libc and to at that point you’re requiring people to build their binaries specifically against your platform, rather than build Linux binaries. If you’re doing that, why target x86 and not some portable ISA that’s designed to be easy to emulate?

      2. 1

        I kinda agree, but tbf, one of the cool things about this is being able to run an entire distro with a working package manager. Would that be possible without any Linux-specific syscalls? Is there some pure-POSIX distro that can run on all kernels?