1. 11
  1.  

  2. 2

    hmm, the premise is that byte code can be denser than native code. I’d agree. I see pcode mentioned at the end. afaik, pcode was also used by Microsoft for dos versions of programs like excel. It allowed more code to fit in tiny segments.

    The conclusion seemed to be that squeak byte code is most efficient. But the interpreter was comparatively huge? Shouldn’t a little more investigation be done? The design may result in densest byte code, but if such a design requires a large complicated interpreter, isn’t it a net loss for anything but the largest of programs? Maybe I missed something but assuming efficient implementation of squeak seems hasty.

    1. 2

      Excel used P-Code (bytecode) for a long time, and not just for DOS. It also allowed more code to fit on a floppy.

      You’re right that interpreter size is a big concern. I don’t think that the bytecode dispatch is particularly space-hungry, though. We’re not talking about bytecodes here like the VAX’s “evaluate polynomial” or “BitBlt” or “packed sum of 8 absolute pairwise byte differences” (Intel’s PSADBW); they’re things like “push self” (0x70), “push temporary location #3” (0x13), “pop and store receiver variable #5” (0x65), or “send literal selector #7 with one argument” (0xE7). (See p.596 (the 618th page)) of the Smalltalk Blue Book, “Smalltalk-80: The Language and its Implementation”, for more detailed information.)

      As an example, the code for decoding that last bytecode might look like this, in C:

      if ((insn & 0xf0) == 0xe0) {
        send(literals[(int)(insn & 0x0f)], 1);
      } else
      

      which generates this assembly listing on amd64 with -g -Wa,-adhlns=foo.lst:

        10:squeak-bytecode-skeleton.c ****     if ((insn & 0xf0) == 0xe0) {
        24                    .loc 1 10 0
        25 0012 89C2          movl    %eax, %edx
        26 0014 81E2F000      andl    $240, %edx
        26      0000
        27 001a 81FAE000      cmpl    $224, %edx
        27      0000
        28 0020 7516          jne .L3
        11:squeak-bytecode-skeleton.c ****       send(literals[(int)(insn & 0x0f)], 1);
        29                    .loc 1 11 0
        30 0022 83E00F        andl    $15, %eax
        31 0025 8B3C8500      movl    literals(,%rax,4), %edi
        31      000000
        32 002c BE010000      movl    $1, %esi
        32      00
        33 0031 E8000000      call    send
        33      00
        34 0036 EB1C          jmp .L4
      

      That is, even on amd64, it’s 0x24 (=36) bytes, of which 20 bytes are 5 32-bit literals. On an 8-bit microcontroller (the context of my post), those would probably be 8-bit literals instead, so maybe twenty or thirty bytes.

      If you multiply this by 40, this gives an estimate of about 800–1200 bytes of machine code needed to decode compact Smalltalk-like bytecodes.

      Now, you could argue that these bytecodes are concealing some huge amount of machinery related to dynamic dispatch and inheritance and so forth. But in fact dynamic dispatch is usually pretty simple, and can be extremely simple; Piumara’s COLA is tiny, and if you just implement the actor model maybe you don’t even need that. You might need another few hundred bytes of code to implement objects.

      So whether it’s “a net loss for anything but the largest of programs” may depend on what your “largest programs” are. Intuitively I am guessing that bytecode saves you space for any program over about 2kiB of code, so if your program ROM is 2K or 4K, then maybe you’re right. But if you’re talking about humongous ROM spaces like 16kiB, 32kiB, or even more, I would say no.