hmm, the premise is that byte code can be denser than native code. I’d agree. I see pcode mentioned at the end. afaik, pcode was also used by Microsoft for dos versions of programs like excel. It allowed more code to fit in tiny segments.
The conclusion seemed to be that squeak byte code is most efficient. But the interpreter was comparatively huge? Shouldn’t a little more investigation be done? The design may result in densest byte code, but if such a design requires a large complicated interpreter, isn’t it a net loss for anything but the largest of programs? Maybe I missed something but assuming efficient implementation of squeak seems hasty.
Excel used P-Code (bytecode) for a long time, and not just for DOS. It also allowed more code to fit on a floppy.
You’re right that interpreter size is a big concern. I don’t think that the bytecode dispatch is particularly space-hungry, though. We’re not talking about bytecodes here like the VAX’s “evaluate polynomial” or “BitBlt” or “packed sum of 8 absolute pairwise byte differences” (Intel’s PSADBW); they’re things like “push self” (0x70), “push temporary location #3” (0x13), “pop and store receiver variable #5” (0x65), or “send literal selector #7 with one argument” (0xE7). (See p.596 (the 618th page)) of the Smalltalk Blue Book, “Smalltalk-80: The Language and its Implementation”, for more detailed information.)
As an example, the code for decoding that last bytecode might look like this, in C:
That is, even on amd64, it’s 0x24 (=36) bytes, of which 20 bytes are 5 32-bit literals. On an 8-bit microcontroller (the context of my post), those would probably be 8-bit literals instead, so maybe twenty or thirty bytes.
If you multiply this by 40, this gives an estimate of about 800–1200 bytes of machine code needed to decode compact Smalltalk-like bytecodes.
Now, you could argue that these bytecodes are concealing some huge amount of machinery related to dynamic dispatch and inheritance and so forth. But in fact dynamic dispatch is usually pretty simple, and can be extremely simple; Piumara’s COLA is tiny, and if you just implement the actor model maybe you don’t even need that. You might need another few hundred bytes of code to implement objects.
So whether it’s “a net loss for anything but the largest of programs” may depend on what your “largest programs” are. Intuitively I am guessing that bytecode saves you space for any program over about 2kiB of code, so if your program ROM is 2K or 4K, then maybe you’re right. But if you’re talking about humongous ROM spaces like 16kiB, 32kiB, or even more, I would say no.
hmm, the premise is that byte code can be denser than native code. I’d agree. I see pcode mentioned at the end. afaik, pcode was also used by Microsoft for dos versions of programs like excel. It allowed more code to fit in tiny segments.
The conclusion seemed to be that squeak byte code is most efficient. But the interpreter was comparatively huge? Shouldn’t a little more investigation be done? The design may result in densest byte code, but if such a design requires a large complicated interpreter, isn’t it a net loss for anything but the largest of programs? Maybe I missed something but assuming efficient implementation of squeak seems hasty.
Excel used P-Code (bytecode) for a long time, and not just for DOS. It also allowed more code to fit on a floppy.
You’re right that interpreter size is a big concern. I don’t think that the bytecode dispatch is particularly space-hungry, though. We’re not talking about bytecodes here like the VAX’s “evaluate polynomial” or “BitBlt” or “packed sum of 8 absolute pairwise byte differences” (Intel’s PSADBW); they’re things like “push self” (0x70), “push temporary location #3” (0x13), “pop and store receiver variable #5” (0x65), or “send literal selector #7 with one argument” (0xE7). (See p.596 (the 618th page)) of the Smalltalk Blue Book, “Smalltalk-80: The Language and its Implementation”, for more detailed information.)
As an example, the code for decoding that last bytecode might look like this, in C:
which generates this assembly listing on amd64 with
-g -Wa,-adhlns=foo.lst:That is, even on amd64, it’s 0x24 (=36) bytes, of which 20 bytes are 5 32-bit literals. On an 8-bit microcontroller (the context of my post), those would probably be 8-bit literals instead, so maybe twenty or thirty bytes.
If you multiply this by 40, this gives an estimate of about 800–1200 bytes of machine code needed to decode compact Smalltalk-like bytecodes.
Now, you could argue that these bytecodes are concealing some huge amount of machinery related to dynamic dispatch and inheritance and so forth. But in fact dynamic dispatch is usually pretty simple, and can be extremely simple; Piumara’s COLA is tiny, and if you just implement the actor model maybe you don’t even need that. You might need another few hundred bytes of code to implement objects.
So whether it’s “a net loss for anything but the largest of programs” may depend on what your “largest programs” are. Intuitively I am guessing that bytecode saves you space for any program over about 2kiB of code, so if your program ROM is 2K or 4K, then maybe you’re right. But if you’re talking about humongous ROM spaces like 16kiB, 32kiB, or even more, I would say no.