The Z80 (used in the contemporaneous ZX Spectrum, amongst others) has a blit instruction called LDIR. But it turns out that wasn’t the fastest way to copy memory.
A game (whose name escapes me, but I think it was a wireframe 3d game with a bunch of spheres, possibly set on Mars?) came out with a higher framerate than should be possible. It used something I thought was a pretty amazing technique. The stack access instructions on the Z80 were faster. So to move a bunch of memory you could:
disable interrupts
move the stack pointer to src
pop all the 16 bit registers you can (4 of them I think)
move the stack ptr to dst
push your registers
repeat until blit finished
put the stack pointer back
enable interrupts again
This link describes it in more detail and also suggests that unrolling a bunch of LDI instructions was also faster than LDIR:
The Z80 (used in the contemporaneous ZX Spectrum, amongst others) has a blit instruction called LDIR. But it turns out that wasn’t the fastest way to copy memory.
A game (whose name escapes me, but I think it was a wireframe 3d game with a bunch of spheres, possibly set on Mars?) came out with a higher framerate than should be possible. It used something I thought was a pretty amazing technique. The stack access instructions on the Z80 were faster. So to move a bunch of memory you could:
This link describes it in more detail and also suggests that unrolling a bunch of LDI instructions was also faster than LDIR:
https://chuntey.wordpress.com/2013/10/02/how-to-write-zx-spectrum-games-chapter-13/