1. 20

  2. 4

    About ten years ago, when I was still active on StackOverflow, I raised a similar question: https://stackoverflow.com/questions/2760794/x86-cmp-instruction-difference

    Turns out someone then commented that these bits of freedom can be used to fingerprint executables: “These 1-bit degrees of freedom also provide a covert channel for compilers to “phone home” - they can “watermark” the binaries they produce, and the compiler vendor can ask you to please explain if they find your software with their watermark, but with no license on file.” – Bernd Jendrissek

    Nice to still see interest in this stuff!

    1. 1

      Thanks for asking that question. oddly enough, I was wondering about this topic a few days ago. I knew about the dual encodings, but I was wondering if disassemblers ever differentiated the mnemonic. I happened upon your stack overflow question.

      It turns out that Nasm etc. do not differentiate. However, gas does and one of the answers to your question showed the .s suffix modifier. This was a nice fine since I couldn’t find anything in my normal Google searches.

      1. 2

        There used to be an excellent shareware assembler named a86 back in the DOS days at a time when I actually wrote assembler. That used this technique for fingerprinting. I remember hearing that a number of viruses were found to have been assembled with a86.

    2. 1

      This is a nice summary of the technique. See my other comment on this thread.

      I was researching a related topic a few days ago. I was wondering if disassemblers did the right thing with these opcodes. I was wondering in particular if they could reproduce the exact binary after disassembly.

      Another trick might be the use of multibyte nop sleds, but that would require much more work.

      1. 1

        Like every ISA, x86 (and AMD64) have multiple ways to encode the semantics of a particular (higher level, conceptual) operation.

        While true, “like every ISA”, irked me, as CISC architectures (and particularly x86 and amd64) do have excessive ways to achieve the same thing, whereas sane (RISC) architectures do minimize this as an effect of only adding ISA complexity with strong justification.

        1. 9

          This is true even on RISC, though, as long as there are general-purpose registers; the compiler can build a side channel by permuting the used GPRs on every procedure prologue and epilogue, with a minimum of one bit per procedure when there are two GPRs. (Specifically, for n GPRs used in a procedure, we should be able to encode (n-1)! bits in the side channel.) The difference is in the necessary conditions for a disassembler to be able to find/visualize the side channel.

          That said, it would be interesting to consider The Mill, which will have a belt of GPRs instead of slots, as not having this flexibility. Instead, The Mill would require a procedure to be entirely rescheduled in order to change the numbering of GPRs.

          1. 2

            Absolutely! The x86 family is particularly guilty of this, and its strange operand encoding is why the technique discussed in the post works.

            The “like every ISA” is there for completeness, and because it’s important to the compiler-based technique that the post mentions but doesn’t use.