Like many people, I eventually settled on doing this via the .incbin assembler directive. The following macro shows the technique, and is pretty portable:
(Unfortunately, using extern seems to be necessary; using static works on Linux, but fails on the Mac, for example.)
BTW, the __asm__(#symbol) tags are there to counteract platforms that prepend underscores to linkable symbols. If you’d rather have the underscores, you could instead embed prefixed underscores to all the symbols in the assembler – and then use __asm__("_" #symbol) to counteract platforms that don’t prepend underscores.
I remember going through the same path of “write a hex converter script, get annoyed with it, find obj generator” path almost 20 years ago.
C still doesn’t have anything like include_bytes!(), and probably never will. In 20 years people will still be printing arrays in hex and doing digital archeology searching for these linker flags.
I suspect that a big part of the reason why it isn’t in C is that it’s trivial to handle the simple case, but complex cases are outside of the C abstract machine. Most times I’ve wanted to do this, I’ve wanted to put it in a specific section, with specific alignment requirements and so on. These sorts of thing fit naturally in a linker script (and are very easy to do with a linker script) but in C require a lot of vendor-specific extensions.
We have elfwrap(1) for doing this in illumos. It produces an object file (*.o) directly that can be used by anything that can access C style ELF symbols.
I’m probably being dumb, but I don’t quite follow how this works. How does ld know _binary_quine_c_start and _binary_quine_c_end should specify the first byte and last byte of the text blob? If I link multiple blobs, how do I specify which one is referenced there?
This is a cool technique that I was not aware of.
Like many people, I eventually settled on doing this via the
.incbin
assembler directive. The following macro shows the technique, and is pretty portable:(Unfortunately, using
extern
seems to be necessary; usingstatic
works on Linux, but fails on the Mac, for example.)BTW, the
__asm__(#symbol)
tags are there to counteract platforms that prepend underscores to linkable symbols. If you’d rather have the underscores, you could instead embed prefixed underscores to all the symbols in the assembler – and then use__asm__("_" #symbol)
to counteract platforms that don’t prepend underscores.An even more portable version of this technique: https://github.com/graphitemaster/incbin
I remember going through the same path of “write a hex converter script, get annoyed with it, find obj generator” path almost 20 years ago.
C still doesn’t have anything like
include_bytes!()
, and probably never will. In 20 years people will still be printing arrays in hex and doing digital archeology searching for these linker flags.There’s a proposal to have
#embed
in C2x, at least. Hard to tell what its fate will ultimately be, however.I suspect that a big part of the reason why it isn’t in C is that it’s trivial to handle the simple case, but complex cases are outside of the C abstract machine. Most times I’ve wanted to do this, I’ve wanted to put it in a specific section, with specific alignment requirements and so on. These sorts of thing fit naturally in a linker script (and are very easy to do with a linker script) but in C require a lot of vendor-specific extensions.
Another approach is to use
objres
from elfkickers.We have elfwrap(1) for doing this in illumos. It produces an object file (
*.o
) directly that can be used by anything that can access C style ELF symbols.See also
Also a nice summery: https://gareus.org/wiki/embedding_resources_in_executables
(bonus, this link also considers osx and win32/mingw)
This is the kind of trick that is so cool you MUST use it, but will hardly have any valid reason to actually do so on any toy project.
This makes me sad, but I’ll put it with some old friends like unions … Pretty cool still !
I’m probably being dumb, but I don’t quite follow how this works. How does
ld
know_binary_quine_c_start
and_binary_quine_c_end
should specify the first byte and last byte of the text blob? If I link multiple blobs, how do I specify which one is referenced there?These symbols are created by
ld
, based on the input filename. Per https://www.devever.net/~hl/incbin:You can see them by running
nm myself.o
to dump out the symbol table for the generated object file.This website somehow doesn’t render on iOS.
I suspect it would be better to contact the author rather than posting on some aggregation site.
Worked fine on iOS for me.