1. 9
    1. 2

      Re: figuring out code size: a really helpful tool in this regard (in my experience) is the linker map (-Wl,-Map=outfile.map), but it’s very very verbose, but also detailed.

      If you don’t need the features, -fno-stack-protector -fomit-frame-pointer -fno-unwind-tables -fno-asynchronous-unwind-tables -fno-rtti -fno-exceptions -fno-threadsafe-statics can also yield a few gains. If you’re doing floating point stuff and have an FPU available, make sure it’s actually being used: -marm -mfloat-abi=hard -march=armv7-a+neon+vfpv3, otherwise the compiler will generate softfloat code.

      Also, if you can afford it, rewrite the crt* code and reimplement a few stdlib functions that can be made much smaller (eg. the newlib strcmp seems to be made for performance and thus needs to do some unaligned access magic, but it can also be done with a simple ldrb/cmp loop).

      (EDIT: markup fix)

      1. 2

        Re: figuring out code size: a really helpful tool in this regard (in my experience) is the linker map (-Wl,-Map=outfile.map), but it’s very very verbose, but also detailed.

        Cyril Fougeray wrote a post about the linker map file on Interrupt a few weeks ago FWIW: https://interrupt.memfault.com/blog/get-the-most-out-of-the-linker-map-file

        If you don’t need the features, -fno-stack-protector -fomit-frame-pointer ’-fno-unwind-tables -fno-asynchronous-unwind-tables -fno-rtti -fno-exceptions -fno-threadsafe-statics

        In our case most of those are disabled by default since we’re writing C code on an ARM MCU (fno-stack-protector, fomit-frame-pointer, fno-rtti, fno-exceptions, fno-threadsafe-statics). I didn’t think to check whether the unwind tables ended up in my bin file, I’ll look into it! In any case, your suggestions make a lot of sense in many contexts.

        If you’re doing floating point stuff and have an FPU available, make sure it’s actually being used:-marm -mfloat-abi=hard -march=armv7-a+neon+vfpv3`, otherwise the compiler will generate softfloat code.

        I initially had a section about FPU code, but my test code targets the cortex-m0 which does not have an FPU. The next blog post in the series will talk about floating point code.

        Also, if you can afford it, rewrite the crt* code and reimplement a few stdlib functions that can be made much smaller (eg. the newlib strcmp seems to be made for performance and thus needs to do some unaligned access magic, but it can also be done with a simple ldrb/`cmp loop).

        I file this under “desperate measures”. There are smaller implementations of libc than newlib-nano (note we’re not using standard newlib), but they’re much less robust.

        1. 1

          In our case most of those are disabled by default since we’re writing C code on an ARM MCU

          I’ve also done stuff for an FPU-less ARM chip (good ol’ ARM7), but the toolchain still enabled these by default, so ¯\_(ツ)_/¯.

          I file this under “desperate measures”. There are smaller implementations of libc than newlib-nano (note we’re not using standard newlib), but they’re much less robust.

          True, but I’ve been exposed to places where things like these (and even worse) were needed.

      2. [Comment removed by author]