I’ve watched friends try Go and immediately uninstall the compiler when they see that the resulting no-op demo program is larger than 2 MiB.
Something about the wording on this kind of bothered me. If said friends were evaluating a language solely on the size of the output binary, perhaps they weren’t really evaluating Go for what it’s good for?
As it turns out, a statically-linked Go binary with libraries and the runtime costs a few megabytes 1. That’s a lot of functionality for a few megabytes. As mentioned in 1, a similar statically-linked hello-world C binary is also pushing a mb (admittedly, I haven’t tried it, but I have no reason to doubt it). There’s a lot of things Go is good for, but golfing binary size isn’t one of them. Though, you can find plenty of examples 23 of ways to shrink Go binaries (and account for binary size) if that’s what you’re really looking for.
I’m not really a Go advocate, I spend my professional time in Java/C++ and my personal time in Rust. I even like the look of Zig (I’m firmly in the “anything but C” camp), but this felt like a bit of an invented example.
The large binaries makes it impractical to use Go WASM in the browser. You can work around it by using the Tiny Go compiler, but that’s not fully compatible with regular Go. OTOH, WASM is still a sort of niche usecase, so in practice it’s not a big deal.
I remember compiling a Motif application in the 90’s on some platform where the Motif libraries weren’t particularly well broken up and shared libraries weren’t well supported either and a “hello Motif world!” application was >1MB…and this was back when 1MB was a significant portion of your disk quota.
It depends a lot on the use case. If you wanted to write a small command-line tool in the language then a fixed 2 MiB overhead per binary would be huge. Even something like BusyBox would likely see a large increase. FreeBSD ships with a Busybox-like single statically linked binary that includes all of the core utilities in /rescue and it’s 13 MiB - an extra 2 MiB there would be quite noticeable. That said, in FreeBSD’s libc, jemalloc is around 1MiB and that’s needed for any non-trivial statically linked binary (snmalloc is smaller).
Even that’s not a great comparison though, because most programs do something. The really important question is how rapidly that grows. If that 2 MiB includes a load of functionality that you’d bring yourself then it may be that a 2MiB fixed overhead ends up being better than a small multiplier on binary size as you increase complexity. C++ templates, for example, can be used well with inlining to give tiny incredibly specialised code (the fast path for snmalloc’s malloc function is split across several templated functions in the source code and compiles down to around a dozen instructions) but it can also cause a rapid growth in code size if you use templates too aggressively and don’t explicitly factor out common code into a non-templated superclass for multiply instantiated templates (the Windows linker will discard identical template functions by default, but that’s technically a standards violation as the address of these functions should compare not equal).
If you wanted to write a small command-line tool in the language then a fixed 2 MiB overhead per binary would be huge.
If you wanted to write it for a constrained environment, yes. If it’s something that will run on user’s desktops, like some internal tool for developers, it’s nothing. The static linked nature of go binaries, and it’s consequential ease of deployment in uncontrolled environments, is a MUCH bigger advantage (believe me, I wrote internal cli tools in python).
Yup, though ‘constrained environment’ covers a lot. For example, I think the base container image for Alpine is around 4 MiB. A 2 MiB overhead in a single tool that runs there is not a big problem but a 2 MiB overhead in each of 100 tools that you run in there will have a noticeable impact on deployment times. Typically, you don’t have 100 tools in a single container (if you do, there’s a good chance that you’ve missed the point of containers).
I probably wouldn’t want an extra 2 MiB on the 13 MiB binaries in /rescue. I definitely wouldn’t want it in the stand-alone versions in /sbin and /bin and so on, because that would roughly double the size of a base VM image, which would increase costs noticeably for cloud deployments. I would be completely fine with it in something like containerd or git. For anything running on a developer desktop, 2 MiB is completely in the noise.
Well, last job I was building and deploying images sometimes well over 1 GB, I guess my perspective is biased there.
Was it kind of a pain in the ass sometimes? Sure, but it worked, we run our shit. So, if that’s doable, I’m not gonna prioritize 2MB overhead unless I REALLY need to.
technically a standards violation as the address of these functions should compare not equal).
Could be done conservatively. But that seems like a standard bug to me. See e.g. lisp permitting coalescing of literals.
(There are formal definitions of equality, which can be applied to functions. It is impossible to determine all cases when two functions are equivalent; nevertheless such a definition is appropriate for a language standard. It is obvious that when two functions comprise exactly the same code they are equivalent.)
Busybox-like single statically linked binary (…) and it’s 13 MiB - an extra 2 MiB there would be quite noticeable.
I think that comparison misses an important thing - it wouldn’t be 2MB of extra code. That space contains useful runtime stuff which larger programs have to implement one way or another. A lot of that runtime already exists in busybox, so rewriting it in go could just as well not change the size.
This was a great technical analysis but I felt the antagonistic tone let it down.
and I think it would be nice to have a language that pushes people to be a bit more mindful of the amount of resources they’re using.
Zig asks you to provide an allocator, that’s pretty mindful!
Instead, functions which need to allocate accept an *Allocator parameter. Likewise, data structures such as std.ArrayList accept an *Allocator parameter in their initialization functions
I also find it pretty hard to ding Zig for including arg parsing code when main didn’t have any args. It’d be a nice improvement if Zig detected this, but I suspect most (all?) useful programs will want to accept arguments.
Something about the wording on this kind of bothered me. If said friends were evaluating a language solely on the size of the output binary, perhaps they weren’t really evaluating Go for what it’s good for?
As it turns out, a statically-linked Go binary with libraries and the runtime costs a few megabytes 1. That’s a lot of functionality for a few megabytes. As mentioned in 1, a similar statically-linked hello-world C binary is also pushing a mb (admittedly, I haven’t tried it, but I have no reason to doubt it). There’s a lot of things Go is good for, but golfing binary size isn’t one of them. Though, you can find plenty of examples 2 3 of ways to shrink Go binaries (and account for binary size) if that’s what you’re really looking for.
I’m not really a Go advocate, I spend my professional time in Java/C++ and my personal time in Rust. I even like the look of Zig (I’m firmly in the “anything but C” camp), but this felt like a bit of an invented example.
The large binaries makes it impractical to use Go WASM in the browser. You can work around it by using the Tiny Go compiler, but that’s not fully compatible with regular Go. OTOH, WASM is still a sort of niche usecase, so in practice it’s not a big deal.
I remember compiling a Motif application in the 90’s on some platform where the Motif libraries weren’t particularly well broken up and shared libraries weren’t well supported either and a “hello Motif world!” application was >1MB…and this was back when 1MB was a significant portion of your disk quota.
It depends a lot on the use case. If you wanted to write a small command-line tool in the language then a fixed 2 MiB overhead per binary would be huge. Even something like BusyBox would likely see a large increase. FreeBSD ships with a Busybox-like single statically linked binary that includes all of the core utilities in
/rescue
and it’s 13 MiB - an extra 2 MiB there would be quite noticeable. That said, in FreeBSD’s libc, jemalloc is around 1MiB and that’s needed for any non-trivial statically linked binary (snmalloc is smaller).Even that’s not a great comparison though, because most programs do something. The really important question is how rapidly that grows. If that 2 MiB includes a load of functionality that you’d bring yourself then it may be that a 2MiB fixed overhead ends up being better than a small multiplier on binary size as you increase complexity. C++ templates, for example, can be used well with inlining to give tiny incredibly specialised code (the fast path for snmalloc’s
malloc
function is split across several templated functions in the source code and compiles down to around a dozen instructions) but it can also cause a rapid growth in code size if you use templates too aggressively and don’t explicitly factor out common code into a non-templated superclass for multiply instantiated templates (the Windows linker will discard identical template functions by default, but that’s technically a standards violation as the address of these functions should compare not equal).If you wanted to write it for a constrained environment, yes. If it’s something that will run on user’s desktops, like some internal tool for developers, it’s nothing. The static linked nature of go binaries, and it’s consequential ease of deployment in uncontrolled environments, is a MUCH bigger advantage (believe me, I wrote internal cli tools in python).
Yup, though ‘constrained environment’ covers a lot. For example, I think the base container image for Alpine is around 4 MiB. A 2 MiB overhead in a single tool that runs there is not a big problem but a 2 MiB overhead in each of 100 tools that you run in there will have a noticeable impact on deployment times. Typically, you don’t have 100 tools in a single container (if you do, there’s a good chance that you’ve missed the point of containers).
I probably wouldn’t want an extra 2 MiB on the 13 MiB binaries in /rescue. I definitely wouldn’t want it in the stand-alone versions in /sbin and /bin and so on, because that would roughly double the size of a base VM image, which would increase costs noticeably for cloud deployments. I would be completely fine with it in something like
containerd
orgit
. For anything running on a developer desktop, 2 MiB is completely in the noise.Well, last job I was building and deploying images sometimes well over 1 GB, I guess my perspective is biased there.
Was it kind of a pain in the ass sometimes? Sure, but it worked, we run our shit. So, if that’s doable, I’m not gonna prioritize 2MB overhead unless I REALLY need to.
Could be done conservatively. But that seems like a standard bug to me. See e.g. lisp permitting coalescing of literals.
(There are formal definitions of equality, which can be applied to functions. It is impossible to determine all cases when two functions are equivalent; nevertheless such a definition is appropriate for a language standard. It is obvious that when two functions comprise exactly the same code they are equivalent.)
I think that comparison misses an important thing - it wouldn’t be 2MB of extra code. That space contains useful runtime stuff which larger programs have to implement one way or another. A lot of that runtime already exists in busybox, so rewriting it in go could just as well not change the size.
This was a great technical analysis but I felt the antagonistic tone let it down.
Zig asks you to provide an allocator, that’s pretty mindful!
I also find it pretty hard to ding Zig for including arg parsing code when main didn’t have any args. It’d be a nice improvement if Zig detected this, but I suspect most (all?) useful programs will want to accept arguments.
Great analysis.
I don’t know Zig myself, but have been interested in it ever since a similar comparison with other language runtimes from @ddevault:
Hello world.