My favorite highlight from the article is the “Building code doesn’t execute it” section:
It is an explicit security design goal of the Go toolchain that neither fetching nor building code will let that code execute, even if it is untrusted and malicious. This is different from most other ecosystems, many of which have first-class support for running code at package fetch time.
This is something unique to among all programming languages, something that even Rust (which puts “security” among its core attributes) doesn’t provide.
I can safely build a Go application and then run it in a separate account or under bubblewrap without the concern that the build process will trash my workbench or account. (On the other extreme end there was one time when a Ruby dependency decided to overtly sudo without even notifying or asking for permission; I was saved by the fact by default, on all my systems the default sudo user is not root but nobody…) :)
The Ruby situation is especially dire because Gemfiles are themselves Ruby programs, so even resolving the dependencies of a project opens you up to remote code execution!
That said, I think there are reasons why projects may sometimes need build-time logic, and my long-term preferences is for this to be available in Rust and other languages, but only in a sandbox with strong limitations, or even the ability for end-users to place additional sandbox constraints or (more ideally) to relax the by-default-strict sandbox constraints.
I don’t think this will ever happen… I think most Rust developers come from two legacies: one is former C/C++ developers that are used to the auto* or CMake or plain make, thus they don’t want to give away those abilities; the other part of developers seem to come from Ruby, Python and other interpreted languages where security is not a top priority…
I would love it if cargo (the Rust build tool) would have a build option that disables the usage of build.rs.
Now, getting back to Go, I think it’s fair to say that this decision (of not running code at build time) is also helped by the fact that a lot of libraries are written in “pure Go” and thus there is no need for any “external build” facilities.
Also it is worth mentioning that even Go has go generate, but which is usually manually invoked by the developer, and its outputs are usually committed besides the code, thus there is no need for it at build time.
I would love it if cargo (the Rust build tool) would have a build option that disables the usage of build.rs.
Note that you’d also need to disable proc macros. And I fear that the number of crates which transitively use neither build.rs nor proc macros is vanishingly small :(
However, at least with regard to proc macros, I assume most of them only process the AST given as input, thus could be limited (either by forbidding the usage of certain API, or by something like seccomp.) For the rest, perhaps the access should be limited to the current workspace (and output) directory, and disallow any other OS interactions (sockets, processes, etc.)
For the rest, that need to invoke external processes or connect to remote endpoints, perhaps their place is not in the build life-cycle, and just like go generate should be extracted into a completely separate step.
On the same subject, I have the feeling that Python fits in the same category with it’s setup.py.
Python’s “wheel” (.whl) packages do not have and have never had the ability to run code during installation; they only run setup.py when building the package for distribution.
And more recently, people have been working on moving to pure declarative package-build configuration anyway.
It’s an unfortunate fact of the world that there are still a lot of sdist-only packages, even ones that are pure python and could easily distribute a universal wheel.
Elm is right up there with Go in not executing code during fetch and build. I’ve even seen experiments with CLIs written in Elm where you can restrict at the type level as to what the code has access to so that were you to run a CLI written in Elm you can know that it’s only touching approved files/directories.
You could maybe include Deno in here too, though it’s a runtime and not a language, because in order to execute something that wants to do IO or such, you need to explicitly allow it. You can even restrict to the directory or url it has access to.
Huh, doesn’t Go tend to make heavy use of code generation? I guess if you check in the generated code, you technically don’t have to execute any code at build-time… but avoiding compile-time code execution by shipping build artifacts in the source repo feels like cheating.
Better than literally distributing binaries, mind you, because generated source is theoretically human-readable! But still, it feels like they only manage to build from source with no code execution by taking a bizarre definition of what “source” is.
I guess if you check in the generated code, you technically don’t have to execute any code at build-time… but avoiding compile-time code execution by shipping build artifacts in the source repo feels like cheating.
Actually I prefer having pre-generated stuff in the repository, as opposed to having to install (and fiddle with) various obscure tools for code generation or documentation… This way, if I only need to patch some minor bug, or make some minor customization to the code, I can rebuild everything by just having Go / Rust / GCC installed.
I have the opposite experience with lots of other projects that in order to build them you need a plethora of Python or Ruby tools, or worse other more esoteric ones, most which are not available by default on many distributions…
Just imagine that you want to patch a tool that relies on serving some JS bundle. Do I want to also build an entire NodeJS project for this? Hell no! I’ll just move to another alternative… (In fact this is my preferred way to interact with the NodeJS based ecosystem: as long as it runs only in the browser, and as long as I don’t have to touch the NodeJS tooling, great! Just give me a “magic” blob! Thus I also keep a close eye on Deno…)
This is fair, but in some cases quite pain, particularly for cross-compilation (or support for other hardware platforms in general). In Rust crates I maintain, we generate FFI bindings for the most common targets, it would be a complete hassle to (re)generate them for all possible targets, and new ones get added regularly, so we’d have to keep on top of that as well. So we offer a feature to do that at build time, if you want to build for a platform we don’t “support”, or if you have some special sauce in your bindgen or the other tooling around it.
I agree that one can’t possibly generate artifacts for all platforms under the sun. (My observation mainly applies to portable artifacts such as JavaScript bundles, or Java jars, or man-pages, or other such resources.)
However, in your case I think it’s great that you at least generate the artifacts for the most common targets! As long as you’ve made the effort to cover +90% of the users, I think it’s enough.
My issue is with other projects out there that don’t even make this effort!
I think the author should have mentioned that this is the current state and quite recent. go 1.16 was only released on 2021-02-16 and I actually worked with go before go.mod and it was a complete mess.
So saying “look at Go, we do it properly”… well, no. they FINALLY do it properly (I hope, I’ve not used it in years).
I still have the source code for a 2017 project and apparently we used glide which put a glide.lock file there and I don’t remember if it had good pinning, but it was a third party addon.
I agree with you about the state of Go pre-modules.
FWIW, Go 1.16 just made some commands fail if go.mod was out of date. Modules became the default in Go 1.13 and were available opt-in starting in Go 1.11 (or maybe 1.10, I don’t remember). Go modules have had a number of these properties for longer than the past year.
Yeah, I admit I was too lazy to read the whole history of Go releases and find out which one exactly did what - thanks for the addition.
I did mean “before 1.11” for the complete fail then, when Go was already around for a good while and 1.16 apparently fixed the last problems, as you clarified.
FWIW, when they created go.mod, they made it automatically import and convert glide files and a few other prior formats (dep being the most prominent).
My favorite highlight from the article is the “Building code doesn’t execute it” section:
This is something unique to among all programming languages, something that even Rust (which puts “security” among its core attributes) doesn’t provide.
I can safely build a Go application and then run it in a separate account or under
bubblewrap
without the concern that the build process will trash my workbench or account. (On the other extreme end there was one time when a Ruby dependency decided to overtlysudo
without even notifying or asking for permission; I was saved by the fact by default, on all my systems the defaultsudo
user is notroot
butnobody
…) :)The Ruby situation is especially dire because
Gemfile
s are themselves Ruby programs, so even resolving the dependencies of a project opens you up to remote code execution!That said, I think there are reasons why projects may sometimes need build-time logic, and my long-term preferences is for this to be available in Rust and other languages, but only in a sandbox with strong limitations, or even the ability for end-users to place additional sandbox constraints or (more ideally) to relax the by-default-strict sandbox constraints.
I don’t think this will ever happen… I think most Rust developers come from two legacies: one is former C/C++ developers that are used to the
auto*
orCMake
or plainmake
, thus they don’t want to give away those abilities; the other part of developers seem to come from Ruby, Python and other interpreted languages where security is not a top priority…I would love it if
cargo
(the Rust build tool) would have a build option that disables the usage ofbuild.rs
.Now, getting back to Go, I think it’s fair to say that this decision (of not running code at build time) is also helped by the fact that a lot of libraries are written in “pure Go” and thus there is no need for any “external build” facilities.
Also it is worth mentioning that even Go has
go generate
, but which is usually manually invoked by the developer, and its outputs are usually committed besides the code, thus there is no need for it at build time.Note that you’d also need to disable proc macros. And I fear that the number of crates which transitively use neither build.rs nor proc macros is vanishingly small :(
I forgot about proc macros…
However, at least with regard to proc macros, I assume most of them only process the AST given as input, thus could be limited (either by forbidding the usage of certain API, or by something like
seccomp
.) For the rest, perhaps the access should be limited to the current workspace (and output) directory, and disallow any other OS interactions (sockets, processes, etc.)For the rest, that need to invoke external processes or connect to remote endpoints, perhaps their place is not in the build life-cycle, and just like
go generate
should be extracted into a completely separate step.watt tries to accomplish this by coming proc macros to WebAssembly, and then executing those.
That’s what Watt does by compiling proc macros to WebAssembly (which is naturally sandboxed).
On the same subject, I have the feeling that Python fits in the same category with it’s
setup.py
.(Funny enough, I think that Java, at least through Maven, dosen’t suffer from this…)
Python’s “wheel” (
.whl
) packages do not have and have never had the ability to run code during installation; they only runsetup.py
when building the package for distribution.And more recently, people have been working on moving to pure declarative package-build configuration anyway.
It’s an unfortunate fact of the world that there are still a lot of sdist-only packages, even ones that are pure python and could easily distribute a universal wheel.
Elm is right up there with Go in not executing code during fetch and build. I’ve even seen experiments with CLIs written in Elm where you can restrict at the type level as to what the code has access to so that were you to run a CLI written in Elm you can know that it’s only touching approved files/directories.
You could maybe include Deno in here too, though it’s a runtime and not a language, because in order to execute something that wants to do IO or such, you need to explicitly allow it. You can even restrict to the directory or url it has access to.
Huh, doesn’t Go tend to make heavy use of code generation? I guess if you check in the generated code, you technically don’t have to execute any code at build-time… but avoiding compile-time code execution by shipping build artifacts in the source repo feels like cheating.
Better than literally distributing binaries, mind you, because generated source is theoretically human-readable! But still, it feels like they only manage to build from source with no code execution by taking a bizarre definition of what “source” is.
Actually I prefer having pre-generated stuff in the repository, as opposed to having to install (and fiddle with) various obscure tools for code generation or documentation… This way, if I only need to patch some minor bug, or make some minor customization to the code, I can rebuild everything by just having Go / Rust / GCC installed.
I have the opposite experience with lots of other projects that in order to build them you need a plethora of Python or Ruby tools, or worse other more esoteric ones, most which are not available by default on many distributions…
Just imagine that you want to patch a tool that relies on serving some JS bundle. Do I want to also build an entire NodeJS project for this? Hell no! I’ll just move to another alternative… (In fact this is my preferred way to interact with the NodeJS based ecosystem: as long as it runs only in the browser, and as long as I don’t have to touch the NodeJS tooling, great! Just give me a “magic” blob! Thus I also keep a close eye on Deno…)
This is fair, but in some cases quite pain, particularly for cross-compilation (or support for other hardware platforms in general). In Rust crates I maintain, we generate FFI bindings for the most common targets, it would be a complete hassle to (re)generate them for all possible targets, and new ones get added regularly, so we’d have to keep on top of that as well. So we offer a feature to do that at build time, if you want to build for a platform we don’t “support”, or if you have some special sauce in your bindgen or the other tooling around it.
I agree that one can’t possibly generate artifacts for all platforms under the sun. (My observation mainly applies to portable artifacts such as JavaScript bundles, or Java jars, or man-pages, or other such resources.)
However, in your case I think it’s great that you at least generate the artifacts for the most common targets! As long as you’ve made the effort to cover +90% of the users, I think it’s enough.
My issue is with other projects out there that don’t even make this effort!
I think the author should have mentioned that this is the current state and quite recent. go 1.16 was only released on 2021-02-16 and I actually worked with go before
go.mod
and it was a complete mess.So saying “look at Go, we do it properly”… well, no. they FINALLY do it properly (I hope, I’ve not used it in years).
I still have the source code for a 2017 project and apparently we used
glide
which put aglide.lock
file there and I don’t remember if it had good pinning, but it was a third party addon.I agree with you about the state of Go pre-modules.
FWIW, Go 1.16 just made some commands fail if go.mod was out of date. Modules became the default in Go 1.13 and were available opt-in starting in Go 1.11 (or maybe 1.10, I don’t remember). Go modules have had a number of these properties for longer than the past year.
Yeah, I admit I was too lazy to read the whole history of Go releases and find out which one exactly did what - thanks for the addition.
I did mean “before 1.11” for the complete fail then, when Go was already around for a good while and 1.16 apparently fixed the last problems, as you clarified.
FWIW, when they created go.mod, they made it automatically import and convert glide files and a few other prior formats (dep being the most prominent).