I am somewhat skeptical that this is a problem that should be solved at a language runtime level. It seems to me that the container level itself should be reporting the constraints it runs under as the result of normal system calls. I don’t think it’s reasonable to ask every programming language to become more environmentally aware than what machine/OS it thinks it runs on. The VM/container layer seems like a more natural place for this sort of effort.
FWIW my cursory eyeballing indicates internally at Google this is also implemented at a package level much in the same way as the linked https://pkg.go.dev/go.uber.org/automaxprocs from the article.
The problem with lowering this to a kernel-level API is the Linux kernel doesn’t have a concept of containers. Unlike Solaris zones or FreeBSD jails, the concept of Linux containers exist entirely in userspace. Container runtimes pull together many different low level kernel APIs to create a container.
In this case, the CPU quotas are dynamic at the kernel level, not at all like the physical CPU count. As far as I know, no Linux container runtime takes advantage of this, but the kernel doesn’t technically know that, and therefore can’t just go around reporting a physical CPU count that could change.
As for whether language runtimes ought to support this tomfoolery, I agree it should be in a library. But I also think much of the Go language runtime should be a library, like Tokio is for Rust. Since Go already has loads of stuff baked into the language runtime, and tries to make it all as self-tuning as possible, I don’t think it’s unreasonable to expect Go to detect Linux containers natively. And Go does already have loads of platform specific code, opting to implement syscall interfaces itself rather than use libc.
The whole reason we have containers is that stuff like this was leaking too much through the APIs of the operating system. The cycle is never-ending, it seems.
Doesn’t really seem noteworthy to me that a programming language does that. It felt to me like the expected behavior to ask the OS about available resources, rather than making assumptions based on all the way it could be run. Docker even though popular is a special case and even CFS is just a one of many Linux schedulers. Certainly sure it won’t be the last.
I’m not arguing for or against it, but you can run everything in docker container also on a desktop and there’s software doing so.
I think languages (implementations that is) should primarily target operating systems (or standards like POSIX) and architectures. Everything else should be considered a special case/an optimization and these are debatable.
What I mean by that is this is also about expectations. I would expect that this is what happens. If I use containers, I would (and should) be aware of it. I think as a developer however, if I get a bug report or even as someone moving to containers, maybe even on a desktop or something if the behavior (including performance or resource usage) change I’d be surprised. I expect that from VMs, but not containers.
On the other hand of course software changes behavior based on where and how it runs, but I wonder if this shouldn’t be in the hand of the developer. I am really not sure though what the better behavior is. Java’s huge amounts of settings also doesn’t seem to be the right way, and in the early days it was a stated goal that one should not deal with such things, that changed a bit with developer here and there interacting with the GC. So that would be a reason why Go should self-optimize, however if that happens I think it would make sense to take care this doesn’t just happen for Docker, but pretty much every way one executes a Go application under every scenario on every platform and operating system, to go with the cross-platform support that Go promises.
As a slightly off topic side note: With Go being statically compiled, having embed and it even being trivial to use things like OpenBSD’s pledge and unveil, I wonder if containers are the right way to execute Go. I feel the same way about Java’s “fat JARs”. I think the Nomad approach is fairly interesting here. Next to running Docker containers they support a “isolated” execuation (which is unlike raw_exec) and a way to execute Java applications. Depending on your use case cutting out the container engine might be interesting. And given that even things like Deno can create self-contained executables out of the box, I think Docker might have accelerated separating state and how one thinks about services, but might be unnecessary complexity in many situations.
This article would shoot to number #1 on HN if the author had instead talked about how the Go team are evil and stupid and hate kittens, instead of just linking to an issue in the Go repo and another repo from Uber that fixes it.
Rob Pike famously said that all Hacker News users are terrible programmers and not smart enough to do anything useful. Thats why they don’t deserve to know about CFS quotas.
It looks like a special case of the general problem that nested schedulers always interact in surprising ways and, unless a nested scheduler has full visibility into its parent, will always behave poorly. This is true for JVMs and for guest kernels on hypervisors’ VCPU schedulers, and for every other attempt to build nested schedulers.
Good example of “Google doesn’t use this tech and therefore the issue isn’t important to us” from the Go team. Meanwhile much of the tech world is deploying on Docker.
We’re certainly open to it. I, at least, have no idea what is required to implement it. It would be helpful if someone could move this issue past “we should do this” to “this is how it can be done.” Thanks.
So it’s actually nothing to do with “not important” it’s “please help us understand containerization better”.
I read the article. Docker has been in widespread use for a decade now, they’re years late to the party.
A similar issue happened in the Ruby community: ruby-core was mainly in Japan and historically didn’t really use Rails. For its first decade, Ruby basically ignored Rails. Now there is far greater integration since many ruby-core are now daily Rails users and/or paid engineers at Rails shops (e.g. Shopify, Heroku).
I think both are fair ways to see it. It’s not important to them so Go core don’t really care, but if they’re shown a good way to do it correctly it’s a useful fix to implement.
Did you forget that Google created Kubernetes? They “use this tech” plenty. It’s just that the “problem” is far less of a problem, and the “solution” is far less clear-cut, than this article makes it out to be.
Regardless of arguments about whether or not the Go runtime should be CFS-aware, I guess the one subject that comes to my mind instantly is: how many Go devs are writing applications expecting the Go runtime to be contextually aware of the CPU limit, and are thus deploying less efficient solutions to the cloud?
My understanding of Go is that it shouldn’t be such a complex language to master, hence why the syntax is easy to learn. But maintaining Go applications is seemingly more complex when people create write-ups about how they have to tune the Go GC or even do things like this where you check if it acknowledges the host’s CPU limitations.
I guess this would file under a leaky abstraction of some sort.
GOMAXPROCS is very often used to decide the upper bound on the number of worker goroutines to spawn. That way each goroutine is guaranteed a full CPU core.
How does the code that spawns GOMAXPROCS goroutines ensure that each of them are guaranteed a core? Can other goroutines not be spawned elsewhere, which would compete for OS threads within the scheduler?
I don’t think “each goroutine is guaranteed a full CPU core” is accurate at all, but GOMAXPROCS is still (probably) the right number of workers. Dealing efficiently with (many) more goroutines than there are threads is the point of the scheduler — to some extent it’s the point of the language. Having more than GOMAXPROCS goroutines is a normal condition, and you should expect them to get scheduled fairly, and not to contend with each other significantly unless you’re actually using nearly all of the available CPU (which usually isn’t the case).
But if you have fewer than GOMAXPROCS goroutines in your worker pool, then if a bunch of work comes in and nothing else is going on, you’re leaving performance on the table by not bursting onto all of the available cores immediately.
And if you follow that logic, then you can see why, although setting GOMAXPROCS equal to your quota share is a somewhat reasonable thing to do, it’s not the only reasonable thing to do, or even the obviously correct thing to do. Setting GOMAXPROCS equal to your quota share means that every Go thread can use 100% CPU all the time and you won’t get throttled by the OS scheduler. But if you set GOMAXPROCS to the number of actual cores, you can potentially use all of those cores at once… as long as you do it in a burst that’s short enough that you don’t get throttled.
When you’re dealing with a “web” sort of workload (requests arrive randomly, you do a tiny bit of work to parse them, and then you quickly block on a database server or an API or something), it’s very much possible that you will get faster response time numbers by not lowering GOMAXPROCS, or by setting it to some intermediate value. The best answer really does depend on the nature of your code, what other loads it’s sharing the machine with, and whether you’re trying to optimize for latency or throughput.
A lot of times, you don’t need a worker pool to begin with, you can just do stuff. If an HTTP server uses a goroutine to service each request, and each of those spawns off a dozen goroutines to do this and that, no big deal, even up to high client counts. But somewhere around a million goroutines you may start to notice some scheduling degradation, or feel the pinch of having that many stacks in memory. So if you were doing something like scraping the entire web, you wouldn’t want to write
func crawlURL(url string) {
page := downloadURL(url)
links := getLinks(page)
for _, link := range(links) {
go crawlURL(link)
}
}
because sooner or later it would explode. Instead you would do the downloading in a worker pool sized to the number of parallel requests you want to make (which has nothing to do with GOMAXPROCS; constraints like that come from elsewhere), and the parsing in another pool (which might be GOMAXPROCS-sized), and feed them using channels.
I am somewhat skeptical that this is a problem that should be solved at a language runtime level. It seems to me that the container level itself should be reporting the constraints it runs under as the result of normal system calls. I don’t think it’s reasonable to ask every programming language to become more environmentally aware than what machine/OS it thinks it runs on. The VM/container layer seems like a more natural place for this sort of effort.
FWIW my cursory eyeballing indicates internally at Google this is also implemented at a package level much in the same way as the linked https://pkg.go.dev/go.uber.org/automaxprocs from the article.
The problem with lowering this to a kernel-level API is the Linux kernel doesn’t have a concept of containers. Unlike Solaris zones or FreeBSD jails, the concept of Linux containers exist entirely in userspace. Container runtimes pull together many different low level kernel APIs to create a container.
In this case, the CPU quotas are dynamic at the kernel level, not at all like the physical CPU count. As far as I know, no Linux container runtime takes advantage of this, but the kernel doesn’t technically know that, and therefore can’t just go around reporting a physical CPU count that could change.
As for whether language runtimes ought to support this tomfoolery, I agree it should be in a library. But I also think much of the Go language runtime should be a library, like Tokio is for Rust. Since Go already has loads of stuff baked into the language runtime, and tries to make it all as self-tuning as possible, I don’t think it’s unreasonable to expect Go to detect Linux containers natively. And Go does already have loads of platform specific code, opting to implement syscall interfaces itself rather than use libc.
The whole reason we have containers is that stuff like this was leaking too much through the APIs of the operating system. The cycle is never-ending, it seems.
Doesn’t really seem noteworthy to me that a programming language does that. It felt to me like the expected behavior to ask the OS about available resources, rather than making assumptions based on all the way it could be run. Docker even though popular is a special case and even CFS is just a one of many Linux schedulers. Certainly sure it won’t be the last.
I’m not arguing for or against it, but you can run everything in docker container also on a desktop and there’s software doing so.
I think languages (implementations that is) should primarily target operating systems (or standards like POSIX) and architectures. Everything else should be considered a special case/an optimization and these are debatable.
What I mean by that is this is also about expectations. I would expect that this is what happens. If I use containers, I would (and should) be aware of it. I think as a developer however, if I get a bug report or even as someone moving to containers, maybe even on a desktop or something if the behavior (including performance or resource usage) change I’d be surprised. I expect that from VMs, but not containers.
On the other hand of course software changes behavior based on where and how it runs, but I wonder if this shouldn’t be in the hand of the developer. I am really not sure though what the better behavior is. Java’s huge amounts of settings also doesn’t seem to be the right way, and in the early days it was a stated goal that one should not deal with such things, that changed a bit with developer here and there interacting with the GC. So that would be a reason why Go should self-optimize, however if that happens I think it would make sense to take care this doesn’t just happen for Docker, but pretty much every way one executes a Go application under every scenario on every platform and operating system, to go with the cross-platform support that Go promises.
As a slightly off topic side note: With Go being statically compiled, having embed and it even being trivial to use things like OpenBSD’s pledge and unveil, I wonder if containers are the right way to execute Go. I feel the same way about Java’s “fat JARs”. I think the Nomad approach is fairly interesting here. Next to running Docker containers they support a “isolated” execuation (which is unlike raw_exec) and a way to execute Java applications. Depending on your use case cutting out the container engine might be interesting. And given that even things like Deno can create self-contained executables out of the box, I think Docker might have accelerated separating state and how one thinks about services, but might be unnecessary complexity in many situations.
This article would shoot to number #1 on HN if the author had instead talked about how the Go team are evil and stupid and hate kittens, instead of just linking to an issue in the Go repo and another repo from Uber that fixes it.
Yeah, this article is way too constructive. He’ll never get anywhere with that attitude.
i flagged this as offtopic because comments complaining about some unrelated website seem offtopic.
Rob Pike famously said that all Hacker News users are terrible programmers and not smart enough to do anything useful. Thats why they don’t deserve to know about CFS quotas.
He also came into my house and turned off all my syntax highlighting :-(((
This seems familiar, it’s the same problem with JVM applications iirc?
It looks like a special case of the general problem that nested schedulers always interact in surprising ways and, unless a nested scheduler has full visibility into its parent, will always behave poorly. This is true for JVMs and for guest kernels on hypervisors’ VCPU schedulers, and for every other attempt to build nested schedulers.
I believe the JVM has been updated to use cgroups info, but I don’t know the internals and if that totally solves this problem.
oh neat! yeah this looks like it solves the main complaints of JVM apps in containers. it at least removes the need for a lot of hacky workarounds.
Good example of “Google doesn’t use this tech and therefore the issue isn’t important to us” from the Go team. Meanwhile much of the tech world is deploying on Docker.
You didn’t bother reading the article. The article links a proposal here: https://github.com/golang/go/issues/33803#issuecomment-1024492050
The last comment from Ian Lance Taylor was:
So it’s actually nothing to do with “not important” it’s “please help us understand containerization better”.
I read the article. Docker has been in widespread use for a decade now, they’re years late to the party.
A similar issue happened in the Ruby community: ruby-core was mainly in Japan and historically didn’t really use Rails. For its first decade, Ruby basically ignored Rails. Now there is far greater integration since many ruby-core are now daily Rails users and/or paid engineers at Rails shops (e.g. Shopify, Heroku).
I think both are fair ways to see it. It’s not important to them so Go core don’t really care, but if they’re shown a good way to do it correctly it’s a useful fix to implement.
Did you forget that Google created Kubernetes? They “use this tech” plenty. It’s just that the “problem” is far less of a problem, and the “solution” is far less clear-cut, than this article makes it out to be.
Adding to that, Google also created cgroups.
Related: https://github.com/KimMachineGun/automemlimit
Regardless of arguments about whether or not the Go runtime should be CFS-aware, I guess the one subject that comes to my mind instantly is: how many Go devs are writing applications expecting the Go runtime to be contextually aware of the CPU limit, and are thus deploying less efficient solutions to the cloud?
My understanding of Go is that it shouldn’t be such a complex language to master, hence why the syntax is easy to learn. But maintaining Go applications is seemingly more complex when people create write-ups about how they have to tune the Go GC or even do things like this where you check if it acknowledges the host’s CPU limitations.
I guess this would file under a leaky abstraction of some sort.
GOMAXPROCS is very often used to decide the upper bound on the number of worker goroutines to spawn. That way each goroutine is guaranteed a full CPU core.
How does the code that spawns GOMAXPROCS goroutines ensure that each of them are guaranteed a core? Can other goroutines not be spawned elsewhere, which would compete for OS threads within the scheduler?
I don’t think “each goroutine is guaranteed a full CPU core” is accurate at all, but GOMAXPROCS is still (probably) the right number of workers. Dealing efficiently with (many) more goroutines than there are threads is the point of the scheduler — to some extent it’s the point of the language. Having more than GOMAXPROCS goroutines is a normal condition, and you should expect them to get scheduled fairly, and not to contend with each other significantly unless you’re actually using nearly all of the available CPU (which usually isn’t the case).
But if you have fewer than GOMAXPROCS goroutines in your worker pool, then if a bunch of work comes in and nothing else is going on, you’re leaving performance on the table by not bursting onto all of the available cores immediately.
And if you follow that logic, then you can see why, although setting GOMAXPROCS equal to your quota share is a somewhat reasonable thing to do, it’s not the only reasonable thing to do, or even the obviously correct thing to do. Setting GOMAXPROCS equal to your quota share means that every Go thread can use 100% CPU all the time and you won’t get throttled by the OS scheduler. But if you set GOMAXPROCS to the number of actual cores, you can potentially use all of those cores at once… as long as you do it in a burst that’s short enough that you don’t get throttled.
When you’re dealing with a “web” sort of workload (requests arrive randomly, you do a tiny bit of work to parse them, and then you quickly block on a database server or an API or something), it’s very much possible that you will get faster response time numbers by not lowering GOMAXPROCS, or by setting it to some intermediate value. The best answer really does depend on the nature of your code, what other loads it’s sharing the machine with, and whether you’re trying to optimize for latency or throughput.
So why even use GOMAXPROCS as the size for a worker pool?
A lot of times, you don’t need a worker pool to begin with, you can just do stuff. If an HTTP server uses a goroutine to service each request, and each of those spawns off a dozen goroutines to do this and that, no big deal, even up to high client counts. But somewhere around a million goroutines you may start to notice some scheduling degradation, or feel the pinch of having that many stacks in memory. So if you were doing something like scraping the entire web, you wouldn’t want to write
because sooner or later it would explode. Instead you would do the downloading in a worker pool sized to the number of parallel requests you want to make (which has nothing to do with GOMAXPROCS; constraints like that come from elsewhere), and the parsing in another pool (which might be GOMAXPROCS-sized), and feed them using channels.