SetDeadline on a net.Conn or os.File will cancel the read or write, and deadlines can be reset to resume later, unlike Close. Go 1.15 is introducing os.ErrDeadlineExceeded to make it easy to distinguish a deadline-induced error.
The approach described here does not actually cancel the underlying read call; it is fine it is the sole reader of something and you are not worried about reading too much data.
In Elvish (https://elv.sh), a shell implemented in Go, I have to solve a very similar problem reading from a terminal. You never want to read more data than necessary since the terminal is shared with other programs run from the shell. I implement cancellable read with a pipe and select: https://github.com/elves/elvish/blob/905447eda5d406ed147b8a1485f57b9549d4b345/pkg/cli/term/file_reader_unix.go (the sys.WaitForRead function is a thin wrapper over select). The synchronization semantics is slightly tricky.
this topic is worth addressing and this blog post does a good job of summarizing the problem. Dealing with cancellation is one of the bigger warts in Go. The context package is a nice attempt but it’s no panacea.
This is frustrating, and it made me wonder why something like the following interface isn’t more common:
interface PreemptibleReader {
Read(ctx context.Context, p []byte) (n int, err error)
}
the author leaves this a bit unanswered. There’s a few reasons io.Reader looks this way (and doesn’t use context.Context):
the io package is older than the context package; io.Reader predates context.Context by years. Because of the Go compatibility guarantees, io.Reader wouldn’t be changed after the introduction of context.Context even if the Go team thought it was a good idea.
the io.Reader api as it exists right now essentially maps 1 to 1 to the read syscall on most operating systems. That is: provide a buffer, and read will fill that buffer with data and tell you how much data it filled in. The linux read syscall docs are strikingly similar: http://man7.org/linux/man-pages/man2/read.2.html
It is, as the author wrote, intended to be as universal as possible, perhaps at the expense of usability.
The provided example has a few oddities:
can only read in chunks of up to 1024 bytes, so depending on your data source this implementation may result in a much larger number of syscalls than you need.
has to copy all of the data in userspace. That kinda defeats the purpose of slices being reference types; in the majority of situations, you want to read some data, then process that data, then read some more data, reusing the same slice buffer as you go, in order to avoid continually allocating and freeing memory. The benefit of being able to read and process at the same time has to be weight against the cost of allocating new slices and copying all of the data. You could toss the buffers in a sync.Pool, but now you have additional coordination work to do to manage the buffers between the reader and the consumer. In some cases this sort of additional work may be faster, in some cases it may be slower, depending on your data source and what sort of processing you’re doing.
so … yes, it’s a real problem, and the provided example is a solution, but there are a lot of tradeoffs being made here. This gets back to the original question of why is io.Reader blocking by default and it’s because the alternatives involve tradeoffs that probably wouldn’t be a great fit for the standard library.
anyway, I agree with the author that this is a challenging situation in Go.
In some cases (but not in the above one) it is possible to solve this by flipping the file description into a non-blocking mode and using poll instead of read, but that, at leas in Rust’s case, isn’t wrapped in a convenient API (and I am not sure how an API for selectable reads should look in the first place).
Neither is there any real API for it in Go. I imagine if there was, it should be a completion port kind of thing. Issue a read along with a channel you want the result delivered to; then you can select on completion or cancellation. Then you just need a non-racy way to cancel delivery of the completion.
The article describes quite accurately a common issue and pattern when mixing IO and channels. It becomes quickly complicated, especially when multiple IO operations are involved.
Every time I encounter that issue I can’t shake the feeling that the code would be much simpler if Go’s select keyword could be extended to handle both channels and IO objects at the same time.
SetDeadline on a net.Conn or os.File will cancel the read or write, and deadlines can be reset to resume later, unlike Close. Go 1.15 is introducing os.ErrDeadlineExceeded to make it easy to distinguish a deadline-induced error.
Hm, then I’m starting to think if it would make sense for me to try and use some kinda “DeadlineReader/Writer” interfaces everywhere now
The approach described here does not actually cancel the underlying read call; it is fine it is the sole reader of something and you are not worried about reading too much data.
In Elvish (https://elv.sh), a shell implemented in Go, I have to solve a very similar problem reading from a terminal. You never want to read more data than necessary since the terminal is shared with other programs run from the shell. I implement cancellable read with a pipe and select: https://github.com/elves/elvish/blob/905447eda5d406ed147b8a1485f57b9549d4b345/pkg/cli/term/file_reader_unix.go (the sys.WaitForRead function is a thin wrapper over select). The synchronization semantics is slightly tricky.
Cool, thanks for sharing! I was looking for something like this. 😀
this topic is worth addressing and this blog post does a good job of summarizing the problem. Dealing with cancellation is one of the bigger warts in Go. The context package is a nice attempt but it’s no panacea.
the author leaves this a bit unanswered. There’s a few reasons
io.Reader
looks this way (and doesn’t usecontext.Context
):io
package is older than thecontext
package;io.Reader
predatescontext.Context
by years. Because of the Go compatibility guarantees,io.Reader
wouldn’t be changed after the introduction ofcontext.Context
even if the Go team thought it was a good idea.io.Reader
api as it exists right now essentially maps 1 to 1 to the read syscall on most operating systems. That is: provide a buffer, and read will fill that buffer with data and tell you how much data it filled in. The linux read syscall docs are strikingly similar: http://man7.org/linux/man-pages/man2/read.2.htmlIt is, as the author wrote, intended to be as universal as possible, perhaps at the expense of usability.
The provided example has a few oddities:
sync.Pool
, but now you have additional coordination work to do to manage the buffers between the reader and the consumer. In some cases this sort of additional work may be faster, in some cases it may be slower, depending on your data source and what sort of processing you’re doing.so … yes, it’s a real problem, and the provided example is a solution, but there are a lot of tradeoffs being made here. This gets back to the original question of why is io.Reader blocking by default and it’s because the alternatives involve tradeoffs that probably wouldn’t be a great fit for the standard library.
anyway, I agree with the author that this is a challenging situation in Go.
Oh, I hit this problem once in a while, and it isn’t go specific. Here’s a horrifying example from Rust’s jobserver, which uses SIGUSR1 to interrupt a blocked read: https://github.com/alexcrichton/jobserver-rs/blob/e6701fe3b642252be7ca592654b5b45804daa4eb/src/unix.rs#L260-L270.
In some cases (but not in the above one) it is possible to solve this by flipping the file description into a non-blocking mode and using
poll
instead ofread
, but that, at leas in Rust’s case, isn’t wrapped in a convenient API (and I am not sure how an API for selectable reads should look in the first place).Neither is there any real API for it in Go. I imagine if there was, it should be a completion port kind of thing. Issue a read along with a channel you want the result delivered to; then you can select on completion or cancellation. Then you just need a non-racy way to cancel delivery of the completion.
The article describes quite accurately a common issue and pattern when mixing IO and channels. It becomes quickly complicated, especially when multiple IO operations are involved.
Every time I encounter that issue I can’t shake the feeling that the code would be much simpler if Go’s
select
keyword could be extended to handle both channels and IO objects at the same time.Just following up on this. I found what seems to be, a good solution here:
https://medium.com/@zombiezen/canceling-i-o-in-go-capn-proto-5ae8c09c5b29
Or at least it solves the problem for me fairly elegantly. I can now cancel my context and it cancels my read.