Betteridge’s law of headlines strikes again. No, not everything is a file, that’s silly. You can slap a file and folders metaphor on top of almost any data type or interface, but that doesn’t make it a good idea or make the underlying thing much like a file.
If everything is still an fd I’m not sure what the difference between actually implementing this approach and acting as if this approach was implemented. Nothing is stopping you from using pread/pwrite on files (obviously) and just never calling lseek. An error from read on a file is not much different than an error from lseek on a pipe or socket, or named pipe, not to mention platform specific things like timerfds and what not.
Also unless you also remove dup altogether you just shift the problem to when you duplicate the post-streamified fd. Even if lseek is gone reads on the two fds will interfere with the current position in the same way.
I could see this working if fds and sds (“stream descriptors”) were different types but I think the existence of fifos means open can’t return just fds (non-streamified descriptors).
You can avoid calling lseek yourself but if you dup a descriptor and hand it off to a library or another process you can’t control whether or not it calls lseek on its descriptor. I guess if it decides to do that you’d still be fine as long as you only used pread/pwrite and never did anything that read the file position.
I’m not entirely clear on the author’s proposal but it sounds like the idea is that if you dupped a “streaming view” of a file then the duplicated copy would have its own independent file position? Or maybe dup on a “streaming view” works the same way that things do now (with a shared file position) but if that bothered you then you could choose to not call dup on the streaming view. Instead you’d create a brand new streaming view from the same underlying file. Then each streaming view would have its own position and you could hand one to other code without worrying about the effects on your own streaming view.
Of course none of this solves the issue of what do to if you have a real stream (not a streaming view of a file) like a pipe. If you dup it then a read from any copy will consume the data and that data won’t be available on other copies of the descriptor. Maybe this is simply defined as the expected behavior and thus as OK?
Named pipes (FIFOs) would complicate things. But this article seems like it proposing an alternative OS design that is not POSIX but is instead “POSIX-like”. In this alternative world we could say that named pipes are not supported. Or that they have to be opened with a special open_named_pipe syscall. Or that the file descriptor returned by calling open on a named pipe is a special stub descriptor. Attempting to call pread/pwrite on the stub will fail. The only way to do anything useful with the stub would to be to create a streaming view from it and then call read/write on that streaming view. This is admittedly kind of ugly but that’s the price for maintaining the ability to open named pipes with open.
There are probably other complications. How do you handle writes to files open for O_APPEND? Does pwrite write to the file at the requested offset or does it ignore that offset and write at the end? If it does write at the requested offset, how can you atomically append some data to the file? You can’t ask for the current size and then write at that offset because the file size might change between the first call and the second.
What do you do about select and poll and friends? Do these take streaming views instead of file descriptors now?
Overall I don’t hate the idea. If we were going to put this in object-oriented terms then the current system would have pread and pwrite methods on the file descriptor interface. But some classes that implement that interface (like pipes, sockets, etc.) don’t support those operations so they just fail at runtime if you try to call those methods. Usually this is a sign that you’ve got your interfaces designed poorly. The most obvious fix for this type of thing would be to split things up into two (or more) interfaces and have each class implement only the methods that make sense for that particular class, and maybe create some adapter classes to help out. That seems to be what’s being proposed here, with the role of the adapter class being played by the “streaming view”. The most significant difference that I can see is that constructing new wrapper objects would normally be considered fairly cheap but constructing the streaming view would require an extra syscall which could be expensive enough that people would want to avoid it.
I wonder if it would be possible to emulate the streaming view in userspace in some place like the C library. That would get the kernel entirely out of the business of maintaining file offsets and leave them up to the process to track. The C library would be allowed to call read and write on objects like pipes and sockets but for real files it would only be allowed to call pread and pwrite. If the user code creates a streaming view of a file and tries to call read on it then the C library would have to translate that request to an equivalent pread call and then update the file position stored in the streaming view. Doing this for any POSIX environment would probably be somewhere between difficult and impossible but maybe one can imagine an OS design where it could be made to work.
My point isn’t that “this isn’t necessary because discipline”, it’s “the amount that this helps doesn’t reduce the discipline required in any significant way.” Everything is still read(Object, … ), pread(Object, …), ioctl(Object, …) etc. Removing lseek doesn’t stop two processes or threads from interfering with each other with read and its implicit seeks on a pipe, socket or streamed file.
Everything is a file but some things are more fcntl than others
Betteridge’s law of headlines strikes again. No, not everything is a file, that’s silly. You can slap a file and folders metaphor on top of almost any data type or interface, but that doesn’t make it a good idea or make the underlying thing much like a file.
If everything is still an fd I’m not sure what the difference between actually implementing this approach and acting as if this approach was implemented. Nothing is stopping you from using
pread
/pwrite
on files (obviously) and just never callinglseek
. An error fromread
on a file is not much different than an error fromlseek
on a pipe or socket, or named pipe, not to mention platform specific things like timerfds and what not.Also unless you also remove
dup
altogether you just shift the problem to when you duplicate the post-streamified fd. Even iflseek
is gone reads on the two fds will interfere with the current position in the same way.I could see this working if fds and sds (“stream descriptors”) were different types but I think the existence of fifos means
open
can’t return just fds (non-streamified descriptors).You can avoid calling
lseek
yourself but if youdup
a descriptor and hand it off to a library or another process you can’t control whether or not it callslseek
on its descriptor. I guess if it decides to do that you’d still be fine as long as you only usedpread
/pwrite
and never did anything that read the file position.I’m not entirely clear on the author’s proposal but it sounds like the idea is that if you
dup
ped a “streaming view” of a file then the duplicated copy would have its own independent file position? Or maybedup
on a “streaming view” works the same way that things do now (with a shared file position) but if that bothered you then you could choose to not calldup
on the streaming view. Instead you’d create a brand new streaming view from the same underlying file. Then each streaming view would have its own position and you could hand one to other code without worrying about the effects on your own streaming view.Of course none of this solves the issue of what do to if you have a real stream (not a streaming view of a file) like a pipe. If you
dup
it then a read from any copy will consume the data and that data won’t be available on other copies of the descriptor. Maybe this is simply defined as the expected behavior and thus as OK?Named pipes (FIFOs) would complicate things. But this article seems like it proposing an alternative OS design that is not POSIX but is instead “POSIX-like”. In this alternative world we could say that named pipes are not supported. Or that they have to be opened with a special
open_named_pipe
syscall. Or that the file descriptor returned by callingopen
on a named pipe is a special stub descriptor. Attempting to callpread
/pwrite
on the stub will fail. The only way to do anything useful with the stub would to be to create a streaming view from it and then callread
/write
on that streaming view. This is admittedly kind of ugly but that’s the price for maintaining the ability to open named pipes withopen
.There are probably other complications. How do you handle writes to files open for O_APPEND? Does
pwrite
write to the file at the requested offset or does it ignore that offset and write at the end? If it does write at the requested offset, how can you atomically append some data to the file? You can’t ask for the current size and then write at that offset because the file size might change between the first call and the second.What do you do about
select
andpoll
and friends? Do these take streaming views instead of file descriptors now?Overall I don’t hate the idea. If we were going to put this in object-oriented terms then the current system would have
pread
andpwrite
methods on the file descriptor interface. But some classes that implement that interface (like pipes, sockets, etc.) don’t support those operations so they just fail at runtime if you try to call those methods. Usually this is a sign that you’ve got your interfaces designed poorly. The most obvious fix for this type of thing would be to split things up into two (or more) interfaces and have each class implement only the methods that make sense for that particular class, and maybe create some adapter classes to help out. That seems to be what’s being proposed here, with the role of the adapter class being played by the “streaming view”. The most significant difference that I can see is that constructing new wrapper objects would normally be considered fairly cheap but constructing the streaming view would require an extra syscall which could be expensive enough that people would want to avoid it.I wonder if it would be possible to emulate the streaming view in userspace in some place like the C library. That would get the kernel entirely out of the business of maintaining file offsets and leave them up to the process to track. The C library would be allowed to call
read
andwrite
on objects like pipes and sockets but for real files it would only be allowed to callpread
andpwrite
. If the user code creates a streaming view of a file and tries to callread
on it then the C library would have to translate that request to an equivalentpread
call and then update the file position stored in the streaming view. Doing this for any POSIX environment would probably be somewhere between difficult and impossible but maybe one can imagine an OS design where it could be made to work.My point isn’t that “this isn’t necessary because discipline”, it’s “the amount that this helps doesn’t reduce the discipline required in any significant way.” Everything is still
read(Object, … )
,pread(Object, …)
,ioctl(Object, …)
etc. Removing lseek doesn’t stop two processes or threads from interfering with each other with read and its implicit seeks on a pipe, socket or streamed file.