1. 14

  2. 1

    Woah, what a coincidence. There’s a hackaton this Saturday and I was planning on implementing a FAT fs, which I’ve been wanting to for a while but didn’t have the time and patience to put my mind into.
    I love how those kind of things happen in life! Thanks for the interesting article.

    1. -1

      People use cat in the weirdest ways…

      1. 8

        I’m aware of useless uses of cat, but in this case I wanted to use it to ensure that wc -c wasn’t relying on the filesystem’s count of the number of bytes in the file - sending it through a pipe ensures that.

        1. 3
          wc -c < foo

          Also, POSIX specifies that wc shall read the file.

          1. 5

            If you check out the GNU coreutils wc source, if it’s only counting bytes, it will try to lseek in the fd to find the length. wc -c < foo is not the same as cat foo | wc -c in this case, because the seek will succeed in the first case and not in the second.

            1. 8

              I still prefer cat |. I actually prefer cat | in almost every case, because the syntactic flow matches the semantic flow precisely. With an infile, or even having the first command open the file, there’s this weird hiccup where the first syntactic element of the pipeline isn’t the initial source of the data, but the first transformation thereof.

              The main argument against it seems to be “but you’re wasting a process”, which, uh, with all due respect, I can’t see ever being a problem on a system you’d ever run a full multiprocessing Unix system on. If your system were constrained enough that that was an issue, a multiprocessing Unix would be too much overhead in and of itself, extra cats notwithstanding.

              1. 2

                < foo

                This does not guarantee that bytes are actually being read(); redirecting a file to stdin like that lets the process call fstat() on it if it wants. A naughty implementation of wc -c could call fstat(), check st_mode to verify that stdin is a regular file rather than a pipe or something, and then return the filesystem’s reported size from the st_size field without actually reading any bytes from stdin. Having some other process like cat or dd or something read the bytes onto a pipe does prevent wc -c from being able to see the original file & hence prevents it from being able to cheat and return st_size.

                Also, POSIX specifies that wc shall read the file.

                I guess this does. :)

                1. 0

                  … and this is, indeed, how I would have done it.

                2. 1

                  Interesting. Thank you for the great response.