1. 35

  2. 15

    But otherwise, try to stick to more standard Unix tools.

    I think dd is pretty darn standard…

    1. 2

      Not in the form of its arguments or the ability to pipe data. I found the example with cat and pv quite telling.

      1. 4

        The default for dd is stdin/stdout, so you can easily do dd if=example | pv | dd of=example.

      2. 1

        You can quibble with his wording but IMO his usage of cat here is pretty darn compelling.

        1. 2

          Not really: in the progress meter version cat has no reasonable way to know whether it is writing to a device where larger writes than the block size are desirable for throughput, or to grep where latency should be minimized.

      3. 12

        dd is the only tool in the POSIX-specified minimal set of utilities that can read an exact number of bytes, and so shows up in some surprising places (Midnight Commander’s “Files transferred over Shell protocol”, for example).

        dd does other useful things too, allowing you to seek to a certain offset in the output file before writing, do big-little endian conversion (only for two-byte values), etc, etc.


        1. 4

          Indeed. cat != dd.

          I wrote a GOFF [*] file reader using standard Unix tools and cat will not work (even with GNU extensions) because you have to read in blocks of 80 bytes. Same goes for the DLL export files.

          If you think dd can be replaced with cat, then I suggest you probably only do what the author does with it: copy entire files.

          [*] GOFF is the object code file format on mainframes.

        2. 8

          dd < source > target

          Author seems ignorant of origin of dd specifically, and UNIX coventions generally.

          1. 5

            While we’re at it: why does the head(1) command even exist?

            sed 11q

            1. 29

              Like a great many “oh it’s so simple” replacement commands, it’s not actually the same. sed 10q *.c and head *.c produce quite different results. I’m sure with enough work, you could cook up a shell script that does about the same. In the mean time, I’ll be using head.

              This reminds that for a long time FreeBSD had a note in the ls manpage explaining that there was no option for sorting by file size because look how easy it is to pipe the output to this sort command. Oh, but remember if you use ls -i the column count changes, so use this other sort command. And if you use ls -h for human readable numbers, first pipe the output through awk so the numbers are scaled properly. Anyway, while you’re trying to piece this all together, take a moment to reflect on how fortunate you are to bask in such pure unix essence.

              Oh, I forgot the best part. ls escapes control characters in output, but only when writing to a terminal. sort isn’t a terminal, so unless you want your terminal getting jacked up, you also have to introduce everybody’s favorite command, cat -v, into the mix.

              1. 1

                There is no substitute for typing dumb things into the computer except not typing dumb things into the computer. There will always be trade-offs. Understanding what a given program does with its input is the first step.

                1. 1

                  I would posit the commands have dumb requirements (using the ls example given above by tedu)

              2. [Comment removed by author]

                1. 4

                  Some of the things in that list are a bit of a stretch. My favourite is the positively masochistic:

                  while (! ~ (`{ date }) (specific-time)); commands

                  as an apparent stand-in for at(1).

                  1. 3

                    I think the assumption is that you’d write your own little shell (rc shell, obviously) script containing something similar to the suggested command and call it “at”.

                    The nice thing about Plan 9 is you could put it in /bin and it would be there no matter where your code was actually running…

                    1. 9

                      I’m not great at rc, but isn’t that a busy loop? If I want to schedule a job for after hours, I’m not sure my corworkers will appreciate pegging the CPU running date 10000 times per second. So you can add a sleep and some stuff, but now you’re just reinventing a square wheel.

                      I’ll add that the script’s error handling also really sucks. Accidentally mistype the specific time? Loops forever…

                      1. 2

                        A simple shell script is not in every case a superior replacement for a featureful program (especially if the shell language is complex and full of easy-to-trip gotchas), but the point of that list of comparison commands is to illustrate that with carefully considered primitives you can go a long way with very little. Probably not all of those examples are golden.

                        1. 1

                          I think there’s also a bit of tongue-in-cheek humor there too.

                          1. 7

                            Like a lot of the plan9 holy scriptures, not all of the true believers seem to be in on the joke. :)

                            1. 3

                              I don’t think the cat -v paper or THE UNIX PROGRAMMING ENVIRONTMENT were intended as jokes. None of this is really about Plan 9, specifically. Plan 9 comes up in these discussions because its primary architect was the same guy who co-wrote those earlier texts, and incorporated his strongly argued preferences into the new system. Those preferences include things like: don’t write unnecessary code. The guys in 1127 at the Labs never bought into a lot of what went into BSD or even other Labs versions of UNIX, anyway. I don’t think it’s fair to dismiss the (admittedly, famously misunderstood) pov of UNIX because some individuals forty years later made bad arguments on the Internet or because a decade-old article in the user-contributed Plan 9 wiki made silly comparisons between then-modern UNIX commonplaces and obviated similar functions in Plan 9. Reading for content, the underlying point remains the same: Keep it simple, stupid.

                              1. 2

                                Funny you mention the cat -v paper. Towards the end, they build a columnizer out of pr, but of course they give it an alias instead of always typing a bunch of arbitrary arguments. Then they go on to suggest rewriting useful utilities in C. So to return to the original question of why we have head instead of telling users to memorize sed commands, I think there’s your answer.

                                1. 2

                                  Are you misunderstanding me on purpose.

                    2. 2

                      HISTORY The head utility first appeared in 1BSD.

                      AUTHORS Bill Joy, August 24, 1977.

                  2. 5

                    There’s lots of features that dd can do rather than output to a raw device. It can read non seekable file descriptors without destroying them. It may not be as fast as cat, but it operates at the block level instead of just STDOUT.

                  3. [Comment removed by author]

                    1. 1

                      I’m not sure I understand. Why is one any better at trashing disk images than the other?

                      1. 2

                        cat truncates the output to the size of the input, dd conv=notrunc doesn’t.

                        1. 3

                          Strictly it’s the shell that’ll do the open(O_TRUNC) rather than cat or head themselves, but yes…dd (the GNU version, anyway) also offers other open(2)-time options that shell redirections don’t, such os O_DIRECT and O_NOATIME.

                    2. 5

                      I wonder if the author has noticed that first he asks for cat to figure out the correct blocksize (which, by the way, is usually larger than the device block size to let the drive use its buffer), but then he adds pv to the pipeline between cat and the target. So now cat definitely cannot pick the block size correctly.

                      No, I don’t want all my pipelines to use the write block size of 1MiB by default, thank you.

                      1. 3

                        dd is an oddball in both behavior and syntax because it’s a reference to the JCL command that shipped with IBM mainframes. It’s not useless, but it is slightly silly on purpose.