1. 15

A counterpoint to Find is a beautiful tool, “find sucks, but is inarguably useful”.

It’s an old plan9 mailing list post, in which someone deconstructs find, and rebuilds it into something more modular, flexible, and powerful, and still compact enough to include the c and shell source code.


  2. 6

    lr is a easy to use alternative with an expression language.

    1. 5

      See also the followup email about usage, including examples.

      1. 4

        It needs to be a full blown Awk-like language, with -v name=value for passing parameters. I don’t think there’s any other way to subsume all of its functionality with command line flags. It’s already a huge abuse of the command line syntax. Both operators like -a and actions like -printf look the same, and neither of them are what flags usually mean (options passed to a program). This is confusing.

        It already it as a pattern/action or predicate/action language, just with a bad syntax.

        As I mentioned here:


        In fact it would be pretty easy to add this to an existing Awk codebase. You just add a simple tree walk function that exposes file system metadata to the boolean evaluator and action executor. If anyone wants to work on this in Oil [1] let me know :)

        [1] http://www.oilshell.org/

        1. 3

          The suggestion later in the thread to use du instead seems odd to me. That’s some Plan 9 shenanigans there.

          I love find but it definitely is a pain to work with. walk looks a lot nicer, although with Subversion I definitely need a prune ability that is better than grepping out the results post-facto. find’s prune is insane though.

          1. 2

            The pure plan9 version of that by Nemo is further down the thread: http://marc.info/?l=9fans&m=111558860717279&w=2

            1. 2

              Sor (Stream OR, get it?) is an rc script that reads a set of filenames from it’s input, and applies a set of tests to them, echoing those names that pass a test, discarding the rest.

              I’m not too familiar with rc, but it looks like the script executes the given command once for each input line. I couldn’t get the script to run here. Therefore I use an equivalent script in sh instead to perform some quick tests on a Subversion checkout of the FreeBSD ports tree. The times given below are the ones towards which the commands converged after numerous runs.

              First only find is used to find all the regular files inside the ports tree.

              % env time find /usr/ports -type f -print > /dev/null
                      1.86 real         0.12 user         1.74 sys

              It does all the work with a single process and performs the fastest.

              The second command is the sh version of the rc script. I would guess the rc version would perform similarly, since it’s doing the same thing.

              % env time find /usr/ports | sh -c 'while IFS= read -r file; do if "$@" "$file"; then echo "$file"; fi; done' '' "`command which test`" -f > /dev/null
                    181.15 real         0.10 user         1.06 sys

              It’s about 100 times slower than the find equivalent.

              It isn’t the fastest thing in the world, but it works, and seems to work pretty well.

              Not “the fastest thing in the world” is quite an understatement. Spawning a new process for every input line is probably not very efficient. We can test this hypothesis by forcing find to do the same.

              % env time find /usr/ports -exec test -f '{}' ';' -print > /dev/null
                    221.08 real        61.76 user       160.86 sys

              And indeed the performance is not too different from the previous command. So the excellent composability of the sh (and presumably rc) variant comes at a hefty price.

              And last there’s stest, which comes with dmenu.

              % env time find /usr/ports | stest -f > /dev/null
                      1.88 real         0.10 user         1.03 sys

              It’s basically as fast as the regular find version and demonstrates that it’s possible to move the application of the predicates to the filenames into a separate command in the pipeline while maintaining good performance. It nicely avoids naively spawning a new process for every input line.