1. 87
  1.  

  2. 20

    Man… I’ve reached the point where techniques I promoted are now historical questions.

    1. 1

      Man, I felt old because I remember doing this 20 years ago, and the answer was obvious to me :)

    2. 7

      I have wondered this for so long!

      1. 3

        I have a faint memory that one of the really, really old CVEs was for this. A privileged script that expected its arguments to be file names, I think, and could be made to do Wrong Things by passing in shell metacharacters or -.

        1. 10

          People (rightly) talk a lot about the security implications of software being written in memory-unsafe languages like C. But, while there’s a lot of talk about the gotchas in shell programming, there’s shockingly little talk about the security implications of having so much software written in a language where it’s almost impossible to handle files correctly and almost impossible to handle strings correctly,

          1. 7

            I think that if a language’s safest way to compare strings uses syntax as weird "x$a" = "x$b", that’s a warning sign with letters big enough to read from space.

            The sign reads do not use this language to write code to be run by strangers, and if there’s a footnote, the footnote says “this is a fine language to write a plugin for git bisect and that kind of one-offs, but not for writing things to be run by stranger, okay? you understand now?”

            1. 3

              Yeah, I agree. Another interesting challenge is to iterate over all the files in a directory, correctly, and without doing the actual loop in a subprocess; it’s probably possible, but most people’s first 3 attempts will probably be dangerously broken.

              I agree that shell is a fine language for tiny one-off things, but I believe it’s being used way more broadly than that. I just ran a find /usr -name '*.sh' | xargs cloc [1] on my system, and found 114501 lines of shell across 1050 files. I don’t have a great level of confidence that none of those shell scripts ever consume untrusted input.

              [1] As an interesting footnote, find /usr -name '*.sh' | xargs cloc is incorrect; it won’t work if files contain whitespace, quotes or backslashes. The “correct” (AFAIK) command would be find /usr -name '*.sh' -print0 | xargs -0 cloc. I sure hope nobody has ever made that mistake in a context where files may contain “weird” characters! Also, that approach only works for iterating over a relatively small number of files; how many is dependent on your particular POSIX system!

              1. 4

                It also ignores the staggering number of shell scripts that are involved in the init process that aren’t named *.sh!

            2. 3

              Bash is like glue, only a fool would make a whole building out of it.

        2. 3

          I thought it was to prevent unwanted behavior in test when $var is not set and/or to test if $var is set.

          1. 4

            That was almost the context in which I first encountered the idiom: when $var expands to the empty string, whether because unset or because explicitly empty.

            Running [ "$foo" = "needle" ] would somehow lose the empty parameter to the left of the = even though it’s explicitly still there as an argv item, so [ "x$foo" = "xneedle" ] was needed.

            I think I tend to use the x form when writing single-square brackets tests in portable shell, but skip it when using the conditional expressions [[ ... ]] feature of bash/zsh.

            1. 4

              As far as I’m concerned this is the only correct answer. If you didn’t use “x$foo” you could get an error about one side of the comparison being empty.

              Thankfully test got smarter, but for those of us who’ve been around the block a few times old habits die hard.

          2. 3

            Related: Problems With the test Builtin: What Does -a Mean?

            The POSIX spec did indeed make things cleaner; the last section quotes it and gives some style advice.

            1. 2

              However, test and [ were also available as separate executables, and appear to have retained a variant of the buggy behavior:

              This and the argument supporting the statement is not complete.

              When you call an executable test or [, any shell has to expand the parameters of the form $var. The shell cannot do any magic special treatment here. If you call [ "$var" = -f ] with var=-f, when [ is not builtin, the executable [ (or test) is going to see a list of four command line arguments, -f, =, -f, and ]. In this case, how does the executable tell if you want the unary test of -f or the binary test of =? POSIX worked around this issue by specifying that if it sees three arguments (excluding the final ]), it should do the binary test, and if it only sees two arguments, it should do the unary test.

              Of course, the above is still not complete. Once you start using (, ), -a, and -o, the POSIX specification of choosing binary/unary by counting arguments breaks down.

              Reference: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/test.html

              For those who hates shell script because of this, it’s not really shell’s fault. Once you are asking a standalone executable to use its (int argc, char *argv[]) to do complex things like this, there is just not enough information from what is specified in POSIX plus -a and -o. Prefixing x to your $var is really a simple and effective way of quoting your variables, similar to \ in most other string handling cases.

              1. 1

                An alternative recommendation (that still avoids “x$var”) is to use the more general && and || (that works with any command, instead of -a and -o that only works for test/[). As a matter of preference, I also prefer to use the test name of this command, because it is honest about being a command, making it less tempting to fall into the trap of using -a, -o and negation.

                if test "$var1" = val && ! test "$var2" = val2; then
                    …
                fi
                

                It is also a sidetrack whether test/[ are builtins. That doesn’t matter: They are builtins, yet they are equally broken, for the same reasons, because they are commands, not syntax. That’s what matters.

                > type [
                [ is a shell builtin
                > type [[
                [[ is a shell keyword
                
              2. 2

                The conclusion says “The last one managed to stay until 2015, but only in the very specific case of comparing opening parenthesis to a closed parenthesis in one specific non-system shell.” but then the epilogue says “You can still see this on macOS bash today”. I get that macOS’s /bin/sh is probably pdksh or ash or something, but there’s enough scripts that hard-code bash that I’d worry about tickling this issue on macOS. Even though newer versions of bash are fixed, macOS is never going to update, so this issue will probably be around until macOS stops shipping bash entirely, or people stop trying to use it as a Unix workstation.