I assume it is a feature that they would like to remove, but has been kept around to ensure backwards compatibility:
On Linux, argv and envp can be specified as NULL. In both cases, this
has the same effect as specifying the argument as a pointer to a list
containing a single null pointer. Do not take advantage of this non‐
standard and nonportable misfeature! On many other UNIX systems, spec‐
ifying argv as NULL will result in an error (EFAULT). Some other UNIX
systems treat the envp==NULL case the same as Linux.
I believe this is a red herring. It’s talking about e.g. execv("foo", NULL), which Linux will translate to execv("foo", (char *[]){ NULL }). The latter invocation seems to be correct by all the relevant standards, as far as I can tell, and is sufficient to exploit pkexec.
I’ve spent too much of today arguing about this today already, but for my money it’s intuitively reasonable that a list of arbitrary length can be empty. If I’d been designing this API, I’d have taken quite a lot of convincing to impose a minimum size of parameter list. Sure, argv[0] is supposed to be our program name, but since we can’t assume our caller was honest enough to fill it correctly, why are we happy assuming it was filled at all?
Sure, argv[0] is supposed to be our program name, but since we can’t assume our caller was honest enough to fill it correctly
Well, keep in mind a single binary can have multiple possible correct values for argv[0]. busybox is one example.
I personally always check the argc == 0 case, but I’m persuaded that it would be good for it to be impossible.
I think of the “argc must be > 0” requirement as equivalent to “we have a new argument to main/execve called char *progname, and now argv[0] is your first argument instead of argv[1]”.
we can’t assume our caller was honest enough to fill it correctly
It’s not that obvious that the caller is actually another program. If you don’t think too hard, this argv[0] thing is “always there” right in the program entry point — so that sure feels like something coming from “the system”!
Note that ANSI C didn’t require that argc be greater than zero; all the rules about argv interpretation are predicated upon an argv > 0. It’s Unix (POSIX) which adds requirements that argc be greater.
IIRC, on the Amiga if argc was non-zero you were in a CLI and if argc was 0 then you were launched from Intuition, the GUI, and should use a library call to get the relevant data needed. That was “a neat hack”.
Note that CheriABI would prevent these problems. With the SysV ABI, the stack is pre-populated with three things:
The ELF auxiliary arguments.
The arguments.
The environment variables.
Startup code is expected to walk up the stack to identify these points. With CheriABI, walking the stack like this isn’t great for memory safety and so the kernel passes pointers to all three of these in the first three argument slots (registers on most architectures). On a CHERI system, the attempt to access argv[-1] would then trap, even on a zero-length argv as an out-of-bounds read.
Huh, Linux actually documents supporting null
argv[0]
as a feature? Interesting.I assume it is a feature that they would like to remove, but has been kept around to ensure backwards compatibility:
I believe this is a red herring. It’s talking about e.g.
execv("foo", NULL)
, which Linux will translate toexecv("foo", (char *[]){ NULL })
. The latter invocation seems to be correct by all the relevant standards, as far as I can tell, and is sufficient to exploit pkexec.I’ve spent too much of today arguing about this today already, but for my money it’s intuitively reasonable that a list of arbitrary length can be empty. If I’d been designing this API, I’d have taken quite a lot of convincing to impose a minimum size of parameter list. Sure, argv[0] is supposed to be our program name, but since we can’t assume our caller was honest enough to fill it correctly, why are we happy assuming it was filled at all?
Well, keep in mind a single binary can have multiple possible correct values for argv[0]. busybox is one example.
I personally always check the argc == 0 case, but I’m persuaded that it would be good for it to be impossible.
I think of the “argc must be > 0” requirement as equivalent to “we have a new argument to main/execve called
char *progname
, and now argv[0] is your first argument instead of argv[1]”.It’s not that obvious that the caller is actually another program. If you don’t think too hard, this
argv[0]
thing is “always there” right in the program entry point — so that sure feels like something coming from “the system”!the “not thinking to hard” about environmental assumptions is why we have had interesting bugs in the past like:
I mean it’s certainly useful if you ever need to run something as root, but don’t have the necessary permissions :D
Note that ANSI C didn’t require that argc be greater than zero; all the rules about argv interpretation are predicated upon an argv > 0. It’s Unix (POSIX) which adds requirements that argc be greater.
IIRC, on the Amiga if argc was non-zero you were in a CLI and if argc was 0 then you were launched from Intuition, the GUI, and should use a library call to get the relevant data needed. That was “a neat hack”.
Note that CheriABI would prevent these problems. With the SysV ABI, the stack is pre-populated with three things:
Startup code is expected to walk up the stack to identify these points. With CheriABI, walking the stack like this isn’t great for memory safety and so the kernel passes pointers to all three of these in the first three argument slots (registers on most architectures). On a CHERI system, the attempt to access
argv[-1]
would then trap, even on a zero-lengthargv
as an out-of-bounds read.