The issue with double dash (--) is one “string safety” problem I’ve meant to address on my blog, but haven’t yet.
I’m not sure I’m following the argument in this post though.
I don’t see the need to distinguish between paths and strings.
I do see the need to distinguish between args and flags, which is what the -- convention does.
As far as serializing arrays, it seems like the NUL convention of find -print0 and xargs -0 is (surprisingly) sufficient? I didn’t quite see that until writing the two Git Log in HTML posts.
I think there does need to be some kind of lint check or function wrapper for --, like flagging the former but not the latter (somehow):
grep foo $file # oops, $file could be a flag and GNU and OS X grep accepts it
grep -- foo $file # this is more correct
mygrep() { # mygrep only accepts args
grep -E --color -- "$@"
}
mygrep2() { However sometimes you do want the user to be able to append flags, so this function is useful too
grep -E --color "$@"
}
mygrep2 -v pattern *.c # invert match
The rule I have for Oil is to avoid “boiling the ocean” – i.e. there can’t be some new protocol that every single command has to obey, because that will never happen. Heterogeneity is a feature.
However there should be some recommended “modern” style, and I haven’t quite figured it out for the flag/arg issue.
I think one place Oil will help is that you can actually write a decent flags parser in it. So maybe the shell can allow for wrappers for grep/ls/dd etc. that actually parse flags and then reserialize them to “canonical form” like:
command @flags -- @args # in Oil syntax
command "${flags[@]}" -- "${args[@]}" # in bash syntax
I addressed quoting and array types in various places, including this post:
I don’t see the need to distinguish between paths and strings.
They’re different types: a path is a string which is or could be the pathname of a file or directory on disk. Since they’re different types, treating them differently yields all the standard benefits of strong typing.
One thing you might find interesting is IBM i’s command language. Arguments to commands are named and typed, and one very interesting property is that you can hit the F4 key and pop up a form with the command’s available options; and prompt further into those, with pop-up help. It then creates the command line to run. This series of toots is probably the best quick primer on why it’s interesting.
Every time someone mentions VMS, AS/400, AIX, HP-UX I wonder if there is any way to goof around these systems to see for myself the differences from the current mainstream servers.
There’s an outfit called VMS Software that claims to be working on an x86_64 OpenVMS port. According to this roadmap document from 2017 they expect to have OpenVMS 9.2 ready for production on x86_64 sometime in 2020. It’ll still be proprietary, of course.
If you want to get a feel for an advanced shell with non-text streams, dive into Powershell. It runs on Linux these days, and it’s got a considerable ecosystem of snippets, articles, etc out there - more than Fish or the other alternative shells.
a fusion of scripting and programming; many commands are functions
Quick suggestion on this one: put a link to your first article in the sentence where you say you previously covered quotation. That way everyone landing on this one first can get to it in a click.
I’m not sure the blame falls entirely on the shell on this one. The shell has no way to signal to an external program what is the type of an argument. You would need some kind of exec() with typed arguments for that, or as the author suggests some type of in-band convention, but for that you either apps to support it.
For critical commands like rm/ls and shell expansion (the example of a file called -ls ) can’t this be solved if those commands are builtins in the shell? If ls is a builtin it and the arguments are typed then it could avoid that corner case.
I’m not sure the blame falls entirely on the shell on this one
Absolutely. This convention would go across boundaries and mean changes to both the shell and the command line tools. Perhaps we could go further and embed metadata about unix programs in the ELF binary like a computer parseable description of the args and their types.
The ELF metadata would be neat.
As an alternative. I think the fish shell grabs some information on command argument completion from the manpages. Maybe you could do something similar to detect compliant commands and cache a table of commands somewhere.
For all this trouble, we gain substantial semantic simplifications. There is no need for the distinction between $* and $@. There is no need for four types of quotation, nor the extremely complicated rules that govern them. In rc you use quotation marks when you want a syntax character to appear in an argument, or an argument that is the empty string, and at no other time. IFS is no longer used, except in the one case where it was indispensable: converting command output into argument lists during command substitution.
Great article! I’d love it if you’d expand on some of these ideas. Can you sketch out what the new world order of typed differentiation between commands, arguments, and flags would actually look like?
Or maybe shells were pretty much built to be primitive User interfaces and not serious “typed” languages. Stop trying to see shells as “scripting” and a lot of your problems pretty much become secondary.
Seriously, just use a REAL scripting language, then you can readdir() or whatever without having to depend on ls and all the weird ways you can invoke it.
While I understand your argument, you should consider that we use the term “script” instead of, say, “command” exactly because scripts were originally a sequence of shell commands saved in a file for future convenience.
So there is a continuous between the glue provided by a shell and a full interpreted programming language.
Shell were designed to glue small programs providing specific features into larger ones.
Interpreted language are designed to write larger programs in the first place, by composing the available libraries that provide the specific features.
The issue with double dash (
--) is one “string safety” problem I’ve meant to address on my blog, but haven’t yet.I’m not sure I’m following the argument in this post though.
--convention does.NULconvention offind -print0andxargs -0is (surprisingly) sufficient? I didn’t quite see that until writing the two Git Log in HTML posts.I think there does need to be some kind of lint check or function wrapper for
--, like flagging the former but not the latter (somehow):The rule I have for Oil is to avoid “boiling the ocean” – i.e. there can’t be some new protocol that every single command has to obey, because that will never happen. Heterogeneity is a feature.
However there should be some recommended “modern” style, and I haven’t quite figured it out for the flag/arg issue.
I think one place Oil will help is that you can actually write a decent flags parser in it. So maybe the shell can allow for wrappers for grep/ls/dd etc. that actually parse flags and then reserialize them to “canonical form” like:
I addressed quoting and array types in various places, including this post:
Thirteen Incorrect Ways and Two Awkward Ways to Use Arrays
There are also related string safety problems here – is
-aan argument or an operator totest?Problems With the test Builtin: What Does -a Mean?
They’re different types: a path is a string which is or could be the pathname of a file or directory on disk. Since they’re different types, treating them differently yields all the standard benefits of strong typing.
One thing you might find interesting is IBM i’s command language. Arguments to commands are named and typed, and one very interesting property is that you can hit the F4 key and pop up a form with the command’s available options; and prompt further into those, with pop-up help. It then creates the command line to run. This series of toots is probably the best quick primer on why it’s interesting.
Every time someone mentions VMS, AS/400, AIX, HP-UX I wonder if there is any way to goof around these systems to see for myself the differences from the current mainstream servers.
There’s an outfit called VMS Software that claims to be working on an x86_64 OpenVMS port. According to this roadmap document from 2017 they expect to have OpenVMS 9.2 ready for production on x86_64 sometime in 2020. It’ll still be proprietary, of course.
There’s a hobbyist program for OpenVMS to get licenses. Gotta get an emulator or used hardware on eBay from there. Im not sure about the rest.
If you want to get a feel for an advanced shell with non-text streams, dive into Powershell. It runs on Linux these days, and it’s got a considerable ecosystem of snippets, articles, etc out there - more than Fish or the other alternative shells.
You don’t have to dream about this stuff. It’s got that Microsoft flavor, but it’s available today.
Welcome back, buddy! Missed ya!
Quick suggestion on this one: put a link to your first article in the sentence where you say you previously covered quotation. That way everyone landing on this one first can get to it in a click.
Thanks nick, that’s a good idea. I have added a link like you suggested.
I’m not sure the blame falls entirely on the shell on this one. The shell has no way to signal to an external program what is the type of an argument. You would need some kind of
exec()with typed arguments for that, or as the author suggests some type of in-band convention, but for that you either apps to support it.For critical commands like rm/ls and shell expansion (the example of a file called
-ls) can’t this be solved if those commands are builtins in the shell? Iflsis a builtin it and the arguments are typed then it could avoid that corner case.Absolutely. This convention would go across boundaries and mean changes to both the shell and the command line tools. Perhaps we could go further and embed metadata about unix programs in the ELF binary like a computer parseable description of the args and their types.
The ELF metadata would be neat. As an alternative. I think the fish shell grabs some information on command argument completion from the manpages. Maybe you could do something similar to detect compliant commands and cache a table of commands somewhere.
http://doc.cat-v.org/plan_9/4th_edition/papers/rc
Great article! I’d love it if you’d expand on some of these ideas. Can you sketch out what the new world order of typed differentiation between commands, arguments, and flags would actually look like?
Maybe a follow on post?
Or maybe shells were pretty much built to be primitive User interfaces and not serious “typed” languages. Stop trying to see shells as “scripting” and a lot of your problems pretty much become secondary.
Seriously, just use a REAL scripting language, then you can readdir() or whatever without having to depend on ls and all the weird ways you can invoke it.
While I understand your argument, you should consider that we use the term “script” instead of, say, “command” exactly because scripts were originally a sequence of shell commands saved in a file for future convenience.
So there is a continuous between the glue provided by a shell and a full interpreted programming language.
Shell were designed to glue small programs providing specific features into larger ones.
Interpreted language are designed to write larger programs in the first place, by composing the available libraries that provide the specific features.