Lately I find myself writing fewer and fewer programs, because I can do a lot in the shell, and it’s usually a couple of lines at worst. Saves me time, saves me work… UNIX as an IDE is incredibly satisfying to use.
Someone shared with me a lovely Unix koan:
Master Foo once said to a visiting programmer: “There is more Unix-nature in one line of shell script than there is in ten thousand lines of C.”
The programmer, who was very proud of his mastery of C, said: “How can this be? C is the language in which the very kernel of Unix is implemented!”
Master Foo replied: “That is so. Nevertheless, there is more Unix-nature in one line of shell script than there is in ten thousand lines of C.”
The programmer grew distressed. “But through the C language we experience the enlightenment of the Patriarch Ritchie! We become as one with the operating system and the machine, reaping matchless performance!”
Master Foo replied: “All that you say is true. But there is still more Unix-nature in one line of shell script than there is in ten thousand lines of C.”
The programmer scoffed at Master Foo and rose to depart. But Master Foo nodded to his student Nubi, who wrote a line of shell script on a nearby whiteboard, and said: “Master programmer, consider this pipeline. Implemented in pure C, would it not span ten thousand lines?”
The programmer muttered through his beard, contemplating what Nubi had written. Finally he agreed that it was so.
“And how many hours would you require to implement and debug that C program?” asked Nubi.
“Many,” admitted the visiting programmer. “But only a fool would spend the time to do that when so many more worthy tasks await him.”
“And who better understands the Unix-nature?” Master Foo asked. “Is it he who writes the ten thousand lines, or he who, perceiving the emptiness of the task, gains merit by not coding?”
Upon hearing this, the programmer was enlightened.
I’ve been going in the other direction. I now use libraries like lens and lens-aeson inside of ghci more and more often. These provide a consistent API for whatever data I’m working with.
I can work with YAML, TOML, JSON, CSV, XML, etc. with the exact same functions. No need to learn separate tools like jq, yq, xmlstarlet, etc.
This is my wish too. From time to time I reach to ghci, but I have not been able to create my toolbox that would allow me to replace shell for the daily operations (navigating directories, copying, etc., good history). But learning Haskell is a journey on itself. Maybe I should start refactoring some of the shell scripts with Haskell to get more used to it?
What libraries can you suggest for the mundane unix tasks, like navigating directories, operating on files, using other processes? I remember that I tried some in the past years, but none convinced me.
What I’ve observed is that sometimes these shell pipelines are only executed once, and then writing them saves a lot of time. But sometimes someone wants to run them again. Or even a few times. Occasionally, someone puts them in a crontab somewhere to be run repeatedly until further notice. And then maybe, just maybe, they become part of a production system and they are supposed to run in various environments.
At some point along this chain of events, it would have been easier from a maintenance perspective to just write the program to begin with. But I’m still terrible at judging ahead of time whether my shell pipelines are truly one-off tasks, or if they’re going to grow into a part of the production system.
“But why don’t you just write the shell pipeline first, and then port it to a proper program at some point in the future?” I hear you ask. That’s a perfectly valid solution. It’s also a really boring task to translate a shell pipeline into a program – all the fun of the implementation design work has already been done, and what’s left is a grudging mechanical task of translation. So I tend to put it off…
At some point along this chain of events, it would have been easier from a maintenance perspective to just write the program to begin with. But I’m still terrible at judging ahead of time whether my shell pipelines are truly one-off tasks, or if they’re going to grow into a part of the production system.
If you truly believe shell pipelines do not have a place in your production system, go talk to one of your systems administrators/ops/SREs or whatever you call them - your eyes will be opened ;-)
Alternative approach for you to consider: why is a shell pipeline not a program?
True, I missed some important context in my comment: I am the only person in the company who is anywhere near capable of writing something that approaches a well-written pipeline. I guess, now that I write it out, though, there’s an argument to it being the more efficient long-term strategy to train everyone else in shell scripting, though.
If you like awk and sed (and other unix tools designed to be inserted into a pipeline), the best tool for json isn’t jq, it’s gron (https://github.com/TomNomNom/gron). gron turns tree-structured json into the line structured data that unix tools expect and allows you to search and edit it using familiar tools like sed, rather than having to learn jq’s mini-language (that I have to look up every time I need it).
It creates a one-per-line entry for each field with a dot-separated hierarchical key.
The comment field there doesn’t permit editting and correcting typos…..
So let me try again here…
In a galaxy far far away….
Larry Wall wondered why he needed to learn 3 pretty bad languages, sh, awk, sed…., and devised perl as the Grand Unifying Language.
Perl sadly borrowed too much from it’s inspirations, and wasn’t much more readable.
Then Matz came along and resolved to borrow the best from perl and scheme and ….. and make something more powerful than them all, yet more readable.
It’s called Ruby.
And yes, you can do everything in Ruby, in one line if you must, that you can do in bash, awk, sed, jq, perl…. but in a more powerful and maintainable form.
All this has been available for decades, why are we (still) bashing (pun intended) our heads against the Lowest Common Denominator?
serious question: what does “doing some awk in Ruby” look like? This might be a pretty big motivator for me to finally figure out Ruby for scripting (I’m more of a Python guy myself but awk works nicely for small scripts on line-oriented stuff when I want a one-liner)
Simple example, but I think it demonstrates that doing various basic and common tasks are quite a bit more complex to do in Ruby than in the shell.
That doesn’t mean I’m always in favour of shell scripts – I got that example from an article I wrote saying you shouldn’t use shell scripts – but there are definitely reasons shell scripting persists, even though we have things like Perl and Ruby.
In that article I wrote “I regret writing most shell scripts [..] and my 2018 new year’s pledge will be to not write any more”. I’ve mostly failed at that new years’ pledge, and have happily continued shelling about. I have started rewritting shell script prototypes to other languages at the first sign of getting hairy though, and that seems like a middle ground that is working well for me (I should update/ammend that article).
To be fair, it looks like most of the additional complexity in the Ruby code comes from reading files: the first command in the pipeline, grep -i ^re glob, is what becomes
Dir[glob].map do |f|
File.open(f)
.readlines
.select { |l| l[0..1].downcase == re }
end.flatten
The rest of the script contributes very little to the Ruby code.
I suspect this is a recurring theme when trying to replace shell pipelines with programs. Only Perl avoids some of this additional complexity for reading files, I think.
At least with Ruby I don’t have to constantly cross-reference the man page and my cargo-culted knowledge of Unix’s multitude text manipulation DSLs, all unlike. It’s pretty obvious what it’s doing.
Actually you used very little shell there in your first example.
You also used grep, cut, sort and head.
Why do you assume the backtick operator and the | operator for io doesn’t exist in ruby? In fact why do people assume shell and jq do not exist if you use ruby?
Personally I tend to reduce the number of tools involved to reduce the cognitive load of needing to understand each tool to understand the one liner.
I balance that against considerations like going IO.read(”|sort -u fileName”) can be a huge performance boost
Because code in sed or awk that worked a decade ago (or, hell, two years) still works. Ruby code seems to bit rot faster than any other language I’ve use for nontrivial work.
Also, with awk, I could put it down for a year, then use it again, and everything I’d need to be productive fits in a small man page. (The same seems to be true of sed, though I don’t use it much.) The Ruby ecosystem moves a lot faster, and if you haven’t been following it closely, catching up will add extra work. (Whether it’s actually going anywhere is neither here nor there.)
Yes, awk is a more limited language, but that’s a trade-off – there are upsides, and I know which I’d prefer.
I don’t know Ruby. But for me these are the reasons why I am writing more and more bash programs:
Bash is my command line. So I am doing a lot of small steps, file modifications, comparing, searching analysing. At some point I can see that some of the steps can be composed and I pull them out of the history, try them out on the console and at some point put them into a script. If Ruby would have a REPL in which I can do all the operations that I am doing on the command line with less typing and more comfort, I would maybe give it a try.
Ruby does have a REPL. It’s called IRB and it comes with every Ruby installation. I use it exactly as you describe, for composing small programs iteratively.
Are you using the Ruby REPL as your daily all-time console or just when you have in mind to create a program? I am asking honestly because I do not know anything about Ruby or their REPL and I am quite interested how good this REPL is as a replacement for the daily life?
My point is that shell scripts are a by-product of using the shell for doing manual tasks. And I get better and better at my shell usage, and even after 20 years of shell usage I am still discovering new features or ways to do something in a more efficient way. While the shell language is really ugly, but being very succinct plus the composition of unix commands, the history, the prompt customization, the possibility to have vi mode for editing (and I probably forgot a lot of features), all this makes using shell such an efficient tool.
Well, no, not as my daily shell. I dislike shell scripting enough that I switch to Ruby pretty quickly if I’m having to spend any amount of time or effort on a task, but it’s not meant to be a replacement for bash/zsh/fish.
Mostly, because picking up enough jq, awk and sed to be useful is faster than learning the ins and outs of Ruby?
I suppose you could make a similar argument about learning Ruby one-liners, but by the time I’m writing a very long bash script, I’m probably writing a larger program anyway, either in Go or Python. Ruby as a language doesn’t have much appeal to me, at least at the moment.
Awk, at least, fits very nicely into a small space right next to regex. jq is a bit fiddilier to pick up, but very nice for basic stuff. Sed, I still don’t have down very well, but also is nicely regex adjacent.
I regularly write sed one liners to do refactorings on my Ruby code. Usually the sed call is fed by the result of grep or find. I could write a Ruby one liner to do the same, but it would be a much longer line and escaping would be much more difficult. Ruby is simply not a replacement for the convenience of sed.
And maintainability is a red herring here: the whole point of something like sed is that you use it for one-off commands.
I’m not that experienced with jq, but when it comes to awk (and sed), one of their benefits is that you can easily write a program in the shell, since they act as glue between pipe operations.
For example, to filter out all lines that have less than 4 characters, all you have to write is
... | awk 'length >= 5' | ...
no imports or types required. It was made for stuff like this, which makes it easy to use. I’ve only read a book about Ruby a few years ago, but to process stdin/out this was should require a bit more overhead, shouldn’t it?
One part of your history lesson is missing: Paul McCarthy and Steve Russell saw what was going to happen and pre-emptively invented Lisp. And yes, you can do everything in Lisp, in one line if you must, that you can do in bash, awk, sed, jq, perl… but in a more powerful and maintainable form.
Lately I find myself writing fewer and fewer programs, because I can do a lot in the shell, and it’s usually a couple of lines at worst. Saves me time, saves me work… UNIX as an IDE is incredibly satisfying to use. Someone shared with me a lovely Unix koan:
I’ve been going in the other direction. I now use libraries like lens and lens-aeson inside of ghci more and more often. These provide a consistent API for whatever data I’m working with.
I can work with YAML, TOML, JSON, CSV, XML, etc. with the exact same functions. No need to learn separate tools like jq, yq, xmlstarlet, etc.
This is my wish too. From time to time I reach to ghci, but I have not been able to create my toolbox that would allow me to replace shell for the daily operations (navigating directories, copying, etc., good history). But learning Haskell is a journey on itself. Maybe I should start refactoring some of the shell scripts with Haskell to get more used to it?
What libraries can you suggest for the mundane unix tasks, like navigating directories, operating on files, using other processes? I remember that I tried some in the past years, but none convinced me.
I usually just use the directory, process, etc. packages. They’re a bit non-ergonomical for a shell but I’m familiar enough with them.
I’ve used the Turtle package, it’s probably a better tool for the things I do and I should probably use it more often.
What I’ve observed is that sometimes these shell pipelines are only executed once, and then writing them saves a lot of time. But sometimes someone wants to run them again. Or even a few times. Occasionally, someone puts them in a crontab somewhere to be run repeatedly until further notice. And then maybe, just maybe, they become part of a production system and they are supposed to run in various environments.
At some point along this chain of events, it would have been easier from a maintenance perspective to just write the program to begin with. But I’m still terrible at judging ahead of time whether my shell pipelines are truly one-off tasks, or if they’re going to grow into a part of the production system.
“But why don’t you just write the shell pipeline first, and then port it to a proper program at some point in the future?” I hear you ask. That’s a perfectly valid solution. It’s also a really boring task to translate a shell pipeline into a program – all the fun of the implementation design work has already been done, and what’s left is a grudging mechanical task of translation. So I tend to put it off…
If you truly believe shell pipelines do not have a place in your production system, go talk to one of your systems administrators/ops/SREs or whatever you call them - your eyes will be opened ;-)
Alternative approach for you to consider: why is a shell pipeline not a program?
But, would it?
What stops a well-written pipeline from being run again, by a cron or any other program?
True, I missed some important context in my comment: I am the only person in the company who is anywhere near capable of writing something that approaches a well-written pipeline. I guess, now that I write it out, though, there’s an argument to it being the more efficient long-term strategy to train everyone else in shell scripting, though.
If you like awk and sed (and other unix tools designed to be inserted into a pipeline), the best tool for json isn’t
jq
, it’sgron
(https://github.com/TomNomNom/gron).gron
turns tree-structured json into the line structured data that unix tools expect and allows you to search and edit it using familiar tools likesed
, rather than having to learnjq
’s mini-language (that I have to look up every time I need it).It creates a one-per-line entry for each field with a dot-separated hierarchical key.
gron -u
undo’s gron-formatted data back into json, so you can make pipelines like<json source> | gron | <commands in pipeline> | gron -u
.For example:
I do prefer it over jq, but the
=
and;
in the output are annoying.I thought that too; there’s already some discussion about it.
The comment field there doesn’t permit editting and correcting typos…..
So let me try again here…
In a galaxy far far away….
Larry Wall wondered why he needed to learn 3 pretty bad languages, sh, awk, sed…., and devised perl as the Grand Unifying Language.
Perl sadly borrowed too much from it’s inspirations, and wasn’t much more readable.
Then Matz came along and resolved to borrow the best from perl and scheme and ….. and make something more powerful than them all, yet more readable.
It’s called Ruby.
And yes, you can do everything in Ruby, in one line if you must, that you can do in bash, awk, sed, jq, perl…. but in a more powerful and maintainable form.
All this has been available for decades, why are we (still) bashing (pun intended) our heads against the Lowest Common Denominator?
serious question: what does “doing some awk in Ruby” look like? This might be a pretty big motivator for me to finally figure out Ruby for scripting (I’m more of a Python guy myself but awk works nicely for small scripts on line-oriented stuff when I want a one-liner)
Relevant links:
https://nithinbekal.com/posts/ruby-sed-awk/
https://tomayko.com/blog/2011/awkward-ruby (ruby’s awk history)
Compare:
Versus Ruby:
Simple example, but I think it demonstrates that doing various basic and common tasks are quite a bit more complex to do in Ruby than in the shell.
That doesn’t mean I’m always in favour of shell scripts – I got that example from an article I wrote saying you shouldn’t use shell scripts – but there are definitely reasons shell scripting persists, even though we have things like Perl and Ruby.
In that article I wrote “I regret writing most shell scripts [..] and my 2018 new year’s pledge will be to not write any more”. I’ve mostly failed at that new years’ pledge, and have happily continued shelling about. I have started rewritting shell script prototypes to other languages at the first sign of getting hairy though, and that seems like a middle ground that is working well for me (I should update/ammend that article).
To be fair, it looks like most of the additional complexity in the Ruby code comes from reading files: the first command in the pipeline,
grep -i ^re glob
, is what becomesThe rest of the script contributes very little to the Ruby code.
I suspect this is a recurring theme when trying to replace shell pipelines with programs. Only Perl avoids some of this additional complexity for reading files, I think.
At least with Ruby I don’t have to constantly cross-reference the man page and my cargo-culted knowledge of Unix’s multitude text manipulation DSLs, all unlike. It’s pretty obvious what it’s doing.
Actually you used very little shell there in your first example.
You also used grep, cut, sort and head.
Why do you assume the
backtick
operator and the | operator for io doesn’t exist in ruby? In fact why do people assume shell and jq do not exist if you use ruby?Personally I tend to reduce the number of tools involved to reduce the cognitive load of needing to understand each tool to understand the one liner.
I balance that against considerations like going IO.read(”|sort -u fileName”) can be a huge performance boost
Anyhoo… some examples of ruby onliners
http://reference.jumpingmonkey.org/programming_languages/ruby/ruby-one-liners.html
Because code in sed or awk that worked a decade ago (or, hell, two years) still works. Ruby code seems to bit rot faster than any other language I’ve use for nontrivial work.
Also, with awk, I could put it down for a year, then use it again, and everything I’d need to be productive fits in a small man page. (The same seems to be true of sed, though I don’t use it much.) The Ruby ecosystem moves a lot faster, and if you haven’t been following it closely, catching up will add extra work. (Whether it’s actually going anywhere is neither here nor there.)
Yes, awk is a more limited language, but that’s a trade-off – there are upsides, and I know which I’d prefer.
Not true.
The awk scripts I wrote decades ago with in Solaris awk which is not quite the same thing as gnu awk.
Well thought out growth in a language is good.
I find the maintenance burden in ruby rolling forward with language versions is very low.
Doubly so since rubocop will often autocorrect stuff.
I don’t know Ruby. But for me these are the reasons why I am writing more and more bash programs:
Bash is my command line. So I am doing a lot of small steps, file modifications, comparing, searching analysing. At some point I can see that some of the steps can be composed and I pull them out of the history, try them out on the console and at some point put them into a script. If Ruby would have a REPL in which I can do all the operations that I am doing on the command line with less typing and more comfort, I would maybe give it a try.
Bash is on every Linux box. Ruby is not.
Ruby does have a REPL. It’s called IRB and it comes with every Ruby installation. I use it exactly as you describe, for composing small programs iteratively.
Are you using the Ruby REPL as your daily all-time console or just when you have in mind to create a program? I am asking honestly because I do not know anything about Ruby or their REPL and I am quite interested how good this REPL is as a replacement for the daily life?
My point is that shell scripts are a by-product of using the shell for doing manual tasks. And I get better and better at my shell usage, and even after 20 years of shell usage I am still discovering new features or ways to do something in a more efficient way. While the shell language is really ugly, but being very succinct plus the composition of unix commands, the history, the prompt customization, the possibility to have vi mode for editing (and I probably forgot a lot of features), all this makes using shell such an efficient tool.
Well, no, not as my daily shell. I dislike shell scripting enough that I switch to Ruby pretty quickly if I’m having to spend any amount of time or effort on a task, but it’s not meant to be a replacement for bash/zsh/fish.
Let’s not limit ourselves here. For those not using Bash and/or Linux, how about this:
$SHELL
is on every Unix box. Ruby is not.So is ed.
However sudo apt install ruby solves that problem.
And yes, ruby does have a REPL.
apt: command not found.
sudo: permission denied
$
Have fun with ed then, it’s the Standard!
https://www.gnu.org/fun/jokes/ed-msg.html
I have written scripts in ed before to do some sufficiently tricky text manipulation. It’s a good tool.
Mostly, because picking up enough jq, awk and sed to be useful is faster than learning the ins and outs of Ruby?
I suppose you could make a similar argument about learning Ruby one-liners, but by the time I’m writing a very long bash script, I’m probably writing a larger program anyway, either in Go or Python. Ruby as a language doesn’t have much appeal to me, at least at the moment.
Awk, at least, fits very nicely into a small space right next to regex. jq is a bit fiddilier to pick up, but very nice for basic stuff. Sed, I still don’t have down very well, but also is nicely regex adjacent.
I regularly write sed one liners to do refactorings on my Ruby code. Usually the sed call is fed by the result of grep or find. I could write a Ruby one liner to do the same, but it would be a much longer line and escaping would be much more difficult. Ruby is simply not a replacement for the convenience of sed.
And maintainability is a red herring here: the whole point of something like sed is that you use it for one-off commands.
I’m not that experienced with jq, but when it comes to awk (and sed), one of their benefits is that you can easily write a program in the shell, since they act as glue between pipe operations.
For example, to filter out all lines that have less than 4 characters, all you have to write is
no imports or types required. It was made for stuff like this, which makes it easy to use. I’ve only read a book about Ruby a few years ago, but to process stdin/out this was should require a bit more overhead, shouldn’t it?
One part of your history lesson is missing: Paul McCarthy and Steve Russell saw what was going to happen and pre-emptively invented Lisp. And yes, you can do everything in Lisp, in one line if you must, that you can do in bash, awk, sed, jq, perl… but in a more powerful and maintainable form.
;)
s/Paul/John/
This gotta be one of my most common brainarts…
It was Yoko’s fault.
Ruby equivalents of the basic
awk
andsed
examples from the article, as examples of Ruby one-liner structure:awk '{print $1}' logs.txt
cat logs.txt | ruby -ne 'puts $_.split[0]'
cat logs.txt | ruby -ane 'puts $F[0]'
sed 's/^[^ ]*//' logs.txt |sed 's/"[^"]*"$//'
cat logs.txt | ruby -ne 'puts $_.gsub(/^[^ ]*/, "").gsub(/"[^"]*"$/, "")'
I wrote what’s perhaps a more accurate explanation of the differences between three common text processing commands: https://two-wrongs.com/grep-sed-and-awk-the-right-tool-for-the-job
I don’t want to discourage folks from blogging what they’ve learned, but I was disappointed by this article.
WHY should I learn jq? (I already know why but I want to see it in the article which had no examples) WHY should I learn sed and awk?
Where’s the beef?