I develop and use a lot of different scientific code in my work (I’m a theoretical chemist). I have pipelines with components written in C/C++, Python, AWK, Haskell, tcl, various shell dialects, and more. All of these programs and their associated languages aren’t going anywhere, not for a long time. When I add to them, I try to follow the “Unix philosophy” for my interfaces and build up single-purpose wrappers (sometimes via input redirection) where I can. To stave off reproducibility issues and to document what I’m doing, I aim for everything possible to be automated. Building initial “inputs” to a pipeline is fine, but I want to be able to run everything after that without fiddling by hand.
I end up writing lots of ad hoc shell scripts with pipes for orchestrating preparation, simulation, and analysis. Unfortunately, there are lot of ways to shoot one’s self in the foot with shell scripts even with “strict mode” and linters like shellcheck. (Crustaceans directed me to both; thank you!) Using Python’s subprocess helps in some ways, but I find it unwieldy for chaining a lot of processes together. Surely there’s a better way to do all of this that is straightforward to read, exploits the composability of programs, but doesn’t suffer from all the deficiencies of a BASH. What tool do/would folks here reach for in such a situation and why?
Any thoughts much appreciated.
tl;dr: What’s your preferred BASH replacement language for composing lots other programs and why?
https://rash-lang.org
Full disclosure: I wrote it.
The documentation isn’t great, and I need to write a new line editor to improve interactivity. But it’s got a lot of strengths that are rare among shells or unique to Rash.
Wow, this looks neat!
I am very impressed by this concept and look forward to kicking the tires, if I can scrape together some time.
If your line editor is generic, you could just point users to rlwrap
Users can use rlwrap currently, and the current repl implementation uses a libreadline ffi wrapper library that comes with Racket. So there is some line editing and even very basic completion for file names. However, neither of those is great. While rlwrap and readline provide basic editing, rlwrap provides no good completion. The readline library is more programmable, but the ffi library in Racket doesn’t expose its full power.
Either way, I ultimately want a much more powerful line editor that’s more like a little emacs. If you look at zsh, for example, it has a fairly fancy line editor that is programmable and provides fancy programmable completion, cool filtering tools like zaw, various different ways to search through history, etc. Zsh’s line editor is basically the reason it’s way cooler than Bash. At the same time, you have to program it in the zsh shell language. Ugh.
If I write a nice programmable line editor in Racket, it will not only be programmable in a nicer language (Racket, Rash, or most any Racket-based language), but it will be able to hook into the Rash parser, analyze the resulting syntax objects, etc, for richer editing, highlighting, and completion. And while my main use for it will be for Rash, I intend to make it usable for essentially arbitrary Racket languages. The line editor for the default Racket language isn’t particularly great, and there is currently no off-the-shelf support for repls that use non-s-exp syntax. So a nice new line editor is generally needed in the Racket world.
You should try Oil and let me know what happens! You can run your bash scripts with it, so you don’t have to switch wholesale.
It will give you a “sane subset” of bash. If you’re using ShellCheck, and strict mode, and find that to be incomplete, then Oil will do even more for you.
Some blog posts that talk about Oil’s increased strictness: http://www.oilshell.org/blog/tags.html?tag=real-problems#real-problems
The error handling is much more reliable, AND it even alerts you when you would have problems in bash: http://www.oilshell.org/blog/2020/10/osh-features.html#reliable-error-handling
That is, I took care not to add error handling that would not be present in bash, which would “reverse break” your shell scripts. The improved behavior is all OPT IN.
To get the strict behavior, add this one line to your program:
It will still work under bash.
https://www.oilshell.org/release/0.8.5/doc/oil-options.html
A list of the strictness options: https://www.oilshell.org/release/0.8.5/doc/oil-help-topics.html#option (the list is complete, but the docs aren’t)
The latest release is here: https://www.oilshell.org/release/0.8.5/
It’s also packaged in a few places: https://github.com/oilshell/oil/wiki/Oil-Deployments (not up to date, I could use help with this)
Forgot to link this: https://github.com/oilshell/oil/wiki/Where-To-Send-Feedback
I’m interested in any problems people have, big or small… I might not be able to address it immediately but it’s useful to know what obstacles there are. I dogfood this on Oil’s scripts, and it caught real bugs, so it works for me. If it doesn’t work for others I want to know.
Also let me know if it does work… I have been seeing a lot of downloads lately, but not that much feedback :)
If you like lisp, this may be of interest: https://acha.ninja/blog/dsl_for_shell_scripting/
I haven’t used it but it seems amazing. You get a concise syntax and proper data structures
I’ve always loved the expressivity of LISPs and was always saddened at how annoying some operations were…
I’ve picked up Janet for its performance, dead-simple ABI, and ability to compile to native code, and I must thank you because this article is really cool and opening lots of cool things right now.
Thanks!
Have you had much success with Janet? I’ve heard good things and some less-good things. The less-good stuff has been primarily about ecosystem immaturity (which takes time), and some gripes about departure from standard Lispiness?
I have had a lot of success using Janet for small/CLI programs, and Advent of Code. It’s definitely a departure from a standard lisp, it’s more semantically related to Lua, all told. For me, that’s a very nice thing, because it means that it’s a small enough language to fit in my head (well, almost). It doesn’t have cons cells as a central data structure. If you come from Python or Lua, that’s not a big surprise, if you come from a lisp, it might be a surprise. It has macros, however, and various quoting setups, and (the thing that brought me back a lot), a really nice PEG api.
If you’re at all comfortable with C, you can add your own bindings for anything that exposes a C api.
The language is more than a toy, but you’d want to have C expertise to build especially large things with it.
It also has a better setup for error handling vs Scheme’s call/cc with-unwind stuff, IMO.
I haven’t written that much Janet yet but there’s a pretty good ecosystem out there (maybe not stable, but it exists):
I’ve only used the first 2 but they’ve been pretty solid. I’m working on libmagic bindings (filetype/mimetype detection) too (as a dabbler in C) and the docs are pretty good so far. I’m considering rewriting my file/URL opening scripts in it
I have been using Raku at work for writing useful utilities with some success. I can easily create a command line interface without any libraries and call shell programs using qqx which I find easier than dealing with bash quoting rules. Also there are some libraries like raku-shell-piping which I’d like to try someday.
The next-generation regexes in particular have saved me a lot of time at work. Our tests runs were printing a lot of output, and it would result in a MB of text so it would be difficult to use Ctrl-F to search for failed tests, so I wrote a tool which parses this output and prints the essential information that a programmer needs to fix the failed tests.
When there’s a script to write, Raku is the first language I reach for. At some point I do intend to learn Perl as well due to the large number of available modules in CPAN, and also because it is easy to interface with Raku using
Inline::Perl5
.Regarding performance, Raku is known to be slow, but performance is something that the team is actively working on. If I can complete a task in an hour of hacking up a Raku solution vs 1.5 hours (or more) in a different language, does it really matter if the program takes 5 seconds or 10 seconds to run?
I mostly use shell scripts because it’s almost always available and doesn’t change much; although I don’t really like it, I don’t really know of anything better as most alternatives are:
Ruby does support running shell commands with backticks:
And general string processing etc. in Ruby is fairly nice as well. I don’t really use it for stuff like this, mostly because of point 2.
Perl can do this as well I believe (and was originally conceived to replace shell scripts), but I never really got to gripes with the syntax. For many purposes, I consider Ruby to be “Perl done better”, but Perl is much more likely to be installed on most systems (and Ruby still has a rather complicated syntax and a bit of a learning curve).
Tcl is nice for interacting with shell commands as well:
But I never worked much with it, and it suffers from point 2 as well (although it’s a lot more light-weight than Ruby, and easier to install).
There are other shells as well, such as rc and oil, but they’re even less likely to be already installed. Alpine Linux, for example, doesn’t even seem to have packages for either in the standard repos.
rc is available in alpine linux: it’s in the 9base package
Ah right; I just searched for
rc
.pacman -F
is good for finding files in packages that you have not installed.tcl is GREAT, and whilst tclsh isn’t as quite likely to be installed as perl, I think the expect (especially autoexpect) is a killer application for operations/composition purposes.
Demonstrates two bombs in one place.
(1)
$USER
can be expanded into two arguments(2) When
$USER
is a substring of another user … or group … or anything.That’s why I prefer data structures. Yes, I know, Unix philosophy, text is the common denominator with ton of tools, composable, etc..
Usually python, just because I know it well, and can get results within a predictable timeframe. My personal rule of thumb is when I need something like a hashmap or array – I give up on bash that instant.
However with time I’ve also gotten more comfortable with bash, so often it’s okay to mix and match. E.g. say, you want to find out the amount of free memory
Right, how do we pick out the number? Normally you’d use
cut
, orawk
:, but what if you forgot, or need something more elaborate? Well, why not use python?
Not as concise as awk, but you can type it quicker than the time you’d spend googling how to use awk.
Note that you also can use multiline input if you press enter after the opening quote, so even if you need imports etc, it doesn’t have to look horrible. Also if you have some sort of vi mode or a hotkey to edit the command in editor (
Ctrl-X Ctrl-E
in bash), it helps a lot for messing with long shell commands.I also tried using xonsh a few times, a shell combining python & bash syntax. Cool idea, but I tend to forget how to use it, so never got into the habit.
Ah, someone who shares my shame.
Err, am I weird in that awk isn’t that hard to use?
$ awk ‘/memFree/ {print $2}’ < /proc/meminfo
Two less fork()/exec()’s and does the same thing as all that python. Why break out the combine when the hedge trimmer will do to cut the grass.
I’m not gonna lie, this falls under learning to use your tools. If you always reach for python/scripting languages for these simple tasks, I’m going to argue your general unix knowledge is too low.
Also that second cat | grep | awk has a bug with print $1 versus $2 so not sure the gp actually ran that shell.
Probably not.
I would disagree on a technicality: if you don’t know it, it isn’t your tool.
This I do agree with. I can’t say that it is difficult to use because I never took the time to really learn awk. Instead, I just try to pick up what I need to to do a particular task. To a large extent, my relationship with awk is governed by apathy. It is an exceedingly practical tool and I just don’t really care. I love those little transcendental moments with software where you feel like you know something more about the world. awk doesn’t do that for me so I haven’t really given it the time it deserves.
That said, my comment about shame comes from responses like:
Missing a little bit of context from your comment, but things can be read this way and it doesn’t feel so good. My comment isn’t about how difficult awk actually is, but how people assume that you should just know these things and if you don’t you are deficient.
To be clear, I don’t think that that there is any malice on your part.
For me, Julia recently completely replaced bash&Python for shell scripting. There are two reasons for that:
Of course, you need to have Julia installed. It also starts a bit slower than Python. This used to be a deal breaker, but was mostly fixed in one of the versions, it is now fast enough for short interactive scripts.
Here’s a example from my dot files: a script to remove merged branches:
https://github.com/matklad/config/blob/master/bin/gbda
Hm that’s a good use case. FWIW in Oil this will look like:
(Not tested, but the only thing that might not be there is the
startswith
method)Differences from bash:
read --line
fills$_line
, instead ofread -r
readarray
works at the end of a pipe becauseshopt -s lastpipe
is on by default:branches
has a:
sigil to remind you it’s a variable name@branches
for splicing instead of"${branches[@]}"
I might make a blog post out of this! I posted some notes on Zulip here https://oilshell.zulipchat.com/#narrow/stream/266575-blog-ideas/topic/Julia.20Shell.20Script (requires login)
Yeah, pipeline concrete syntax for function composition reads better to me here. Alas, it is almost available in Julia: https://discourse.julialang.org/t/why-are-there-no-curried-versions-of-map-filter/47478
Curious about your workflow for this. I tried to use Julia to replace shell scripting for myself but found that many scripts (especially anything with any dependencies) would take 30s+ to run.
I don’t have any workflow, it’s just fast enough for me:
In absolute terms, 120 ms latency here is unreasonable (Python’s equivalent finishes in 20), in human-relative, this is ok enough for me. Unlike Clojure, Julia owns its runtime, so I also hope that this’ll improve in the future (one low-hanging fruit, iirc, is not initing blas&friends at startup).
30s does seem unreasonable. Either this is some kind of config problem, or dependencies are particularity heavy.
I manged to get it down to a few seconds IIRC using some tips from https://discourse.julialang.org/t/can-julia-really-be-used-as-a-scripting-language-performance/40384
Thanks for the link! Yeah, looks like I am severely underestimating the cost of deps in Julia :-(
Seems theoretically fixable by creating & caching sysimages on the fly…
Did you try to use PackageCompiler.jl or your issue does not fit its usage/worth the time to dive into it?
I’ve looked at it, but it is too complicated for my use-case. I want “stick this into
#!
line” solution, not Project.toml solution. I don’t have Project.toml at all.Sorry I’m late to the party.
I’m author of the Next Generation Shell. It was born out of frustration with bash and Python.
Here is how the branches deletion code would look like in NGS.
sh is a cute-sy interface for shell scripts in
Python
. Examples from the docs:Plumbum is usually the Python shell DSL I see recommended. Haven’t used either though, as
subprocess.run
is generally good enough.I feel like I’ve written 60% of that library in a less useful way several times. I need to remember that for next time. Thanks!
Also, it begs for a sibling that wraps paramiko in an
ssh
interface.Doesn’t look as convenient as
[ That was shameless plug, that’s NGS ]
Terra. Lets you call C library functions with a simple include. Example from my own code:
Well that seems neato. Have you had much success with it?
After 20 years of avoiding it, I’ve found myself using Perl again a bit for this kind of problem.
I have yet to adopt it, but execline is a possible alternative to traditional shell utilities.
Bash, but as soon as the logic seems to get more complex, I use Python.
Lately, I’ve been using Go for small command line tools as well. This flag library is essential since the inbuilt flag library is wonky. It’s fast to compile, nice to easily make API calls in parallel, parse/spit out JSON.
I have a similar approach, but I have started using urfave/cli for command line applications in Go.
I really like rc (plan9/v10 unix shell). It’s not too surprising compared to bash and fixes a lot of the issues I have with it. The only downside to it is that interactive execution is lacking compared to more modern shells.
Interesting; I presume this is what you mean? All variables as lists that can easily be indexed and explicit concatenation look promising. Anything else stand out to you? Thank you!
Yup, exactly. The more explicit redirection is also a plus IMO, and it might especially be useful to you. no more
esac
s orfi
s is also pleasing but that’s more an aesthetic thing. all in all it’s not a huge paradigm shift from bash, which I think is a good thing.On the other hand, I think fish is the best shell for interactive use– it might be worth examining that too.
If the Bourne shell frustrates you … welcome to the club. It frustrates a lot of people.
Some of your frustration is probably rooted in the process model itself, the operating system, rather than in BASH. Pipes. Exit codes. Signals. Environment variables. Scattershot argument and output conventions. Languages don’t really solve this mess, so much as hopefully expose it faithfully, so you can try and manage it yourself. The language determines what abstractions you can use, and use conveniently, to try and do that. In my experience, those that try to “hide” the complexity below usually fail to expose it completely. The popular scripting languages have all figured it out, eventually.
Frankly, if you’re doing something delicate, or need it to be more reliable, there’s nothing wrong with
#!/usr/bin/python
or#!/usr/bin/ruby
or#!/usr/bin/node
or the like. Yeah, there are efficiency and startup time trade-offs. But if you’re on a desktop, laptop, or server, the cost of one interpreter probably ain’t gonna break you.The problem with those other options starts when you want to get that script on your coworker’s machine. Which version of python are they running? Do they have ruby or node installed?
If you’re going to write a full-featured application, I agree that versioning remains wise. But core process handling and string manipulation APIs haven’t changed much in ten-plus-year-old popular scripting languages.
The exception’s probably Python 2 to 3, but most distros I’m aware of know this and suffix their binaries on
$PATH
. So you can do#!/usr/bin/python2
or#!/usr/bin/python3
.If the second choice is BASH, it’s not such a stretch to write for the common denominator and avoid newish bells and whistles. I’ve shipped deployment, testing, and other scripts this way, without significant issue.
Python has a habit of casually introducing backwards-incompatible changes in new releases. That’s not only python2 vs python3. E.g. in Python 3.6 there was a general overhaul where all the modules handling file paths moved from taking strings, to taking pathlib paths. That broke my code (and updating it to work on 3.6 broke it on 3.5). That’s just one point, the interwebs have plenty of such stories. The library ecosystem seems to have a similar attitude. Python2 vs Python3 is far from being a problem for me. But each time I find an old Python script that I need to use, I cringe in expectation of what I’m going to have to do with its dependencies and the Python version itself. A month ago e.g. I needed to make some 2 years old Python script from Chromium repos work today. Took me almost a day, hunting down historical dependency versions that weren’t available anymore on PyPI. I also have similar story with the most popular yaml library. With Python, any version change of just about anything, be it the interpreter or any library you’re using, is likely to cause pain. At least it has always been like that in my experience.
For shell scripts on my local machine I use https://elv.sh, a shell with a rather neat scripting language that borrows from all kinds of languages.
ruby - I’ve mostly moved on to other languages for full fledged applications but for small scripts and exploratory coding it remains unmatched.
To be honest, they’re is not “better” as gluing together programs will always give you bullet for each toe. Try to provide a consistent interface between your ad hoc script so weird stuff pops up fast.
Previous work in research, we tended to use either bash or python depending on who set-up the glue and the project.most of the time,the ideas was to provide a cli with a few arguments to run the chain.
During a time, I looked at [Common Workflow Language] (https://www.commonwl.org/) to make the glueing more accessible and the main dev there wanted to create a GUI based on node to plugs together (like Node Editor in Blender) so any researcher can create workflow and document it. All of that had to play nice with slurm. I also provided a few glue script with a simple config file. Maybe stick to any task runner you like to provide an common interface to link everything.
Nowadays, learning Raku, I think I may use it a lot of needed in the future. Probably Go if I need to ship a binary somewhere. Python if I can’t install anything in the env. I will avoid Bash because I am less confortable with it when things got more complicated. However, for short pipes when you have one entry point and everything follows until the final output, Bash is the easiest.
It sounds like you would benefit from a workflow runtime. I really like snakemake because it is Make-like but allows dropping Python code to fit what’s missing.
Lua and shell script make for amazing glue languages
Fun fact: lead (which pipes used to be made from, hence plumbing) is “luaidhe” in Irish
Same here. Bash -> Lua -> C, where “->” means “if the left operand isn’t enough…”
same, but any more I tend to favor fennel + shell script
https://fennel-lang.org/
Just wanted to say that I am super-grateful to everyone who chimed-in. There are a lot of useful ideas here; thank you!
I use bash.
When things get too complex, I do turn to a scripting language. Perl is useful and almost optimal for the task, but some of its structures around abstraction are (were?) lousy. Python is decentish. My gut feel is that Ruby is the right language here, but dinking with ruby libraries is yet another task I have to deal with, and I haven’t mustered the energy to care.
In time, I actually start rewriting the internal components into the scripting language so there’s a cohesive system and without concern about the interop between supervisor and component. Things like “write file, read file” are slow and require essentially a protocol.
I don’t know what your components look like, but the number of different technologies gives me pause; it seems that you’re risking some serious bugs due to the interop issues.
For myself, I have started moving away from the idea of files and pipelines for interesting work and data processing and prefer to use Postgres SQL if feasible. It has guarantees around writing and concurrent access that can really help. That might be more than what you can do due to third party software and so forth.
Ruby is the glue that never sets.
I started using Ruby in 2001 to replace a bunch of shell scripts in a really complex build system and I loved it. I rode it through all the popularity of rails, and the decline as the world became more polyglot, people decided Javascript on the server was ‘good enough’, and Python won in the data space… but for scripting, gluing, command line apps, etc. Ruby is still the first tool I reach for.
Where I work at the moment, the choices are essentially bash or node, with TS or JS. I’d personally be happy with say Python or Ruby but I work in JS because every programmer here knows it.
I despise bash as a programming language and try to minimise the use of it.
For most glue like things things, I use Node with plain JS, but usually make everything synchronous. Editing files is
fs.readFileSync()
orfs.writeFileSync()
. Running commands ischild_process.execFileSync()
.If anything has to be async, I write
async function main() { try { ... } catch(error) { console.error(error); process.exit(1); } main();
because promises are okay but callbacks are hell.Main reason for not using ts is that the benefits are limited for glue code (so many unstructured bags of bytes) so I’d rather just have the shorter iteration time from not invoking tsc.
I prefer cli one-liners where possible. These can get complex, so I save them in all-purpose notes file relevant to the projects I’m working on. If it is general purpose and not restricted to a project, I’ll see if I can get away with alias/functions. Otherwise, see if I can manage a bash script.
If the above is not suitable, only then I use Python/Ruby. For such cases, I try to avoid shell commands and use built-in features as much as possible.
A different perspective: this may not be a language problem. Of course it depends on what you mean by shooting yourself in the foot. For example, maybe you are accidentally deleting files, or running long-during pipelines that end up crashing because there were silly mistakes in the initial inputs. How would a different language help? You can delete files in any language, and you can skip input validation in any language. Although to be fair perhaps it’s easier to do input validation in another language. Still, what makes Bash so productive in general is maybe also the thing that makes it so hazardous: it’s really easy to compose any set of programs with a few keystrokes. I personally haven’t found a replacement I like more for this purpose.
Some programs offer flags for doing dry runs. What if you added something like that to your pipeline steps?
I usually reach for Bash to start, and if it starts to get complex then migrate into Ruby. Depends on the application or target audience though. If I’m writing a generic cli tool and Bash isn’t the correct fit then I’d likely reach for Go over Ruby, just because deploying it as a single binary is simpler.
Inside applications (at $work mostly) I stick to ruby scripts inside the rails apps, and have been using JS scripts inside the couple of Node apps I’ve touched. Infrastructure repos tend to get bash or ruby, just because I know those folks are on a Mac and have Ruby > 2.5 available with the standard library present.
It depends on what I want to „glue“ together.
a) commands/programs – I use Bash – or a shell in general, but usually it is Bash.
b) C-libraries – I use C++ or Java (especially with JNA it is quite convenient). It provides me type safety (not perfect but reasonably good and much better than dynamic/scripting languages). RAII and exceptions in C++ helps a lot with closing resources and error handling. The same does try-with-resources (
AutoCloseable
) in Java. The D language also seems to be a good option for this use-case (integration with C or even C++).One more note on Shell/Bash: this glue work quite well, but there are several pitfalls (see Classic pipeline example). They however arise rather from not-well specified text streams, than the language itself.
For me usually the pain is in the lack of multi-threading (which
bash
can do if you really try, but it’s clumsy), so when I can’t patch things up withparallel
or by wrapping it inmake -j
, I reach for Node.js.INSTALL
instructions don’t get more complicated than runningnpm i
Around 2013, I’ve started working on Next Generation Shell. Frustrated with bash and Python, which I was using extensively, I looked at the niche of what’s on intersection of modern and for Ops. I’ve seen nothing.
While bash was definitely for Ops, it did not meet any expectations for “modern”: horrible syntax, poor error handling and missing (OK, very limited) data structures. Using jq from bash felt like a symptom.
On the other hand there was a bunch of general purpose programming languages such as Python, Ruby, etc. Well .. they are not specifically for Ops and in practice that means that dealing with files and processes is unnecessary verbose. In NGS for example, if you run a process, it’s an exception if it exits with an error exit code. Note that error exit code is not anything above zero. There are commands such as
test
where exit code 1 is fine.Over the time I’ve discovered several projects in that modern-for-Ops niche. Somehow the feeling is that they are trying to improve over existing shell while NGS is based on the idea of creating a programming language for ops … with a shell.
PowerShell was not on my radar for some time but when I did look at it I’ve seen horrible language design choices such as
-eq
comparison which is not equality comparison.(Emphasis mine).
Back to NGS.
Project status: useful programming language which we use at work; starting working on the shell.
I hope that other people will find NGS to be a helpful tool.
For anything parallel which would otherwise be done in bash, (gnu) make is great. With 4.3, it now has support for grouped targets, so stamp files can (mostly) finally die.
For everything else I just use python :)
I have been meaning to learn perl simply because of the vast amount of existing scripts written in it.