1. 3

    Another method that I like is a HELP variable at the start of script which you can then reuse for usage output:

    #!/usr/bin/env bash
    HELP=$(
    	cat <<-EOF
    		This is a git prepare-commit-msg hook which queries tmux for a list of
    		attached users and adds them as co-authors based on their email addresses found
    		in ldap. It is called by git with 1-3 params: commit_file_path, commit_source &
    		commit_old_sha, see githooks(5) for more info. This script also installs and
    		uninstalls the hook, -i & -u.
    	EOF
    )
    
    1. 16

      I find all the admonishments to use POSIX shell rather than bash befuddling. Why would you not want to use Bash language features such as arrays and associative arrays? If your goal was portability across a wide swath of UNIX variants, then I understand that you might want to target POSIX shell, but that is a very unusual goal. In most instances why not leverage a shell language with very useful features?

      1. 31

        Because arrays and associative arrays are two of the most broken features in bash. You can use them if you’re an expert, but your coworker who has to maintain the script will likely not become an expert. (I’ve been using shell for 15 years and implementing a shell for 4, and I still avoid them)

        I implemented them more cleanly in Oil and some (not all) of the differences / problems are documented here.

        https://www.oilshell.org/release/latest/doc/known-differences.html

        • ${array} being equivalent to ${array[0]} is very confusing to people. Strings and arrays are confused.
        • associative arrays and arrays are confused. Try doing declare -A array and declare -a assoc_array and seeing what happens.
        • dynamic parsing of array initializers is confusing and buggy
        • in bash 4.3 and below (which is deployed on Ubuntu/Debian from 2016, not that old), empty arrays are confused with unset arrays (set -e). Not being able to use empty arrays makes them fundamentally broken

        I could list more examples, but if you’ve used them even a little bit, you will run into such problems.

        In Oil arrays and assoc arrays are usable. And you even have the repr builtin to tell you what the state of the shell is, which is hard in bash.


        edit: I forgot that one of the first problems I wrote about is related to array subscripts:

        Parsing Bash is Undecidable

        Also the security problem I rediscovered in 2019 is also related to array subscripts:

        https://github.com/oilshell/blog-code/tree/master/crazy-old-bug

        A statement as simple as echo $(( a )) is in theory vulnerable to shell code injection.

        • Shells that do not have arrays do not have the security problem.
        • You can inject code into shells with arrays with a piece of data like a[$(echo 42; rm -rf /)].

        All ksh derived shells including bash have this problem. Note that you don’t even have to use arrays in your program to be vulnerable to this. Simply using arithmetic in a shell that has arrays is enough! Because arithmetic is dynamically parsed an evaluated, and accepts $(command subs) in the array index expression.

        https://unix.stackexchange.com/questions/172103/security-implications-of-using-unsanitized-data-in-shell-arithmetic-evaluation/172109#172109


        bash is poorly implemented along many dimensions, but I would say arrays are one of the worse areas. But I learned that it got a lot of that bad behavior from ksh, i.e. “bash-isms” are really “ksh-isms”. So the blame for such a bad language to some extent goes back to AT&T and not GNU.

        1. 1

          Thanks. Without arrays, how do recommend handling cases where you need to build up a series of arguments to be passed to another command?

          1. 3

            I oscillate between:

            • Using strings and implicit splitting for known/trusted arguments. If you’re building up flags to a compiler like -O3 or -fsanitize=address, this is a reasonable option, and the one I use.
            • Using arrays and taking care not to use empty arrays (which is annoying).

            The fact that it’s awkward is of course a motivation for Oil :)

          2. 1

            in bash 4.3 and below (which is deployed on Ubuntu/Debian from 2016, not that old), empty arrays are confused with unset arrays (set -e). Not being able to use empty arrays makes them fundamentally broken

            Did you mean set -u? And to be fair, I would consider this more of a case of set -u being broken; it seems that their fix for it was to not trigger (even when unset) on * and @ indexes.

            1. 1

              Yes set -u. Either way, you can’t use empty arrays and “strict mode” together, which makes arrays virtually useless in those recent version of bash IMO.

              I have a memory of running into that the first time, and I couldn’t understand what was going on. And I ran into many more times after that, until I just stopped even trying to use arrays.

          3. 15

            Why would you not want to use Bash language features such as arrays and associative arrays?

            If you need them that’s a good sign it’s time to rewrite the script in real programming language.

            1. 12

              It’s not as “unusual” as you might think though. There are many people running OpenBSD on their work machines, where bash is simply not included in the default install. I could see why you’d want to make your script work without issue when bash isn’t an option. At the very core, a shell script should not do much: cleanup your local mailbox, aggregate multiple RSS feeds or perform a quick backup using rsync, … I have run openbsd on a laptop for some time, and I was delighted to see that all my personnal scripts were still working like a charm (besides calls to seq(1), this one killed me…). I’ve also written some simple C software, and wrapper scripts for it using POSIX shells only, and I was glad that nobody from the BSD world bothered me with these scripts.

              Another argument is that POSIX shells is much more simple to use and understand than the myriad of corner-cases and specificities that bash can have. It sure is a strengh to be able to use proper arrays, but bash manpage has 5 times more (48746 words) than sh (10224). Mastering bash as a language is definitely harder than sticking to POSIX, especially when you only write scripts that are < 20 lines.

              1. 4

                Even on Linux bash isn’t always included by default, especially on simple server setups/distros such as Alpine Linux.

                The seq thing is annoying, BSD has jot with different syntax. I wish they could agree on one or the other. Same with stat.

              2. 10

                In my experience, the moment my shell script becomes complex enough for a bash array is the moment it gets rewritten in Python.

                The only features from bash that have convinced me to require it in recent memory are herestrings (<<<) and the regex match operator (=~).

                1. 1

                  In my experience, the moment my shell script becomes complex enough for a bash array is the moment it gets rewritten in Python.

                  Same here. Even for the simple scripts since a script would become more complicated over time.

                2. 4

                  I use systems without bash installed by default. If you’re going to write a complex script that is too awkward to write in POSIX sh, then why not use a better (less quirky) language in the first place? Python for example is probably available in almost as many places as bash now. If you’re writing a shell script, you’re probably aiming for portability anyway, so please write POSIX shell if you can.

                  1. 2

                    I write many bash scripts where portability is not a concern and I think the language can be quite functional and even elegant when you are pulling together different cli tools into a new program.

                    1. 2

                      Certainly if it’s not a concern and you enjoy the language then I don’t see a problem with using bash! My main issue is that I don’t think it’s at all an unusual goal to use POSIX sh for portability.

                      1. 2

                        Ubuntu shaved quite a bit off their (pre-systemd) boot times by just switching /bin/sh from bash to dash (although I can’t find the exact number right now). The increased performance may be a concern (or is at least nice) in some cases (bending over backwards for it would of course be silly).

                      2. 1

                        Because there are plenty of use cases, like installers, where relying on another language isn’t possible.

                        1. 4

                          Why not do the heavy lifting in awk, if your’e in such a restricted environment?

                          1. 1

                            When I talk about writing shell applications, I personally am talking about using bash plus all the standard UNIX tools that come with every UNIX base install.

                            Awk is one of those tools.

                            1. 5

                              And when I say do the heavy lifting in awk, I mean more or less ignore the shell, and use awk. You get proper arrays, reasonable syntax for writing algorithms, relatively little shelling out, relatively sane string manipulation, and sidestep a bunch of issues around shell variable reexpansion.

                              And your code is more portable.

                              1. 2

                                I believe that. I haven’t used awk that way and I’ll admit that’s because I don’t know it well enough.

                                I know I’m asking a lot but might you be able to link to an example of awk being used in this way?

                                I mostly end up using stupid simple awk invocations like awk ‘{print $3}’ :)

                                1. 1

                                  Here’s an example from the FreeBSD tree. Lines 107 through 788 are one big awk script. https://github.com/freebsd/freebsd/blob/master/sys/kern/makesyscalls.sh

                                  It parses the syscall definitions here, and generates C: https://github.com/freebsd/freebsd/blob/master/sys/kern/syscalls.master

                          2. 1

                            Why isn’t it possible and why can’t it be done in POSIX sh?

                            1. 1

                              The OP’s post has two points.

                              1. If you’re writing anything complex at all, use a different language.

                              That’s what I was addressing.

                              To your point, there are customer environments where shipping UNIX systems with languages like Python is prohibited, either because of security constraints, maybe disk usage, etc.

                              1. bash versus POSIX shell? If bash ships as /bin/sh on a given UNIX system, I don’t see the difference as important enough to be worth any gymnastics to get one installed, but you may have a higher dimensional understanding of the differences than I do.
                              1. 1

                                I don’t think it’s really a big issue. Certainly for myself it’s only a minor frustration if I need to install bash for a simple script which gains nothing from using bash. I’ve just seen the #!/bin/bash shebang far too many times when it’s not necessary, and I think that a lot of the time that’s just because people don’t know the difference. It’s certainly not the end of the world and if bash feels like the right tool for whatever your use case is then I’m not going to argue! It would just be nice if there was a little awareness of other systems and that people would default to #!/bin/sh and only consider bash if they find a specific need for it. I imagine that in most cases shell scripts are used there actually is no need.

                                I’m a little obsessed with building my workflow around shell scripts and I have never found a need for bash extensions (YMMV). The other big benefit other than portability as others have suggested is the digestibility of the manpages for, e.g. dash.

                      1. 1

                        TIL: Seem to work really nice! Thanks for sharing, this will help to ease my dependence away from using watch as a hammer! It’s also already packaged in Debian which is convenient.

                        1. 1

                          Go: no need to worry about memory management, deployment, or dependency management. Vendor all your modules, deploy as a static binary & reduce the stress in your life!

                          1. 3

                            This seems actually useful, could replace a good part of my web searches that inevitably end with me search for the copy-and-pastable symbole within a fileformat.info result.

                            The searching seems to need some tweaks, though. E.g. looking for a regular smiley, none of “smile”, “smiley”, “happy” give the wanted result, while “face” lists too many. It turns out the right search word is “smiling”, but maybe there should be some form of aliases?

                            I also had trouble with the regular red heart, but that may be of a different kind?

                            $ echo "❤️ " | ~/go/bin/uni identify
                                 cpoint  dec    utf-8       html       name
                            '❤'  U+2764  10084  e2 9d a4    &#x2764;   HEAVY BLACK HEART (Other_Symbol)
                            '◌️'  U+FE0F  65039  ef b8 8f    &#xfe0f;   VARIATION SELECTOR-16 (Nonspacing_Mark)
                            

                            How would I find this using search?

                            Regarding search, some more ideas:

                            • looking up by emoticon, e.g., uni identify "(:" or uni identify "<3"
                            • looking up by short code, e.g. uni identify :heart: (are these standardized?)

                            And a bit of a bug regarding that other stdin UX thread:

                            $ echo "" | ~/go/bin/uni identify
                            $ i: reading from stdin...
                            
                            1. 2

                              The searching seems to need some tweaks, though. E.g. looking for a regular smiley, none of “smile”, “smiley”, “happy” give the wanted result, while “face” lists too many. It turns out the right search word is “smiling”, but maybe there should be some form of aliases?

                              Yeah, adding more search terms is marked as “TODO” in the code. It’s a bit tricky as it’s very easy to get way too many matches and/or pollute the output with a lot of keywords, which isn’t useful either. This is one reason I worked on a GUI emoji picker based on this code last week, but I had a lot of problems getting GTK to show ZJW sequences well, so I kind of gave up on that for now, but basically I’m running in to the limitations of dmenu’s plain text filtering.

                              I rarely use uni e <search> by the way, but instead use the “emoji-common” groups from dmenu-uni which reduces the number of emojis to a more manageable number (from about 1600 to 200).

                              I also had trouble with the regular red heart, but that may be of a different kind? [..] How would I find this using search?

                              Just in case this wasn’t clear – and the documentation should probably make this a bit clearer – but the print, search, and identify commands work only on codepoints. They have no concept of multiple codepoints combing to form a single character (or “graphmeme”, if you wish). I basically use identify mostly as a “Unicode-aware hexdump -C”.

                              At any rate, it shows up with e.g. uni emoji heart, or uni emoji ‘red heart’for an exact match. It's a bit hidden in there, because apparently we need hearts in 20 shapes and colours 🤷‍♂️ You have the same when you type:heart` in e.g. WhatsApp, but because the emojis are shown in colour and quite large it’s reasonably obvious. This is again kind of running in to the limits of what you can do with this kind of plain text search.

                              1. 2

                                HEAVY BLACK HEART is the name of the red heart, as it was named as such before emoji gained color. For older Unicode characters (before color), “white” means outlined and “black” means filled in.

                                1. 1

                                  The search problem is pretty tough to solve, as some of the unicode descriptions use a particular english dialect, for instance:

                                  $ uni s poop
                                  no matches
                                  

                                  damn British! :)

                                  One possible solve would be to augment the descriptions with information from another free source, like wikipedia

                                  1. 3

                                    That’s actually specified in the Unicode CLDR (“Common Locale Data Repository”):

                                    $ grep poop en.xml
                                    <annotation cp="💩">dung | face | monster | pile of poo | poo | poop</annotation>
                                    

                                    It contains many useful aliases, for example for the pirate flag:

                                    <annotation cp="🏴‍☠️">Jolly Roger | pirate | pirate flag | plunder | treasure</annotation>
                                    

                                    I just haven’t added support for that.

                                    1. 1

                                      oh very cool

                                1. 4

                                  Thanks so much for this tool, I love having a command line utility to query the unicode database!

                                  1. 2

                                    bash & shellcheck & shfmt, I have come to love those combination of tools for quickly automating parts of my workflow. I often pipe my shell history into a file and then create a quick script to automate something I am working on, fc -l > script. I think of bash more as a REPL than just as a command shell.

                                    1. 2

                                      Like the author I struggled with users who were use to emacs mode and so they couldn’t switch to vi mode without losing their muscle memory. But, I discovered that the vi bindings in INSERT mode are largely a subset of those in emacs mode, so I created a readline config with a vi mode which has all of the emacs key bindings as well. This way folks can use emacs and vi keybindings. I also change the cursor to indicate which vi mode you are in.