1. 60
  1.  

  2. 7

    Neat! I got a buzz out of him using my inih library, though I found it humorous that he says inih is what does the “heavy lifting”. I wouldn’t call it exactly heavy lifting: it’s a single C file with ~300 lines of fairly straight-forward code. :-) This is one of the cooler inih handlers I’ve seen, though!

    1. 2

      Looks like I fell into my own trap! I try to avoid using terms like light and heavy when describing software projects, because I don’t think they are sufficiently descriptive. The Devil’s Dictionary of Programming does a great job of highlighting the ambiguity of some of those terms. What I wanted to convey was that your inih library was well tested. That we are relying on an established library, rather than rolling our own. I’ll see if I can reword that phrase to better convey my intent.

    2. 6

      This is so cool. I had no idea I could extend bash this way. Will be really fun to see if this can be extended to higher level languages too.

      1. 5

        I’m glad you enjoyed the post and I’m also glad to hear you had not heard of Bash’s custom builtins before. As I mention at the end of the post I would love to see more builtins being built and improvements in how you deploy a custom builtin along with your Bash code.

        1. 2

          I’ve worked with my fair share of Bash, leaning on it rather often, and I too didn’t know about this! Thanks for sharing this info!

      2. 3

        Thanks for writing this. I spent a little time fiddling with builtins after stumbling on a module that supported writing a FUSE in Bash, which leans on a loadable builtin. It was tricky to search/find people using/discussing them, so it’s nice to have more literature on it.

        I was a little interested in looking into whether loadable builtins for some common external utilities could save much time in Nix/nixpkgs build-infra scale. I was largely dissuaded by some of the points made near the end of this IRC discussion https://gist.github.com/abathur/74e7a63b25b7bbd4a6fa9ad7e728ab70 though I’m still vaguely curious about whether there’s a safer subset that could be done this way.

        The distribution of custom Bash builtins is not well paved, limiting their utility?

        I’m not sure I’d call it well-paved, but Nix makes it simple enough to cross this bridge (and it can be paved). I think I still have the Nix expression(s) from playing with these and don’t recall it being too hard to set up once I fumbled through figuring out how it all worked.

        1. 2

          I spent a little time fiddling with builtins after stumbling on a module that supported writing a FUSE in Bash, which leans on a loadable builtin.

          Heh, was that this by any chance? It’s been a while (longer than I’d realized) since I wrote that, and I think it’s bit-rotted to incompilability at this point; perhaps I should dust it off and try to update it to work with recent bash releases. (It was originally meant as more an amusing hack than anything else, but I actually ended up getting some real use out of it once.)

          1. 2

            Yep.

            In retrospect, I should have realized the kind of person who would write a Bash FUSE and the kind of person who’d read this thread would overlap, but still–glad you spotted it. :)

            I don’t recall how much fiddling it took, but I did get it working on macOS.

            I still have the directory, so on the off chance it helps, I took a little time tonight to commit what I had and set up CI as proof. It didn’t readily work on Linux, but I didn’t really try.

            I can’t really attest to why I made specific changes–that knowledge is already lost to time.

          2. 1

            Glad you enjoyed it. We had a Bash application at my work that we were struggling to speed up and I found it very difficult to determine what was consuming most of the wall time. It would be great if Bash had a profiler that showed how much time was spent running each command. If you are interested in speeding up Bash you might also be interested in the route the Ada folks went to speed up their build process, https://github.com/AdaCore/gsh! I have not yet dived into the world of Nix, but that is an interesting idea as well.

          3. 2

            the git-config format is INI-ish and you can use it as a generic parser if you have it. There’s more about it on this serverfault answer: https://serverfault.com/a/985355/360506

            1. 6

              There are definitely some workable options in that thread, but I was hoping to demonstrate in the post how you can build a builtin and what capabilities a builtin has beyond a typical command executed by Bash.

              1. 1

                for sure, I’ve just run into the same issue and was able to get git to do the work for me, it’s certainly a less interesting solution than the one you presented

            2. 1

              This is really cool, and I think it means that one could also extend bash with Nim or other languages that can compile to the right .so structure.

              Small nit: You have -a for the name of the TOC variable in the code, but you have -t in the commentary?

              1. 2

                I would love to write a followup post trying to write builtins in other languages such as Nim or Rust

                1. 1

                  glad you enjoyed the post, and thanks for finding that error, fixed!

                2. 1

                  I really wanted to try this after seeing this post. Thank you!

                  One thing of note that I found though.

                  At least for my version (5.1.8(1)-release), instead of loadables.h, you need to include bash/builtins.h. Also instead of EXECUTION_SUCCESS, you need to use EXIT_SUCCESS.

                  1. 1

                    strange, I was testing the code on a Debian Sid box, running Bash 5.1-3, what operating system are using?

                    1. 1

                      Oh I didn’t even think about distro differences.

                      I’m currently on Gentoo. The Gentoo package for it doesn’t contain any loadables.h file.

                      Plus all of the bash header files are in /usr/include/bash and /usr/include/bash-plugins.

                      1. 1

                        Oh I didn’t even think about distro differences.

                        Ah, Debian…

                        It looks like the entire contents of loadables.h (which lives in /usr/lib/bash(!)) is

                        #include "builtins.h"
                        #include "shell.h"
                        #include "bashgetopt.h"
                        #include "common.h"
                        

                        so you can just include these headers directly. You can also do something like

                        CFLAGS := $(shell pkgconf --cflags bash) # etc...
                        

                        to try and even out some of these differences (though it wouldn’t have helped in this case).

                        1. 1

                          good idea, I reworked the code based on your suggestion and it now work on Gentoo, thanks

                        2. 1

                          Based on the suggestion from from @Forty-Bot I included the content of loadables.h directly. I also tested it on Gentoo with podman. I would love to know if the new code works for you.

                          https://github.com/lollipopman/bash-ini-builtin-blog-post/blob/main/Dockerfile.gentoo

                          1. 1

                            It works perfectly. :)

                            1. 2

                              awesome, good to hear

                    2. 0

                      ….K&R syntax?

                      1. 2

                        I had a feeling my limited C experience would expose itself somehow! Thanks for the feedback. The K&R syntax comes from me copying and pasting from Bash’s source code which often still uses the K&R syntax. I have updated the examples with the modern function definition syntax.

                        1. 3

                          I had a feeling my limited C experience would expose itself somehow!

                          In the spirit of education, a few nits:

                          char *sleep_doc[] = {"Patience please, wait for a bit!", (char *)NULL};
                          

                          NULL is just (void *)0, and

                          A pointer to void can be implicitly converted to and from any pointer to object type…

                          so it doesn’t need to be explicitly casted to char *.

                          return (EX_USAGE);
                          

                          EX_USAGE is just a plain old number, so you don’t need any parentheses. In general, macros should include their own parenthesis (unless you have a truly sadistic coding style, or are doing something funny with blocks).

                          sleep(secs);
                          

                          Note that sleep can fail, so you might want to stick this in a loop (yes, I know this was probably abbreviated for the sake of example, but you handled all the other errors).

                          struct builtin sleep_struct = {
                              "sleep",         /* Builtin name */
                              sleep_builtin,   /* Function implementing the builtin */
                              BUILTIN_ENABLED, /* Initial flags for builtin */
                              sleep_doc,       /* Array of long documentation strings. */
                              "sleep NUMBER",  /* Usage synopsis; becomes short_doc */
                              0                /* Reserved for internal use */
                          };
                          

                          Note that with C99, you can use designator expressions to initialize your structs like

                          struct builtin sleep_struct = {
                              .name      = "sleep",         /* Builtin name */
                              .function  = sleep_builtin,   /* Function implementing the builtin */
                              .flags     = BUILTIN_ENABLED, /* Initial flags for builtin */
                              .log_doc   = sleep_doc,       /* Array of long documentation strings. */
                              .short_doc = "sleep NUMBER",  /* Usage synopsis; becomes short_doc */
                          };
                          

                          but, alas, it appears that bash does not use this style.

                          (variable_context > 0) && (global_vars == false)
                          

                          Note that && has lower precedence than both > and ==, so this would typically be written

                          variable_context > 0 && global_vars == false
                          

                          In addition, I believe that although variable_context is an int, it should always be positive or zero (see e.g. execute_function). And any comparison to zero like x == 0 may be replaced by !x, so this could also be rewritten like

                          variable_context && !global_vars
                          
                          SHELL_VAR *toc_var = NULL;
                          

                          Note that if you really are writing in K&R style, then declarations should come before assignments.

                          char *sep = "_";
                          size_t sec_size = strlen(toc_var_name) + strlen(section) + strlen(sep) +
                                            1; // +1 for the NUL character
                          char *sec_var_name = malloc(sec_size);
                          char *sec_end = sec_var_name + sec_size - 1;
                          char *p = memccpy(sec_var_name, toc_var_name, '\0', sec_size);
                          if (!p) {
                            builtin_error("Unable to create section name");
                            return 0;
                          }
                          p = memccpy(p - 1, sep, '\0', sec_end - p + 2);
                          if (!p) {
                            builtin_error("Unable to create section name");
                            return 0;
                          }
                          p = memccpy(p - 1, section, '\0', sec_end - p + 2);
                          if (!p) {
                            builtin_error("Unable to create section name");
                            return 0;
                          }
                          

                          This is not very idiomatic. A more typical formulation would be

                          static const char fmt[] = "%s_%s";
                          size_t sec_size = snprintf(NULL, 0, fmt, toc_var_name, section) + 1; /* +1 for the NUL character */
                          char *sec_var_name = xmalloc(sec_size);
                          
                          snprintf(sec_var_name, sec_size, fmt, toc_var_name, section);
                          

                          Note also the use of xmalloc, which bash appears to use a lot, and which just dies if you run out of memory. And of course, C++ style comments are far too modern for bash ;)

                          ini.so: libinih.o ini.o
                          	$(CC) -o $@ $^ $(LDFLAGS)
                          
                          sleep.so: sleep.o
                          	$(CC) -o $@ $^ $(LDFLAGS)
                          

                          Note that you can reduce this to something like

                          %.so: %.o
                          	$(CC) -o $@ $^ $(LDFLAGS)
                          
                          ini.so: libinih.o
                          

                          and similarly, you can reduce

                          libinih.o: inih/ini.c
                              $(CC) $(CFLAGS) $(INIH_FLAGS) -o libinih.o inih/ini.c
                          

                          to

                          ini/ini.o: CFLAGS += $(INIH_FLAGS)
                          

                          (but of course you have to update your dependencies).

                          1. 4
                            (variable_context > 0) && (global_vars == false)
                            

                            Note that && has lower precedence than both > and ==, so this would typically be written

                            variable_context > 0 && global_vars == false

                            For the sake of readability I would keep the parentheses. Precedence rules are complicated.

                            Everybody understands the version with parentheses. Not everybody (especially novices) can correctly parse the version without parentheses.

                            1. 1

                              Precedence rules are complicated.

                              Not really. The precedence rules are specifically designed to make writing expressions like the above easy. The only time when you really need parentheses is for bitwise operations which have lower precedence than comparisons for some reason, or when you want to mix operators with the similar precedence. That is, I would not write

                              a && b || c
                              

                              instead of

                              (a && b) || c
                              

                              even though they are equivalent. However, it reduces the burden on the reader to use less parentheses where possible.

                            2. 1

                              In the spirit of education, a few nits:

                              Thanks for the careful code review…

                              A pointer to void can be implicitly converted to and from any pointer to object type… so it doesn’t need to be explicitly casted to char *.

                              fixed, I was unaware, thanks.

                              return (EX_USAGE); EX_USAGE is just a plain old number, so you don’t need any parentheses. In general, macros should include their own parenthesis (unless you have a truly sadistic coding style, or are doing something funny with blocks).

                              fixed thanks

                              sleep(secs); Note that sleep can fail, so you might want to stick this in a loop (yes, I know this was probably abbreviated for the sake of example, but you handled all the other errors).

                              I was trying to handle all errors, so thanks for spotting this, fixed.

                              Note that with C99, you can use designator expressions to initialize your structs like

                              struct builtin sleep_struct = {
                                  .name      = "sleep",         /* Builtin name */
                                  .function  = sleep_builtin,   /* Function implementing the builtin */
                                  .flags     = BUILTIN_ENABLED, /* Initial flags for builtin */
                                  .log_doc   = sleep_doc,       /* Array of long documentation strings. */
                                  .short_doc = "sleep NUMBER",  /* Usage synopsis; becomes short_doc */
                              };
                              

                              but, alas, it appears that bash does not use this style.

                              Those are nice, fixed

                              (variable_context > 0) && (global_vars == false) Note that && has lower precedence than both > and ==, so this would typically be written

                              variable_context > 0 && global_vars == false In addition, I believe that although variable_context is an int, it should always be positive or zero (see e.g. execute_function). And any comparison to zero like x == 0 may be replaced by !x, so this could also be rewritten like

                              variable_context && !global_vars

                              This last form seems really nice, so I switched to it.

                              SHELL_VAR *toc_var = NULL; Note that if you really are writing in K&R style, then declarations should come before assignments.

                              I was actually trying to write in a modern style, but I still have quite a bit to learn about what all that entails. The K&R bits slipped in through my copy pasta ignorance.

                              char *sep = "_";
                              size_t sec_size = strlen(toc_var_name) + strlen(section) + strlen(sep) +
                                                1; // +1 for the NUL character
                              char *sec_var_name = malloc(sec_size);
                              char *sec_end = sec_var_name + sec_size - 1;
                              char *p = memccpy(sec_var_name, toc_var_name, '\0', sec_size);
                              if (!p) {
                                builtin_error("Unable to create section name");
                                return 0;
                              }
                              p = memccpy(p - 1, sep, '\0', sec_end - p + 2);
                              if (!p) {
                                builtin_error("Unable to create section name");
                                return 0;
                              }
                              p = memccpy(p - 1, section, '\0', sec_end - p + 2);
                              if (!p) {
                                builtin_error("Unable to create section name");
                                return 0;
                              }
                              

                              This is not very idiomatic. A more typical formulation would be

                              static const char fmt[] = "%s_%s";
                              size_t sec_size = snprintf(NULL, 0, fmt, toc_var_name, section) + 1; /* +1 for the NUL character */
                              char *sec_var_name = xmalloc(sec_size);
                              
                              snprintf(sec_var_name, sec_size, fmt, toc_var_name, section);
                              

                              Note also the use of xmalloc, which bash appears to use a lot, and which just dies if you run out of memory. And of course, C++ style comments are far too modern for bash ;)

                              I really struggled with the right way to do string concatenation with modern C. I tried using snprintf but clang-tidy flags it with this warning:

                              • “Call to function ‘snprintf’ is insecure as it does not provide security checks introduced in the C11 standard. Replace with analogous functions that support length arguments or provides boundary checks such as ‘snprintf_s’ in case of C11”

                              Which led me to this blog post:

                              The posts advocates for using memccpy, which I thought was worth trying, though I found the api hard to love.

                              ini.so: libinih.o ini.o
                                $(CC) -o $@ $^ $(LDFLAGS)
                              
                              sleep.so: sleep.o
                                $(CC) -o $@ $^ $(LDFLAGS)
                              

                              Note that you can reduce this to something like

                              %.so: %.o
                                $(CC) -o $@ $^ $(LDFLAGS)
                              
                              ini.so: libinih.o
                              

                              and similarly, you can reduce

                              libinih.o: inih/ini.c
                                  $(CC) $(CFLAGS) $(INIH_FLAGS) -o libinih.o inih/ini.c
                              

                              to

                              ini/ini.o: CFLAGS += $(INIH_FLAGS)
                              (but of course you have to update your dependencies).kjj
                              

                              I still have much to learn about creating concise Makefiles, thanks for advice, fixed.

                              Thanks again for the careful feedback.

                              1. 2

                                Have you considered using asprintf for allocating strings?
                                This removes the need to compute the length, allocate memory, and copy individual strings.
                                The bash source’s autoconf script checks for asprintf so there aren’t any compatibility issues.
                                I think it would be something like the following untested code.

                                asprintf(&sec_var_name, "%s%s%s", toc_var_name, sep, section);
                                
                                1. 1

                                  I had not, though that seems like a good option, I wonder if the C standards org ever considered including them in the next revision.

                                  1. 1

                                    Other than malloc(), calloc() and realloc(), I don’t recall any standard C function [1] that will allocate memory, and I suspect that’s intentional.

                                    [1] I haven’t fully read through the C11 standard, so there may be new functions that do allocate memory as part of their function.

                                    1. 1

                                      yeah, that is evidently the rub, they mentioned that difficulty in the review of snprintf_s and related functions.

                                      1. 1

                                        I think str(n)dup is being added for C23.

                                        1. 1

                                          good point, it seems they are open to some functions which allocate memory, https://en.wikipedia.org/wiki/C2x, I wonder what the rationale is on which ones are allowed.

                                  2. 2

                                    Don’t let clang-tidy scare you off from using snprintf(). All snprintf_s() does is error out at runtime if the format string is NULL or any string passed in for a “%s” specifier is NULL and may restrict the size of the output buffer [1]. The code Forty_Bot provided is fine.

                                    [1] The size parameter for snprintf_s() is of type rsize_t, which may not exceed RSIZE_MAX (which may be smaller than SIZE_MAX, the maximum of size_t).

                                    1. 1

                                      Thanks for the additional info, I wish clang-tidy had more information on its warnings, something akin to shellcheck’s SCXXXX codes and wiki would be much appreciated.

                                2. 1

                                  Ah! Classic GNU.

                              2. -9

                                “They are the 🔋🔋 included with Bash”

                                I stopped reading here because it was so hard to tell what emoji this was that I had no idea what the sentence was saying. Stop using fucking emojis in technical writing.

                                1. 4

                                  My apologies for reducing the readability, I thought it made the analogy a bit more tangible, but I’ll pull it out.

                                  1. 18

                                    The tone of the parent comment is needlessly harsh, but I agree with the sentiment.

                                    I appreciate greatly that you accepted the feedback even though it was delivered so poorly.

                                    This is also a good post generally so please don’t let nit-picking get you down, the world is better for this writing to be out there!

                                    1. 2

                                      Thanks for the compliment, very much appreciated!