1. 13

  2. 5

    Nice to see some comparison between those 3, seeing how often they are used together.

    Small note on the “sed flags for regexp addressing”, “I” is a regex modifier, whereas “e”, “p”, “w”, “d” etc. are commands, pretty much like “s” is, meaning they can be used independently whereas “I” cannot, they work pretty differently.

    For instance /foo/Id means run “d” command if pattern space matches foo with the case insensitive modifier, we can see this as /foo/I{ d; }. Whereas /foo/dI is a syntax error since it would mean /foo/{ dI; } which is incorrect since the “d” command does not accept an “I” parameter (sed: -e expression #1, char 9: extra characters after command)

    1. 2


      Regarding your note, the sed manual terms them as flags when explaining their behavior in s command. But, my post is about regexp and these commands do not change the behavior of regexp, so I’ve now removed them.

    2. 5

      FWIW I stopped using BRE in favor of ERE because I want to remember fewer things, and I can always use:

      1. awk – ERE by default
      2. egrep – opt in to ERE with one char :) (I also sometimes use fgrep, so it’s funny that plain old grep is the thing I do NOT reach for)
      3. sed –regexp-extended on systems with GNU sed.

      I think #1 and #2 are POSIX, so all systems should have them. The third isn’t, but at this point I would just install/compile GNU sed on those systems if I encountered them.

      FWIW this is why Eggex [1] only compiles to ERE at the moment. But BRE can be added if anyone think it’s worth it and is motivated :)

      [1] https://www.oilshell.org/release/latest/doc/eggex.html

      1. 3

        I think #1 and #2 are POSIX, so all systems should have them

        Took a quick look and it’s not very clear but it looks like egrep and fgrep as separate binaries/scripts/symlinks are deprecated, the -E and -F options however are indeed specified by POSIX.

        Looking at my current system (ubuntu) those are just shell scripts calling exec grep -E/F "$@" right now.

        POSIX spec about fgrep/egrep [1]:

        This grep has been enhanced in an upwards-compatible way to provide the exact functionality of the historical egrep and fgrep commands as well. It was the clear intention of the standard developers to consolidate the three greps into a single command.

        The old egrep and fgrep commands are likely to be supported for many years to come as implementation extensions, allowing historical applications to operate unmodified.

        And GNU grep manual [2]:

        Direct invocation as either egrep or fgrep is deprecated, but is provided to allow historical applications that rely on them to run unmodified.

        [1] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/grep.html#tag_20_55_18

        [2] https://www.gnu.org/software/grep/manual/html_node/grep-Programs.html

        1. 1

          Hm thanks for the info! It is good to know what’s POSIX, and I didn’t know they were deprecated.

          But long term I hope Oil can gain some lightweight containers / app bundle support, and people may worry less about what exact binaries are installed! I guess you can even do something like

          if ! command -v egrep; then
            egrep () { grep -E "$@"; }

          in any script. Also I just checked busybox and it supports egrep and fgrep. I think that those shortcuts are trivial and useful and so they’re probably here to stay.

        2. 2

          I’m strongly in the “never use BREs” camp and view the existence of POSIX BREs as a bug that is baked into too many places to fix now.

          1. 1

            I use BRE by default and use ERE only when needed. Easier for me since I use only GNU versions (no feature difference, only syntax changes for certain metacharacters).

            Regarding sed, I think there are other implementations which support -E option as well (this used to be -r but now -E is more portable).

            1. 2

              If you’re not concerned about sed, is there any reason to use BRE at all?

              ERE is also closer to Perl-style regexes, so I think it’s easier to remember. I definitely have to remember 2 regex dialects: ERE for awk and Perl-style for Python/C++/etc.

              I don’t want to remember 3! OK eggex makes it 3, but it compiles to ERE, so maybe one day I can forget it :)

              1. 2

                is there any reason to use BRE at all?

                Just a preference for me, because I often have to match () or {} or +| literally, so default BRE works out better in terms of escapes needed (sometimes none at all compared to ERE).

                Using ERE always does help you to stay sane across multiple languages and tools.