1. 9
    1. 4

      std::regex is notoriously slow and should not be used: https://stackoverflow.com/questions/70583395/why-is-stdregex-notoriously-much-slower-than-other-regular-expression-librarie

      The author would have been better off by depending on PCRE2

      1. 2

        Or RE2, it’s not the fastest, non-movable objects are annoying, and their use of an in-place Compile instead of a builder is infuriating to me, but it’s reasonably easy to use and the lack of backtracking makes it very attractive as a default.

        1. 1

          You’re probably alluding to this with “lack of backtracking” - I’ll point out that the most notable property of RE2 is that it has linear time complexity in the length of the pattern and the length of the string (O(m*n)), making it much less vulnerable to DoS attacks if one or both of them is user-generated input. Russ Cox has a series of articles on this.

          1. 1

            That is exactly what I was alluding to yes.

      2. 3

        This really feels like a mixture of non-idiomatic C and non-idiomatic C++ and it was a bit of a pain to follow. I ended up just reading the example program to understand it a bit better. I do realize it’s more in the “cool hack” category of code snippets 🙂.

        Also I suggest “c” tag.

        1. 1

          Can you elaborate on why you consider the C part unidiomatic? On the C++ side, I’m not huge on overriding the global operator new and operator delete, but I couldn’t find anything wrong on the C side.

          1. 1

            Maybe they are small things. Allocating 2 MB for the arena with:

            int   cap = 1<<21;
            

            Also lots of ptrdiff_t everywhere.