1. 3

    I’ve had to write a couple over the years. I enjoyed it immensely. I think the other thing to keep in mind is that even when you’re not designing or implementing an algorithm, good engineering occasionally requires just knowing tradeoffs for correct algorithm selection.

    So over the years, here’s a few datastructures and algorithms I’ve needed to use while building software.

    1. KMP search. I also had unique requirements to work on a stream rather than a string so I had to re-implement. Why? I needed to search for something in a HTTP response without buffering and this use case was not well supported by any libs I could find. I ended up borrowing a java implementation from twitter.
    2. Good old binary search. Why? Needed to find a good compression ratio for an image that didn’t fall below a certain visual similarity score.
    3. trie: Didn’t implement this one myself, but I needed to rapidly match IP blocks and a simple hash or list wouldn’t have worked. I just used a library called cidranger, but I selected that library because I knew I needed that datastructure.
    4. Tree traversal. It happens sometimes in surprising contexts (dom manipulation, certain types of reporting, etc)

    To be clear my position is that software engineering interviews focus too much on CS. Just like Physics and mechanical engineering are separate but related disciplines, CS and Software engineering are not the same thing.

    1. 1

      Good old binary search

      And even that is pretty easy to get wrong.

      1. 1

        I also had unique requirements to work on a stream rather than a string so I had to re-implement. Why? I needed to search for something in a HTTP response without buffering and this use case was not well supported by any libs I could find. I ended up borrowing a java implementation from twitter.

        Note that, unless the stream is somehow interactive (that is, “wait until we have N bytes to read (or there’s EOF)” is unacceptable and we need to find match in what we have right now), this has a simpler solution that is both faster (with a fast memmem() — e.g. from glibc) and lighter on memory than KMP (unless len(pattern) <= 256 and you use 1-byte values to encode the values of prefix function — but then the difference in memory is at most 256 bytes). Here’s the pseudocode:

        def contains(stream, pattern):
            prefix = ''
            while True:
                chunk = stream.read(len(pattern))
                # assume 'memmem()' which is O(N+M) is used for 'in' operation
                if pattern in (prefix + chunk):  
                    return True
                if len(chunk) < len(pattern):
                    # EOF
                    return False
                prefix = chunk
        
      1. 3

        I wonder what the applications of this “least fixed points of monotone functions by bottom-up iteration” thing, outside of compilers/formal methods, are.

        1. 2

          https://shdown.github.io — no frills :)

          1. 1

            Looks handy. I saw this supports prerelease builds of Lua 5.4; did you have any trouble supporting that, or was it pretty straightforward?

            1. 2
              1. 1

                lol

                1. 1

                  Holy smokes; that’s awful. Is it disabled by default at least?

                  1. 1

                    As far as I know, yes.

                    1. 1

                      Here’s hoping it either gets removed before the full release, or never gets enabled ever by anyone compiling it.