1. 30
  1. 7

    From the comments:

    Code, if given enough time, can be optimized a lot.

    1. 1

      Given enough time it can be optimized to not being needed at all. Thus achieving infinite optimization.

    2. 6

      The bitwise comparison trick is pretty clever.

      1. 11

        It’s also a great representation for decks/hands of playing cards. You can use a 64-bit integer, n, where the rank is represented by n % 16 and the suit by n / 16, for example.

        1. 8

          A lot of chess engines do similar using bitboards where you use 64bit values to represent board state, etc

        2. 4

          Packing things into integers is a very useful trick if you care about things going fast. The video talks about solutions in Rust and C outperforming Python by a lot, but I’ve used similar tricks in languages like Ruby and Erlang for the last few Advents of Code and got times in the order of solutions in C and Rust. You can do data-oriented programming in any language with a bit of knowledge of the language’s runtime and when it’s going to allocate memory.

        3. 5

          The fairly naïve Haskell solution to this, using the word32-of-bits representation is quite satisfying, and actually performs quite well, IIRC under a second, if you can do the file processing efficiently too:

          findQuintuples :: [Word32] -> [(Word32, Word32, Word32, Word32, Word32)]
          findQuintuples xs = do
            (a:as) <- tails xs
            (b:bs) <- tails $ filter (\b -> a .&. b == 0) as
            let ab = a .|. b
            (c:cs) <- tails $ filter (\c -> ab .&. c == 0) bs
            let abc = ab .|. c
            (d:ds) <- tails $ filter (\d -> abc .&. d == 0) cs
            let abcd = abc .|. d
            (e:_es) <- tails $ filter (\e -> abcd .&. e == 0) ds
            pure (a,b,c,d,e)
          
          1. 3

            I found the comments on Python to be a bit of a mischaracterization:

            Very rarely will Python go into production …. [in this context]

            Even in this specific context of number crunching I’m not sure this is accurate…

            At the start of the video someone quickly submitted a python solution that already solves the problem in 900 seconds (a huge improvement), and doesn’t appear to be using all the optimisations. There’s a few open PR’s that further reduce this time.

            I wonder what a like-for-like comparison would look like where they compared a Python solution with all the optimisations against an equivalent C/C++/Rust program. I’m too lazy to Google or write my own.

            1. 4

              In simple (loops, addition and so on) single core code, Python is roughly 100 slower than compiled languages. Also this is usually the speed-up you get when compiling python with Cython, Numba or mypyc.

              But Python can be used as a wrapper of compiled languages. NumPy or OpenCV are good examples of this. NumPy (mostly) uses all available cores very efficiently and is probably faster than anything someone could code up on the spot. So as long as you can express your computation in of those wrapper frameworks, Python is really really fast. However those frameworks are not applicable here.

              Then on the other hand you can use much more low level tricks to gain performance with lower level languages.

            2. 2

              Previously on lobste.rs, Java, in 30 seconds, and also an update, in 3 seconds.