1. 18
    1. 5

      I don’t always get opportunities to do deep dives on the software I write for my research, but when I do it tends to be pretty fun. In this case I had a simulation that was taking about 8 hours to run, which made me grumpy. I decided to see how fast I could make it and ended up making HUGE improvements. Read on to see how I did it and how much faster it was in the end!

      1. 2

        I enjoyed following your iterative process (rather than a post-mortem “here is how I ended up doing this”). Thanks for sharing.

        I would have reached for Numba before rewriting in Rust to see how far it would get me, but I don’t write Rust so this choice would be an ergonomic/practical one. Do you have any comment on taking that approach? I’d also probably reach for CPython before a full rewrite (in C or Rust or anything else).

        As an aside re: finding bottlenecks - my first pass after profiling is always to look for the Python loops that can be replaced by NumPy. Most of the time, that’s as far as I need to go in practice.

        1. 1

          I enjoyed following your iterative process

          Thanks! I wanted it to be an end-to-end description of the process so that new-ish programmers can see what the process is like (measure, tweak, measure, tweak, etc). It’s also meant to show that optimization is often a case of compounding smaller optimizations rather than one BIG honking optimization.

          I would have reached for Numba before rewriting in Rust to see how far it would get me, but I don’t write Rust so this choice would be an ergonomic/practical one. Do you have any comment on taking that approach?

          Sure! I was aware of things like Numba, PyPy, etc, but I remembered that at least with PyPy there were some restrictions on which Python versions you could use, which libraries could be used with it, etc. I didn’t do a ton of research into this. I also think I mixed up Numba and Dask in my head and immediately discounted it because I only wanted to run this program on my 2-core laptop, not a cluster, so I figured the additional overhead would eat into any performance gains. I’ve actually had more than one person recommend Numba to me since I published this, so I’ll probably go back to a previous version in git and test it out to see how far that would have gotten me in comparison. I’ll post an update on the piece whenever I get around to that.

          Writing a Python extension has been on my programming bucket-list for a while, so this was just enough of an excuse to actually do it. I already knew Rust well enough to get by, so that wasn’t a big concern. I also kind of welcomed the opportunity to see how I could tweak the low-level details to squeeze out some performance here and there.

    2. 2

      Did you ever consider using Julia? I‘ve never used the language for something serious, but from the talks I listened to this sounds like a prime example for it.

      From my understanding Julia is especially well suited if you have hot loops, meaning many small calculations happening over and over again. Your core loop takes milliseconds, but you execute it for hours.

      It would be great if some experienced Julia programmer could share their opinion here.

      Funnily enough I‘ve written a Rust extension for Python myself the other day. Mainly like you because I just wanted to and it was certainly an enjoyable experience :)

      1. 1

        I’m aware of Julia but I actually didn’t consider Julia at all. The reasons aren’t technical. I’ve heard good things about Julia, especially the differential equation libraries.

        The context here is that I have this program fmo_analysis which does the number crunching, loading input files, saving results, etc, and is also part of a CLI application, but I also have a bunch of scripts/simulations that I’ve written that use fmo_analysis as a library. These simulations are little one-off things I did to test out various ideas, prove whether certain physical effects matter, etc.

        I don’t know the Julia ecosystem at all, so I would need to (1) learn Julia, (2) rewrite fmo_analysis, then (3) rewrite all of the little scripts and simulations that use fmo_analysis. Instead what I chose to do was rewrite just one piece of fmo_analysis (the actual number crunching) in Rust, a language I already knew. That saves me a bunch of time, which is nice considering I want nothing more these days than to finish my PhD as fast as possible :)