For political reasons, governments are spending a whole lot more money on HPC and the tools have gotten far better.
Spark has faded in importance as most of the stuff people were doing with it now fits on a single big node. But the Chevy Spark (car) became very popular which makes the Google Trends results for “spark” look positive.
Big non-HPC datacenters have adopted Infiniband and message-passing architectures.
HPC clusters can now easily run Kubernetes which opens up all the non-HPC distributed applications like databases and queues.
And MPI is still there and still extremely popular for physical simulations. Most of the simulation “codes” are relatively simple programs, that have been rigorously and thoroughly checked and re-checked over decades of work by brilliant scientists. They would be very hard to rewrite against a different API.
And MPI is still there and still extremely popular for physical simulations. Most of the simulation “codes” are relatively simple programs, that have been rigorously and thoroughly checked and re-checked over decades of work by brilliant scientists. They would be very hard to rewrite against a different API.
Yep, I think articles like this fail to account for the mechanisms behind how and why HPC software gets written!
I did a three-year stint writing number crunching code way back. I used a lot of MPI code, but wrote very little MPI code myself (realistically, I wrote truly non-trivial MPI code just once, and it was not the right tool for that level of abstraction so the next, better version of that code used a new, very cool library called ZeroMQ :-D).
That’s because the typical flow for what we did was:
Study a hard engineering problem/physical phenomenon/whatever
Devise a mathematical model for it (tl;dr turn that beautiful thing into a bunch of ugly PDEs, then figure out how to turn that into a discrete model)
Write a program that implements the discrete mathematical model
Lots of people on the software side (and, sadly, on the journalistic side) think #3 is the hard part, which requires the most effort and the most focus.
It’s not. #1 and #2 are, and #2 is actually the fundamental one – coming up with a quantitative understanding (in the form of a model) of a phenomenon, so that you can understand it and/or put it to practical use.
#3 is the (usually trivial) part that you unfortunately have to engage in primarily in order to validate your work in #2, because lots of phenomenons just don’t result in a mathematical model that you can solve with pen and paper in practical cases. Sometimes, if the thing at #1 is really relevant on a wide enough scale that people are willing to pay for a program that solves it, it gets turned into a commercial endeavour and maybe it gets a little more attention, but that’s not extremely common.
That’s where the MPI part comes in handy. There’s lots and lots of code, some of it dating back to the 1980s, which very reliably implements all the mathematical tools that you need – code that e.g. solves huge sparse systems of linear equations through all sorts of ingeuous methods, optimised for various types of constraints.
All that was code written by a lot of smart math/CS PhDs who worked specifically on that – devising that kind of mathematical tools. Everyone else – physicists, chemists, engineers (in my case, of the electrical kind) – is forever in their debt. I did not want to spend time writing that kind of code. First of all I probably couldn’t – I was certainly not inept at math but nowhere near good enough to write the kind of code that people who’d spent years studying and advancing the field of numerical methods were. Second, even if I’d tried, the folks upstairs would’ve probably been pissed that someone who was supposed to be working on solving engineering problems was now writing abstract math code instead.
So there was a huge amount of code (“was” = almost 15 years ago, I don’t know what’s trending these days) that used MPI as in, it used e.g. MUMPS (the sparse solver, not the language). That was code written by teams like the one I was in, which tried to crack tough engineering problems, and wanted to spend as much time as possible working on cracking engineering problems and as little time as possible implementing our cracks.
You do need to write some MPI boilerplate to use those – e.g. you have to preprocess and partition simulation data. That’s kind of at the interface between #2 and #3 above, and it’s pretty nutty work. The actual programming side of it is easy, it’s the math it arises out of that’s hard, and it’s often hard to check the code against the math.
It wasn’t very uncommon for some of this software to use MPI at “higher” levels of abstraction, too, e.g. in order to share configuration data. MPI isn’t great at that but when you literally just have to send a couple of bools over the network, it’s good enough, and it beats having to import another library. It’s not something that was done out of laziness to learn another API: it was done after a few generations of PhDs learned the hard way that, while a lot of number crunching code survived for 10, 20 or even 30 years, lots of networking code didn’t, so you wanted to stick with the things that were most likely to survive the next dotcom boom lest you have to start porting the least important part of your code fifteen years from now.
Edit: OP is right to point this out:
They would be very hard to rewrite against a different API.
And lots of people on the software end of things underestimate just how hard it is. This isn’t an abstract “second system syndrome” kind of problem, where the main challenge is keeping ambitions in check, and as long as you can do that it’s a very simple project because you already have a reference implementation.
Dijkstra once pointed out that numerical code is the most trivial kind of code, and he was right in principle. What you quickly learn later, however, is that debugging parallel numerical code is insanely hard. You’re literally stuck trying to dissect the step-by-step progress of a problem that you’ve written because you can’t do that kind of math step-by-step in the first place, which is made even harder by all sorts of numerical weirdness. Even recognizing bugs is difficult, because they often arise out of new mathematical models, which may themselves be wrong in the first place. The equivalent of “it’s never the compiler” is “it’s never the solver”: if your program insists on a gate current of 43034892 A, you tend to suspect the program, not the library that does the math.
Rewriting both the libraries and the programs that use them in a new API is an extraordinarily costly endeavour. It’s not that there’s no value in it, it’s just that said value is dwarved by the cost. It looks good on paper right up until you have to do the budget, then it’s an absolutely terrible idea.
the tools have gotten far better … HPC clusters can now easily run Kubernetes
From the perspective of a lowly user (in academia), I feel the tools have gotten more… heterogeneous I guess? Which is overall not necessary better, though some things are easier than they used to be.
Some clusters run the classic Slurm/LSF job submission system with shared persistent NSF storage. Other clusters are Kubernetes-only, like the NSF-funded National Research Platform. And as a third option, some labs are all-in on AWS/GCP/Azure tooling and workflows, funded by research credits or CloudBank or similar. In practice I find myself spending a lot of time repackaging and porting software to run it in a way that it wasn’t expecting to be run, which is not too fun.
On the plus side, it’s nice that these resources exist in the first place. :-) The National Research Platform in particular opens up moderate HPC capability for individual researchers even at small liberal arts colleges, for no monetary cost (just the cost of figuring out how to put your stuff into Kubernetes jobs), which is pretty neat.
A lot has happened since this was written.
For political reasons, governments are spending a whole lot more money on HPC and the tools have gotten far better.
Spark has faded in importance as most of the stuff people were doing with it now fits on a single big node. But the Chevy Spark (car) became very popular which makes the Google Trends results for “spark” look positive.
Big non-HPC datacenters have adopted Infiniband and message-passing architectures.
HPC clusters can now easily run Kubernetes which opens up all the non-HPC distributed applications like databases and queues.
And MPI is still there and still extremely popular for physical simulations. Most of the simulation “codes” are relatively simple programs, that have been rigorously and thoroughly checked and re-checked over decades of work by brilliant scientists. They would be very hard to rewrite against a different API.
Yep, I think articles like this fail to account for the mechanisms behind how and why HPC software gets written!
I did a three-year stint writing number crunching code way back. I used a lot of MPI code, but wrote very little MPI code myself (realistically, I wrote truly non-trivial MPI code just once, and it was not the right tool for that level of abstraction so the next, better version of that code used a new, very cool library called ZeroMQ :-D).
That’s because the typical flow for what we did was:
Lots of people on the software side (and, sadly, on the journalistic side) think #3 is the hard part, which requires the most effort and the most focus.
It’s not. #1 and #2 are, and #2 is actually the fundamental one – coming up with a quantitative understanding (in the form of a model) of a phenomenon, so that you can understand it and/or put it to practical use.
#3 is the (usually trivial) part that you unfortunately have to engage in primarily in order to validate your work in #2, because lots of phenomenons just don’t result in a mathematical model that you can solve with pen and paper in practical cases. Sometimes, if the thing at #1 is really relevant on a wide enough scale that people are willing to pay for a program that solves it, it gets turned into a commercial endeavour and maybe it gets a little more attention, but that’s not extremely common.
That’s where the MPI part comes in handy. There’s lots and lots of code, some of it dating back to the 1980s, which very reliably implements all the mathematical tools that you need – code that e.g. solves huge sparse systems of linear equations through all sorts of ingeuous methods, optimised for various types of constraints.
All that was code written by a lot of smart math/CS PhDs who worked specifically on that – devising that kind of mathematical tools. Everyone else – physicists, chemists, engineers (in my case, of the electrical kind) – is forever in their debt. I did not want to spend time writing that kind of code. First of all I probably couldn’t – I was certainly not inept at math but nowhere near good enough to write the kind of code that people who’d spent years studying and advancing the field of numerical methods were. Second, even if I’d tried, the folks upstairs would’ve probably been pissed that someone who was supposed to be working on solving engineering problems was now writing abstract math code instead.
So there was a huge amount of code (“was” = almost 15 years ago, I don’t know what’s trending these days) that used MPI as in, it used e.g. MUMPS (the sparse solver, not the language). That was code written by teams like the one I was in, which tried to crack tough engineering problems, and wanted to spend as much time as possible working on cracking engineering problems and as little time as possible implementing our cracks.
You do need to write some MPI boilerplate to use those – e.g. you have to preprocess and partition simulation data. That’s kind of at the interface between #2 and #3 above, and it’s pretty nutty work. The actual programming side of it is easy, it’s the math it arises out of that’s hard, and it’s often hard to check the code against the math.
It wasn’t very uncommon for some of this software to use MPI at “higher” levels of abstraction, too, e.g. in order to share configuration data. MPI isn’t great at that but when you literally just have to send a couple of bools over the network, it’s good enough, and it beats having to import another library. It’s not something that was done out of laziness to learn another API: it was done after a few generations of PhDs learned the hard way that, while a lot of number crunching code survived for 10, 20 or even 30 years, lots of networking code didn’t, so you wanted to stick with the things that were most likely to survive the next dotcom boom lest you have to start porting the least important part of your code fifteen years from now.
Edit: OP is right to point this out:
And lots of people on the software end of things underestimate just how hard it is. This isn’t an abstract “second system syndrome” kind of problem, where the main challenge is keeping ambitions in check, and as long as you can do that it’s a very simple project because you already have a reference implementation.
Dijkstra once pointed out that numerical code is the most trivial kind of code, and he was right in principle. What you quickly learn later, however, is that debugging parallel numerical code is insanely hard. You’re literally stuck trying to dissect the step-by-step progress of a problem that you’ve written because you can’t do that kind of math step-by-step in the first place, which is made even harder by all sorts of numerical weirdness. Even recognizing bugs is difficult, because they often arise out of new mathematical models, which may themselves be wrong in the first place. The equivalent of “it’s never the compiler” is “it’s never the solver”: if your program insists on a gate current of 43034892 A, you tend to suspect the program, not the library that does the math.
Rewriting both the libraries and the programs that use them in a new API is an extraordinarily costly endeavour. It’s not that there’s no value in it, it’s just that said value is dwarved by the cost. It looks good on paper right up until you have to do the budget, then it’s an absolutely terrible idea.
From the perspective of a lowly user (in academia), I feel the tools have gotten more… heterogeneous I guess? Which is overall not necessary better, though some things are easier than they used to be.
Some clusters run the classic Slurm/LSF job submission system with shared persistent NSF storage. Other clusters are Kubernetes-only, like the NSF-funded National Research Platform. And as a third option, some labs are all-in on AWS/GCP/Azure tooling and workflows, funded by research credits or CloudBank or similar. In practice I find myself spending a lot of time repackaging and porting software to run it in a way that it wasn’t expecting to be run, which is not too fun.
On the plus side, it’s nice that these resources exist in the first place. :-) The National Research Platform in particular opens up moderate HPC capability for individual researchers even at small liberal arts colleges, for no monetary cost (just the cost of figuring out how to put your stuff into Kubernetes jobs), which is pretty neat.
Really interesting, but it could use a [2015] in the title. Now I’m curious if this has become even worse!
Orange site links
I don’t think the article has legs. It conflates things and really no-one uses MPI directly. MPI and Spark/Hadoop solve totally different problems.