1. 43
  1.  

  2. 35

    I have very mixed feelings about this article. Some parts I agree with:

    It’s you, the software engineering community, that is responsible for tools like C++ that look as if they were designed for shooting yourself in the foot.

    There is very little impetus to build tools that are tolerant of non-expert programmers (golang being maybe the most famous semi-recent counterexample) without devolving entirely into simple toys (say, Scratch).

    Some of you have helped with a first round of code cleanup, which I think is the most constructive attitude you can adopt in the short term. But this is not a sustainable approach for the future.

    […] always keeping in mind that scientists are not software engineers, and have neither the time nor the motivation to become software engineers.

    Yep, software engineers pitching in to cleanup academic messes after the fact definitely doesn’t work. One of the issues I’ve run into when doing this is that you can totally screw up a refactor in ways that aren’t immediately obvious. Further, honestly, a lot of “best practices” can really hamper the explorative liminality required to do research spikes and feel out a problem.

    But then, there’s a lot of disagreement I have too:

    The scientists who wrote this horrible code most probably had no training in software engineering, and no funding to hire software engineers.

    We expect people doing serious science to have a basic grasp of mathematics and statistics. When they don’t, we make fun of them (that is, when the peer review system works properly). If you’re doing computational models, you damned well should understand how to use your tools properly. No experimental physicist worth I damn that I’ve known couldn’t solder decently well–nobody doing science that relies on computers shouldn’t be expected to know how to program competently and safely.

    clear message saying “Unless you are willing to train for many years to become a software engineer yourself, this tool is not for you.”

    Where’s the clear messaging in the academic papers saying “Yo, this is something that I can only reproduce on my Cray Roflcluster with the UT Stampede fork of Python 1.337”? Where’re the warnings “Our university PR department once again misrepresented our research in order to keep sucking at the teat of NSF and donors, please don’t discuss this incredibly subtle work you’re probably gonna misrepresent.” Where’s the disclaimer for “This source code was started 40 years ago in F77 and lugged around by the PI, who is now tenured and doesn’t bother to explain things to his lab anymore because they’re smart and should just get it, and it has been manhandled badly by generations of students who have been under constant pressure to publish results they can’t reproduce using techniques they don’t understand on code they don’t have the freedom to change.”?

    The core of that research is building and applying the model it implemented by the code, the code itself is merely a means to this end.

    This callous disregard for the artifact that other people will use is alarming. Most folks aren’t going to look at your paper with PDEs and sagely scratch their chins and make policy decisions–they’re going to run your models and try to do something with the results. I don’t think it is reasonable to disavow responsibility for how the work is going to be used in the future if you also rely on tax dollars and/or bloated student tuition to fund your adventures.

    There’s something deeply wrong with academic research and computing, and this submission just struck me as an attempt to divert attention away from it by harnessing the techlash.

    1. 18

      I’m someone who’s done their (extremely) fair share of programming work in academia, but outside a CS department: I can guarantee that anyone insisting that the solution was simple and that it’s just “they should have hired real software engineers” has had zero exposure to “real software engineers” trying to write simulation software. Or if they had, it was either in exceptional circumstances, or they didn’t actually pay attention to what happens there.

      (This is no different to CS, by the way. The reason why you can’t just hire software engineers and expect they’ll be able to understand magnetohydrodynamics (or epidemiology, or whatever else) is the same reason why you can’t just hire electrical engineers or mechanical engineers and expect them to write a Redis clone worth a damn in less than two years – let alone something better.)

      As Dijkstra once remarked, the easiest machine applications are the technical/scientific computations. The programming behind a surprising proportion of simulation software is trivial. By the time they’re done with their freshman year, all CS students know enough programming to write a pretty convincing and useful SPICE clone, for example. (Edit: just to be clear, I’m not talking out of my ass here. For two years I’ve picked the short straw and ended up herding CS first-years through it, and I know from experience that two first-year students can code a basic SPICE clone in a week, most of which is spent on the parser). I haven’t read it in detail but from glossing over it, I think none of the techniques, data structures, algorithms and tools significantly exceed a modest second-year CS/Comp Eng curriculum.

      Trouble is, most of the domain-specific knowledge required to understand and implement these models far exceeds a CS/Comp Eng curricula. You think epidemiologists who learned C++ on their own and coded by themselves for 10 years write bad simulation code? Wait ’til you see what software engineers who have had zero exposure to epidemiology can come up with.

      “Just enough Python” to write a simple MHD flow simulator is something you can learn in a few afternoons. Just enough electromagnetism understand how to do that is a four-semester course, and the number of people who can teach themselves how to do that is very low. I know a few and I know for a fact that most companies, let alone public universities, can’t afford their services.

      This isn’t some scholastic exercise. No one hands you a two-page description of an algorithm for simulating how the flu spreads and says hey, can you please turn this mess of pseudocode into C++, I’m not that good at C++ myself. The luckiest case – which is how most commercial-grade simulation software gets written – is that you get an adnotated paper and a Matlab implementation from whoever developed the model.

      (Edit: if you’re lucky, and you’re not always lucky, that person is not an asshole. But if you think translating Matlab into C++ isn’t fun, wait until you have to translate 4,000 lines of uncommented Matlab from someone who doesn’t like talking to software engineers because they’re not real engineers).

      However, by the time that happens, the innovation has already happened (i.e. the model has been developed) months before, sometimes years. If you are expected to produce original results – i.e. if you do research – you don’t get a paper by someone else and a Matlab implementation. You get a stack of 80 or so (to begin with) papers on – I’m guessing, in this case, epidemiology, biochemistry, stochastic processes and public health policies – and you’re expected to come up with something better out of them (and, of course, write the code). Yeah, I’m basically describing how you get a PhD.

      1. 7

        I can guarantee that anyone insisting that the solution was simple and that it’s just “they should have hired real software engineers” has had zero exposure to “real software engineers” trying to write simulation software.

        I totally agree with this. That’s also why my argument is “researchers need to learn to write better code” and not “we should hire software engineers to build their code for them”.

      2. 13

        …no funding to hire software engineers.

        Speaking as a grant-funded software engineer working in an academic research lab, it’s amazing what you can get money for if your PI cares about it and actually writes it into grant applications.

        My suspicion, and I have zero tangible evidence for this, just a handful of anecdotal experiences, is that labs outside of computer science are hesitant to hire software engineers. It’s better for the PI’s career to bring in a couple more post-docs or PhD students and expect them to magically become software engineers than to hire a “real” one.

        Another interesting problem, at least where I work, is that the pay scale for “software engineer” is below market. I’m some kind of “scientist” on paper because that was the only way they could pay the position enough to attract someone out of industry.

        1. 5

          Speaking as a grant-funded software engineer working in an academic research lab, it’s amazing what you can get money for if your PI cares about it and actually writes it into grant applications.

          Oh, totally agree. I’ve made rent a few times by being a consulting software engineer, and it’s always been a pleasure to work with those PIs. Unfortunately, a lot of PIs just frankly seem to have priorities elsewhere.

          I’ve heard also that in the US there’s less of a tradition around that, whereas European institutions are better about it. Am unsure about this though.

          Also, how to write code that can survive the introduction of tired grad students or energetic undegrads deserves it’s own consideration.

          1. 6

            Yeah, “Research Software Engineering” is a pretty big thing in the UK at least… https://society-rse.org.

            1. 11

              It is (I’m an RSE in Oxford). It costs as much within bizarre University economic rituals for a researcher to put (the equivalent of) one of us (full time, but what they usually get is that time shared across a team of people with various software engineering skills and experiences) on a project as it would to hire a postdoc research assistant, and sometimes less. Of course they only do that if they know that they have a problem we can help with, and that we exist.

              Our problems at the moment are mostly that people are finding out about us faster than we’re growing our capability to help them. I was on a call today for a project that we couldn’t start before January at the earliest, which is often OK in the usual run of research funding rounds, less OK for spin-out and other commercial projects. We have broken the emergency glass for scheduling Covid-19 related projects by preempting other work, I’ve been on one since March and another was literally a code review and plan for improvement as the linked project got after it was shared. We run about 3 surgery sessions a week on helping researchers understand where to take their software projects, again that only lands with people who know to ask. But if we told more people they could ask, we’d be swamped.

              While we’re all wildly in agreement that this project got a lot of unfair context-free hate from the webshits who would gladly disrupt epidemiology, it’s almost certainly the case that a bunch of astrophysicists somewhere are glad the programming community is looking the other way for a bit.

              1. 3

                I’m an RSE in Oxford

                A lot of UK universities don’t have an RSE career track (I’ve been helping work to get one created at Cambridge). It’s quite difficult to bootstrap. Most academics are funded out of grants. The small subset with tenure are funded by the department taking a cut of all grants to maintain a buffer for when they’re not funded on specific ones. Postdocs are all on fixed-term contracts. This is just about okay if you regard postdoc as a position like an extended internship, which should lead to a (tenured) faculty position but increasingly it’s treated as a long-term career path. RSE, in contrast, does not even have the pretence that it’s a stepping stone to a faculty job. A sustainable RSE position needs a career path, which means you need a mechanism for funding a pool of RSEs between grants (note: universities often have this for lab technicians).

                The secondary problem is the salary. We (Microsoft Research Cambridge) pay starting RSEs (straight out of university) more than the UK academic salary scale pays experienced postdocs or lecturers[1]. RSEs generally expect to earn a salary that is comparable to a software engineer and that’s very hard in a university setting where the head of department will be paid less than an experienced software engineer. The last academic project I was on had a few software engineers being paid as part-time postdocs, so that they had time for consulting in the remaining time (a few others we got as contractors, but that was via DARPA money that is a bit more flexible).

                The composition of these two is a killer. You need people who are paid more than most academics, who you are paying out of a central pool that’s covered by overhead. You can pay them much less than an industry salary but then you can’t hire experienced ones and you get a lot of turnover.

                [1] Note for Americans: Lecturer in British academia is equivalent to somewhere between assistant and associate professor: tenured, but junior.

                1. 2

                  Postdocs are all on fixed-term contracts.

                  Happy to talk more: what we’ve done is set up a Service Research Facility, which is basically a budget code that researchers can charge grant money against. So they “put a postdoc” on their grant application, then give us the money and get that many FTEs of our time. It also means that we can easily take on commercial consultancy, because you multiply the day rate by the full economic cost factor and charge that to the SRF. A downside is that we have to demonstrate that the SRF is committed to N*FTE salaries at the beginning of each budget year to get our salaries covered by the paymasters (in our case, the CS department), making it harder to be flexible about allocation and side work like software surgeries and teaching. On the plus side, it gives us a way to demonstrate the value of having RSEs while we work to put those longer-term streams in place.

                  The secondary problem is the salary […] so that they had time for consulting

                  You’re not wrong :). I started by topping mine up with external commercial consultancy (I’ve been in software engineering much longer than I’ve been in RSE), but managed to get up to a senior postdoc grade so that became unnecessary. I’m still on half what I’ve made elsewhere, of course, but it’s a livable salary.

                  Universities and adjacent institutions (Diamond Light Source, UKAEA, Met Office/ECMWF all pay more but not “competitive” more) aren’t going to soon be comparable to randomly-selected public companies or VC funded startups in terms of “the package”, and in fact I’d hate to think what changes would be made in the current political climate to achieve that goal. That means being an RSE has to have non-monetary incentives that being a FAANG doesn’t give: I’m here for the intellectual stimulation, not for the most dollars per semicolon.

                  A sustainable RSE position needs a career path, which means you need a mechanism for funding a pool of RSEs between grants (note: universities often have this for lab technicians).

                  I’m starting a DPhil (same meaning as PhD, different wording because Oxford) on exactly this topic in October: eliciting the value of RSEs and providing context for hiring, training, evaluating and progressing RSEs. I’ve found in conversations and panel discussions at venues like the RSE conference that some people have a “snobbish” attitude to the comparison with technicians, BTW. I’m not saying it’s accurate or fair, but they see making the software for research as a more academically-valid pursuit than running the machines for research.

                  1. 2

                    Thanks, that’s very informative. Let me know if you’re in Cambridge (and pubs are allowed to open again) - I’ll introduce you to some of our SREs.

                2. 2

                  Seeing as you seem to have experience in the field, from a very high level view, does the complaints about this project seem valid or not? I understand that one could only make an educated guess considering this is 15K lines, hotly debated, and also a developing situation (the politics… Whoo boy!), but I would love to have someone with experience calibrate the needle on the outrage-o-meter somewhat.

                  1. 1

                    I haven’t examined the code, which is perhaps a lesson in itself.

                    1. 1

                      As a baseline I put the code through clang’s scan-build and it found 8 code flows where uninitialized variables may affect the model early in the run. It’s possible that not all them can realistically be triggered (it doesn’t know all dependencies between pieces of external data), but it’s not a great sign.

                      Among others that’s a reasonable explanation why people report that even with well-defined random seeds they see different results, and I wouldn’t count “uninitialized variables” in the class of uniform randomness, so I’d be wary about just averaging it out.

                  2. 2

                    If you cannot pay somebody much, give them a fancy title, e.g., “Research Software Engineering”. It’s purely an HR ploy.

              2. 6

                It’s you, the software engineering community, that is responsible for tools like C++ that look as if they were designed for shooting yourself in the foot.

                There is very little impetus to build tools that are tolerant of non-expert programmers (golang being maybe the most famous semi-recent counterexample) without devolving entirely into simple toys (say, Scratch).

                I actually agree with the author on this.

                Let’s not even pretend that the only alternative to the absolutely mind-boggling engineering and design shit show that is C++ is “devolving entirely into simple toys”.

                1. 1

                  Rust?

                  1. 1

                    One option.

                2. 4

                  I think you put it very well. Look: if there’s a hierarchy of importance I’m happy to put science far ahead of software development. But the fact remains: when it comes to producing scientific results using software, software developers do know a thing or two about how hard it is to fool yourself and we are rightly horrified at someone handwaving away lack of tests and input validation by “a non-programmer expert will look at this code and make sure not to hold it wrong”

                  I guess in that sense it’s not much different than the rampant misuse of statistics in science, it’s just that software misuse might be currently flying a little below the radar.

                  1. 4

                    exactly. It is the job of the researcher to be aware of the limitation of his own limited ability to implement his model with a particular tool. To badly implement something then make grandiose claim that the result of said badly implemented model should inform decision that affect millions, is his own fault.

                    You can’t blame a screw driver ‘community’ if you use it badly and poke yourself in the eye. Not even the lack of “do not poke eye with screwdriver” warning label counts as failure.

                    1. 1

                      This plays out in an interesting way at Google’s Research division. Whatever else you might think about the company, Google software engineers (SWEs) are generally pretty decent. Many of them are interested in ML research projects because they’re all the rage these days. The research teams, of course, just want to do research. But they can get professional SWEs to build their tools for them by letting them feel like they’re part of cutting edge research. So they end up with a mix of early-career SWEs building tools that aren’t inherently all that interesting or challenging but get used to do very interesting and impactful research work and a few more experienced SWEs who want to make the transition into doing research.

                    2. 12

                      It’s not our fault that software is looked down upon in academic circles as work for unskilled, undertrained students. You can build the safest language in the world and still have a sufficiently foolish fool create a disaster. The issue isn’t with software, it’s with culture.

                      And no tooling can solve a culture problem.

                      1. 3

                        It’s not our fault that software is looked down upon in academic circles

                        Considering the dismal state of commercial and open source software, I’d say it kind of is. Too much widely-used snake oil and superstition.

                        1. 9

                          That’s not the reason it’s looked down upon. It’s looked down upon because of the hierarchical nature of academia.

                          You can find plenty of reasons to flagellate yourself. Go ahead, I won’t stop you. But that doesn’t answer the hierarchical issues at play of “being above” such dirty work as observed from academics.

                        2. 1

                          It’s not our fault that software is looked down upon in academic circles as work for unskilled, undertrained students.

                          Admittedly, the feeling is quite mutual on the software industry side.

                        3. 4

                          A clear message saying “Unless you are willing to train for many years to become a software engineer yourself, this tool is not for you.”

                          Is there a single well-established general purpose programming language that we wouldn’t, in good conscience, have to slap that warning sticker on? I mean a relatively small amount of due diligence shouled have revealed that C++ isn’t, by far, the obvious first choice here — Python, Julia, and Go seem far more sensible — but I’d argue even the most “novice friendly” first “real” programming language takes 3-5 years of solid effort to master to the point of being able to reliably write sensible, well reasoned, well structured, and well tested code. It’s just not a smaller discipline then that, but the author seems to somehow think “we” can possibly reduce the minimum energy of activation for one niche consumer group with highly specialized needs and no budget.

                          1. 5

                            Python, Julia, and Go seem far more sensible

                            The model evolved over 13 years - the initial coding predates both Go and Julia’s first releases by 5 years.

                            1. 2

                              Ok, so Go and Julia weren’t options, then Python + NumPy + SciPy certainly were in that time frame.

                              1. 4

                                Numpy really exist since late 2006 (Numeric and numarray existed before the merge in Numpy). Scipy started in 2001 and Numpy begin a part of it in late 2006. Honestly, the Python scientific environment changed a hell lot during the last ten years.

                                And also, you have to reflect on which infrastructure you had available at that time. C++ was maybe not the sanest choice but back in those times if you needed something that could run fast and handle that amount of RAM/processing power you didn’t have a lot of choices.

                                1. 1

                                  I went with a back of the hand calculation based on the comment made; 2020-13 years seemingly puts you within the era of both SciPy and NumPy. Regardless, the point remains: if C++ was the right choice at the time then complaining about its footguns – nevermind still using them – 13 years later is nonsensical; learn the tool, then use the tool, not the other way round.

                                  1. 3

                                    Sometimes, the mentality about “If it ain’t broke, don’t fix it” apply a lot to code produced during research and the reuse of it. As said in another reply, it also boils down to resource/time-management/infrastructure. Big rewrite are time-consuming, don’t provide the possibility to publish about it so if you can use the same code lying around for a while why not. It ain’t pretty but I have the feeling that most of the time, that’s it.

                                    Sometimes you get code out for a paper and that’s it. It will lay around until someone will need it to build on top of it. R libraries are really a good example of that. Looking at the number of them that dies at the same moment the PhD student leaves or the researcher works on something else.

                                    The tooling is way better now for scientific programming but legacy-code still and will be everywhere. Moreover, as already said that code don’t rely on external lib, stand the effort of time and can scale correctly with OpenMP. You can not said that for a lot of code I have seen around in recent years of academia where libraries broke program a lot of time.

                                    Most of the time, you really have the time to learn the subset of the language you need to get the thing done. There is always that funny question in my mind : Is it easier for a scientist in X field to learn to program or for a software dev to learn X field? Honestly I don’t know, it depends. But most of the time, it is not yet common to have the two skillsets at the same level.

                                    1. 2

                                      Is it easier for a scientist in X field to learn to program or for a software dev to learn X field? Honestly I don’t know, it depends. But most of the time, it is not yet common to have the two skillsets at the same level.

                                      Should it be? I’d argue no.

                                      From experience I’d say it takes roughly the same amount of time to reach an equivalent breadth and depth of competency in any complex field, and “programming” isn’t a single discipline any more than “science” is, so you can’t expect the two skillsts to coexist to the same degree unless the individual has had twice the time to dedicate to learning.

                                      Edit: BTW, I completely agree on the “if it ain’t broke, don’t fix it” point, and I’m not saying anyone should have rewritten this particular bit of spaghetti: if it’s producing meaningful models then it could have been programming in brainfuck for all I care. BUT, if someone put 13 years of effort out in public view for the first time and it was written in brainfuck, I’d justifiably point out to them that maybe adopting more of an industry-best-practices language and development approach would make their lives both a lot easier and more fruitful.

                            2. 4

                              As the brother comment already points out, the development of the model predates a few year Go and Julia and even at that time the state of Python or R for scientific computation was not there. Porting those kind of programs to another programming language for whatever reason is a huge amount of work that most of the time you don’t have in academia and you are stuck with whatever the first researcher/Phd student has already put a massive amount of work on his own and validate their results.

                              I had to port an kind of hand-tuned rule-engine from Matlab to Python done by a researcher gone from the lab and it was tedious at least (and you can not fast-check what you have done because it was running on satellite image of the world at 100m resolution split in multiple strata with specific rules for each. You had to look at the 22 files, diff them to find the common part extract those, and so on to go from dozen of thousands lines of Matlab to a few hundreds of Python (and we port only half of the engine because we only needed that part. If we ever need the other part, someone else will have to do the port and merge with the existing python program).

                              I am no C++ expert (I can barely read it by habit of looking at source code of interest) but what I mainly see in the epidemic simulation code on github is that they almost don’t use any external library. I can’t provide any opinion about good or bad practises but I can read it. If you want to something close in Python/R/Julia/Go, you’ll have to use some to a lot of just to reach the speed and volume needed and it will be with C++/C/Fortran behind it. Honestly the code is more than a lot of what you will get usually in academia. More often than not, you just get a bunch of papers some complementary files linking to another papers and sometimes some old code lying around.

                              Meanwhile, C++ is everywhere in scientific program, either in front of you or behind. I mean R an C++ are so intertwined together is crazy if you look at it : RCpp or stuff like Template Model Builder. By example, in a field I know a bit more like Remote Sensing, you will see a heavy use of GDAL, OTB, GEOS, PROJ to name a few and all are written is C++ even if they provide API to various extent (mostly thanks to SWING). If you want to do high-level and prototype stuff, Python/Julia/R/YouNameIt works only because of those low-level libraries (and a miss a lot of them between : GLS, BLAS, LAPACK, Eigen, etc. being a mix-bag of C++/C/Fortran). Honestly, I am asking myself a lot if I will skip what I want to learn (Rust/Go/Clojure/Whatever) and just focus on learning C++ to add it to my skillset. I am not a software dev, just a researcher doing mostly programming. I self-learned progrmming and it got me jobs in academia (was not a good student :)).

                              1. 1

                                Ok, well I am a software dev, and unless what you’re developing IS low level mathematics and numerical computing libraries — as opposed to developing applications and models that use those libraries — then C++ is very obviously the wrong tool for you, especially if you’re going to write terrible C++.

                                Yes, of course, numerical and scientific computing leverages a lot of existing FORTRAN and C (and some C++ that tends to primarily be C)… that’s what I leverage all the time in my work in visual effects; but once you’ve got those underlying tools you can wrap them up in a well-formed FFI and actually using them becomes infinitely easier.

                                So, my guess, is that you can use everyone one of those libraries you’ve mentioned in Python / Julia at roughly the same speed you’ll get out of them in C++, but you’ll be doing that through an interface that’s designed to relieve you of all the incredibly numerous and well known footguns of C++.

                                The whole thesis of the open letter is that software developers should give scientists better tools without lots of footguns. We have! They’re Python + NumPy + SciPy, as a more mature option, with Julia being less mature but more especially designed for science. On the purely mathematical end of the spectrum there’s Idris and Coq, derived from Haskell… language developers have already tried to give you the tools you need. Apparently you’re either unaware of that as a community or you’re refusing to use them, then complaining when people with actual software industry experience look at you like you’ve pooped the bed.

                                1. 2

                                  ———– EDIT –––––––

                                  I have found your blog by looking out and I think that the post about Hugo is on point with partly how to find tools (and libraries and programming languages). You have two situations : inherit already done work in X or Y and deal with it (so you have to learn on the spot the subset that you need to make it work) or you can begin from scratch. You have your list of the Minimal Viable Tool and look out first to co-workers and to what is used in the field. Based on your constraint, you end with a choice. Python + Numpy + Scipy is not always what you want or need (in fact part of the time). You need scale up, you up to learn how to use : Dask, Xarray, Numba, etc. based on what you think you will need. You need to stat tools, you got stuck with R (Yes, you can call R lib from Python but it clearly will not be your first reflex to look for that). Need to go faster, bigger, stronger? Hello C++. I don’t know a thing about the Java Ecosystem but I assume it will imply the same digging till a choice is made.

                                  –––––––– END of EDIT —————–

                                  The whole thesis of the open letter is that software developers should give scientists better tools without lots of footguns. We have! They’re Python + NumPy + SciPy, as a more mature option, with Julia being less mature but more especially designed for science. On the purely mathematical end of the spectrum there’s Idris and Coq, derived from Haskell… language developers have already tried to give you the tools you need. Apparently you’re either unaware of that as a community or you’re refusing to use them, then complaining when people with actual software industry experience look at you like you’ve pooped the bed.

                                  First of all, looking down at scientists as a unified community as a community is flawed. It depends so much on the field and the exposure of the field to software and the need of it. Python + Numpy + Scipy are just part of the equation and you have to learn those skills somewhere to begin with. I had know a few fantastic students and researchers not able to properly use a computer beyond basic usage. They are brilliant but never had to learn how to do it before being forced to do so for a project. My opinion is mainly that to see better software practise in scientific research is to provide a better education and saner tools that hide the footgun and stil be able to scale to larger problems.

                                  I think that the whole thesis stand of the article stand when you see new tools coming like stan and other declarative approach to modelling. Most of time, scientist don’t know how to program at the beginning of their career and learned what is done in the lab they are working. I personally think that most of us must learn stuff as database management and querying because we can go a long way with Sqlite and Postgres for example. Some field are locked by commercial products too like in public health/epid and the use of SAS environment. Heck, in R, I am found of data.table because I know SQL and this mindset let me use it and share with people with weaker R skills but they know Excell (or any table calculator).

                                  Honestly I understand your point of view and partially agree to it but it all boils down to resource, time-management and access to infrastructure. I would love to see more education around sane options and easy way to handle scaling a prototype but the more you try to go at a higher resolution of data, the more you begin to hit a wall on performance, knowledge or resources.

                                  PS : I am clearly biased by my work experience in specific fields. All I can say is that I can understand why one ends up to write C++ code because of constraints and labs culture. Some fields are in better shapes than other, look at remote sensing for agriculture/environment or bioinformatics. Epidemiology is hard to do at large scale (even country-scale) because of not only the statistics part tend to be solid piece to manage but also the quality of the data and voodoo magic you have to do to achieve sane datasets is most of the times not easy. When you see the amount of works needed by the IHME and the amount to effort needed to produce the global map of malaria is honestly crazy and results can be looked at Malaria Atlas Project. Another anecdotal story : The Global Forest Watch project was initially written in Clojure with heavy use of Spark. Because of the lack of knowledge by the fellow scientific community in the field, it had to be ported to Javascript to be maintained by the community. It was heart-breaking for me because I am well aware of the existence of various tools and programming languages (strong and weakness). I really hope that we can find a (open source) set of tools and good practises as we had found in lab work when it comes to programming models/simulation/computation/data analysis but honestly I don’t see that happen at all in a near-future.

                                  1. 2

                                    I’m not looking down at scientists, at all, I want to empower scientists. Moreover, I want scientists (at least those without concomitant interest in computing) to spend their time doing actual science, and when they need to do things that require any substantial expertise in programming then find (by whatever means) the way to tap into that expertise, because we already exist and are HAPPY to help. So bring the two communities together, rather than live apart.

                                    The open letter is specifically complaining about the lack of good warning signs on the tools that exist and a lack of science-specific tools coming out of the software engineering community. Well, the whole point of my original comment is that becoming a competent software developer (in any language) is a significant investment in both time and energy, it’s a deep subject and developing real competency in it is at least the same sort of commitment as is becoming a specialist in any given scientific subject. To truly understand all the footguns (and therefore be able to give that sort of “here be dragons” warning sticker on a language) requires on the order of 10-15 years of solid and polyglot professional experience.

                                    So it’s heart-breaking to me to see that those fields are having so much difficulty with computing, but the tools we (the software people) know you (the scientists) need are out there … we, no doubt, are ignorant of a great many needs you have but have not communicated. Please communicate them … but it’s particularly terrifying (and elucidating) to find that your community (and especially those with tutorial-hell-only skills) has been using the wrong language(s) for what are (very likely) the wrong reason(s).

                                    How do we embed more software engineers and programmers in labs? Or get your side of these questions the education / help they really need? Cause there’s a whole lot of resources available and interested and simply unaware of your real needs, and frankly terrified that epidemiologists are tripping over things like global variables.

                                    1. 2

                                      […] How do we embed more software engineers and programmers in labs? Or get your side of these questions the education / help they really need? Cause there’s a whole lot of resources available and interested and simply unaware of your real needs, and frankly terrified that epidemiologists are tripping over things like global variables.

                                      A few ideas accumulated those last three years.

                                      On how to get more SE and programmers in labs and have a better code quality in labs:

                                      • Create and maintain a proper RSE track or facility that can be taking in account while in competition with other universities/teams on big grand proposals. This should be see as an asset not as cost liability. See this comment. Money is a big struggle in this area IMHO.
                                      • Finding a way to attract SE that is not based on salary because academia can’t and will not compete with industry in this area.
                                      • Value a culture of writing programs that can be combined together and proper format specification on input and output instead of reinventing the weel or tweaking your data to fit the program.
                                      • Create and value IT infrastructure and tech working on it at the university level or higher. Getting AWS credit or some obscure-cloud credit it is not a full solution and may not provide a clear to reproduce results on big scale study.
                                      • Push for Open Science and campaigns as “Public Money = Public Code” (not specially FSF-only campaigns but you get the idea).
                                      • Push and force teams to learn with Git or any DVCS and with tools like Singularity (kinda docker for science).
                                      • Document your analysise and use tools that can make it understandable at multi-level. I really like the idea behind the Common Workflow Language even not a fan of using a YAML syntax.
                                      • Find a way to give budget and resources to maintain and update research code and libraries not only on relying on volunteers and their free-time.
                                      • “Automate the Boring Stuff” approach because we will never have uniform data format ever.

                                      As your see, a lot of think are basic stuff on SE-side but sometimes I have the feeling that we don’t push in lab that your code will outlive you. The turnover is so fast and people do what they need to be done and go away to find another fixed-term contracts (for post-doc at least).

                                      On how to enhance study programs in sciences or any others fields relying more and more on software :

                                      • Stop teaching tools that are not on par with the de facto tools used in research. I have seen to much of students learning Matlab when it is not used in research in their specific field for the last ten years. It give a wrong sense about what programming can do and lock them in specific tools that can not be generalized. Learn Python. It’s free and use the money saved on the Matlab contract to do something meaningful for your students. You will have the time to learn proprietary tools in research when you will be sure than they are needed.
                                      • Teaching programming as an “Automate the Boring Stuff”-mind to show how it can value your time and focus on what really matters.
                                      • Teach a database introduction course so students can really feel what is it and don’t only view it like a foreign construction for high-scale website. They will stop being stuck in only thinking about files and format and begin to abstract their data.
                                      • Students don’t have to be formed as full programmer but have a basic understanding to able to communicate in a meaningful way about it. You need how to form a question properly before finding any answer.
                                      1. 3

                                        Thanks for taking the time to write all that out. Funny how much of that looks pretty much identical to a whole series of culture-change documents and discussions we had in VFX over the last decade or so as we slowly and painfully transitioned towards smaller teams and more automation. Public money = public code, indeed … I’d love us to get more into the idea of public compute as well. I’ve personally reached the point in my career where salary stopped being an attraction and making a meaningful contribution has taken over as my driver, and I’d love to move into the sciences instead of continuing to help enable Michael Bay’s destruction of human culture. If I could see, on a day to day basis, more of what it is scientists have to deal with and how showing people why sqlite can be their friend would be huge, it’d be great. Personally I think there’s something to be said for simply asking us SE people to come look over your shoulders and take notes. Fundamentally that’s what I do to empower visual effects artists; I look for inefficiencies and redundancies and then I optimize them away. At the end of your life, though, there’s only so much sense of satisfaction to looking back and knowing you’ve primarily succeeded in helping bring terrible movies to market.

                                  2. 0

                                    I have flagged this comment as ‘unkind’, It is more combatative than helpful.

                                    1. 1

                                      Fair enough, though I have flagged that as ‘incorrect’. I’m very much trying to help the individual I’m talking with.

                                      1. 1

                                        I do think you’re correct, just being a bit of a jerk communicating it.

                                        Specifically this sentence:

                                        Apparently you’re either unaware of that as a community or you’re refusing to use them, then complaining when people with actual software industry experience look at you like you’ve pooped the bed.

                                        1. 3

                                          I agree, I’m being a bit of a jerk. I don’t think, however, that the colorful language is either unwarranted, nor the point being made unhelpful.

                                          Think of it this way: if someone who was a software developer spent 13 years writing a paper on, I dunno, physics, and then plopped it on a preprint server and all of the methodology was utterly wrong, impossible to follow, and clearly against every known best practice in the field, wouldn’t actual physicists understandably think the software developer had made a mess in his/her proverbial sleeping area?

                              2. 4

                                Eh, would you say the same thing about statistical methods (another common area of scientific misuse and failure, that requires some level of competence to wield responsibly), blaming it on the tool rather than the practioner? They seem analogous to me.

                                1. 4

                                  For all the shitstorm, I see no actual bug report invalidating results in the open issues. Can anyone please point it out?

                                  Otherwise it feels all the testards and drive-by team leaders will teach researches better than to open up their code.

                                  1. 7

                                    This pretty much matches my impression as well – it’s hard to wonder if any of the people who wrote those “analyses” ever used – let alone wrote – simulation software.

                                    There’s enough incorrect material in them that writing a rebuttal would be tedious (plus I really ought to stop wasting time on Lobste.rs and get back to work…). But I just want to point out that the rhetoric matches the analysis.

                                    For example, in this article you see things like:

                                    “A few people have claimed I don’t understand models, as if Google has no experience with them.”

                                    unless the author of the article is Google, that’s pretty much irrelevant. Google has hundreds of thousands of employees, I bet quite a few of them don’t understand models. (The author of this article is definitely one of them, by the way).

                                    Edit: it’s nothing to be ashamed of, at any given moment there’s an infinite amount of things any of us doesn’t understand. But ranting about things one does understand usually gives better results.

                                    1. 4

                                      Are you saying that nondeterminism doesn’t matter because the model is supposed to be nondeterministic? Then why are they nevertheless fixing the nondeterminism bugs?

                                      Do you understand the value of reproducible research, which logically implies making the source code open in this case? Are you aware that Ferguson’s code wasn’t open source for over a decade, and that is part of the problem?

                                      1. 8

                                        To answer the nondeterminism part, normally you take a large set of runs and analyze them as a group.

                                        For example, a monte carlo of a gamma particle tunneling through radiation shielding is inherently non deterministic, however a large number of runs allows you to find the distance necessary for most if not all particles to be stopped safely. Nondeterminism is not an issue if the behaviors involved allow you to derive reproduceable results from the aggregate.

                                        That said, software bugs like incorrect branching can also be nondeterministic. The degree to how much they affected the simulation is often done through error propagation analysis or comparing the results before and after. Not all bugs are created equal - many can be obviously wrong but not “infect” the results enough to trash them. Still can muddy it tho.

                                        That’s why yes you fix bugs in nondeterministic models because the model is meant to be the only source of nondeterminism. Bugs have to be reduced out enough to avoid tainting the result set

                                        1. 3

                                          To answer the nondeterminism part, normally you take a large set of runs and analyze them as a group.

                                          If your simulation is meant to be nondeterministic, then good reproduceable science uses a strong PRNG and takes a seed from a configuration. You run it with a fixed set of seeds but can then reproduce the same results by providing the same set of seeds. If it’s not meant to be nondeterministic then it’s a bug and it’s impossible to know its severity without knowing more (but in C++, it can be any kind of undefined behaviour and so the end result can be complete nonsense).

                                          1. 2

                                            For example, a monte carlo of a gamma particle tunneling through radiation shielding is inherently non deterministic, however a large number of runs allows you to find the distance necessary for most if not all particles to be stopped safely.

                                            Sorry if I misunderstand, but surely being careful with when and who is calling your PRNG helps limit this, especially in a single-threaded case?

                                            Over in game development the issues around non-determinism are a pretty well-known if not always well-solved problem and have been for near two decades, at least.

                                            1. 8

                                              (Note: not parent).

                                              There are processes – I’m not sure if gamma particle tunneling is one of them because solid-state physics isn’t exactly my field, but if I recall things correctly, it is – which are inherently probabilistic. It’s supposed to give different results each time you run it, otherwise it’s not a very useful simulator, and I’m pretty sure I read at least one paper discussing various approaches to getting a useful source of randomness for this sort of software.

                                              (Edit: there are various ways to cope with this and reconcile the inherent determinism of a machine with the inherent probabilistic character of a physical process, assuming you really do have one that’s probabilistic. It’s not as simple as yeah, we just write simulators that give different results each time you write them.)

                                              In this particular (i.e. Ferguson’s code) case, the non-determinism (fancy name for a bug. It’s a bug) manifests itself as a constant-ish extra error term – you get curves that have the same shape but don’t coincide exactly, at least not over the duration where the model is likely to give useful results..

                                              Unfortunately, that’s exactly what you expect to get when doing stochastic process simulation, which is probably is a plausible reason why it wasn’t caught for a long time. This kind of error gets “folded” under the expected variation. That can have two outcomes:

                                              • If the errors are random, then averaging several runs will indeed cancel them out
                                              • If the errors are systematic, then averaging several runs will yield an extra (likely time-dependent) error factor, but it’s hard to say if that actually changes the simulation outcome significantly without doing an actual analysis.

                                              Thing is, the latter case is usually swept under the rug because these models are meant to investigate trends, not exact values. If you look at the two graphs ( https://github.com/mrc-ide/covid-sim/issues/116#issuecomment-617304550 – that’s actually the only substantial example of “non-determinancy” that the article cites), both of them say pretty much the same thing: there’s a period modest, then accelerated growth, that settles for a linear growth after 50-60 days.

                                              It’s not really relevant if you reach 200,000 deaths in 62 or in 68 days – not because “reproducible outcomes don’t matter” but because there is an inherent expectation that a model that’s supposed to tell you how a flu will spread over 90-150 days in a non-homogenous population of 40,000,000 people is not going to be accurate down to a few days.

                                              Edit: to clarify – I’m not saying that’s not a bug, it is. But it’s definitely not clear that its impact over the simulation results is enough to invalidate them – in fact, if I were to speculate (which is exactly what the authors of these critical articles do, since they don’t actually run any numbers, either) I’d say they probably don’t. The one bug report shows only two curves, and that’s not even enough to refute the authors’ argument that averaging enough runs will cancel out these errors.

                                              Edit: also to clarify – what parent comment is saying is, IMHO, completely correct. The only source of non-determinism in the result should be the non-determinism in the model, and bugs that introduce extra error factors should absolutely be fixed. However – and this is the erroneous message that these articles are sending – tainted result sets can still provide valid conclusions. In fact, many result sets from actual, physical measurements – let alone simulations – are tainted, and we still use them to make decisions every day.

                                          2. 3

                                            Do you understand the value of reproducible research, which logically implies making the source code open in this case? Are you aware that Ferguson’s code wasn’t open source for over a decade, and that is part of the problem?

                                            There is a culture problem in academia around this, but it is getting better and more journals are requiring source code with paper submissions.

                                            In this case, the model has been reproduced by researchers using different Probabilistic Programming Languages (Turing.jl and STAN), which is the bar it needed to reach. Discussion of the implementation quality isn’t really useful or scientifically interesting. It’s the inputs and modelling assumptions that are interesting.

                                            (Draft?) replication post here: https://turing.ml/dev/posts/2020-05-04-Imperial-Report13-analysis Code for that post is here: https://github.com/cambridge-mlg/Covid19

                                        2. 4

                                          There’s coverage from the first link in the submission.

                                          1. 0

                                            “Lockdown sceptics”, seriously? “Stay sceptical, but presuppose the conclusion you want to reach and find facts in support of it”?

                                            1. 6

                                              That’s neither here nor there, let’s stay on discussion about the issues they’ve found.

                                              1. -2

                                                Yeah, they may have a perfectly good breakdown of issues in the simulation which affects results, I’m not discussing that. I didn’t take the time to read it (and probably won’t; the topic doesn’t interest me that much), and I should’ve been more clear that I’m not saying their findings are invalid. I just thought it was worth pointing out, and probably should be something people keep in mind while reading their review.

                                        3. 2

                                          Ooh.. the reason I left Hackernews is because it has these category of posts alot..please try to avoid these type of content here :)

                                          1. 3

                                            I am sympathetic. This is what we sometimes call a “lobster boil”. The worst of them do get flagged and removed; this one’s not all that bad.

                                            But, you should be aware, that’s what the “hide” button is for: individual temperature control.

                                            1. 2

                                              So far no-one has flagged it off-topic, and to my mind, it’s generated a lot of good comments and discussion.