1. 88
    1. 27

      What a delightful read. I want to highlight a bit that resonated with me:

      It could be that the biggest disadvantage is professional. Every year you spend in the Minerva monoculture the skills you need interact with normal software atrophy. By the time I left I had pretty much forgotten how to wrestle pip and virtualenv into shape (essential skills for normal Python). When everything is in the same repo and all code is just an import away, software packaging just does not not come up.

      I have observed this phenomena across every single company I’ve worked at, made somewhat worse by places that were early adopters of webshit. Everybody has their own way of using React. Everybody has their own basic Elixir application that then mutates. Everybody has some custom Jenkins clusterfuckery with docker and k8s. Invariably, you come to a new job “Oh hey they use Express this should be–oh dear christ what the hell is going on this is not normal.”

      I think that’s more an indictment of our industry and some weird aversion for standardized practices than anything else.

      This is also why I’m a little hesitant about people that talk about the super productive devs at Twitter or Google or whatever…they are building on top of very tuned ecosystems. I’ve noticed myself that the early days at a place without the developer tooling I’m used to are always miserable and slow, and after a year or however long it takes to drag things kicking and screaming along it’s like “oh gee getting stuff out the door is so easy”; upon switching, it’s back to walking on bare glass.

      1. 8

        Your comment wonderfully summarizes what was brewing in my mind while reading this article, and comparing it with my current situation. Beginnings are miserable until we have well established pipelines and workflows, as we spend more time swimming upstream and finishing side quests than solving the actual problems, which I find super annoying and demoralizing.

        Also, I agree that we spend a lot of time trying to use well known tools that become amalgamation of responses to everyone’s needs, instead of crafting/modifying something to directly address our immediate needs.

    2. 15

      Seeing the examples use floats for currency made my eye twitch uncomfortably. None of the code I’ve written for financial institutions did that. Is that really done in this space?

      1. 37

        I used to work at a firm that did asset valuation for bonds, and we used floats :).

        It’s generally fine to use floats when it comes to asset valuation, since the goal is to provide an estimate for the price of a security. Our job was to estimate the value of a bond, discounting its price based on 1) the chance that the issuer will go bust or 2) the chance that the issuer will pay off their debt early.

        My understanding is that floats are never used in places where a bank is dealing with someone’s physical assets (it would be a disaster to miscalculate the money deposited into someone’s account due to rounding errors). Since our firm was not dealing with money directly, but instead selling the output of statistical models, floats were acceptable.

        1. 9

          That makes absolute sense to me. Thanks for sharing the difference. We were dealing with transactions (and things like pro-rated fees, etc.) so even for things where it made sense to track some fraction of a cent, it was “millicents” and integer arithmetic. I wasn’t thinking in terms of model output.

        2. 4

          it would be a disaster to miscalculate the money deposited into someone’s account due to rounding errors

          IME the really really hard thing is that summing floats gives different answers depending on the order you do it in. And summation operations appear everywhere.

      2. 7

        @jtm gave you a bit more detail, the original post offers this in the Other notes section:

        One of things that tends to boggle programmer brains is while most software dealing with money uses multiple-precision numbers to make sure the pennies are accurate, financial modelling uses floats instead. This is because clients generally do not ring up about pennies.

        1. 6

          Ah I missed this, but yes – exactly this.

          This is because clients generally do not ring up about pennies.

          An amusing bit about my old firm: often times, when a bond is about the mature (i.e. the issuer is about to pay off all of their debt on time), the value of a bond is obvious, since there is a near-zero chance of the issuer defaulting. These bonds would still get run through all the models, and accrue error. We would often get calls from clients asking “why is this bond priced at 100.001 when its clearly 100?” So sometimes we did get rung up about pennies :).

        2. 2

          If that was there when I read it, I overlooked it because my eye was twitching so hard.

          1. 2

            It’s completely possible they added the Other notes section later! Just wanted to share since it addressed your question directly.

      3. 3

        I never wrote financial code, but I also never understood the desire to avoid floats / doubles. They should have all the precision you need.

        Decimal is a display issue, not a calculation issue. I think the problem is when you take your display value (a string) and then feed it back into a calculation – then you have lost something.

        It’s like the issue with storing time zones in you database vs. UTC, or storing escaped HTML in the database (BAD), etc.

        Basically if you do all the math with “right”, with full precision, then you should be less than a penny off at the end. I don’t see any situation where that matters.

        Although on the other side, the issue is that “programmers make mistakes and codebases are inconsistent”, and probably decimal can ameliorate that to some extent.

        I also get that it’s exact vs. inexact if you advertise a 0.1% interest rate, but I’d say “meh” if it’s a penny. It’s sort of like the issue where computer scientists use bank account balances as an example of atomic transactions, whereas in real life banks are inconsistent all the time!

        1. 11

          I also never understood the desire to avoid floats / doubles.

          Addition isn’t associative, so the answers you get from summations are less predictable than you would like

        2. 7

          I think in practice the issue may actually be that floats can be too precise. Financial calculations are done under specific rules for e.g. rounding, and the “correct” result after multiple operations may actually be less mathematically accurate than if you’d just used 64-bit floats, but the auditors aren’t going to care about that.

          1. 4

            It’s not just that, it’s that the regulations are usually written to require that they be accurate to a certain number of decimal digits. Both the decimal and binary representations have finite precision and so will be wrong, but they’ll be differently wrong. Whether the binary floating-point representation is ‘too precise’ is less important than the fact that it will not give the answer that the regulators require.

        3. 4

          like @lann and @david_chisnall mentioned, it’s not about being precise, it’s about getting the answer expected by the accountants and bookkeepers and finance people. Back when they were doing it all on paper, they built certain rules for handling pennies, and you have to do it the same way if you want to be taken seriously in the finance/banking/accounting industries. Back then they couldn’t cut a physical penny in half, so they built rules to be fair about it. Those rules stuck around and are still here today and are sometimes codified into law[0]

          As for “meh” it’s a penny, they generally don’t care much about anything smaller than a penny, but they absolutely care about pennies. I regularly see million dollar transactions held up from posting because the balancing was off by 1 penny. They then spend the time it takes to track down the penny difference and fix it.

          0: PDF paper about euro rounding

        4. 1

          how do you store 1/3 with full precision ?

          1. 1

            Not with a decimal type either :)

            1. 1

              sorry I misread your post

    3. 7

      This sounds wonderful to me.

      1. 7

        I was thinking just the same. Data-oriented, dataflow, data persistence and discoverability, easy way to run and release… Getting to the same outcomes with current cloud tech is a year long mission on constantly moving grounds.

        1. 4

          Oh, definitely. And the grounds move at someone else’s whim, completely outside of our control, making us play catch up.

        2. 4

          It’s like a view into a world that developed without being trapped in the Unix zeitgeist. I’m sure that there’s a load of it that’s nightmarish but it would at least be differently terrible.

          1. 7

            I’ll disagree a little–it’s not the Unix zeitgeist, but (I posit) the tolerance of bikeshedding and worrying about the “right way” of getting things done instead of just, you know, solving the immediate business problem.

            The large evolutionary pressure seems to be “we need to be a force multiplier for the folks that use Excel and know more than us”. Such a world doesn’t get bogged-down in React version rewrites, devops fads, or whatever.

            (It does, of course, have other nightmares as you note!)

            1. 1

              That’s fair. I’m a noted anti-Unix zealot, so.

              1. 2

                No harm in it!

        3. 3

          I agree! If an organization is focused on solving things in their problem domain, they will build software that helps them solve their problems. In fact, they will build software that helps them build software. In time, there will be a very custom programming environment that has been tuned to addressing needs in your domain and nothing more.

          This is also, as I understand it, the mindset behind thinks like k and qdb. It’s a toolset that’s evolved in a very specific niche and has accumulated years of evolutionary adaptations to performing in that niche. From what I understand this is also how Jane Street runs their technical org - you first go through a bootcamp, learning how to develop “jane street” code in the “jane street” ide (emacs + all the things they’ve added). If someone from JS is reading this, please corroborate if you can.

          Not being beholden to whatever google (k8s) or facebook (graphql, react) tell you to use this year is a big plus in my book. Focus on the problem and how to solve it. The bonus is that folks in your org can actually master a large part of the stack they use to run the show.

    4. 4

      (Parts of) this sound a LOT like Zope and ZODB. Code living inside the DB was totally a thing in Zope.

      But likely not because ZODB has snapshot transactions, it’s ACID-ish rather than eventually consistent, and no zipping of pickles.

    5. 4

      This is a previous discussion about the Tables from this article.


      The article is even funnier knowing the real names of the tools mentioned!

    6. 4

      My understanding is that Wes McKinney wrote and open-sourced Pandas while working at an investment bank, so at least some of this stuff has escaped into the real world. Pandas is one of the most popular Python libraries.

      Another note is that SAP is another major environment where code lives in the database alongside data.

      Finally, Prolog-based authorization is coming into vogue in the open-source world with Open Policy Agent.

      1. 4

        Pandas is one of the most popular Python libraries

        The value of Pandas really can’t be overstated. Python is one of my least favourite languages, Pandas is the reason that I use it for data manipulation.

    7. 3

      Very interesting. I have thought before that someone should write “an oral history of data journalism JavaScript” or some such for my field. My biggest takeaway though is that if you can get a bank programmer to “vouch” for you, you could round off a lot of pennies…

    8. 3

      One of the slightly odd things about Minerva is that a lot of it is “data-first”, rather than “code-first”.

      This inherently makes sense to me when the data is worth much more money than the code.

      “Medium-sized” is big enough that you cannot create an object for every row but not so big that you are going to need some distributed compute cluster thingy.

      This is a compelling sounding thing because the upper bound on “medium” keeps growing over time as SSD capacity and throughput get cheaper.

    9. [Comment removed by author]

    10. 1

      I very much want to hear about the prolog based auth system. We recently started using such a thing and it’s better than RBAC in so many ways.

      Is that system perhaps based on soutei?

    11. 1

      I’ve mentioned Barbara overlays. They also work for source code. You can tell Walpole to mount your own ring in front of sourcecode when it’s importing code for a job and then you can push source files to that instead of getting them vouched into sourcecode.

      Sounds like Nix overlays to me :-)