1. 52
  1.  

  2. 76

    Interesting article. Yeah, some advocates are going off the rails with their praise of static typing. I find it helps to look to prior successes and tradeoffs to try to understand how to best do engineering. The people who invented the engineering of software were independently Bob Barton of Burroughs B5000, Dijkstra w/ THE OS, and Margaret Hamilton in Apollo & 001 Tool Suite. Each had pieces of the problem with some culminating with later work into the TCSEC and Orange Book’s B3/A1 classes of high-assurance security. Big companies like IBM came up with formal inspections and Cleanroom’s cost-effective approach. Cleanroom consistently resulted in low-defect software in commercial applications even on first use. Became my baseline. Meanwhile, safety-critical embedded community continued using all kinds of variations of them with lessons learned. CompSci continued working on formal methods, type systems, testing strategies, and runtimes trying to figure stuff out with numerous successes. Private companies differentiating on high-quality systems used a combination of formal methods (esp Altran/Praxis or Galois), safer languages (all), and DSL’s (iMatix and Galois). So, we have plenty to draw on in the lessons learned reports.

    Let’s start with what high-assurance projects showed would help by many real-world demonstrators:

    1. Ambiguous, English requirements created problems. The requirements, functional or otherwise, must be specified in a precise, unambiguous form. They used formal specifications for that combined with English descriptions so people knew what they were looking at.

    2. Design should be modular with easy to follow control flow plus careful attention to correctness of interfaces and any assumptions modules have about interfacing. All of them did this. B5000 did it with argument and type checks at both compiler level and CPU level during runtime. Hamilton noted interface errors were 80+% of problems on Apollo development. Her USL and 001 Tool Suite used formal specs of functions and interfacing to generate code free of interface errors so long as specs were correct. Would knock out 80+% of flaws potentially. This showed up in larger, commercial use with Eiffel Method/Language and SPARK Ada later combined it with static analysis.

    3. Highest-assurance had formal proof that the design matched the requirements or security policy. Medium-assurance did this informally where each part mapped to other parts with clear, convincing arguments they corresponded. Almost every project that applied formal proofs, whether it helped a lot or barely, noted that simply redoing specs and code simple enough for provers to use itself caught errors without doing the proving process. Lasting lesson there.

    4. Since Bob Barton, it was known that there would be less problems if the language, CPU, and OS made common activities safe to perform by default. His B5000 did that with it being immune to most forms of code injection today. MULTICS’s reverse stack and non-null-terminated strings with PL/0 made it immune to some problems. Wirth’s languages checked bounds, argument types, etc. Hansen had one free of race-conditions. Eiffel’s SCOOP took that further with Rust doing it again. Various ML’s, Haskell, and IDRIS took it even further in various ways. The focus was designing something so that average use could make sure entire problems would never happen. More typing led to more time so tools often did prototyping in unsafe mode with quick tests to make sure it worked initially. Full type-checks and thorough tests could run in background, on clusters, or overnight to get more thorough results.

    5. Review of all that. Burroughs did it implicitly with Barton seeming to invent the concept in an off-hand comment or two. Apollo team did it religiously. Commercial sector got the message when Fagan introduced Software Inspection Process at IBM that dedicated percentage of time to finding bugs based on checklists of kind that keep cropping up. Worked wonders. Later, OpenVMS team alternated one week building, one week reviewing/fixing based on priority, and all tests ran during weekend. One of most reliable OS’s ever built. Microsoft did a 180 in security when high-assurance engineer, Steve Lipner, gave them a watered-down, ship-quick version of this list under SDL. The reviews & QA tests knocked 0-days down to lower than most FOSS that I can tell. So, review, review, review!

    6. Testing. Came in many forms with many benefits. Acceptance tests were about customers as much as proving the tech works. Unit tests checked every module and improved quality but test-building and running vs other QA trade-off is still debated. Cleanroom decided bugs users see are worse than those they don’t. So they did usage-driven testing based on models of all the ways the users might use the interfaces to the product. Combining it with formal specs or Design-by-Contract allowed test generation based on specs of actual behavior. You don’t manage those tests at all but still get the benefit. Fuzz testing caught errors manual tests often missed so they’re good idea. So far, automated testing from formal specs and/or types combined with manual tests for what those can’t easily express seems like strongest combo. Run fuzz tests, QuickCheck, etc over night with prioritizing fixes based on importance or severity.

    7. It was known since early days that compilers were complicated beasts that could trash your software’s correctness. Wirth went with simplified ones that did quick iterations with OK performance. Ada compilers tried to do everything under the sun for high safety & performance with high build times and costs resulting. LISP’s had ideal method of REPL for instant results, incremental checks/compilation of individual functions for quick iteration of performing code, and full compilation with static types for best results which could be batched. It also showed the complex functionality could be macro’d onto simpler language that’s easy to compile. 4GL’s did that for BASIC as well with tremendous productivity boosts. Certifying compilers showed up that converted HLL to assembly with evidence they’d always be correct. My recent proposal is to add macros, Design-by-Contract, Rust’s temporal safety, and a few other things to a language like Modula-3 that’s already ultra-fast to compile and turn into production code. Give it LISP-like combo of fast interpreter, incremental compilation, and fully-optimizing compilation caching & integrating the results as development happens. Fully test it in overnight runs that do spec-based test generation, acceptance/regression testing, fuzz testing, full analysis, and certifying compilation with tests against the fast versions for correctness checking. Come into work next day to find either what problems to fix or ultra-optimized, correct code ready to integrate into whatever you’re about to build.

    8. Information-flow analysis. The early version was to detect covert channels that are the bane of high-assurance security’s existence. The only thing we couldn’t reliably, consistently eliminate. The initial method modeled the system, software and hardware, as interacting parts to find what components they shared. If they can read/write it, it’s a storage channel. If they can modify & time it, it’s a timing channel. Later on, such stuff turned into information flow control that labeled various variables in accordance with your correctness or security model to then track via type systems or other analysis how information flows in your system. Violations of your expectations vs code’s reality are problems to be corrected. Design-by-contract can do some of this but maybe not all. Open problem. Already practical enough that tools like SIF can do web applications that automatically find or prevent leaks then generate client and server-side code with specific algorithms in right spot for acceleration or security. Such methods could benefit software in general.

    9. Recovery-oriented computing and fault-tolerance. Hamilton’s people proved this out in Apollo using asynchronous communicating processes, hardware redundancy, code to handle failures of processes/components, and a human-in-the-loop method to let people do what they do better. Tandem NonStop used similar features to design a massively-parallel, five-9’s platform for mission-critical assets. OpenVMS got close to that with decade plus uptimes by just focusing on code review, error handling in kernel, versioned files, standardized locking/batching/whatever to do hard stuff for apps, and bulletproof clustering. Byzantine fault-tolerance and all kinds of exotic stuff went from there but recent stuff like Copilot monitoring scheme and micro-restarts paper show it can be really simple to react to lots of stuff properly. :) The strategy for the rest was fail fast, safe, and as obviously as possible with error messages pointing in the right direction if possible.

    10. Build-systems. I’ll briefly mention this as it doesn’t cause most problems. Should protect integrity of software keeping track of all changes to know where problems came from. If security-critical project, it should run on strong TCB with access controls per project or maybe even file to reduce malicious input. Automate any tedious things while letting people decide on the rest as “managing” software in build systems has so many angles I’m still learning basic things in new contexts. Immutable store with per commit integrity, access controls for confidentiality/malice, and replication for availability are the basics.

    11. Static and dynamic analysis. These came later in research for lighter, formal methods. The best static analysis tools… probably Astree Analyzer for C and SPARK Ada’s built-in one… can prove software free of common errors if you can write it in the constrained way necessary. Many model-checkers and SMT solvers were formed that could automatically prove properties about specific software. This ties into spec or property based testing where the solvers can handle what they’re good at with tests handling others. Such thinking led to dynamic analysis that combined specs and/or static analysis techniques with actually running the program. The benefits of that, from safety to extra optimization, led to my idea of staged compilation process that does quick checks in interpreter early on, slightly-slower ones on sub-optimized EXE in background, and full battery with full optimization last. In any case, developers should be using whatever automated analysis or testing tools they can for their language/app.

    12. Functional/OOP programming. Each group claimed their methods of development prevented flaws common in structured, imperative programming esp as things got big. More flexibility and safety they say. Subsets of these principles were shown to do exactly that. Smalltalk was the early success and is reigning king of OOP model in productivity with Eiffel maybe champ in productivity, speed, and correctness tradeoff. Functional finally matured enough with Ocaml and Haskell ecosystems to do amazing things. Using functional in as pure a style as possible to combine with automated analysis and testing methods seems like great promise in work involved vs correctness obtained. OOP I can’t speak on as I avoided it and have less data on how experts in say Smalltalk or Eiffel might do that.

    13. CPU or ASIC level. I left this last as it’s often the least-likely to be available. B5000 architecture made it immune-by-default to stack overflows, out-of-bounds accesses, mis-matched arguments in function calls, and malicious data becoming new code due to app flaws. Wirth and Jurg’s Lilith computer designed the CPU, assembly, compiler, and HLL to work together ideally so no abstraction gaps posed a problem. The Secure Ada Target and Army Secure OS (ASOS) projects made tagged hardware plus high-security runtimes that enforced at runtime many protections which Ada may or may not catch at compile time. They failed but numerous Java CPU’s doing something similar succeeded in marketplace. Recently, at high-end, Cambridge’s CHERI makes C apps capability secure with any violation of POLA being caught by CPU. More important here, the SAFE processor at crash-safe.org has a policy-neutral, metadata engine built-in that can enforce at runtime something like 50 different policies in any combination on running software. A combination of above spec and testing strategies with such a CPU in build-system could have tools produce the EXE’s and policies to check correctness with CPU doing that natively with no penalty. Most important policies, like memory safety or leak prevention, can be used in production as well to be sure any unknown vectors won’t screw with you. CPU’s themselves or at least critical parts can be made with correct-by-construction techniques that often ensure first-pass with asynchronous knocking out timing issues, triplicated checking w/ voter to knock out general issues, and SOI/rad-hard processes knocking out many EM or radiation issues.

    So, there’s what over a decade studying high-assurance systems taught me had big payoff in various projects. Methods like Cleanroom, formal specification esp of interface expectations, spec-based testing, safe languages, static analysis, and easy iteration w/ full-on testing done later have collectively always paid off. Combinations of these were done on various projects by small teams with limited expertise in them and still got better success than usual at correctness. The companies like Altran or Galois that consistently apply some of this rarely had failures in production with most defects being non-critical. Such empirical evidence of specific methods consistently getting good results in terms of low defects argues strongly to start with what they do in coming up with next method with better time/work/defect tradeoffs.

    It’s about as quick a summary as I can put in. I’ll be busy today chilling with family but will chime in to attempt to answer more detailed questions as time permits.

    1. 8

      Interesting article. Yeah, some advocates are going off the rails with their praise of static typing. I find it helps to look to prior successes and tradeoffs to try to understand how to best do engineering.

      So I basically agree that these are all good ideas for improving the quality of software. But a lot of them work by increasing the cost of software development to the point where it’s not viable for most people to program this way.

      Which is the context in which I’m making this point: Given an unlimited (or at least relatively large) budget allocated to software quality, I’d definitely take types + tests, but given a more typically modest one I don’t think the types are adding much.

      Such empirical evidence of specific methods consistently getting good results in terms of low defects argues strongly to start with what they do in coming up with next method with better time/work/defect tradeoffs.

      I don’t think this is necessarily true if the goal is to actually see widespread adoption. The approach I’m taking is trying to take the tools that people are already using and make them more powerful, which seems a lot more likely to bear fruit than taking the tools that people aren’t already using and trying to make them cheaper/easier to use.

      1. 9

        But a lot of them work by increasing the cost of software development to the point where it’s not viable for most people to program this way.

        The same principle applies to doctors and engineers: true depth of experience and proper protocol means that not every person gets to perform heart surgery or build a bridge. As a society, we’ve accepted that this is desirable.

        The goal of widespread adoption is a bit at odds with the goal of secure and reliable software. Taken to even a slight extreme, you end up with:

        • Deploying code on hardware architectures that do not afford protection against certain attacks or bad input.
        • Requiring deployments to be managed by fewer people than is required to have proper operational checks-and-balances (consider having the developer, the admin, and the security/compliance person all being the same job when management comes calling for an “urgent” feature request).
        • Providing simplified services to people that cannot understand the importance or significance of what the software is doing and cannot detect when it is operating incorrectly in time to fix or at least deactivate it.
        • Purposefully breaking automation to prevent users from consolidating workforces or losing upsell opportunities (Frank is our reporting guy, who won’t sign off on your new BI platform that makes him redundant…better sell Frank on some obscure voodoo he can be master of and hope he doesn’t break the product).

        These things all happen, every day.

        I think that there are kinda two asymptotic views we can take about software:

        • Accept that all software is disposable, and accept that we will need to replace/upgrade it regularly. At the same time, we should be open about making frequent upgrades and tolerant that occasionally things will break and that every 5 years the business should just budget for new code in the same way that they’d budget for, say, carbon copy paper or new laptops.
        • Accept that all software needs to be built as securely and reliably as possible, and accept that this makes software expensive. At the same time, we should learn how to specify software well so that it can actually built to match what we need and we should become comfortable with dealing with minor rough edges when a given package doesn’t do exactly what piddling little thing we ask. Nobody decides to repaint highway spans every year to match the color of their Tesla, because that’d be really stupid–and yet, in software, people expect that this is reasonable.
        1. 7

          Good points except your dichotomy at bottom because it seems false to me like most dichotomies. Call me extra skeptical whenever someone tells me to choose between two options. Usually a trap. ;) So, here’s your options:

          1. Software will be crap and change too much. Result should be designed so we can throw it away. Easy updates and integrations is the preference here.

          2. Software needs to be built robustly with strong specifications. Build it to last.

          There will be projects that might fall into one extreme or the other. Reality says most won’t, though. The fact is that robust, cathedral-style software can be made under principles of No 1. They just need to clearly decompose it, do interface specifications, tests for probable successes/failures of key functions, be agnostic to storage method, and be agnostic to communication method. Most are straight from medium-assurance development methods. The last two I devised when I saw how Microsoft did data lockin: formats and protocols. They have to be interchangeable easily to counter that whether intentional or incidental. To counter platform lock-in, the platform’s API needs to be open with ideally it also being simple enough for a small team to port or re-implement the whole thing. Hence my early love for industrial BASIC’s, Turbo Pascal, and Oberon derivatives for future-proof software.

          Simple OOP a la Smalltalk or Component Pascal. Close to purely functional development like sane Ocaml or Haskell. These two can be combined with above tricks to make correct code easier to crank out, easy to extend, easy to replace, and easy to throw away since everything else is easy vs vanilla, legacy stack. Also, that just using it in the common way creates fewer defects (esp catastrophic defects) means getting stuck on it (worst-case) ain’t as bad as COBOL, J2EE, or MFC/C++. ;)

          Note: Many people countering adopting Ada 2012 point out that most of AdaCore’s customers are probably legacy businesses who can’t move. The legacy is a future-proof-so-far language with safe defaults, built-in concurrency, good performance, safer extensions, and safer integrations. Customers probably think it’s a nice thing to be “stuck” on vs what veterans of other stacks write about. So, I think about how I can do the same with something that isn’t Ada and has even more power/safety/ecosystem. Julia, Rust, some Scheme’s, ParaSail, and Ocaml are all in my mind as I wonder about that.

          1. 4

            I believe we need to start treating software development more like engineering and less like the anti-intellectual hacker’s dream that many of us are currently in. All devices are internet connected now and therefore software is not disposable unless the hardware becomes disposable. That would be terrible!

            1. 2

              This is a very good point. We can also see what’s happening due to poor security in these internet connected devices. At the very least, it’s extremely wasteful (think about all the resources wasted on generating and then mitigating the Dyn DDoS), and at the other end of the spectrum, it’s unworkable. Throwaway software is definitely not an option when everything is a part of one massively interconnected graph.

              1. 2

                “Throwaway software is definitely not an option when everything is a part of one massively interconnected graph.”

                Interesting you mention that. One looking back paper that I often post is from one of the pioneers of INFOSEC field, David Bell. His paper traces the start, successes, shutdown, and effective disappearance of high-assurance security in government. There was a graph and theorem in his paper where I’m not mathematical enough to be sure whether he was overreaching with it. Section B on Intermediate, Security Value Theorem.

                http://lukemuehlhauser.com/wp-content/uploads/Bell-Looking-Back-Addendum.pdf

                One early counter to high-assurance was that we only need to use them for most critical systems. So, we decide which we can run single-level, low security and which needs multi-level, high security. Thing is, once you interconnect them, Bell et al predicted that the middle systems where any overlap happens are effectively multi-level. They have to be secure or they start compromising the others in some way from attacks to covert channels. The modern net uses protocols where some nodes depend on the others, including strangers. There’s also many direct connections through various paths plus things going through firewalls due to over-reliance on specific ports. A graph of that might look like Bell’s on steroids (or LSD). That the minimal networks ended up needing protection for all systems depending on others means the modern, Internet systems probably do as well.

                That’s how the theory would’ve played out. In practice, we’ve seen hackers abuse the identity/transport aspects of hosts to compromise others more easily and use low-assurance hosts to DDOS those with greater protection. Maybe Bell was right. They might all have to be high-security against code injection & internetworking attacks. I’ve been recommending to protect them as that’s what Orange and Red Books taught us anyway. Just remembered Bell’s claim was based on a graph when you said that.

            2. 3

              The same principle applies to doctors and engineers: true depth of experience and proper protocol means that not every person gets to perform heart surgery or build a bridge. As a society, we’ve accepted that this is desirable.

              Right. To be clear, when I say “these ideas are not suitable for widespread adoption” I do not mean “these ideas are bad”. I mean “these ideas are great for improving the situation in areas to which they are applicable and should be actively encouraged there, but they will not do much to raise the average quality of software because of the comparative size of those areas”.

              Accept that all software needs to be built as securely and reliably as possible, and accept that this makes software expensive.

              I don’t think we get to have this option, because what happens in most cases is instead that end users don’t value security and reliability enough to pay the amount that it actually costs, so we end up not developing the software because someone else got there first/cheaper instead of developing the software securely and reliably.

              …which is why I’m interested in techniques suitable for widespread adoption, because they allow you to make changes to the quality of software within that constraint. If you can make better software on the same budget then software quality improves. If you can make better software on a slightly higher budget software quality might improve. If you can make better software on a substantially higher budget, only software that is already very good is likely to improve.

            3. 4

              “Given an unlimited (or at least relatively large) budget allocated to software quality, I’d definitely take types + tests, but given a more typically modest one I don’t think the types are adding much.”

              Types are a small part of assurance. Investing more won’t add much if we’re talking about average developer. Maybe could still benefit where 3rd parties in the ecosystem build specific components that way with an extra boost in quality. That already happens to a degree. However, static typing itself probably won’t do it. They’re better off with lightweight, formal specifications that feed into checks on their code, drive test generation, and aid compiler optimization. Types are a form of formal spec that can sometimes do that but general ones like Hoare logic or Z are better suited.

              “he approach I’m taking is trying to take the tools that people are already using and make them more powerful, which seems a lot more likely to bear fruit than taking the tools that people aren’t already using and trying to make them cheaper/easier to use.”

              It’s sound advice per Worse is Better theory. It even worked with C++, VB.NET, the PHP enhancements at FB, especially SPARK for Ada, etc. Thing is, most of the good stuff hasn’t worked. Think Spec# for C# plus Microsoft’s other tools. The safe subsets or variants of C combined with static analysis. Java with JML and all its checkers (esp concurrency). Ironclad C++ with vanilla C++. These have all failed to gain any traction. The average developer simply will not use them. Problem is with developers, the tooling available, or both. There’s also almost no effort by developers to improve those tools. Little chicken-and-egg problem.

              So, I’m not entirely sure where to go if we’re starting with broken tools in mainstream languages. I was thinking about reimplementing Python interpreter and standard library in SPARK Ada with Rust checking temporal safety. Then, modify one of those nice, web frameworks to do things fast and securely. A few mods to the language to help specific correctness optionally for the programmer. Developers get more safety than average simply using the product as they normally do with an easy route to increased safety. Something like that might take off, at least in industry. It needs to painlessly integrate with an existing ecosystem, though, as Scala, Clojure, F#, and Julia do. It’s a high bill given the size of all these projects plus the fact that the API’s weren’t designed for correctness or security from the get go.

              1. 5

                The safe subsets or variants of C combined with static analysis.

                Well, Cyclone, which I suppose is one of these safe variants of C, eventually led to the development of Rust, but it’s still too early to tell whether that will affect the general quality of software. I’m hopeful.

                On the other hand, a corollary your paragraph is that Rust may very well be too much for your average developer. Does the “average developer”, if such a thing exists — let alone exists in such a manner that we would be able to characterise it without sounding too elitist or patronising — really care about the reliability of software? I think the answer is decidedly no. For now. Maybe things will change.

                I don’t really like Uncle Bob, nor do many people here, for his patronising tone; but I think he was right in pointing out that one day, a careless programmer will code a mistake that will cost lives or a significant amount of money. When this happens, society will begin to expect programmers to possess same rigorous training as engineers, doctors, and lawyers do.

                1. 4

                  Does the “average developer”, if such a thing exists — let alone exists in such a manner that we would be able to characterise it without sounding too elitist or patronising — really care about the reliability of software? I think the answer is decidedly no. For now. Maybe things will change.

                  I’m not really a fan of the term “average developer” either, but you don’t need to care about reliability to find Rust attractive. You just need to not like debugging seg faults, data races or other memory safety related bugs.

                  1. 3

                    “but I think he was right in pointing out that one day, a careless programmer will code a mistake that will cost lives or a significant amount of money.”

                    Unfortunately, this sort of thing has happen plenty of times with mixed results. It’s often ignored after some money is paid out with no changes because it’s so rare. Barely impacts the profitable status-quo. The times when change happens were in safety-critical industries like aerospace or railways where average developer is expected to try to prevent whatever they can. They have regulations where people have to code in safer ways with the docs, design, and code getting inspected. These don’t apply to people outside those sectors, though.

                    This leaves little opportunity to change the average case. Two possibilities are regulations enforcing a baseline with a few, sane techniques that shouldn’t add much to the cost of everything. Just a few items on my list would be a great start. The other is liability legislation so they have to pay for their misdeeds. Fewer screwups, fewer lawsuits they loose. Ranum notes the lawyers will have a field day with that to the point it might be more trouble than it’s worth. That’s why I’m in favor of an established baseline of good engineering and safety techniques in whatever law passes. Like we had with TCSEC and do with DO-178C.

                2. 2

                  Is there data on the relative cost of defects in software? Cuz in the end, that’s what it all boils down to. Will writing a mostly defect-free software result in less cost? Of course, you need to consider the loss in revenue caused by the lag in the initial release, but it only takes one catastrophic defect to kill you dead permanently. I really wanna side with science.

                  1. 3

                    Outside the first-mover advantage, there’s evidence that software that meets expectations can sell more due to better user experience or reliability. That niche is actually a large market even though it pales in comparison with normal market of software.

                    The other angle to look at is maintenance phase. This is actually best selling point where no liability legislation exists. The maintenance phase involves fixes and enhancements of production systems that can’t change too much from user perspective. The further in lifecycle you discover a bug, the more it costs to both develop and field the fix. Empirical evidence is clear on fixing them early as possible. Technical debt shows us codebases get harder to fix, extend, and port if certain amount of upfront or refactoring work isn’t put in on its structure. These combine to show a methodology that forces cleaner design with defects knocked out early will cost less over time while allowing more agility in developments. It just logically follows from proven issues. The question is what kinds of effort and how much to put in to get an acceptable tradeoff.

                    The methodology that did that best early on was Cleanroom. It enforces clean structure, does code reviews, ties amount of verification to specific item in question, catches many problems before software even runs, testing centered on what happens in real-world use, control flow easy to analyze by hand, and consistently low defect at reasonable cost (sometime cost savings). The defect rate was so low and predictable they even statistically certified certain teams to specific defect rates in such a way that firms would sometimes offer a warranty on the software where that number of fixes was free. I’m not a believer in statistical certification but bug rates were low enough for warranties. Often non-critical, too. Same with Altran/Praxis Correct-by-Construction method that charges 50% premium on average for low-defect code warrantied at specific bug count.

                    My base recommendation for low-cost version is Cleanroom done with industrial-strength, functional programming with interface checks, static analysis (esp QuickCheck-style), and usage-driven testing. Add over time battle-hardened libraries for common tasks that do them correctly and have clear guidance to users on proper use with sane defaults. Not just stuff like crypto but also things like strings, names or date/time handling. They need to be able to just reach for it to find a working library that takes almost no brains to use right. The language and checkers are already like that. Learning how and what to formally spec at interfaces or algorithms will be hard part. Worst case they can do asserts.

                3. 3

                  nick – one of the best comments I’ve ever read. Nice work, keep it up.

                  1. 4

                    Much thanks. Figuring out above been my life’s work pretty much. Been trying to condense it all in forms like that so people don’t have to dig for 100+ papers in IEEE or ACM or blogs. Still work in progress.

                    1. 2

                      If you wrote a book, I’d read it. I enjoy reading your comments, they provide a lot of historical background I don’t usually see.

                4. 22

                  This came up again in the context of the recent Dyn DDoS and comes up regularly in discussion of the Internet of Things and its general total lack of software or security.

                  Link? I’m wondering why people think this. Do you mean people are arguing it would have made taking over the IoT devices impossible in the first place, or that a static type system would somehow make handling a DDoS possible? In any case, I think any reasonable person would see through these statements pretty quickly. You can find someone who has said just about anything.

                  As static typing becomes more elaborate the scope of the bugs it can prevent becomes wider, but it comes with an additional cost: Build times.

                  Ocaml has pretty reasonable build times. IMO, I would say Ocaml build times are close to Go given how much moe expressive the type system is than Go’s.

                  My general experience of the correctness software written in fancy statically typed languages is not overwhelmingly positive compared to that of software written in dynamic languages. If anything it trends slightly negative. This suggests that for the scale of many projects the costs and benefits are close enough that this actually matters.

                  Having written and inherited code in a dynamically typed language, I can say the big win, in my experience, is refactoring. IME, a lot of people (on both sides) talk about producing code the first time. IME, that is pretty easy no matter what language one is it. Coming back to code 6 months or 6 years later is a much different story.

                  1. 14

                    Coming back to code 6 months or 6 years later is a much different story.

                    Oh gosh this so much.

                    I’ve inherited various chunks of under-documented under-tested code over time. Type-y assistance where present has made all the difference in these situations!

                    1. 11

                      Maintaining nearly 80KLOC of OCaml over three years, I was able to use repeated edit-compile-error-scratch-head sessions as a means to re-acquaint myself with the system. But this is such a low bar; we need better. Maintaining more than 1MLOC of Smalltalk over the past twelve years, I found leaning on a type system–even OCaml’s–to be quite impoverished compared to a live, self-navigating system of uniform, recursively designed objects and messages.

                      1. 4

                        I’ve heard that refactoring smalltalk is a great experience. I don’t hear the same about erlang. Is this just a matter of the smalltalk folks focusing more on tooling? They’re such similar languages in a lot of ways, I would think you could achieve similar things, but I don’t hear people talking about refactoring erlang systems with such reverence.

                        1. 4

                          I can’t do it justice in a few lines, and I don’t want to clutter the discussion here. I will say that’s not just about Smalltalk but the mindset that Smalltalk fostered and, in turn, fed back into the evolution of Smalltalk. Go back, at least, to the writings of Ward Cunningham, Kent Beck, Ralph Johnson, and Brian Foote. As for Smalltalk itself, tooling almost emerges spontaneously from it because–I think–of the attributes I noted in the last line of my previous reply.

                        2. 3

                          to be quite impoverished compared to a live, self-navigating system of uniform, recursively designed objects and messages.

                          Do you mean that the code bases are uniform in this way or that any program written in Smalltalk will have this quality? One reason I like a good type system is because most code is not uniform and the type system let’s me take messes and make them clean, pretty safely. But a type system is no panacea, of course.

                          1. 5

                            You can write bad code in Smalltalk, as in any other language. One thing that sets Smalltalk apart is that it’s as much a system as it is a language. Further, it’s a system represented solely in terms of objects sending messages to other objects. Further still, it’s a running, fully reflective, live system with that entire object graph readily introspected and manipulated. So, in Smalltalk–as Alan Kay said–architecture (uniform recursive object system) dominates materials (code). In Smalltalk, even bad code is easily navigated, understood, and refactored.

                      2. 7

                        “Ocaml build times are close to Go given how much moe expressive the type system is than Go’s.”

                        Build time is measured the same, no matter the language. One could say a longer build time is justified because of certain benefits, but it’s still a longer build time.

                        1. 4

                          I’m not sure what I said you’re disagreeing or correcting me on. I am saying Ocaml builds fast, but slower than Go, but you get a lot of typing out of that extra time.

                          1. 3

                            That seems reasonable. The original statement might be more clear to me if it said “close enough to Go given…”

                        2. 3

                          Ocaml has pretty reasonable build times. IMO, I would say Ocaml build times are close to Go given how much moe expressive the type system is than Go’s.

                          That’s a good example, thanks. I’ll add it in.

                        3. 8

                          This seems like a classic straw man argument. “This frustration stems from the idea that static typing will solve all our problems, or even one specific problem: The ubiquity of broken software.”

                          No-one says that static typing will solve “all our problems” or “the ubiquity of broken software”. By setting static typing up on too high a pedestal he’s making it an easy target to shoot down. Static typing can be useful but let’s not pretend it’s more than it is just for the sake of using it as a piñata.

                          1. 6

                            As you point out in your article, it is a bit sad we’re missing the data on this. My experience is almost diametrically opposed to yours, and I find that especially in contexts where you have no quality budget, static types are a godsend. Then again, this is just me.

                            1. 1

                              One strength has been that if my opportunities to increase quality are small, but numerous (say, one day every few weeks), static typing lets me do those changes with more confidence. IME, dynamically typed languages require a a fairly expensive context-switch time period, where I have to read tests (if they exist) or read through the code to figure out what the APIs are and if my change makes sense and if anyone reviewing that code has to do similar. But, to support @DRMacIver’s point, I enforce my own quality budget and I use static typing to make that happen when I can.

                            2. 5

                              This whole article is based on a false premise. The reason some of want to use statically typed languages is not that it will automatically save us from broken software, but that it helps prevent a whole category of errors. It helps promote better design and better programming.

                              It’s like the lines on the road. They don’t automatically prevent head on collisions, but they help guide you away from them.

                              1. 4

                                at the level of expense most people are willing to spend on software correctness your technical solution has to approach “wave a magic wand and make your software correct” levels of power to make much of an impact

                                This doesn’t make sense - the one who makes his program correct or incorrect is the programmer himself. Reliably writing correct programs has a computational cost: coming up with proofs of correctness. If you(r customer) can’t afford this cost, then you have to accept that you can’t reliably write correct software. That much is fine. By and at large, the world has accepted “good enough” as good enough.

                                Haskell, Scala or Rust all have interesting and powerful type systems and horrible build times.

                                You’re right about (GHC) Haskell and Scala, because they have type-level programming facilities that Haskell and Scala programmers are eager to use. But, in Rust’s case, type checking takes a small fraction of build times. The real bottleneck is the compiler’s backend: LLVM bytecode generation and optimization.

                                1. 1

                                  GHC’s type checking also takes a small fraction of build time as you can verify by running with -fno-code.

                                2. 4

                                  I’m not at all convinced that static typing is inherently more costly at the level of a programmer choosing their approach to a project (my personal experience is quite the opposite), but I guess the argument is then that it’s more cost-effective to hire programmers who use untyped or not-very-typed languages/styles (otherwise why would businesses keep doing it?)

                                  I mean my career has consisted of, broadly, working for companies that were using static typing to outcompete companies that weren’t (though I suppose everyone’s career has consisted of something similar for their own preferred technique). Maybe effective static typing is a recent innovation and the market is taking a while to absorb it - software development is a very noisy business and it’s hard to see clear trends in which companies succeed and which don’t; even assuming there is a high “evolutionary pressure” in favour of static typing. Certainly if we’re making a popularity argument then the trend is very much in favour of static typing - there is much more and better static typing going around than 5 or 10 years ago, more programmers are studying the techniques, and more businesses are talking about it.

                                  The site says that the ways to get non-broken software would be to make it more expensive to write broken software and cheaper to write correct software. I’d argue that increasingly sophisticated internet attackers are already doing the first, and recent innovations in strong typing are doing the second. Indeed I think the main reason we’re having this conversation at all - the reason that strong typing is part of the zeitgeist - is that costs and benefits have shifted to the point that people are much more interested in writing correct software.

                                  I think the idea of a “quality budget” is putting the cart before the horse; the business position is much closer to “we need to achieve x level of quality as cheaply as possible” rather than “we have x to spend on as much quality as possible”.

                                  The idea that static typing catches “less critical” bugs is very contrary to my experience. The outage-causing, money-losing bugs I’ve seen have generally been incredibly dumb bugs that could have been caught by a simple type check. There are subtle bugs that are harder to track down, but that’s not correlated with criticality, and if anything the subtler bugs are where type systems shine, because they usually result from confusing two related-but-slightly-different things.

                                  Talking about catching bugs is missing the opportunity to eliminate them at an earlier stage, by designing without that possibility; TDD can achieve this to a certain extent, but types are better at it. The idea that “literally nobody is going to encode it in the type system” is simply not my experience. If I believe it’s true, I should have a simple chain of reasoning for why it’s true, and then that’s easy to translate into types. If it’s hard to encode into the type system, that’s a giant red flag that whatever I was doing was probably a bad idea, and there’s a simpler way to do it.

                                  1. [Comment removed by author]

                                    1. 7

                                      It’s not that I’m saying you’d comment without having read the post, but uh ok yes it is. Have you read the post? Because it’s literally a discussion of that.

                                      1. [Comment removed by author]

                                        1. 6

                                          It’s the responsibility of internet commenters to not make facile dismissive comments in response to things they haven’t read, right?

                                    2. 3

                                      But when a program can’t be compiled unless it’s verified to be correct, you’ve averted a large portion of broken-software cases.

                                      1. 3

                                        There is no single development, in either technology or in management technique, that by itself promises even one order-of-magnitude improvement in productivity, in reliability, in simplicity.

                                        – Brooks, 1986

                                        1. 3

                                          I don’t buy the claim that Go’s type system fights you every step of the way.

                                          1. 5

                                            The lack of generics doesn’t help. You end up abusing interface{} and litter code with if typed, isTyped := foo.(bar); isTyped { style checks.

                                            1. 3

                                              This is only true if you are trying to create your own generic containers. I find I’m able to just use slices and maps for just about anything I need.

                                              1. 4

                                                Even more interesting than generic containers are generic algorithms that work over multiple data structures exposing a common interface. The ability to do this gives me freedom to begin projects with simple, dumb data structures, and swap them with possibly more complicated ones when the need arises.

                                                1. 1

                                                  That’s a fair point; writing generic algorithms in Go does require jumping through some annoying hoops with interfaces.

                                                  1. 3

                                                    It’s not just jumping through hoops. Sometimes it’s outright impossible, without either duplicating code or bypassing the type system. Try implementing a linear algebra library in Go that works equally well with rational, real and complex scalars, without duplicating logic (bad for maintainability) or using runtime type assertions (bad for performance).

                                                    1. [Comment removed by author]

                                                      1. 8

                                                        There’s plenty of tooling you can use to do template style substitution into your source.

                                                        Template-style substitution is the caveman alternative to generic programming. With genuine generic programming, you don’t need to wait until template instantiation time to find type errors in your code.

                                                        1. 1

                                                          Just because it’s a “caveman alternative” doesn’t mean it isn’t a valid choice. At the very least, it suggests your dichotomy as misleading:

                                                          … without duplicating logic (bad for maintainability) or using runtime type assertions (bad for performance)

                                                          1. 2

                                                            You are duplicating logic. The template system just happens to automate the process for you. If you weren’t duplicating logic, then type checker could semantically analyze your template itself. But it can only analyze the template’s instantiations.

                                                            1. 1

                                                              The template system just happens to automate the process for you.

                                                              So does monomorphization.

                                                              You are duplicating logic.

                                                              If I only need to write and maintain the logic in one place, then for all practical matters, there is no duplication.

                                                              then type checker could semantically analyze your template itself. But it can only analyze the template’s instantiations.

                                                              The downsides of a template approach are not something I’m questioning. You’ve already pointed them out and I agree, but your dichotomy is still misleading.

                                                              1. 2

                                                                So does monomorphization.

                                                                Monomorphization doesn’t duplicate source language code. It duplicates target language code and happens after type checking, so it’s a compiler implementation detail.

                                                                If I only need to write and maintain the logic in one place, then for all practical matters, there is no duplication.

                                                                You seem to be arguing that eliminating duplication in the source code is the only thing that matters. I disagree with this: If I need to duplicate code in my head to make sense of a template, there is still duplication. The only way to eliminate duplication in my sense is to provide an abstraction (genericity) that can be reasoned about in its own terms, rather than in terms of specific implementations (uniform representation, monomorphization, type passing, etc.).

                                                                1. 1

                                                                  I disagree. I’ve used templating in Go to reduce logic duplication before and it works fine (subject to the downsides you’ve already listed). People do it in C++ too.

                                                                  I know how monomorphization work. Nevertheless, your dichotomy remains misleading.

                                                                  You seem to be arguing that eliminating duplication in the source code is the only thing that matters.

                                                                  No, I’m saying that if you only need to maintain the logic in one place, then it isn’t duplicated. That’s all. This just means your dichotomy requires revision.

                                                                  I find it perfectly reasonable to oppose templating based solutions for a variety of reasons. They have significant usability problems. But, they do solve the logic duplication and performance issues you’ve cited with existing “generic” approaches in Go, and that’s worth acknowledging properly, which I don’t think you’ve done.

                                              2. 3

                                                No, you don’t. I have written hundreds of thousands of lines of Go code over the past 7 years, much of it in the Go runtime and standard library, and if you architect your data model correctly, you don’t end up with many type assertions (and coincidentally, not many if err != nil { checks either).