I feel like these conversations are often had through the lens of static typing being java-esque static typing. When static types are just a means of making sure you don’t pass an integer into a function expecting a String, the argument for them is weakened significantly (although, for large projects, I would still argue for that over an untyped language 9 times out of 10).
I’m sure that 2% number would grow significantly, if you considered the bugs that a sufficiently expressive static type system can prevent (e.g. unsafe concurrent operations, failures to check for nil, failures to bounds check, that a case wasn’t forgotten, that a function will throw an exception, etc).
Bad data is worse than no data, because at least the latter doesn’t carry unearned connotations of legitimacy.
The study in question looked at a problem in a specific domain of a specific size, meant to be solved by a specific number of contributors, using a specific language (whose type system wasn’t especially powerful to begin with). The conclusions from that study can’t reasonably be extrapolated to other problem domains, other languages and toolsets, other team sizes, not to mention aspects such as long-term maintenance, engineer turnover, or refactoring. Saying one typing discipline is “better” in the broadest sense than another based off this one study (or even fifty like it) is not compelling.
The GitHub analysis is pointless because bugs that are caught and fixed are, by definition, not captured within the issue tracker. These bugs might have been caught and fixed during development, during testing, or as part of a prior release, depending on the project best practices or even the carefulness of individual developers. Not to mention bugs that aren’t discovered due to certain paths within the application or library not being frequently executed, or the fact that popular, well-used projects probably have fewer known defects than lesser-known projects of similar quality that haven’t been exercised by as many end users and had those errors uncovered. I mean, the authors of the study even point to the temperament of individual contributors as an exogenous factor they couldn’t account for, for crying out loud.
The GitHub analysis is pointless because bugs that are caught and fixed are, by definition, not captured within the issue tracker.
The github analysis, according to the post (I haven’t watched the video) is only of dynamic languages. As I understood it, this was to see how often type problems cause bugs in comparison to other issues. So I don’t believe it is pointless, but I concur that there are many factors to consider. For example, you have to write more tests. (Particularly around parameter validation if you’re writing a public library.) Conversely you may be able to gain eg. loser coupling, allowing for greater reuse.
I think this only really matters if you try to argue that static typing is the only way to catch type issues. An analysis of the issue tracker doesn’t tell us anything about how much more time it took to write unit tests to reach the ‘same’ level of coverage, nor the amount of time and effort that individual contributors spent debugging issues (including type issues) before pushing what they believed to be working code to the repository.
What if someone performed this analysis using memory-related bugs? Maybe C and C++ projects only have 2% of their bugs related to memory issues (leaks, use after free, buffer overruns, etc). Would you believe them if they used that to argue that automatic memory management is useless for writing correct code, and valgrind and careful coding is all anyone really needs?
(I’m not trying to argue against dynamic languages per se; I’d rather use Clojure or Erlang than any of the mainstream statically typed languages for a large number of problems. I think the methodology of the studies mentioned above is so badly flawed that we need to take a stand, lest engineering management decide one day that they now have ‘objective data’ to base their decisions around.)
Give me a minute to put on my tartan kilt, because no true statically typed language would leave it until run time to discover that your code doesn’t handle a null pointer in that one spot.
I would be interested to see a comparison of dynamically typed languages with statically typed languages like Haskell and OCaml rather than C#/Java. As someone who programs in Ruby for work and recently completed the OCaml MOOC, I suspect the productivity differences would be much smaller.
How were your experiences with the OCaml MOOC? Do you have any insights that you’d share with others?
I enjoyed the course a great deal but I recommend it with reservation. I’d written some elementary Haskell before implementing H99 exercises and read John Whitington’s OCaml From The Very Beginning before taking this course so I was already familiar with writing tiny, beginner-level, typed functional programs. For me it was a fun collection of exercises that familiarized me with writing small programs in OCaml. I particularly enjoyed writing the arithmetic interpreter and the toy database.
But sometimes I struggled to understand the instructions for the exercises and without the benefit of the community forum I would have stayed stuck. From reading the frustrated posts of some students it seemed like those without prior exposure to functional programming were not having a good time. The pacing also felt uneven, some weeks I finished the exercises in hardly any time at all while others took a great deal of time.
So I think it was a good course for motivated students with some prior experience with functional programming basics like map and fold. But someone with no previous functional programming experience might want to look at other resources first. My favorite is The Little Schemer, which isn’t about typed functional programming but does a tremendous job getting you comfortable with recursion and higher order functions. I also enjoyed OCaml From The Very Beginning and think that would also be a good introduction.
I saw that the organizers of the course are planning a second edition. Hopefully they can even out some of the rough patches and make it a more enjoyable experience to programmers brand new to functional programming.
So for example take python. Out of 670,000 issues only 3 percent were type errors (errors a static typed language would have caught)
This depends enormously on what one thinks a type system can encode. For instance, writing to read-only file descriptor in Python or in Java is a run-time error, but in OCaml it’s a type error because you cannot pass an in_channel to a function expecting an out_channel. And indeed, a common “design pattern” in statically-typed languages is to capture invariants and encode them in the type system to make the compiler responsible for handling possible misuses. (Of course, a certain degree of good taste and design is required to not take this too overboard.)
But a common pattern in Java is to use Readers and Writers and classes that wrap them. That would result in compile time errors as well.
True, albeit in a more verbose way, as hinted on in the program length part of this post.
True, so perhaps using C as an example would’ve been a better example (fopen returns a FILE* that can be passed to both fread and fwrite)?
That would be the absolute purr-fect answer. Meow.
I think stdio.h has a lot of good examples of ways to avoid building new C APIs in 2015. You could just as easily use a library that provides distinct input and output types; streams, or channels, or whatever.
Which is to say: C the language is not the problem per se.
Three critical misses here.
The first is that no one who understands this issue in any detail would hold up Java or C++ as an example of static typing. You have to compare best-in-class against best-in-class. If it’s C++ or Java up against Clojure, Clojure will win. I’m a huge fan of static typing but I’d rather use Clojure than C++ or Java. (I might even prefer it over Scala, but that’s another discussion.) Java’s type system is pretty awful. In fact, if you look closely, it has two of them. There’s a bottom-up type system for primitives and arrays that isn’t all that bad (although it’s limited) and a top-down type system where everything inherits from Object. They don’t work well together, and the generics system was bolted on after a lot of key decisions were made. Then you have covariant arrays (they aren’t actually covariant). Don’t get me started. The point is: if your experience with static typing is limited to Java, C#, C++, you’re just not qualified to enter this discussion, just as I’m not qualified to hold strong opinions about (say) racehorses or 9th-century Chinese poetry or the history of gardening.
Second, the proximate cause of an exception or failure isn’t always the logical error (this is the dreaded error-failure distance). Your NullPointerException (or analogue in a dynamic language) might have a cause 7 levels up the stack. Then you get what’s actually a type error but looks like something else. Also, dynamically-typed languages and cultures tend to have an “accept partial failure” attitude. Misspelled record field? Return nil. I’m not going to debate where that is right and where it is wrong, or even say that there’s one wrong answer. I’m only going to say that, even if only 2% are obvious type errors, there’s probably a much larger percentage that stem from inaccurate assumptions that a type checker could catch at compile time.
Third, the performance of “static typing” has much to do with the skill of the person writing the code. Naive Haskell is probably significantly more safe than most code in dynamic languages, but unsafe, bad, or error-prone Haskell can certainly be written (see unsafePerformIO, undefined, unsafeCoerce). It takes a few months to get a sense of how to use the type system to anticipate and block classes of errors, and maybe a couple years before you can see how to block most errors (even that wouldn’t be seen as type-system problems, such as concurrency bugs). And sure, no type system is “good enough” to block 100% of possible errors, but once you get good at a language like Haskell, you can block a surprisingly large percentage of them, and that helps you rule out whole classes of misbehavior when you’re debugging, which means that the debug sessions you will encounter are often damn quick.
There are advantages to dynamic typing on small codebases, and also for exploratory projects like much of what’s called “data science” these days, but generally I prefer static typing, because I really enjoy when I can spend more of my time coding than debugging runtime bugs whose manifestations (failures) are often several stack frames away from the actual logical error.
Comparing initial-write-costs may well favour dynamic typing. Much of the advantage of static typing comes when you have to refactor in the face of changing requirements.
None of the languages compared have good type systems.
If you find yourself embedding a scripting language you should reconsider whether your primary language is good enough for your problem, sure.
Types are not anti-modular, they just make coupling explicit. In fact by allowing language-enforced encapsulation they improve modularity.
It was disappointing to me that 80% or more of the comments responded only to the title and clearly didn’t actually read the summary. It appeared as if almost no one actually watched the video. Most people just clung to their beliefs clearly ignoring any evidence to the contrary.
I’ll happily read, but I’m not going to watch a video. If the message is worthwhile you can express it in writing.
A large portion of the talk is centered around data from this paper by Lutz Prechelt (written in 2000) and it compares the following languages
I don’t think that “When compared, C++ and Java are more difficult to write than Perl, Python, Rexx, and Tcl. Oh, and also C is harder to write” is a worthwhile comparison for anybody except maybe a web developer from the year 2000 thinking of switching away from Java.
I also don’t believe the author has heard of type inference, formal verification, or dependent types.
Wood is better than steel! It doesn’t require slow and expensive welding, and when you survey the problems people have with wood houses, few actually relate to the wood beams buckling. Some folks want to make things with steel (skyscrapers, whatever); apparently they just haven’t seen the data.
(There are, I should add, now some relatively tall wood-ish buildings, made possible by clever engineered wood products. I’m really impressed by composites in general, how some fiber + plastic can make something decently strong and light and cheap by combining different kinds of strengths. None of that is meant metaphorically, it’s just neato, but I guess you can make a metaphor from it if you want.)
I have some particular feelings about strengths and weaknesses of specific ecosystems I’ve coded in, but talking about them would just distract from the point. Taking this kind of scalar comparison at face value doesn’t help anyone get the real nuanced understanding they should have for the complex decisions they might have to make.
This came up in chat. The speaker (Smallshire) and the creator of the slides in this blog (Hanenberg, I think) both got mentioned in this literature roundup that is far less persuasive in favor of one over the other.
I’m not knowledgeable enough on the subject to have a strong opinion categorically one way or another on dynamic vs static typing, so this post is a question rather than any sort of opinion or statement. The main reason to have static typing is to catch a certain class of errors in compile-time, right? There are some other ways to do this; I know, for example, that Erlang doesn’t have static typing, but its static analysis tool, Dialyzer, can infer and detect some type errors.
What other types of errors can be caught in compile-time, and which languages/tools handle those?
The next step up from static typing is probably formal verification. I have only the most passing familiarity with this field, but as far as I know the idea is to express your specification as a mathematical theorem, and show that your program is equivalent to a correct proof of that theorem. You can accomplish this using tools like Coq, which was used to build this C compiler.
I know, for example, that Erlang doesn’t have static typing, but its static analysis tool, Dialyzer, can infer and detect some type errors.
Another way of saying this is that Erlang has both dynamic typing and static typing.
Sorry for the pedantics, but I dislike that characterization. Dialyzer points out, rightly, the idea that types are merely a form of analysis and saying a language “has” them is a bit weird. Languages “have” types when they’re designed or defined such that things with invalid cannot compile at all. Languages which do not have this property do not “have” dynamic types and they can at any point gain static types through outside analysis. If you then reject all programs without good typing they you’ve highlighted a subset of the previous language which “has” types.
I dunno. The language here just always seems biased toward some particular implementations and therefore misleading.
It’s trivial to construct a build which will reject Erlang code which does not pass a certain level of type safety.
You can quibble and say that this is “not true Erlang” but that seems very No True Scotsman. Dialyzer ships with Erlang. Does that mean that Erlang is secretly two languages?
An anecdote: I have some experience with Clojure, and some with Scala. Although I liked using Scala’s case classes a lot, I feel things got hairy as types got more complex. I would occasionally get into a situation where the compiler spat out a completely incomprehensible type error that I didn’t know how to resolve. (In these situations I usually I resorted to begging a smarter colleague for help.) Compare that to a dynamic language, where the code might compile but result in a run-time error. This code can be run in a debugger, and I can step through to find exactly where my assumption failed.
I have no experience with Haskell or OCaml. Maybe their type systems yield more comprehensible errors than Scala, but IMO running code that I can debug is much better than an incomprehensible type error IMO.
IMO running code that I can debug is much better than an incomprehensible type error IMO.
I think that’s entirely a question of tooling - with a poor debugger, running code is incomprehensible, and with a good compiler, type errors are comprehensible. Scala’s errors are bad partly because it was never designed to be used this way - AIUI scala implicits were originally intended for simple conversions and the typeclass pattern caught on later, a bit like template metaprogramming in C++. (FWIW your example was obvious enough to me, but maybe I’m just used to spray/scala).
lol, no. Never. Dynamic Typing is great when you are writing a small script and need to be able to slam it out, but the speed comes at a severe cost. I love dynamic typing, but I would never give up static typing.