1. 15
  1.  

  2. 10

    This is the same tired argument in favor of static typing that you see in every blog. The problem is that while the arguments sound convincing on paper, there appears to be a serious lack of empirical evidence to support many of the benefits ascribed to the approach. Empiricism is a critical aspect of the scientific method because it’s the only way to separate ideas that work from those that don’t.

    An empirical approach would be to start by studying real world open source projects written in different languages. Studying many projects helps average out differences in factors such as developer skill, so if particular languages have a measurable impact it should be visible statistically. If we see empirical evidence that projects written in certain types of languages consistently perform better in a particular area, such as reduction in defects, we can then make a hypothesis as to why that is.

    For example, if there was statistical evidence to indicate that using Haskell reduces defects, a hypothesis could be made that the the Haskell type system plays a role here. That hypothesis could then be further tested, and that would tell us whether it’s correct or not. This is pretty much the opposite of what happens in discussions about static typing however, and it’s a case of putting the cart before the horse in my opinion.

    The common rebuttal is that it’s just too hard to make such studies, but I’ve never found that to be convincing myself. If showing the benefits is truly that difficult, that implies that static typing is not a dominant factor. One large scale study of GitHub projects fails to show a significant impact overall, and shows no impact for functional languages. At the end of the day it’s entirely possible that the choice of language in general is eclipsed by factors such as skill of the programmers, development practices, and so on.

    I think it’s important to explore different approaches until such time when we have concrete evidence that one approach is strictly superior to others. Otherwise, we risk repeating the OOP hype when the whole industry jumped on it as the one true way to write software.

    1. 6

      One large scale study of GitHub projects fails to show a significant impact overall, and shows no impact for functional languages.

      That is not the language used by the authors of the paper:

      The data indicates functional languages are better than procedural languages; it suggests that strong typing is better than weak typing; that static typing is better than dynamic; and that managed memory usage is better than un-managed.

      1. 2

        Look at the actual results in the paper as opposed to the language.

      2. 3

        Annecdote but Typescript exists purely to make an existing language use static types. It has near universal appeal among those who have tried it, and in my experience an entire class of errors disappeared overnight while being having almost no cost at all apart from the one-time transition cost

        Meanwhile “taking all projects will average things out” is unlikely to work well. Language differences are rarely just about types, and different languages have different open source communities with different skill levels and expectations

        1. 3

          As much as I like empiricism and the “there’s not actually that much difference” hypothesis, that article has flaws. In particular, it has sloppy categorization, fex classifying bitcoin as “typescript”. Also, some of its conclusions set off my “wait what” meter, such as Ruby being much safer than python and typescript being the safest language of all.

          1. 3

            The study has many flaws, and by no means does it provide any definitive answers. I linked it as an example of people trying to approach this problem empirically. My main point is that this work needs to be done before we can meaningfully discuss the impacts of different languages and programming styles. Absent empirical evidence we’re stuck relying on our own anecdotal experiences, and we have to be intellectually honest in that regard.

          2. 2

            That link doesn’t seem to be working. Is this the same study?: http://web.cs.ucdavis.edu/~filkov/papers/lang_github.pdf

            I think you make very good points (even though I currently have a preference for static types). I’d love to see more empirical evidence.

            1. 1

              Thanks, and that is the same study. It’s far from perfect, but I do think the general idea behind it is on the right track.

              1. 2

                I only skimmed the study, but doesn’t it actually show a small positive effect for functional languages? From the study:

                Result 2: There is a small but significant relationship between language class and defects. Functional languages have a smaller relationship to defects than either procedural or scripting languages.

                I realise that overall language had a small effect on defect rate, and they noted that it could be due to factors like the kind of people attracted to a particular language, rather than language itself.

                1. 4

                  The results listed show a small positive effect for imperative languages, and no effect among functional ones. In fact, Clojure and Erlang appear to do better than Haskell and Scala pretty much across the board:

                  lang/bug fixes/lines of code changed
                  Clojure  6,022 163
                  Erlang  8,129 1,970
                  Haskell  10,362 508
                  Scala  12,950 836
                  
                  defective commits model
                  Clojure −0.29 (0.05)∗∗∗
                  Erlang −0.00 (0.05)
                  Haskell −0.23 (0.06)∗∗∗
                  Scala −0.28 (0.05)∗∗∗
                  
                  memory related errors
                  Scala −0.41 (0.18)∗
                  0.73 (0.25)∗∗ −0.16 (0.22) −0.91 (0.19)∗∗∗
                  Clojure −1.16 (0.27)∗∗∗ 0.10 (0.30) −0.69 (0.26)∗∗ −0.53 (0.19)∗∗
                  Erlang −0.53 (0.23)∗
                  0.76 (0.29)∗∗ 0.73 (0.22)∗∗∗ 0.65 (0.17)∗∗∗
                  Haskell −0.22 (0.20) −0.17 (0.32) −0.31 (0.26) −0.38 (0.19)
                  

                  The study further goes to caution against overestimating the impact of the language:

                  One should take care not to overestimate the impact of language on defects. While these relationships are statistically significant, the effects are quite small. In the analysis of deviance table above we see that activity in a project accounts for the majority of explained deviance. Note that all variables are significant, that is, all of the factors above account for some of the variance in the number of defective commits. The next closest predictor, which accounts for less than one percent of the total deviance, is language.

                  This goes back to the original point that it’s premature to single out static typing as the one defining feature of a language.

          3. 7

            I find their example weird for a couple of reasons:

            They are complaining about a type error on a dynamic language, which the is caught and reported as such at compile time by SBCL, the most popular FLOSS implementation.

            scratch-2.lisp:10:7:
              warning: 
                Derived type of (LAMBDA (X) :IN ADD-TEXT-PADDING) is
                  (FUNCTION (T) (VALUES NULL &OPTIONAL)),
                conflicting with its asserted type
                  (FUNCTION * (VALUES CHARACTER &REST T)).
            ...
            compilation failed
            

            The important bit is (values null ...) vs (values character ..) for those unfamiliar with type declarations in Lisp.

            The mistake is the result of they not reading the documentation for map, even looking at the message from eldoc would hint as to what is their mistake, first argument is the result-type. Btw although most libraries are poorly documented, the language itself is very well documented and easy to open the documentation for the function at point from SLIME/SLY, C-c C-d C-h.

            (map ’string …) is used to loop through characters in a string. Note that here we use map function as a helper for a rather imperative procedure

            What is worse is that they seem to double down by mistakenly stating that (map 'string ...) is used to loop through the characters in string. It is (map ...). Why would I need to declare the type of an argument in Lisp? And if I did, why wouldn’t it be with the (declare ...) form. Seems that they were ‘in a Haskell mind-set’ while writing the code.

            Additionally, it is is not idiomatic Common Lisp to use map when traversing a string when you only care for a the side-effect of the function. MAP takes the result-type precisely because you care about what the function returns! It would be idiomatic to use the LOOP construct

            (loop :for char :across string
                  :do (princ x out)
                  :when (char= x #\Newline)
                    :do (dotimes (i padding)
                          (princ #\Space out)))
            

            The code earlier in the function hints that they dislike non-functional aspects of Lisp, like LOOP and FORMAT. Worse they claim there is no syntax for ‘\n’ in Lisp while pointing to said syntax, ~%. Instead of concatenate idiomatic common lisp would use (format nil "~A~%" str), or they could do away with the entire if and move it to the format control string, (format nil "~:[~;~%~]~A" newline str). In fact there is no need for (dotimes (i padding ...) they can use FORMAT for that.

            (format nil "~VT~A" padding line)
            
            1. 4

              re standard libraries

              One idea I had when looking at transpilers and C-like LISP’s was to make a Python-like LISP. Also thought of Nim since they look similar. Anyway, the idea is to embed a Python-compatible language in a well-tooled LISP or Scheme where Python code could be easily ported. Start porting its standard library. Automate this process with Python semantics matched to the LISP. Eventually, one can do most things in the LISP using all those Pythonic libraries with the extra benefit of native, code compiler. Maybe also do it in reverse where one can extract idiomatic Python for distribution to those that don’t know the LISP version. Last part of that brainstorm was possibly doing it in Racket so people who started with or know Python can go to How to Design Programs to learn Scheme then to the Pythonic Scheme to get its power reusing their existing knowledge and libraries.

              Just a brainstorm folks might find interesting. Python has gotten pervasive and critical enough that I do keep thinking back on automated methods to optimize it, secure it, etc.

              1. 8

                Not quite like what you’re describing, but have you seen Hy?

                1. 4

                  I haven’t. Excerpting this:

                  “This is pretty cool because it means Hy is several things:

                  1. A Lisp that feels very Pythonic
                  2. For Lispers, a great way to use Lisp’s crazy powers but in the wide world of Python’s libraries (why yes, you now can write a Django application in Lisp!)
                  3. For Pythonistas, a great way to start exploring Lisp, from the comfort of Python!”

                  That looks like it’s at least half of what I was aiming for. Cool stuff. Bookmarking it. Thanks for the link!

                2. 5

                  There is a python implementation in Common Lisp, https://github.com/metawilm/cl-python. There is also a library to run a CPython interpreter inside a Common Lisp image and interface with it: https://github.com/mmontone/burgled-batteries

                  Also Marijn Haverbeke, of code-mirror fame, had a similar idea but with JavaScript instead of Python so cl-javascript was born.

                  1. 1

                    That’s more like it! Even runs on multiple implementations of CL. Bookmarked. :)