1. 63
  1.  

  2. 33

    I think size of the program, and the team maintaining it, is an important factor in the static vs dynamic discussion.

    I’m in the “I don’t make type errors, and if I do, I can shake them out with a few tests” camp for as long as I can comprehend what’s going on in the codebase.

    But when the code grows to the point I can no longer remember where everything is, e.g. I can’t find all callers of a function, dynamism starts to become a burden. When I need to change a field of a struct, I have to find all uses of the field, because every missed one is a bug that’s going to show up later. At some point that becomes hard to grep for, and even impossible to account for in reflection-like code. It can degrade to a bug whack-a-mole, promote more and more defensive programming patterns, and eventually fear of change.

    I’ve had good experiences with gradually-typed languages. They stretch this “size limit” of a program a lot, while still allowing use of duck typing where it helps, without bringing complexity of generic type systems.

    1. 11

      “Dynamic typing falls apart with large team/large codebase” is one of those cliché arguments that doesn’t really contribute usefully, though.

      Also your presentation of it has multiple issues:

      • Large team/large codebase projects fail all the time regardless of typing discipline. Static typing doesn’t appear to have a better track record of success there.
      • Tooling for dynamically-typed languages has come a long way in the decades since this argument was first raised. You can just get an IDE and tell it to track down and rename references for you. And if your complaint is that it’s harder/impossible to do through “reflection-like code”, well, people can write metaprogram-y reflection stuff in statically-typed languages too.
      • Ultimately, if your codebase has lots of functions or methods that are called from huge numbers of disparate places, to such a degree that you can’t safely work with it without an IDE doing full static analysis to track them all for you, that’s a code smell in any language, in any typing discipline.
      1. 22

        Static languages can verify all metaprogramming is type correct. IDE heuristics can not. In Rust you can write a macro and the compiler will expand and type check it. That kind of stuff is impossible in dynamic languages.

        1. 10

          Static languages can verify all metaprogramming is type correct.

          This is probably going to get off-topic into arguing about the exact definition of “statically-typed”, but: I think that if you venture outside of languages like Rust (which seem to deliberately limit metaprogramming features precisely to be able to provide guarantees about the subset they expose), you’lll find that several languages’ guarantees about ahead-of-time correctness checks start being relaxed when using metaprogramming, runtime code loading, and other “dynamic-style” features. Java, for example, cannot actually make guarantees as strong as you seem to want, and for this among other reasons the JVM itself is sometimes referred to as the world’s most advanced dynamically-typed language runtime.

          There also are plenty of things that seem simple but that you basically can’t do correctly in statically-typed languages without completely giving up on the type system. Truly generic JSON parsers, for example. Sure, you can parse JSON in a statically-typed language, but you either have to tightly couple your program to the specific structures you’ve planned in advance to handle (and throw runtime errors if you receive anything else), or parse into values of such ultra-generic “JSON object” types that the compiler and type system no longer are any help to you, and you’re effectively writing dynamically-typed code.

          1. 5

            for this among other reasons the JVM itself is sometimes referred to as the world’s most advanced dynamically-typed language runtime

            Aren’t runtimes always “dynamically typed”? What does it mean for a runtime to be “statically typed”?

            or parse into values of such ultra-generic “JSON object” types that the compiler and type system no longer are any help to you, and you’re effectively writing dynamically-typed code.

            It sounds like you’re arguing that the worst case for static type systems is equivalent to the best case for dynamic type systems, which doesn’t seem like a ringing endorsement for dynamic type systems. That said, I don’t even think this is true for this JSON-parsing example, because you could conceive of a generic JSON parser that has different unmarshaling strategies (strict, permissive, etc). Further, as static type systems are adopted more widely, this sort of poorly-structured data becomes rarer.

            1. 7

              Aren’t runtimes always “dynamically typed”?

              Some more so than others. Rust, for all its complexity as a language, is mostly shoving that complexity onto the compiler in hopes of keeping the runtime relatively simple and fast, because the runtime doesn’t have to do quite as much work when it trusts that there are classes of things the compiler simply prevents in advance (the runtime still does some work, of course, just not as much, which is the point).

              But a language like Java, with runtime code loading and code injection, runtime reflection and introspection, runtime creation of a wide variety of things, etc. etc. does not get to trust the compiler as much and has to spend some runtime cycles on type-checking to ensure no rules are being broken (and it’s not terribly hard to deliberately write Java programs that will crash with runtime type errors, if you want to).

              That said, I don’t even think this is true for this JSON-parsing example, because you could conceive of a generic JSON parser that has different unmarshaling strategies (strict, permissive, etc).

              If you want truly generic parsing, you’re stuck doing things that the compiler can’t really help you with. I’ve seen even people who are quite adept at Haskell give up and effectively build a little subset of the program where everything is of a single JSON type, which is close enough to being dynamically typed as makes no difference.

              Further, as static type systems are adopted more widely, this sort of poorly-structured data becomes rarer.

              My experience of having done backend web development across multiple decades is that poorly-structured data isn’t going away anytime soon, and any strategy which relies on wishing poorly-structured data out of existence is going to fail.

              1. 2

                Aren’t runtimes always “dynamically typed”?

                If you eschew these needlessly binary categories of static vs dynamic and see everything on a scale of dynamism then I think you’ll agree that runtimes are scattered across that spectrum. Many even shift around on that spectrum over time. For example, if you look at the history of JSR 292 for adding invokedynamic to the JVM you’ll find a lot of cases where the JVM used to be a lot less dynamically typed than it is today.

              2. 4

                There’s no reason you can’t parse a set of known JSON fields into static members and throw the rest into an ultra-generic JSON object.

                1. 4

                  Those are the options I said are available, yes.

                  1. 3

                    I mean, you can do both at once for the same value, getting the benefits of both.

                2. 3

                  Dynlangs are definitely better for data that isn’t structured as well.

                  C#’s dynamic keyword feels like a perfect fit for this situation without having to give up static typing everywhere else. Hejlsberg is ahead of the curve, per usual.

              3. 7

                Fail is too harsh. Unless you’re writing some rocket navigation system, a project is not going to outright fail because of software defects. Run-time type errors merely add to other bugs that you will need to fix, and I argue that bugs caused by runtime type errors are less of a problem in small programs.

                I don’t know of any robust tooling for refactoring large JavaScript projects. Of course most languages have some type-system escape hatches, but I expect languages like JS to use hard-to-analyze type-erasing constructs much more often.

                I disagree that having callers beyond your comprehension is automatically a code smell. It’s a natural state of things for libraries, for example. Ideally libraries should have a stable API and never change it, but it’s not always that easy, especially for internal libraries and reusable core pieces of large projects that may need to evolve with the project.

                It’s not just about IDEs. Compilation will also track down all type errors for you, regardless of where and when these errors happen. When working with teams, it may be someone else working on some other component. In this case the types are a way to communicate and coordinate with others.

                You can make a mess in any language, but how easy is to make a mess varies between languages. Languages that prevent more errors will resist the mess for longer.

                1. 2

                  I expect languages like JS to use hard-to-analyze type-erasing constructs much more often.

                  Why do you expect this?

                  I disagree that having callers beyond your comprehension is automatically a code smell.

                  Even if it’s an internal library, why don’t other internal codebases have a single clear integration point with it? And why does everything else need to have lots of knowledge of the library’s structure? This definitely is a code smell to me – the Law of Demeter, at least, is being violated somewhere, and probably other design principles too.

                  Languages that prevent more errors will resist the mess for longer.

                  This is veering off into another clichéd and well-trod argument (“static typing catches/prevents more bugs”). I’ll just point out that while proponents of static typing often seem to take it as a self-evident truth, actually demonstrating its truth empirically has turned out to be, at the very least, extremely difficult. Which is to say: nobody’s managed it, despite it being such an “obvious” fact, and everybody who’s tried has run into methodological problems, or failed to prove any sort of meaningful effect size, or both.

                  1. 2

                    Why do you expect this?

                    Because the flexibility is a benefit of dynamic languages. If you try to write code as-if it was strongly statically typed, you’re missing out on the convenience of writing these things “informally”, and you’re not getting compiler help to consistently stick to the rigid form.

                    why don’t other internal codebases have a single clear integration point with it?

                    The comprehension problems I’m talking about that appear in large programs also have a curse of being hard to explain succinctly in a comment like this. This is very context-dependent, and for every small example it’s easy to say the problem is obvious, and a fix is easy. But in larger programs these problems are harder to spot, and changes required may be bigger. Maybe the code is a mess, maybe the tech debt was justified or maybe not. Maybe there are backwards-compat constraints, interoperability with something that you can’t change, legacy codebase nobody has time to refactor. Maybe a domain-specific problem that really needs to be handled in lots of places. Maybe code is weirdly-shaped for performance reasons.

                    The closest analogy I can think of is “Where’s Waldo?” game. If I show you a small Waldo picture, you’ll say the game is super easy, and obviously he’s right here. But the same problem in a large poster format is hard.

                    1. 4

                      Because the flexibility is a benefit of dynamic languages. If you try to write code as-if it was strongly statically typed, you’re missing out on the convenience of writing these things “informally”, and you’re not getting compiler help to consistently stick to the rigid form.

                      You are once again assuming that statically-typed languages catch/prevent more errors, which I’ve already pointed out is a perilous assumption that nobody’s actually managed to prove rigorously (and not for lack of trying).

                      Also, the explanation you give still doesn’t really make sense. Go look at some typical Python code, for example – Python’s metaprogramming features are rarely used and their use tends to be discouraged, and easily >99% of all real-world Python code is just straightforward with no fancy dynamic tricks. People don’t choose dynamic typing because they intend to do those dynamic tricks all the time. They choose dynamic typing (in part) because having that tool in the toolbox, for the cases when you need it or it’s the quickest/most straightforward way to accomplish a task, is incredibly useful.

                      The comprehension problems I’m talking about that appear in large programs also have a curse of being hard to explain succinctly in a comment like this

                      Please assume that I’ve worked on large codebases maintained by many programmers, because I have.

                      And I’ve seen how they tend to grow into balls of spaghetti with strands of coupling running everywhere. Static typing certainly doesn’t prevent that, and I stand by my assertion that it’s a code smell when something is being called from so many disparate places that you struggle to keep track of them, because it is a code smell. And there are plenty of patterns for preventing it, none of which have to do with typing discipline, and which are well-known and well-understood (most commonly, wrapping an internal interface around a library and requiring all other consumers in the codebase to go through the wrapper, so that the consuming codebase controls the interface it sees and has onlyu a single point to update if the library changes).

                      1. 3

                        I’ve worked on large codebases maintained by many programmers, because I have. And I’ve seen how they tend to grow into balls of spaghetti with strands of coupling running everywhere. Static typing certainly doesn’t prevent that . . .

                        No, definitely not, agreed. But static typing definitely improves many/most dimensions of project maintainability, compared to dynamic typing. This isn’t really a controversial claim! Static typing simply moves a class of assertions out of the domain of unit tests and into the domain of the compiler. The question is only if the cost of those changes is greater or lesser than the benefits they provide. There’s an argument to be made for projects maintained by individuals, or projects with lifetimes of O(weeks) to O(months). But once you get to code that’s maintained by more than 1 person, over timespans of months or longer? The cost/benefit calculus just doesn’t leave any room for debate.

                        1. 3

                          But static typing definitely improves many/most dimensions of project maintainability, compared to dynamic typing. This isn’t really a controversial claim!

                          On the contrary, it’s a very controversial claim.

                          Proponents of static typing like to just assert things like this without proof. But proof you must have, and thus far nobody has managed it – every attempt at a rigorous study to show the “obvious” benefits of static typing has failed. Typically, the ones that find the effect they wanted have methodological issues which invalidate their results, and the ones that have better methodology fail to find a significant effect.

                          The cost/benefit calculus just doesn’t leave any room for debate.

                          Again: prove it. WIth more than anecdata, because we both have anecdotes and that won’t settle anything.

                      2. 3

                        Because the flexibility is a benefit of dynamic languages. If you try to write code as-if it was strongly statically typed, you’re missing out on the convenience of writing these things “informally”, and you’re not getting compiler help to consistently stick to the rigid form.

                        I see most typing errors as self-inflicted wounds at this point. Don’t have time or patience for things that can be prevented by the compiler happening at runtime.

                        Dynlangs + webdev together is my kryptonite. If I had to do that all day I’d probably start looking for a new career. Just can’t deal with it.

                    2. 1

                      I don’t know of any robust tooling for refactoring large JavaScript projects

                      Following up on this specifically: I’m an Emacs guy, not really an IDE person, so I don’t know the landscape that well. But everybody I know who goes the IDE route in Python uses PyCharm, so I looked up the rest of the JetBrains product line, and sure enough they have an IDE for JavaScript/TypeScript which claims to support refactoring.

                      I assume it’s not the only such product out there.

                    3. 4

                      Large team/large codebase projects fail all the time regardless of typing discipline. Static typing doesn’t appear to have a better track record of success there.

                      Yes, projects can fail for lots of reasons; no one is claiming that static typing will make a shitty idea commercially successful, for example :) But I do think static types help a lot within their narrow scope–keeping code maintainable, reducing bugs, preserving development velocity, etc. Of course, there’s no good empirical research on this, so we’re just going off of our collective experiences. 🤷‍♂️

                      1. 3

                        Large team/large codebase projects fail all the time regardless of typing discipline. Static typing doesn’t appear to have a better track record of success there.

                        I think it pretty much does, actually. Static typing moves an enormous class of invariants from opt-in runtime checks to mandatory compile-time checks. Statically typed languages in effect define and enforce a set of assertions that can be approximated by dynamically typed languages but never totally and equivalently guaranteed. There is a cost associated with this benefit, for sure, but that cost is basically line noise the moment your project spans more than a single developer, or extends beyond a non-trivial period of time.

                        1. 2

                          I think it pretty much does, actually.

                          As I said to your other comment along these lines: prove it. The literature is infamously full of people absolutely failing to find effects from static typing that would justify the kinds of claims you’re making.

                      2. 13

                        I always laugh when I see ruby code where the start of the method is a bunch of “raise unless foo.is_a? String”. The poor mans type checking all over the place really highlights how unsuitable these dynamic languages are for real world use.

                        1. 7

                          To be fair, any use of is_a? in ruby is a code smell

                          1. 12

                            Sure, it’s also a pattern I have seen in every Ruby codebase I have ever worked with because the desire to know what types you are actually working with is somewhat important for code that works correctly.

                            1. 5

                              Yeah, the need for ruby devs is much larger than the supply of good ones or even ones good enough to train the others. I’ve seen whole large ruby codebases obviously written by Java and C++ devs who never got ruby mentoring. I expect this is an industry wide problem in many stacks

                          2. 6

                            You seem to just be trolling, but I’ll play along, I guess.

                            I’ve seen a bit of Ruby, and a lot of Python and JavaScript, and I’ve never seen this except for code written by people who were coming from statically-typed languages and thought that was how everyone does dynamic typing. They usually get straightened out pretty quickly.

                            Can you point to some examples of popular Ruby codebases which are written this way? Or any verifiable evidence for your claim that dynamic languages are “unsuitable… for real world use”?

                            1. 6

                              I’m not trolling at all. I’ve been a Rails dev for the last 7 years and seen the same thing at every company. I don’t work on any open source code so I can’t point you at anything.

                              I quite like Rails but I’m of the opinion that the lack of static type checking is a serious deficiency. Updating Rails itself is an absolute nightmare task where even the official upgrade guide admits the only way to proceed is to have unit tests on every single part of the codebase because there is no way you can properly verify you have seen everything that needs to change. I’ve spent a large chunk of time spanning this whole year working towards updating from Rails 5.1 to 5.2. No one else dared attempt it before I joined because it’s so extremely risky.

                              I love a lot of things about Rails and the everything included design but I don’t see a single benefit to lacking types. Personally I see TypeScript as taking over this space once the frameworks become a little more mature.

                              1. 3

                                You made a very specific assertion about how people write Ruby (lots of manual type-checking assertions). You should be able to back up that assertion with pointers to the public repositories of popular projects written in that style.

                                1. 8

                                  I remembered hearing from my then-partner that Rails itself uses a lot of is_a?, and that seems true.

                                   if status.is_a?(Hash)
                                          raise ArgumentError, etc...
                                  
                                  1. 3

                                    This is pretty misleading – a quick glance at some of the examples seems like many of them aren’t really checking argument types, and when they are, they’re often cases where a method accepts any of multiple types, and there’s branching logic to handle the different options.

                                    Which is something you’d also see in a statically-typed language with sum types.

                                    The proposition that this is a common idiom used solely as a replacement for static checking is thus stil unproved.

                                    1. 1

                                      Well yeah, and then there’s those that raise errors, or return some failure-signaling value.

                                      I don’t know what projects to look at since I don’t use Ruby, but I found some more in ruby/ruby.

                                  2. 6

                                    ill concur with GP: this is a fairly common pattern to see in ruby codebases.

                                    however, to be fair, it’sa pattern most often introduced after attending a talk by a static typing weenie…

                              2. 5

                                Do you also laugh when you see “assert(x > 0);” in typed languages?

                                1. 7

                                  I would, but it would be a sad laugh because I’m using a type system that can’t express a non-zero integer.

                                  1. 3

                                    I would love to see broader adaptation of refinement types that let you statically guarantee properties like integer values being bound between specific values.

                                2. 4

                                  I’m in the “I don’t make type errors, and if I do, I can shake them out with a few tests” camp for as long as I can comprehend what’s going on in the codebase.

                                  This is generally true for me, but writing tests or debugging stack traces makes for a slow iteration loop. A type error from a compiler usually contains better, more direct information so resolving these type errors is a lot faster. To the extent that I (a 15 year Pythonista) eventually began to prototype in Go.

                                  That said, the biggest advantage for me for a static type checker is that it penalizes a lot of the crappy dynamic code (even the stuff that is technically correct but impossible to maintain/extend over time). A static type system serves as “rails” for less scrupulous team members. Of course, a lot of these less-scrupulous developers perceive this friction as a problem with static type systems rather than a problem with the way they hacked together their code, but I think Mypy and TypeScript have persuaded many of these developers over time to the extent that static types are much less controversial in most dynamic language communities.

                                  Another big advantage is that your type documentation is always correct and precise (whereas docs in a dynamically typed language often go stale or simply describe something as “a file-like object” [does that mean it just has a read() method, or does it also need write(), close(), seek(), truncate(), etc?]). Further, because the type docs are precise, you can have thinks like https://pkg.go.dev complete with links to related types, even if those types are declared in another package, and you get all of this for free.

                                  1. 4

                                    I’m in the Type everything if it’s even kinda big camp now. There are too many things I need to think about during the day to remember the state and usage of every variable of every program I’ve ever written, used or inspected. Typings are rails for my logic. Typings are documentation. Types help my IDE help me. I will take every single shortcut I can when the timespan I or anyone else could be interacting with the code is longer than 10 minutes.

                                    Retracing steps is just so tedious and frustrating when you had it all in your head before. It just sucks. I just wanna build stuff, not fill my head with crap my computer can do.

                                    /rant

                                  2. 13

                                    I’ve been disappointed by dynamic typing for basically the entirety of the time I’ve been a professional software developer using dynamically-typed languages. Like the author, one of my first programming jobs involved writing Ruby - but that just made me acutely aware of runtime bugs caused by type errors (including other people’s type errors), because I had to fix them.

                                    The lesson of the runtime type error in the author’s code is that even if you are doing some kind of metaprogramming, you still want access to a type system that will statically guarantee that you’re not trying to do something like call a method that expects only hashable types with an unhashable type.

                                    1. 12

                                      On an episode of the On The Metal podcast, Jonathan Blow said something to the effect of “we’re trading about 10x performance by using Python, but are we really getting 10x the developer experience back on the trade?” It does seem like if being dynamic is so slow (which it is), it also ought to be an even better experience using it, but it’s generally not, at least, not yet.

                                      1. 6

                                        I think the problem with Blow’s argument (in my mind) is that:

                                        a) Time to value is often way more important than performance, and the language + ecosystem have this in spades. b) Software requirements are dynamic c) No one is choosing Python for very performance critical applications

                                        Also, Python isn’t slow because it’s dynamic. Python is slow because it’s interpreted. I bet heavily optimized, and ahead of time compiled Python could be within a factor of 2x compared to many equivalent C programs, and beat the arbitrary 10x number generally. The difficulty in getting there can’t be understated, however.

                                        1. 12

                                          Python is slow because it’s interpreted

                                          This used to be a pet peeve of mine because like … Python is obviously compiled; all Python becomes bytecode before it’s interpreted. That’s what .pyc files are.

                                          Then someone told me that Python bytecode is basically just a weird serialization format of the AST, meaning like … it’s compiled, but in a way that doesn’t hardly benefit from the compilation at all, as far as I can tell? All it does is save you having to parse? Anyway, the takeaway I got from that is that judging dynamic languages in general by how well Python does just isn’t fair, and Python should be taken as an extreme example of a design that can’t be well optimized. It’s slow because it’s Python, not because it’s dynamic. Basically every other dynamic language gets this right where Python gets it wrong.

                                          1. 6

                                            It’s slow because it’s Python, not because it’s dynamic. Basically every other dynamic language gets this right where Python gets it wrong.

                                            Eeeeh. I’m not sure about this generalization? What are you arguing that Python gets wrong but other languages get it right?

                                            It’s easy to prove that interpreted languages are slower that their compiled counter parts. But, Julia is dynamic, compiled, and fast. As is compiled Common Lisp. Dynamic languages do not imply slow. Dynamic interpreted languages, however, tend to be slow without some sort of native compilation (including JITs, for instance. See PyPy, luajit, Ruby’s jit, etc.) and even then…

                                            1. 6

                                              It’s not that Python is “wrong” exactly, but rather that Python made design choices that make fast interpreter hard to write.

                                              IIRC, in particular the dynamic typing & introspection capabilies combine to make it hard to nail down the types of objects at compile time, so the bytecode is forced to be very high level - you’re effectively calling functions on objects all the time, unable to specialise down to optimised routines that can assume that a variable is a given type. The problem (for a python compiler) is that at any point the variable in question might change type, so unless the compiler can prove that this isn’t the case it has to generate generic code for every operation & can’t specialise or inline anything.

                                              With more advanced program analysis you can deduce the types of some variables but interpreted languages that don’t allow Python’s level of dynamic flexibility are always going to have a much easier time creating fast low-level bytecode or using a JIT to generate fast machine code.

                                              1. 4

                                                I think it’s also a matter of how much engineering time is thrown at the problem. JavaScript shares many of the same issues with Python, but JS engines have become quite fast due to the massive number of optimization tricks they pull. I think CPython also has a goal of being easy to understand and hack on, which might be an additional barrier to such massive improvements.

                                                1. 1

                                                  It’s not that Python is “wrong” exactly, but rather that Python made design choices that make fast interpreter hard to write.

                                                  I guess I don’t understand what choices Python made that other popular dynamic + bytecode interpreted languages didn’t? That’s my question here.

                                                  In the Languages Benchmarks Game (where > implies faster): PHP > Perl > Python > Ruby > Lua.

                                                  1. 3

                                                    The biggest problem with Ruby (and I haven’t done enough Python to say whether this is true there as well but I suspect it is but maybe to a slightly lesser degree?) is that idiomatic code simply invokes dynamic dispatch everywhere. (In Ruby’s case it’s thru message passing; in Python’s it would be lookup as a hash-table access on the class’s method table instead I think?) While it’s possible to write code using lambdas, etc which never invokes message passing, such code would be considered very unusual. Ruby and Python usually get bucketed in the same class of “just about as slow as it gets” with one or the other pulling ahead in some specific area.

                                                    Meanwhile Lua’s pervasive use of closures and locals for nearly everything results in much fewer lookups for calls. Plus in cases where you are calling a function from a table in a tight loop and a bottleneck is identified, it’s usually trivial to optimize it (in “userspace”) to a local once you’ve identified the issue, if you’re not on LuaJIT where the runtime does it for you.

                                                    Of course, it’s easy to write Lua code that has dispatch overhead by shoving everything in a table, but at the same time the cost of such code is pretty clear, vs Ruby where the slow way is the default for everything.

                                                    1. 2

                                                      I’d truly like to see a benchmark of a real world script in Ruby, Python, Lua to do this comparison. Python is flexible enough that the use of classes isn’t necessarily standard, but used when it makes sense. And, from what I’ve seen with Lua the same is true. If it makes sense to use a OOP, then methods will be devised somehow. Both Lua and Python have heavily optimized core datastructures (tables, or dicts + lists), and my guess is that performance differences between them are negligible.

                                                      Modules in both languages are a wrapper over their table / dict implementations, and local variables might go to Lua because it’s likely an array index, though it’s likely a stack index operation in Python.

                                                      >>> def foo(x, y):
                                                      ...     a = x + y
                                                      ...     print(a)
                                                      ... 
                                                      >>> import dis
                                                      >>> dis.dis(foo)
                                                        2           0 LOAD_FAST                0 (x)
                                                                    2 LOAD_FAST                1 (y)
                                                                    4 BINARY_ADD
                                                                    6 STORE_FAST               2 (a)
                                                      
                                                        3           8 LOAD_GLOBAL              0 (print)
                                                                   10 LOAD_FAST                2 (a)
                                                                   12 CALL_FUNCTION            1
                                                                   14 POP_TOP
                                                                   16 LOAD_CONST               0 (None)
                                                                   18 RETURN_VALUE
                                                      

                                                      I don’t know – these look like index operations, not dict lookups. We might be splitting hairs. Gotta go LOAD_FAST

                                                      1. 2

                                                        They’re not index operations though. Look at the implementation of BINARY_OP_ADD in the python interpreter in https://github.com/python/cpython/blob/main/Python/generated_cases.c.h

                                                        TARGET(BINARY_OP_ADD_INT) {
                                                                PyObject *right = PEEK(1);
                                                                PyObject *left = PEEK(2);
                                                                PyObject *sum;
                                                                assert(cframe.use_tracing == 0);
                                                                DEOPT_IF(!PyLong_CheckExact(left), BINARY_OP);
                                                                DEOPT_IF(Py_TYPE(right) != Py_TYPE(left), BINARY_OP);
                                                                STAT_INC(BINARY_OP, hit);
                                                                sum = _PyLong_Add((PyLongObject *)left, (PyLongObject *)right);
                                                                _Py_DECREF_SPECIALIZED(right, (destructor)PyObject_Free);
                                                                _Py_DECREF_SPECIALIZED(left, (destructor)PyObject_Free);
                                                                if (sum == NULL) goto pop_2_error;
                                                                STACK_SHRINK(1);
                                                                POKE(1, sum);
                                                                next_instr += 1;
                                                                DISPATCH();
                                                            }
                                                        

                                                        and then look at the definition of _PyLongAdd() in https://github.com/python/cpython/blob/main/Objects/longobject.c which is even more complex. Then consider that this is the specialised code for ints - it’s worse for more complex types.

                                                        Python is slow because every time it does anything at all it has to check the types & do a bunch of other unavoidable busywork because of the semantics of the language. Even LOAD_FAST is relatively slow because the interpreter is a stack machine & variables have to be pushed on to the stack in order to do anything at all with them, which means tweaking reference counts & copying object pointers.

                                                        As I understand things it has to do all this work because the python interpreter doesn’t know that variables haven’t changed type since the last time it looked at them. In principle you could write a JIT that did proof-tool like work to satisfy itself that a given variable was always an Int, or whatever (this is the kind of thing that V8 does I believe) but the python runtime doesn’t do anything like that. This interpreter style (a bytecode on a stack machine) probably also keeps the code relatively straightforward, at the expense of a lot of extra memory pressure shuffling data around at runtime.

                                                        Python has always been written for ease of programming itself & ease of programming for people writing it. Performance has never been a priority.

                                                        1. 1

                                                          Most of what you are describing as slow parts of the interpreter are just implementation choices that could be changed. Yes, the types constantly need to be checked because they can change. That’s a property of JavaScript, too. Jython doesn’t have to “constantly tweak reference counts” because it runs on the JVM where tracing GC is used instead.

                                                          Yes, there are parts of Python’s semantics that make it challenging to make performant. My overarching point, in this whole thread, is that this fact might limit the over all expected performance, but it doesn’t mean that Python can’t be sped up significantly despite it.

                                                          1. 1

                                                            Sure, if you put the same level of engineering as went into V8 into a Python interpreter, you could probably get something as fast, at the expense of some loss of compatibility.

                                                            That’s a big ask though & also doesn’t seem to be a direction the existing Python devs want to go in.

                                                            1. 1

                                                              at the expense of some loss of compatibility.

                                                              What compatibility? C extensions? Sure. Python is a language first, and a community second. If your goal is “fast Python” the hurdle is “Python’s obvious implementation is interpreted due to its semantics.” If you want community compatibility for all that’s on PyPi, then it’s going to be a much greater undertaking.

                                                              My point continues to be that Python could be made much faster, and that the dynamic nature of it does not completely discount all performance improvement opportunities like other folks in this thread keep suggesting.

                                                              1. 1

                                                                No one is saying it’s impossible, just that it’s as hard or harder than making JS fast & that took Google level expenditure over many years.

                                                2. 2

                                                  Eeeeh. I’m not sure about this generalization? What are you arguing that Python gets wrong but other languages get it right?

                                                  I’m saying (and again this is based on what I’ve been told) that if Python’s bytecode is just “let’s take that AST that we just parsed and dump that out to disk just so we don’t have to parse it again” instead of actually being designed for performance, then it’s doing much worse than every other non-toy dynamic language, because the compilation process is hardly making any meaningful transformation at all.

                                                  Even without JIT, a bytecode compiler that does meaningful compilation in the bytecode will absolutely beat the pants off Python.

                                                  It’s easy to prove that interpreted languages are slower that their compiled counter parts

                                                  That’s the thing; there really aren’t any interpreted languages in widespread use, other than Python (which is compiled, but in a way that’s so trivial that it might as well be interpreted.) Every other dynamic language in widespread use either compiles to bytecode, or compiles to native code, or both. I believe the last holdout was R, which added a bytecode compiler in 2011, but of course this depends on how you define “widespread use”.

                                                  I guess it’s possible that there are some other bytecode formats used by other languages that make the same mistake as Python and just serve as a serialization format for their AST; if that’s what you mean then I’m interested to hear more. I don’t know of any such languages myself.

                                                  1. 3

                                                    I’m saying (and again this is based on what I’ve been told) that if Python’s bytecode is just “let’s take that AST that we just parsed and dump that out to disk just so we don’t have to parse it again” instead of actually being designed for performance

                                                    CPython compiles the AST to a stack based bytecode interpreter, which you might recall is similar to the JVM. The instructions are here.It does a number of peephole optimizations as well. It’s certainly the case that there’s always room for more compiler optimizations, and probably also room for super instructions (combinations of common instructions to reduce dispatch overhead). Python is probably no less sophisticated than Ruby, Lua, Perl in the optimizations it does in it’s bytecode interpreter. I’m pretty sure they all use switch statement, for instance, rather than a more sophisticated threading model. Jython, an implementation equivalent to Python 2.7, targets the JVM, and JVM bytecode, fwiw, if we’re looking for arguments on sophistication of the interpreters.

                                                    then it’s doing much worse than every other non-toy dynamic language, because the compilation process is hardly making any meaningful transformation at all.

                                                    I’m not sure which dynamic, interpreted, languages are doing loads of optimizations when compiling to their internal bytecode for execution. Optimization takes time, which increases execution latency. I briefly looked at Ruby’s compiler (which I think is now based on YARV) and they seem to do some optimizations, but it’s not obviously clear to me what passes (if any) they make, so most likely peephole, and “let’s avoid the obvious stupid thing” in different cases. Looking at the lua code generator, it does some peephole optimizations as fixups, and tries to be smart in specific cases to avoid unnecessary things. It does not appear to be doing sophisticated control flow analysis and doing multiple passes to generate the tightest possible bytecode.

                                                    I’ve not dug into Perl, or Raku, but both of those languages also have bytecode interpreters internally, and I’m sure are doing some optimizations, but not significant compilation passes.

                                                    And, of course the reason for this is latency.

                                                    That’s the thing; there really aren’t any interpreted languages in widespread use, other than Python (which is compiled, but in a way that’s so trivial that it might as well be interpreted.)

                                                    The 4/5 languages mentioned above have nearly the same execution design and are in widespread use. I suppose I should look at PHP, which I’m pretty sure is also now a similar design of “compile to bytecode with some small number of optimizations.” Lua is a “register machine.” I know Python and Perl use stack machines, and I’m not entirely sure about YARV. Point is, I’m not sure you’re making a coherent argument?

                                                    1. 2

                                                      I’m not sure you’re making a coherent argument?

                                                      Actually I think it sounds more like the argument I’m making might have been based on something incorrect I read somewhere, and that the Python bytecode process is doing more than that?

                                                      1. 2

                                                        Perhaps!

                                                    2. 1

                                                      Even without JIT, a bytecode compiler that does meaningful compilation in the bytecode will absolutely beat the pants off Python.

                                                      See my comment here

                                                  2. 6

                                                    This was one of the hardest lessons to learn when writing a derivative of Python: bytecode is not necessarily fast, just packed. I’m glad that I learned it, though.

                                                    1. 1

                                                      it’s compiled, but in a way that doesn’t hardly benefit from the compilation at all, as far as I can tell?

                                                      To expand/confirm what follows: the stronger statement here is that not only it “doesn’t” benefit from compilation, it also “can’t” benefit from compilation, because executing bytecode isn’t the bottleneck. The bottleneck is the C runtime to implement very dynamic semantics of Python:

                                                      https://blog.kevmod.com/2016/07/02/why-is-python-slow/

                                                    2. 4

                                                      a) Well, Blow is a game developer, so of course performance is crucial to him. He’d also dismiss most GC’d languages because of unpredictable pause times (and I know it’s less true with some modern GCs). He’d probably dismiss any non-natively-compiled language.

                                                      b) Does this imply that the typing has to be dynamic? You can change types when you program, it’s fine.

                                                      c) Sure, it’s a self-fulfilling prophecy. Python is so slow that no one picks it up for performance-critical software.

                                                      Really if you restrict the argument to gamedev, I think Blow is right. In general, if you include web development, sysadmin, etc. then there are good reasons to use

                                                      Now about the dynamic part: dynamism is a big part of why python is slow. To make it fast, you need some form of JIT, because you have to speculate about values and undo your guesses if you’re wrong. Even the bytecode, like /u/technomancy says, is slow. There’s hashtable lookups everywhere. Python’s design is just… not very amenable to performance, and that’s partly because it’s incredibly dynamic and relies on so many table lookups semantically.

                                                      1. 2

                                                        a) Well, Blow is a game developer, so of course performance is crucial to him. He’d also dismiss most GC’d languages because of unpredictable pause times (and I know it’s less true with some modern GCs). He’d probably dismiss any non-natively-compiled language.

                                                        Right! If performance is your goal, look elsewhere.

                                                        b) Does this imply that the typing has to be dynamic? You can change types when you program, it’s fine.

                                                        No. I did not mean to imply dynamic requirements mean dynamically typed language. There are certainly lots of refactoring arguments on both sides of the coin. :)

                                                        c) Sure, it’s a self-fulfilling prophecy. Python is so slow that no one picks it up for performance-critical software.

                                                        ding, ding, ding!!! It’d be hard, it might require lots more memory, but you could make Python loads faster than it is today.

                                                        Really if you restrict the argument to gamedev, I think Blow is right. In general, if you include web development, sysadmin, etc. then there are good reasons to use’

                                                        Agreed.

                                                        Now about the dynamic part: dynamism is a big part of why python is slow. To make it fast, you need some form of JIT, because you have to speculate about values and undo your guesses if you’re wrong. Even the bytecode, like /u/technomancy says, is slow. There’s hashtable lookups everywhere. Python’s design is just… not very amenable to performance, and that’s partly because it’s incredibly dynamic and relies on so many table lookups semantically.

                                                        There are lots of dynamic languages that are “fast.” Common Lisp, Julia, Chez Scheme are all pretty fast. Dynamic doesn’t mean “slow.” I agree that Python’s semantics and level of dynamism are not obviously amenable to performance, but the bulk of Python code in existence today isn’t taking advantage of all this dynamism. And even if you did, it’s still not impossible to improve upon.

                                                        1. 1

                                                          There are lots of dynamic languages that are “fast.” Common Lisp, Julia, Chez Scheme are all pretty fast. Dynamic doesn’t mean “slow.” I agree that Python’s semantics and level of dynamism are not obviously amenable to performance, but the bulk of Python code in existence today isn’t taking advantage of all this dynamism. And even if you did, it’s still not impossible to improve upon.

                                                          Yeah I think we basically agree. “Dynamic” is also a spectrum: for example, scheme has lexical scoping, and you can potentially pass variables in registers, perform optimization passes, compile to native code, etc. Julia is clearly designed to be JIT-compilable as well (types are runtime but are used to resolve overloading, if I understand correctly; you thus get monomorphic implementations of functions, heavily optimized, and which implementation is used is decided by runtime types).

                                                          At the same time, Python allows you to change and observe pretty much anything at any time, so you can’t do anything behind the user’s back. My impression is that people have tried a lot to accelerate python (since unladden swallow, etc.) but it always breaks some of the semantics or some of the C API, all of which is part of the language in practice. Maybe it’s even that “python” the language could be optimized reasonably well, but “cpython” is basically impossible to optimize, and it’s what people mean when they talk about “python”.

                                                      2. 3

                                                        Also, Python isn’t slow because it’s dynamic. Python is slow because it’s interpreted.

                                                        You have it backwards, strangely enough. It’s interpreted because it’s dynamic.

                                                        Python is so dynamic that even if you try to compile it ahead of time, all you do is generate a pile of machine code that is also slow because it’s still doing all of the dynamic dispatch and hash table lookups that make Python slow. Many people have tried to make ahead-of-time compilers for dynamic languages and they rarely get anywhere near the performance of compiled static languages.

                                                        Because you can’t compile Python’s insane level of dynamism to efficient native code anyway, you may as well just do a bytecode interpreter which is way simpler, more hackable, and makes it easier to do runtime introspection, debugging, etc.

                                                        1. 4

                                                          You have it backwards, strangely enough. It’s interpreted because it’s dynamic.

                                                          I don’t think it’s strange at all, nor do I think it’s backwards. The easiest way to build a highly dynamic language, is certainly to interpret it. We agree here. But that’s not the only way to do so. You know that Knuth quote about pre-mature optimization? 97% of the time it doesn’t matter, but that 3%, oh boy, make the 3% more efficient.

                                                          But, in a language like Python, you always pay for dynamism that you’re using “3% of the time.” That doesn’t make much sense at all. Consider the common uses of Python today:

                                                          1. Straight up numerical computing, via IPython notebooks, etc. Half of this is calling out to Python C extensions. It’s also generally interactive, and not likely to utilize dynamic features outside of a few basic things. Here’s the thing: the slow parts (the large numerical operations) are slow enough that paying extra for dynamism isn’t a big deal. Interactive computing can deal with some latency.
                                                          2. Scripts for system utilities, package managers, text processing, etc: These are typically written in Python to avoid commiting to static types, but often aren’t introspecting / disassembling classes, or doing anything crazy. You’re parsing text, generating new text, calling system utilities, executing syscalls, etc.You’re likely not using a REPL, and you’re almost certainly not doing things that require insane amounts of introspectability of the interpreter, or the bytecode.
                                                          3. Web Servers: In the web server case, or any of the other uses of “run this Python code on a server somewhere and interact with it via a Layer 7 network Protocol” you’re not exposing the dynamic capabilities of the interpreter – that’d be a security nightmare, or a performance hit.
                                                          4. Everything else: You’re almost certainly not doing crazy magic things like introspecting the runtime to hack to rewrite bytecode in odd ways, or any other magic thing you might consider.

                                                          So what are we paying for? “97% of the time” we’re writing code that could be sped up by compiling it directly to its CPython API equivalent in C. We know this because C extensions, PyPy, and previous experiments like Cython, and the really old Psyco actually significantly increase performance. For the 3% of the time where this doesn’t work, we can afford to take the hit of throwing out the compiled stuff, and interpreting it.

                                                          (And, as I’ve said many times in this thread: Yes, this is hard work that might take years)

                                                          Because you can’t compile Python’s insane level of dynamism to efficient native code anyway, you may as well just do a bytecode interpreter which is way simpler, more hackable, and makes it easier to do runtime introspection, debugging, etc.

                                                          I guess it depends on your goals. There’s obviously a lot of time and money invested in Python (specifically), and only some of that time seems to be devoted to raw performance improvements. It doesn’t seem, for instance, that the Python community has fully embraced PyPy (it’s not been mainlined, for instance), so maybe there’s a general consensus that Performance doesn’t matter more than maintenance, hackability, understandability. Not sure those are the same principles I’d work towards, but then again, I don’t really program in Python anymore. But, there’s clearly members of the community that feel otherwise, e.g. Pyston.

                                                          1. 2

                                                            Mypyc claims about 5x speed up for aot compilation. That seems pretty respectable.

                                                        2. 3

                                                          It’s a strange comment. Why should we care if the ratio is 1:1? Most of the time performance is not an issue.

                                                          1. 1

                                                            The trade off seems to pay dividends in the top ranks of Advent of code, at least. I appreciate how static typing and advanced algebraic times slow me down and force me to think of correctness first and foremost while on the job, but in a throw away program at the beginning of the day in December, I feel I progress faster using a dynamic language.

                                                          2. 11

                                                            I just want to call out, I think the author isn’t talking about dynamic/static typing from the perspective of developing a regular application, but for program analysis.

                                                            I think we’re going through a wave of static analysis, but dynamic analysis like the author is talking about will surely come back as we start to hit the limits of static analysis.

                                                            For me, its not as important if the best tooling exists out there or not, but if I can write some tooling that’s good enough for me. I spent an afternoon a few weeks ago writing a svelte store inspector using Proxy, honestly I’m sure an extension exists that does it better, but I could write something that from the web console I could do stores.user = newUser or stores.user to update/see the existing value at runtime.

                                                            Also something I’ve realized recently in the whole static/dynamic debate, I think we’re usually talking about the wrong thing (just if you have a type checker or not). TypeScript is still a dynamic language in the way that I care because the runtime is dynamic.

                                                            1. 4

                                                              dynamic analysis like the author is talking about will surely come back as we start to hit the limits of static analysis

                                                              This sounds odd to me, as I would expect static analysis to have a higher upper bound than dynamic analysis. What are the limits of static analysis, and what approaches could we take to push dynamic analysis past them? It seems unbelievable that tools for Python could catch as many issues as fbinfer for example. And as we increase the abstraction power of our static types, such as with (liquid) refinement types, dependent types, effect types, etc., it looks like our ability to analyze statically typed languages is only increasing.

                                                              TypeScript is still a dynamic language in the way that I care because the runtime is dynamic.

                                                              Why do you care that the runtime is dynamic? What benefits do you gain from having your data boxed this way? If you want to extend types with new fields, or allow different kinds of structures that all share the same interface without requiring specification, we already have this in statically typed land with row polymorphism & structural typing / duck typing.

                                                              1. 4

                                                                I care that the runtime is dynamic so I can programmatically interact with/inspect the program at runtime, like I was describing with my svelte stores example.

                                                                1. 3

                                                                  I’ve gone the dynamic route pretty far for this – having implemented a few runtimes that use that approach – and in my latest stint I ultimately found static types to actually be helpful for the inspection scenario since it gives you places to attach metadata about the shape of your data. For the run-time aspect I found that serializing the app state and then restarting it was way more resilient to code changes than running code in a current VM state (often the VM state becomes something that’s hard to reproduce). This definitely makes more sense if serialization is a core part of your app (which I’ve often found to be the case in my applications – games or tools).

                                                                  Bit of an old video but here’s a workflow demonstrating some of this: https://youtu.be/8He97Sl9iy0?t=132 – directly links to a timestamp where I modify a data structure / metadata on some fields and you can see the results in the UI (while the state of the application remains, including undo / redo history). Earlier in the video at 0:47 you can see the app actually have a hard error due to array out of bounds, but after I fix that the app actually continues from the same point.

                                                                  But yeah I get all of this while also getting the static compile benefits of performance etc. in this case (it compiles to C++ and then to wasm. I enable more clang / llvm optimizations when I do a release build (takes longer)).

                                                            2. 9

                                                              I just wanna see more weird programming stuff people

                                                              1. 6

                                                                John shutt discusses some similar ideas. He delineates a class of ‘interpreted programming languages’, in which this philosophy may be applied to ordinary application runtime (not just analysis), and classes even seeming dynamic standouts like common lisp as ‘compiled’ (contra kernel, of course). I have never been quite sure what to make of kernel, but I think that there is obviously something there, that we should be fools to discard without considering.

                                                                1. 1

                                                                  Thanks for the link.

                                                                2. 4

                                                                  Odd, over time I start enjoying dynamic typing more. At the risk of a competing article, here are my small praises:

                                                                  • There exists the “Good, Fast, Cheap” triangle. I tend to heavily optimize for “Cheap” so that the developer can explore the problem space quickly. O(n) doesn’t matter if you only run it twice before deciding the problem has more complexity and needs a different approach. Once the code stabilizes, start using more developer time to move the trade-offs for “more good, less cheap”.
                                                                  • Tricks used to try to dynamically make code work better than decorators. For example, instead of calling ’place_window(w, point, level), I might, even in the debugger, call trace_place_window(w, point, level)`. The function did not exist before the call and is just the function wrapped in a tracing function.
                                                                  • I wish my software discovered types better. For example, if I used a tuple of (int, string, string) more than once, it should recognize this is really a type and show it as such.
                                                                  • I truly hate that adapters fall out of date with what they wrap and are just piles of boilerplate.

                                                                  At the end of the date, we are to solve problems. We ship code to solve problems. Test cases, template specifications, git logs, and even documentation is just a cost that is worthwhile only if reduces the net cost of shipping the code.

                                                                  1. 4

                                                                    Even the toy example for the blog post is inscrutable magic. I can’t imagine the horrors of working on an entire codebase constructed in this manner.

                                                                    1. 4

                                                                      I loved everything about this: the personal introspection and interrogation of the author’s own biases and preferences, and the fairly well balanced look at both sides.

                                                                      Personally, I prefer working in dynamic languages. It’s a personal preference. And that preference is reinforced every time one of the type supremacists tells me “static language X will ensure you never encounter problem Y” where problem Y is something I’ve either never experienced, or on the rare occasions I’d seen it, it was better blamed on some other problem that wouldn’t have been solved by switching languages.

                                                                      That said, I’m not likely to reject a static language because it’s a static language. It’s just not that high on my priority list. I can’t think of a situation where I’d ever reach for a static language myself, but I also can’t think of a situation where I would try to convince someone to use a dynamic language over a static language for the same job on that characteristic alone. And in some cases, I’d argue in favour of a static language.

                                                                      iOS is a perfect example of this, in my experience. I strongly prefer Objective-C to Swift. But I wouldn’t try to tell someone they should use Objective-C instead because it’s better. They’re different, and it depends on what your goals are. (If your goal is “has static types” then you’ve misunderstood the question entirely.) If I were solving the same problem in Objective-C and Swift, my designs would look wildly different, but they’d still have test suites of the same size, the problem would be solved in both cases.

                                                                      I think the problems that people talk about come from people trying to do “dynamic design in a static language” or “static design in a dynamic language.” That’s a recipe for pain.

                                                                      I am really glad to have read this, and to be able to share it with others in the future.

                                                                      1. 3

                                                                        I find it odd that there’s even a “debate” between static vs dynamic typing. They both have cool uses.

                                                                        1. 16

                                                                          I mean, have you met programmers? Over-identifying with their tools and making a big deal about in-group/out-group dynamics is kind of a whole thing.

                                                                          1. 3

                                                                            Sorry but my experience is different from yours so you’re wrong /s

                                                                        2. 3

                                                                          I’m not sure the example he gives here is necessarily dynamically typed - in a language where you have interfaces and classes, you could simply have a subclass of “callable function” that keeps track of argument counts like the example. I admit it’s more tricky because functions tend not to be “first-class classes” (for lack of a better term).

                                                                          But, the example is there - the author shows that it’s possible and probably easier to do such things in a dynamic language. Why the disappointment?

                                                                          I would hazard a guess that such things aren’t exactly common because typically such run-time reflection-based magic is inscrutable and difficult to debug, especially for people who didn’t write the magic themselves. There’s the folk wisdom surrounding metaprogramming that “you should not use it unless you absolutely have to” in language communities that support it in some form or another (eg, macros in Lisps, monkey patching in Ruby, even reflection in Java).

                                                                          1. 3

                                                                            in a language where you have interfaces and classes, you could simply have a subclass of “callable function” that keeps track of argument counts like the example.

                                                                            I thought the same thing. The more expressive your static typing system is, especially in the direction of things such as structural typing, the more you gain the ability to substitute an object for another, as long as it fulfills the same contract. You generally cannot substitute an object for a function because statically typed languages prefer to separate the two in the type system, but there’s nothing that would theoretically stop a statically typed language from enabling this feature if it designed its fundamental “type tree” differently. But, implicitly, most people don’t seem to think it is worth it.

                                                                            In C#, you can use IL weaving with something like Fody to insert code before or after any arbitrary method, solving the use case of the Replacer in a somewhat different manner.

                                                                            I would hazard a guess that such things aren’t exactly common because typically such run-time reflection-based magic is inscrutable and difficult to debug, especially for people who didn’t write the magic themselves.

                                                                            Whenever you write Reflection code in C#, it’s partially like you turned off the type system. You are treating the typing as data and analysis of that would require a “higher order type checker”. It’s a bit surprising that meta-programming tends to be avoided even in dynamic languages, because you would think the downside is lesser.

                                                                            1. 1

                                                                              Whenever you write Reflection code in C#, it’s partially like you turned off the type system. You are treating the typing as data and analysis of that would require a “higher order type checker”. It’s a bit surprising that meta-programming tends to be avoided even in dynamic languages, because you would think the downside is lesser.

                                                                              The downsides to the readability are the same in dynamic and static languages. Except in typed languages, “turning off the type checker” is considered an additional downside because it is a sign you’re working against the grain of the language.

                                                                          2. 4
                                                                            1. 2

                                                                              Coming from Ruby and watching all the metaprogramming stuff go down (even doing it a bit myself!), I can say that the biggest issue there was the fact that one needed to manipulate global state in order to perform these tasks. If Ruby had a better way of isolating runtime environments such that RSpec, Rails, and others could put a “boundary” up to protect themselves from polluting the global scope, I think we’d have seen a lot more metaprogramming going on. In fact, I feel like Ruby’s metaprogramming capabilities are the main reason for its usefulness outside the Rails sphere. Products like Vagrant, Chef, and others have all taken advantage of this power in the language, but they’ve also had to deal with global scope pollution and other weird side effects as a result.

                                                                              Ruby has come up with a few ways to solve this over the years. One early attempt was refinements, but there were rules in place with those kinds of objects that prevented this feature from being very useful to most people, especially Rails developers. I forget what the last proposal was called, but it was very similar to the ShadowRealms proposal from JavaScript. At a high-level, a “realm” would be a copy of the original environment that you can monkey-patch to your hearts content. Your code would run in the “realm”, while everyone else’s code would run in their own realms. This would give you essentially a containerized environment to work within and you would never have to worry about messing up anyone else’s libraries.

                                                                              I do think that if this problem of needing to pollute the global state were solved, it might make maintaining software built with such metaprogramming features a bit easier.

                                                                              1. 1

                                                                                Working on a team convinced me the problem is way deeper, and also simpler. Most ruby developers, even professional ones, do not know even the basic language deeply. So when you start adding on new ways of doing things via metaprogramming and DSLs, you aren’t simplifying a constrained domain with a bespoke language (the claim of Rspec, eg), you are just making a person who doesn’t even know the core language well needlessly learn yet another thing.

                                                                                And the benefit is usually minimal compared with just using vanilla objects and methods to solve your problem. This is certainly the case with Rspec.

                                                                              2. 2

                                                                                Well, yeah, as mentioned in the post… the author is not regarding Erlang/Elixir’s dynamic types. They are not as powerful as TypeScript static types, yet they are quite helpful (based on my experience)

                                                                                1. 2

                                                                                  My intuition on this issue is that if you do not go “far enough” on the dynamic typing and bundle it with something such as code-as-data, ala Clojure, then you end up in some awkward space where you do not get much benefit out of being dynamic. It’s very hard to say anything on this topic that has a high degree of confidence or that won’t piss off someone, but I think the fact that more and more Python developers rely on things such as mypy is a strong indicator that Python is in this awkward space I mentioned.

                                                                                  1. 2

                                                                                    why not both? dynamic typing for ui, static typing for systems. it’s the best of both worlds[1].

                                                                                    1. https://github.com/nathants/aws-gocljs
                                                                                    1. 3

                                                                                      I also tend to like this approach because the frontend code has lower expected lifetime and causes less damage when it fails. It’s hard to make a frontend which isn’t relying on “stringly typed” components at some level. For example, if you add a search component to a page, and it hits the /search endpoint, that’s an untyped association by the name of some strings, but in practice, it’s fine. You don’t actually need a strongly typed <form action="{{ endpoint }}"> tag most of the time.

                                                                                      1. 1

                                                                                        cljs with reagent and shadow-cljs is such a fantastically great experience it’s hard to imagine using anything else.

                                                                                        if i could bolt on go’s type system i probably would, but it’s fine.

                                                                                        using a lot of unique keywords helps with cljs structure.

                                                                                    2. 1

                                                                                      This remind me of the most common metainterpreters Prolog tasks, which are adding additional debug information, store a list of reasonings to get to the answer, etc That is very easy to do in Prolog since with 3 lines you can already execute Prolog in Prolog, and add more behavior to the program.

                                                                                      1. 1

                                                                                        Profile the stacks that at some point call function f.

                                                                                        I do not know what profiling the stack is, help?

                                                                                        When f is called with x, return f(x) but also log what the results would have been for f(x+1) and f(x-1).

                                                                                        I do not understand the purpose.

                                                                                        We can replace a function with an object that can be called for the exact same behavior, but also gives us an information sidechannel.

                                                                                        With program reflection, you can achieve most what you want: given a function (sum function) [0] that expose at runtime the source of a function with its lexical scope (also known as static scope), and another function that I call (sink ...) that does the opposite operation.

                                                                                        [0] it comes from the latin expression: dubito, ergo cogito, ergo sum. That I map as follows: dubito ~ information gathering; cogito ~ optimization; and sum ~ reflection.

                                                                                        one step further than homoiconicity: instead of code as data we have programs as data.

                                                                                        Exactly.

                                                                                        Mind the fact, that what you describe is already possible in a limited way using manual JIT: change the behavior of a program, based on knowledge gathered, and only available at runtime when the program includes a compiler, and expose it to the user, where homoiconicity helps a lot.

                                                                                        Manual JIT is in the control of the user, but requires a priori and precise knowledge of what the algorithm does. For example, a hash-table with keys only known at runtime, may be rewritten into a switch where clause are ordered by frequency.

                                                                                        That is achieved with something along the line:

                                                                                        (eval `(lambda (argument) ,(hash-table->switch ht)))
                                                                                        

                                                                                        That particular hash-table to switch transform can be constructed, and compiled at runtime in user code, and is generic enough to be shipped as a library.

                                                                                        Quoting OP:

                                                                                        Track the arguments for every call to an expensive function. If it’s called with the same parameters ten times, alert someone that they should be calling it once and storing the result.

                                                                                        That is doable, without manual JIT.

                                                                                        people raving about it never talk about “augmenting” programs in the way I’m looking for.

                                                                                        If you have read until here, what do you think about:

                                                                                        (sink eval (optimize (sum eval)))
                                                                                        

                                                                                        I don’t think there’s very much work done in this kind of dynamic analysis.

                                                                                        Look into 3-lisp, John N. Shutt’s Kernel, and Nada Amin.

                                                                                        TIL: test amplification. It is unclear how it works. Is it some kind of fuzzer that produce test-cases based on types?

                                                                                        1. 3

                                                                                          I don’t know if any of these are impossible without runtime metaprogramming, but they all seem like they’d be more difficult. At the very least you couldn’t do them from the REPL.

                                                                                          Yeah, that was a weird section! 3/6 things listed are trivial to do in Lua, (just picked out of a hat as a language which I know well but isn’t typically considered cutting edge or advanced) a language without any metaprogramming whatsoever, and I can’t figure out how metaprogramming would make the others any simpler. They’re just simple applications of having first-class functions, and easy to do in the repl.

                                                                                          I don’t think there’s very much work done in this kind of dynamic analysis. If there was then I’d see a lot more dynamic typing advocates talking about it or showcasing

                                                                                          Maybe people just don’t think it’s that big of a deal?

                                                                                          1. 4

                                                                                            Maybe people just don’t think it’s that big of a deal?

                                                                                            PLT is hard enough to get grants for, so my guess is that it’s far easier to fund work on static types than it is for dynamic stuff.

                                                                                          2. 2

                                                                                            I do not understand the purpose.

                                                                                            I should probably write a separate newsletter on why exactly I want this, because I’ve struggled to explain it to other people before. The idea is to first-class the idea of a “function driver”, which I often do for GAN art and research coding anyway, but as a separate system.

                                                                                            1. 2
                                                                                              Profile the stacks that at some point call function f.
                                                                                              

                                                                                              I do not know what profiling the stack is, help?

                                                                                              I imagined the goal to be understanding control flow / the call graph in order to understand potential optimizations / hoisting of expressions… I guess dynamically? Not clear to me why you wouldn’t use static analysis techniques ahead of time though. Dynamic languages still work with (at least some) static analysis algorithms…

                                                                                              I’m making this guess based on the thing above that talks about “alerting someone” that there’s a hot function. Which… I’m not sure why you wouldn’t do profiling / benchmarking to obtain that, especially since it talks of alerting someone … e.g. it’s not dynamically rewriting the code.

                                                                                              I do not understand the purpose.

                                                                                              I’ll admit that I’m stumped by this too. One might imagine that a general logger for function calls might be useful in certain situations (and there’s no real reason it can’t also be done in staticly typed languages, mind you), but “speculative” execution of x+1, x-1 seems not very general purpose.


                                                                                              To be quite honest, I’m having trouble understanding why this all needs to be done at runtime, and not with the existing tooling we have today: profilers, debuggers, benchmarks, etc.. Sure, I guess it might be nice to have an interpreter “trace” mode that can automatically glean information, sift through it, and provide value. But:

                                                                                              1. Some JITs are already advanced enough to hot path optimizations
                                                                                              2. You wouldn’t run this tracing mode in production (and so, you still need to find some way to exercise the system to generate meaningful traces before production)
                                                                                              3. None of this seems impossible to do today, either directly in code with first class functions or with external tools.

                                                                                              Even the “drop into a debugger on uncaught exception” could be done in many languages with a shell script (provided tests can be run in the debugger).

                                                                                              But if all these things are possible, why aren’t they done? Why is Rails moving away from metaprogramming magic? Why isn’t Language Oriented Programming (e.g. DSLs, Macros) more prevalent? It’s because people “sell simplicity” as a silver bullet, in lieu of more understanding.

                                                                                              The complaints about Rails metaprogramming have always been: “When it breaks, I have no idea how to fix it?” Which is sort of like saying: “I’m a status quo engineer who isn’t willing to look past the fact that I have no clue how my tools work.” But this is valid, because most people are 9-5ers who just want to mine the quarry mindlessly. (And even the people curious enough to understand some of their tools can’t understand all of them)

                                                                                              The complaints about DSLs / Macros have always been: “Now I have to learn multiple languages. Why are you doing this to me?” But this is also valid, because most people do not understand language fundamentals enough to not need a 400 page O’Reilly book to pick up a new language. It’s also valid because most people don’t actually know how to write good languages (or regular APIs for that matter).

                                                                                              E.g. we’re collectively not very good at any of this, and everything is far too complex for even the best engineers to fully understand.

                                                                                              Those that are, those that attempt to pave new roads to programmer enlightenment don’t also have the skills to lead people, en masse, to it, so it ends up being in a paper / blog post / talk that a subset of people nod their head to, and then move on.

                                                                                              1. 2

                                                                                                To be quite honest, I’m having trouble understanding why this all needs to be done at runtime, and not with the existing tooling we have today: profilers, debuggers, benchmarks, etc.

                                                                                                We could even say that these tools are “post-run-time”. I like that. We need to improve post-runtime tooling.