Threads for carlgay

    1. 5

      Steele suggests the trick in balancing size against utility is that a language must empower users to extend (if not change) the language by adding words and maybe by adding new rules of meaning.

      I don’t buy this. Authors generally don’t extend e.g. English with new words or grammar in order to write a novel. Programmers generally don’t need to extend a programming language with new words or rules in order to write a program.

      A programming language, like a spoken/written language, establishes a shared lexicon in which ideas can be expressed by authors and widely understood by consumers. It’s an abstraction boundary. If you allow authors to mutate the rules of a language as they use it, then you break that abstraction. The language itself is no longer a shared lexicon, it’s just a set of rules for arbitrarily many possible lexicons. That kind of defeats the purpose of the thing! It’s IMO very rare that a given work, a given program, benefits from the value this provides to its authors, compared to the costs it incurs on its consumers.

      1. 19

        Programmers generally don’t need to extend a programming language with new words or rules in order to write a program.

        Maybe you would disagree, but I would argue that functions are essentially new words that are added to your program. New rules do seem to be a bit less common though.

        1. 4

          This would be my interpretation as well. Steele defines a language to be approximately “a vocabulary and rules of meaning”, and clearly treats defining types and functions as to be adding to that vocabulary throughout his talk. My (broad) interpretation of a generalized language is based on that idea that the “language” itself is actually just the rules of meaning and all “vocabulary” or libraries are equal whether they be “standard” or “prelude” or defined by the user.

        2. 3

          I understand the perspective that functions (or classes, or APIs, or etc.) define new words, or collectively a new grammar, or perhaps a DSL. But functions (or classes, or APIs, or etc.) are obliged to follow the rules of the underlying language(s). So I don’t think they’re new words, I think they’re more like sentences, or paragraphs.

      2. 15

        Authors generally don’t extend e.g. English with new words or grammar in order to

        FWIW a thing that shitposters on Tumblr have in common with William Shakespeare is the act of coining new vocabulary and novel sentence structure all the time. :)

        1. 5

          See: 1984, full of this

          1. 1

            Also, Ulysses no?

      3. 5

        The author is saying that simple systems need this extensibility to be useful. English is far from small. Even most conventional languages are larger than the kernel languages under discussion, but chief argument for kernel languages is their smallness and simplicity. And those languages are usually extended in various ways to make them useful.

        I would say, and I think that the author might go there too, that certain libraries that rely heavily on “magic” (typically ORMs) also count to some degree as language extensions. ActiveRecord and Hibernate, for instance, use functionality that is uncommonly used even by other practitioners in their respective languages.

        1. 2

          The article is about languages, right? Languages are a subset of systems. The abstraction established by a language is by definition a shared context. A language that needs to be extended in order to provide value doesn’t establish a shared context and so isn’t really a language, it’s a, like, meta-language.

          1. 4

            Not really.

            It’s about using languages as a case study in how systems can be of adequate or inadequate complexity to the tasks they enable their users to address, and how if a tool is “simple” or perhaps inadequately complex that the net result is not that the resulting system is “simple” but as @dkl speculates above that the complexity which a tool or system failed to address has to live somewhere else and becomes user friction or – unaddressed – lurking unsuitability.

            This creates a nice concept of “leverage” being how a language or system allows users to address an adequate level of complexity (or fails to do so), and begs how you can measure and compare complexity in more meaningful and practical terms than making aesthetic assessments both of which I want to say more about later.

            1. 2

              Right.

              I suppose I see languages as systems that need to have well-defined and immutable “expressibility” in order to satisfy their fundamental purpose, which isn’t measured in terms of expressive power for authors, but rather in terms of general comprehension by consumers.

              And, consequently, that if a language doesn’t provide enough expressivity for you to express your higher-order system effectively, the solution should be to use a different language.

              Reasonable people may disagree.

              1. 3

                If you consider that each project (or team) has slightly different needs, but there aren’t that many slightly different language dialects out there that add just one or two little features (thankfully!) that these projects happen to need. Sometimes people build preprocessors to work around one particular lack of expressibility in a language (for one well-known example that’s not an in-house only development, see yacc/bison). That’s a lot of effort and produces its own headaches.

                Isn’t it better to take a good language that doesn’t need all that many extensions, but allows enough metaprogramming to allow what your team needs for their particular project? This allows one to re-use the 99% existing knowledge about the language and the 1% that’s different can be taught in a good onboarding process.

                Nobody in their right mind is suggesting that projects would be 50% stock language, 50% custom extensions. That way lies madness. And indeed, teams with mostly juniors working in extensible languages like CL will almost inevitably veer towards metaprogramming abuse. But an experienced team with a good grasp of architecture that’s eloquent in the language certainly benefits from metaprogrammability, if even it’s just to reduce the amount of boilerplate code. In some inherently complex projects it might even be the difference between succeeding and failing.

                1. 5

                  I do think the industry tends to be dominated by junior-friendly programming languages, as if the main concern was more about increasing headcount than it was about expressing computation succinctly and clearly.

                2. 4

                  Nobody in their right mind is suggesting that projects would be 50% stock language, 50% custom extensions. That way lies madness.

                  Paul Graham made roughly that claim when he wrote about why viaweb could only have been written in common lisp, but I think his numbers were more like 70/30. But you did qualify it with only people in their right mind, so maybe that doesn’t count.

                3. 1

                  Isn’t it better to take a good language that doesn’t need all that many extensions, but allows enough metaprogramming to allow what your team needs for their particular project?

                  YMMV, but I don’t think so. My experience with metaprogramming in industry has been negative. I find it usually adds more complexity than it removes. And I don’t really agree with the framing that a language should be extensible, at the ·language· level, by its users. I see that as an abstraction leak.

                  1. 3

                    My experience with metaprogramming in industry has been negative. I find it usually adds more complexity than it removes.

                    I’d put that in the “hell is other people” basket - indeed, very often metaprogramming gets abused in ways that make code more complicated than it has to be. So you definitely have a point there. But then I’ve seen horror shows of many other kinds in “industrial” codebases in languages without metaprogramming. In the wrong hands, metaprogramming can certainly wreak more havoc, though!

                    Still, I wouldn’t want to go back to languages without metaprogramming features, as I feel they allow me to express myself more eloquently; I prefer to say things more precisely than in a roundabout way.

                    And I don’t really agree with the framing that a language should be extensible, at the ·language· level, by its users. I see that as an abstraction leak.

                    I never considered that perspective, but it makes sense to view it that way. In my neck of the woods (I’m a Schemer), having an extensible language is typically considered empowering - it’s considered elitist to assume that the developer of the language knows everything perfectly and is the only one who should be allowed to dictate in which direction the language grows. After all, the language developer is just another fallible programmer, just like the language’s users are.

                    As a counterpoint, look at the featuritis that languages like Python and even Java have spawned in recent years. They grew lots of features that don’t even really fit the language’s style. Just because something’s gotten popular IMHO doesn’t necessarily mean it ought to be put in a language. It burdens everyone with these extra features. Think about C++ (or even Common Lisp), where everyone programs in a different subset of the language because the full language is just too large to comprehend. But when you want to leverage a useful library, it might make use of features that you’ve “outlawed” in your codebase, forcing you to use them too.

                    There’s also several examples of the Ruby and Python standard libraries having features which have been surpassed by community libraries, making them effectively deprecated. Sometimes they are indeed removed from the standard library, which generates extra churn for projects that were using them.

                    I’d rather have a modestly-sized language which can be extended with libraries. But I readily admit that this has drawbacks too: let’s say you choose one of many object/class libraries in an extensible language which doesn’t supply its own. Then if you want to use another library which depends on a different class library, you end up with a strange mismatch where you have entirely different class types, depending on what part of the system you’re talking to.

                    1. 1

                      I’ve seen horror shows of many other kinds in “industrial” codebases in languages without metaprogramming. In the wrong hands, metaprogramming can certainly wreak more havoc, though!

                      Sure! In pure quantitative terms, I agree: I’ve seen way more awful Java (or whatever) than I have awful metaprogramming stuff. But the bad Java was a small subset of the overall Java I’ve witnessed, whereas the bad metaprogramming was practically 100% of the overall metaprogramming I’ve witnessed.

                      I fully acknowledge that I’m biased by my experience, but with that caveat, I just haven’t seen metaprogramming used effectively in industry.

      4. 5

        If you allow authors…

        This is an important point. You are allowing authors to add new syntax, not requiring it. You can do most of the same tricks in Common Lisp or Dylan that you can do in Javascript or Kotlin, by passing funargs etc., it’s just that you also have the ability to introduce new syntax, within certain clearly defined bounds that the language sets.

        Just as you have to learn the order of arguments or the available keywords to a function, Lisp/Dylan programmers are aware that they have to learn the order of arguments and which arguments are evaluated for any given macro call. Many of them look exactly like functions so there’s no difference. (I like that in Julia, as oppose to Common Lisp and Dylan, macro calls must start with the special character “@”, since it makes a clear distinction between function calls and macro calls. But I don’t know much about Julia macros.)

        A programming language, like a spoken/written language, establishes a shared lexicon…

        Yes, and I don’t believe macros change this significantly, although this depends to a certain extent on the macro author have a modicum of good taste. The bulk of Common Lisp and Dylan macros are one of two flavors:

        1. “defining macros” – In Common Lisp these are usually named “defsomething” and in Dylan “define [adjectives] something”. When you see define frame <chess-board> (<frame>) ... end you know you will encounter special syntax because define frame isn’t part of the core language. So you go look it up just as you would lookup a function for which you don’t know the arguments.

        2. “with…” or “…ing” macros like “with-open-file(…)” or “timing(…)”

        These don’t change the complexity of the language appreciably in my experience and they do make it much more expressive. (Think 1/10th to 1/15th the LOC of Java here.)

        Where I believe there is an issue is with tooling. Macros create problems for tooling, such as including the right line number in error messages (since macros can expand to arbitrarily many lines of code), stepping or tracing through macro calls, recording cross references correctly for code browsing tools, etc.

        1. 1

          Do you not see code that defines new syntax as categorically different than code which uses defined syntax?

          What is a language if not a well-defined grammar and syntax?

          I don’t see how something which permits user modification of its grammar/syntax can be called a language. It’s a language construction set, maybe?

          1. 9

            This seems founded in the idea that if you just know the language, you can look at code and understand what it does. But this is always convention. For instance, consider the classic C issue of “who owns the pointer passed to a function?” Even with an incredibly simple language, there’s ambiguity about what conventions surround a library call - do you need to call free()? Will the function call free()? Can you pass a stack pointer, or does it have to be heap allocated? And so on. More powerful type systems can move more information into the type itself, but more powerful types tend to be included in more powerful languages; for instance, in D, if you pass an expression to a function, if the parameter is marked as lazy, the expression may actually be evaluated any number of times, though it’s usually zero or one, and you have no idea when the evaluation takes place. So just from looking at a function call, foo(bar), it may be that bar is evaluated before foo, or during foo, or multiple times during foo, or never.

            Now a macro could do worse, sure, but to me it’s a difference of degree, not kind. There’s always a spectrum, and there’s always ambiguity, and you always need to know the conventions. Every library is a grammar.

            1. 3

              This seems founded in the idea that if you just know the language, you can look at code and understand what it does. But this is always convention. For instance, consider the classic C issue of “who owns the pointer passed to a function?” Even with an incredibly simple language, there’s ambiguity about what conventions surround a library call - do you need to call free()? Will the function call free()? Can you pass a stack pointer, or does it have to be heap allocated? And so on.

              The devil’s in the details. But when I’m answering questions like “who calls free()?” I don’t need to re-evaluate my understanding of language keywords or sigils or operator semantics. I see this as a categorical difference. I suppose other people may not.

          2. 5

            I think it’s worth addressing the elephant in the room: there’s language as in formal language (mathematics) and then there’s language as reasoned by linguistics, which also acknowledges and studies the sociopolitical and anthropological aspects of language. It may be useful to analyze PLs through the lens of the former, but programming languages also do an awful lot of things that human languages do: give rise to dialects, have governing bodies, have speakers that defy governing bodies, borrow from each other, develop conventions, nurture communities, play host to ecosystems, and so forth.

            We now arrive at the core of your assertion, which is that there is a hard line between syntax that comes for free and syntax that is defined by the userland programmer. This is part of the very same line that the author is asking us to imagine blurring:

            Building on Steele’s deliberate blurring of the line between a “language” and a “library”, I suggest this train of thought applies to libraries as well. Libraries – language extensions really – help us say and do more but can fail to help us manage complexity or impose costs on their users in exactly the same ways as language features.

            It is furthermore dangerous to equate the malleability of syntax with the extensibility of a language. Macros are a form of extensibility, yet conversely, extensibility does not require a support for macros. Take Python, a language that does not support macros, for example. Python allows you to engage in so much operator overloading that you could read the expression (a + b) << c and still have no clue what it actually does without looking up how the types of a, b, and c define such operations. Ultimately, it is conventions — be they implicit or documented, be they cultural or tool-enforced — that dictate the readability of a language, just as @FeepingCreature has demonstrated with multiple examples.

            The author covers a great deal of the triumphs and tragedies of language extensibility in the “Convenience? Or simplicity through design?” section. The monstrosity of the Common Lisp LOOP macro, as well as userland async / await engines, are brought up because they brutally defeat static analysis, whereas the .use method in Kotlin is shown as an example that manages to accomplish all of the needs of WITH-FOO macros with none of the ill effects of macros. In fact, there is another commonly used language that manages to achieve this the same way Kotlin does: Ruby. In Ruby, any method call can take an additional block of code, which gets materialized into a Proc object, which is a first-class function value. This means in Ruby it is routine to write code that looks like File.open("~/scratch.txt") { |f| f.write("hello, world") }. This is an example straight out of the Ruby standard library, and it is the convention for resource management across the entire Ruby ecosystem.

            Now, even though Ruby — like Python — does not support macros, and even though it got resource management “right”, just like Kotlin did, it has nevertheless managed to dole out some of the most debilitating effects of metaprogramming. Ask anyone who’s worked on a Rails project, myself included, and they will recount how they were victimized by runtime-defined methods whose names were generated by concatenating strings and therefore remain nigh unsearchable and ungreppable. A great deal of research and literature have been dedicated to making Ruby more statically analyzable, and they all converge towards the uncomfortable conclusion that its capability of evaluating strings as code at runtime and its Smalltalk-like object system both make it incredibly difficult for machines to reason with Ruby code without actually running it.

            1. 1

              there’s language as in formal language (mathematics) and then there’s language as reasoned by linguistics, which also acknowledges and studies the sociopolitical and anthropological aspects of language.

              Programming languages can be analyzed in the formal/mathematical sense, but when they are used, they are used by humans, and humans necessarily don’t understand languages through this lens. I’m not saying a programming language is strictly e.g. linguistic, but I am saying that it cannot be effectively described/modeled purely via formal or mathematical means. A language isn’t an equation, it’s an interface between humanity and logic.

          3. 3

            I would accept that Common Lisp can be called a language construction set but I don’t see how it’s useful or accurate to say it’s not a language.

      5. 5

        Authors generally don’t extend e.g. English with new words or grammar in order to write a novel.

        Well, no, but it helps that English language already has innumerable – legion, one might say – words, an English dictionary is a bountiful, abundant, plentiful universe teeming with sundry words which others have already invented, not always specifically for writing a whole novel, often just in order to get around IRL. It’s less obvious with English because it’s not very agglutinative but it has been considerably extended in time.

        Language changes not only through the addition of new words and rules but also through other words and rules falling out of use (e.g. in English, the plural present form of verbs lost its inflection by the 17th century or so), so it’s not quite fair to say that modern English is “bigger” than some of its older forms. But extension is a process that happens with spoken languages as well.

        It’s also super frequent in languages that aren’t used for novels and everyday communication, too. At some point I got to work on some EM problems with a colleague from a Physics department and found that a major obstacle was that us engineers and them physicists use a lot of different notations, conventions, and sometimes even words, for exactly the same phenomenon. Authors of more theoretical works regularly developed their own (very APL-like…) languages that required quite some translation effort in order to render them into a comprehensible (and extremely verbose) math that us simpletons could speak.

        (Edit: this, BTW, is in addition to all that stuff below, which @agent281 is pointing out, I think the context of the original talk is relevant here).

      6. 1

        If I had to guess I would think that this was a nudge to Scheme programming language from Steel’s point of view (or Common Lisp, but this one is quite large in comparison). Quite easy to extend Scheme that way and he does have a history with lisp-family languages. But that’s just a guess.

    2. 1

      This would make a nice addition to the Dylan documentation as a quick overview for people new to the language. Thanks for writing it!

      1. 1

        Oddly, none of the reviewers pointed out that I missed subclass types! I guess I’ll have to find enough interesting material about that to make a blog post …

    3. 5

      For $work I just started working on the meaty part of a re-implementation of an airline flight schedule combiner in Go. This thing has to handle messages in ancient and ill-specified protocols, so…fun. For Open Dylan I’m on my second round of trying to add multi-line strings to the compiler, after refactoring the lexer a bit to enable some testing. I normally don’t hack the compiler so this is a bit different for me. Planning to use Python-like triple-double-quote syntax.

      1. 1

        What would be an alternative to the triple-double-quote? It looks good, but it’s a nightmare to syntax highlight & providing auto-complete for, I’ve heard. Admittedly haven’t looked into it myself, but anecdotally IntelliJ’s handling of Scala’s """ hasn’t been great.

        1. 2

          Good point about editor support. It seems like there should be some general support for it out there due to Python though.

          Other alternatives I thought about are #s and #r since Dylan has #t, #f, and #“symbol” syntax already. That is, #s"…“ and #s'…‘ for strings with interpreted escape characters like \n and #r”…“ and #r’…‘ for "raw” strings with no interpretation. The latter is particularly useful for regular expressions.

          The bottom line, I think, is that if you want to put a big blob of text in a constant there’s a good chance it has both a double-quote and a single-quote in it somewhere, and that’s why “ ” “ is nice.

          There’s some discussion here… https://lists.opendylan.org/pipermail/hackers/2013-February/006709.html