1. 24
  1.  

  2. 20

    A bit unrelated to the linked content, but “Low Hanging Fruit of Programming Language Design” really makes me wish a place/collection/book which documents the lessons (both good and bad) learned in the last few decades of language design.

    Language design currently feels like it has plateaued and isn’t really moving forward anymore except in a few specific attributes the language author really cares about.

    Having some exhaustive documentation that holds potential design approaches for individual features, possibly with a verdict based on past experiences would help to prevent language designers repeating the same mistakes over and over.

    Entries could include:

    • The various, bad approaches to operators and operator “overloading”.
    • How not to mess up equality and identity.
    • Things that don’t belong on a language’s top type.
    • Implicit numeric conversions will never work in a satisfying fashion.
    • Picking <> for Generics is wrong and broken, as evidenced by all languages that tried it.
    • Why ident: Type is way better than Type ident.
    • Typeclass coherency is fundamentally anti-modular and doesn’t scale.
    • Upper and lower bounds are a – rather uninteresting – subset of context bounds/typeclasses.
    • There is no reason to require “builtin/primitve” types to be written differently from user-defined types.
    • How to name things in a predictable way.
    1. 13

      Another good source of lessons would be what Gilad Bracha calls shadow languages. Essentially, any time you find a simple mechanism getting extended with more and more features, until it essentially becomes a (crappy) programming language in its own right. The obvious thing to try in these situations is to throw out that standalone system and instead provide some mechanism in the ‘real’ language for programs to calculate these things in a “first class” way.

      Bracha gives examples of ML modules, where e.g. functors are just (crappy) functions, Polymer (which I don’t know anything about) and imports (e.g. conditional imports, renaming, etc.).

      Some more examples I can think of:

      • Haskell’s typeclass resolution mechanism is basically a crappy logic language, which can be replaced by normal function arguments. Maybe a better generalisation would be implicit arguments (as found in Agda, for example)? One way to programmatically search for appropriate arguments is to use “proof tactics”.
      • Haskell types have gained datakinds, type families, etc. which are basically just a (crappy) functional programming language. When types are first-class values (like in Coq, Agda, Idris, etc.) we can use normal datatypes, functions, etc. instead.
      • Build/packaging systems. Things like Cabal, NPM, PyPI, RPM, Deb, etc. These are usually declarative, with elaborate dependency solvers. As well as being rather limited in what we can express, some systems require all participants to maintain a certain level of vigilence, to prevent things like ‘malicious solutions’ (e.g. malicious packages claiming to implement veryCommonDependency version 99999999999). I’ve seen a couple of nice approaches to this problem: one is that of Nix/Guix, which provide full functional programming languages for calculating packages and their dependencies (I’ve even written a Nix function which calls out to Tinc and Cabal for solving Haskell package dependencies!); the other is Racket’s old PLaneT packaging system, where programs write their dependencies in-line, and they’re fetched as needed. Unfortunately this latter system is now deprecated in favour of raco, which is just another NPM-alike :(
      • Multi-stage programming seems like a language-level concept that could subsume a bunch of existing pain points, like optimisation, configuration, or even packaging and deployment. Why bother with a mountain of flaky bash scripts to orchestrate compilers, build tools, test suites, etc. when we can selectively compile or interpret sub-expressions from the same language? The recent talk “programming should eat itself” looks like the start of some really exciting possibilities in this area!
      1. 4

        I tried to write something like that, but this is so subjective.

        This is probably not a big problem in practice. If you design and publish a language, many helpful trolls will come and tell you all mistakes you made. 😉

        1. 6

          FWIW, I compiled a list of articles on some of the topics I mentioned above:

          Maybe it is interesting for you.

          The articles’ conclusion are based on an exhaustive survey of more than a dozen popular languages as well as many minor, but influential ones.

          1. 2

            Thanks, a few of my writings:

            Is there a way to get a feed of your articles? https://soc.github.io/feed/ is empty.

            1. 1

              Looks like we agree on pretty much everything in the first two articles. :-)

              I think there are only bad options for dealing with operators (while operator overloading is pretty much broken altogether), but some approaches are less bad then others.

              My opinion is to pretty much use the simplest thing to document, specify and implement, and emphasize the lack of importance of operators to users to stop them from going overboard.

              I believe they get way too much attention given how unimportant they are in the grand scheme of things.

              Is there a way to get a feed of your articles? https://soc.github.io/feed/ is empty.

              I’ll look into that, wasn’t even aware that I had a feed. :-)

            2. 2

              For equality, I’m not sure if there should be an equals method in Object (or Any or AnyRef).

              Equality is not a member of a type. (Unfortunately, in Java everything must be inside a class) Equality often depends on context. For example, when are two URLs equal? Sometimes you want to compare the strings. Sometimes the string without the fragment identifier. Sometimes you want to make a request to look what gets returned for the URLs.

              Sometimes, we might prefer to not provide equals at all. For example, does it make sense if locks can be equal?

              The argument pro Object.equals is convenience. For many types there is a reasonable default. Manually specifying equality for every hash map instantiation is tedious.

              1. 1

                For equality, I’m not sure if there should be an equals method in Object (or Any or AnyRef).

                I agree. The considerations don’t rely on it, except as a tool for demonstration. If a language isn’t forced (e. g. by the runtime/host language/library ecosystem) to have it on their top-type it makes sense to leave it out.

              2. 2

                Why are [] better than <> for generics

                I feel like this should be a two-part argument:

                • why re-using <> for generics is bad
                • why [] are better used for generics than for indexing

                Your article is pretty convincing on the first part, but curiously silent on the second. What does Scala use for indexing into ordered collections? Or does it avoid them altogether?

                1. 4

                  Scala uses () for indexing into ordered collections, a la vbscript.

                  As a language developer, I’ve not implemented generics, so I’ve yet to develop strong feelings about <> in that sense
                  As a language user, <> for generics has never tripped me up. That has mostly been In C# and Java, however, and I think both languages keep the places where <> vs < or > shows up mostly distinct. I’d hardly call it a disastrous choice on this side of things, even if it took some extra work on the part of the language teams for them.

                  1. 2

                    I do know that <> ends up looking painfully ugly, at least in Rust. Also is making it harder to find a nice syntax for constant generics, and is responsible for the ugly turbofish operator.

                    1. 1

                      I would suppose that is a bit more of a matter of taste, but I’m unsure that [] would be any better on that front, unless ::<> would be replaced by something other than ::[]. Which might be possible if Rust didn’t use [] for indexing. Given the tiny bit of Rust I’ve written, giving up [] for indexing would almost make sense, since you’re often using methods for collection access as it is. I’d have to sit down with the Rust grammar to be sure.

                      1. 3

                        The important thing to keep in mind is that <> weren’t chosen for any kind of reason except “these are literally the only symbols on the keyboard which kind of look like braces that we can retrofit into an existing language.”

                        If you start from a blank slate and ask “what is the best we can do, making the rules of the language simple and easy to understand” the result will be very different.

                        Consider two approaches:

                        Approach A

                        • () brackets: used for method calls, except where they are not: Array construction, array indexing, etc.
                        • [] brackets: used for array construction and array indexing
                        • {} brackets: used for array construction, method bodies, initializers, …
                        • <> “brackets”: used as an operator for comparisons, used for generics

                        Approach B

                        • () brackets: used for terms, grouping, marks a single expression
                        • [] brackets: used for types
                        • {} brackets: used for refining a term/type, marks a sequence of statements
                        • <> “brackets”: not used, because they are not brackets

                        I think no one would say “let’s mix things up and assign brackets to various use-cases randomly” and pick approach A over approach B.

                        And yes, Rust would be simpler and easier to read if they kept the use of [] for generics, instead of migrating to <>.

                        1. 2

                          unless ::<> would be replaced by something other than ::[].

                          That’s exactly what I’m thinking. It’s subjective, but I also find that <> makes polymorphic types look claustrophobic, where as [] feels more ‘airy’ and open, due to their shapes.

                          Here’s an example from today:

                          fn struct_parser<N>(fields: &[Field<N, Rc<binary::Type<N>>>]) -> Rc<ParseExpr<N>> {
                          

                          vs.

                          fn struct_parser[N](fields: &Slice[Field[N, Rc[binary::Type[N]]]]) -> Rc[ParseExpr[N]] {
                          

                          Ideally I would prefer that the same syntax be used for both value level and type level abstraction and application, but I’ll save that debate for another time…

                2. 3

                  Even a list of how different language designs approach the same problem, with respect and without comparing them, would be a huge improvement over what we have now. Should be easier to compile than a “here are the lessons” document since it’s less subjective.

                  1. 2

                    To compare languages, I’ve used:

                    And although I haven’t used it very much, Rosetta code does what you want:

                    Of course these are superficial, but surprisingly useful. I draw mostly on the languages I really know, but it’s nice to have an awareness of others. I know about 5 languages really well (written thousands of lines of code in each), 5 more languages a little bit (Lua, Go, etc.), and then there are maybe 10 languages that I don’t know which are “interesting” (Swift, Clojure, etc.)

                    I think that posting to lobste.rs or /r/ProgrammingLanguages can help feedback with those. Here is one thread I posted, summarizing a thread from HN:

                    https://www.reddit.com/r/ProgrammingLanguages/comments/7e32j8/span_slices_string_view_stringpiece_etc/

                    I don’t think there is much hope of getting all the information you want in one place. Because there is so much information out there, and some languages like Swift are new and rapidly developing.

                    Personally I maintain a Wiki that is my personal “delicious” (bookmark manager), although I have started to move some of it to the Oil wiki [1]

                    [1] https://github.com/oilshell/oil/wiki

                    1. 2

                      FWIW, I compiled a list of articles on some of the topics I mentioned above:

                      Maybe it is interesting for you.

                      The articles’ conclusion are based on an exhaustive survey of more than a dozen popular languages as well as many minor, but influential ones.

                      (Sorry for the double post.)

                      1. 1

                        A can also recommend /r/Compilers. At least, I had a nice discussion there recently.

                  2. 2

                    The best things I’ve found on this are interviews with language designers. But it is scattered.

                    1. 2

                      That would be nice, but I see several problems:

                      • Language design depends on the domain. There’s no right answer for every domain. For any language that someone claims is “general purpose”, I will name a domain where virtually no programs in it are written (for good reasons).
                      • Almost all languages features interact, so what is right for one language is wrong for another.
                      • Some things are subjective, like the two syntax rules you propose. They’re also dependent on the language.
                      1. 3

                        Kind of agree with your points, but I believe there is a reasonable subset of topics, where one can provide a conclusive verdict based on decades of languages trying various approaches, like for instance abusing <> for generics or ident: Type being better.

                    2. 6

                      Interestingly, my argument why I don’t like macros goes somewhere along those lines: If you have something repetetive where people use macros to work around, it’s probably a flaw in the host language.

                      I’m not saying they are bad, just that I don’t like them.

                      Excluded are obviously languages that are fundamentally based on them.

                      1. 4

                        Ok, so it’s a flaw in the host language. Isn’t it nice to have macro’s to work around them?

                        1. 7

                          Possible outcome: every project works around the problem in their own slightly incompatible way, and no-one bothers fixing the problem in the host language because it’s easy enough to work around.

                          I like macros as a way to cheaply prototype proposed language changes. I don’t want to see them in production code; debugging from the output of a (nonstandardised) code generator is awful but still easier than debugging from the input, which is effectively what the choice between code generation and macros boils down to.

                          1. 8

                            I like macros as a way to cheaply prototype proposed language changes. I don’t want to see them in production code; debugging from the output of a (nonstandardised) code generator is awful but still easier than debugging from the input, which is effectively what the choice between code generation and macros boils down to.

                            This has, by the way, happened with Rusts “try!()” (which, after some modifications, became the “?” operator).

                            1. 2

                              Reminds me of Rust’s primary use of macros: emulating the varargs that the language lacks.

                              My cardinal rule about macros is that if I have to know that something is a macro, then the macro is broken and the author is to blame.

                              Rust also messed up in that regard by giving macro invocations special syntax, which acted as an encouragement to macro authors to go overboard with them because “the user immediately sees that it is a macro” – violating the cardinal rule about macros.

                            2. 2

                              Yup, the alternatives are duplication/boilerplate or external codegen until the language catches up. Macros are an decent way to make problems more tractable in the short term (unless your in a wonderous language like Racket), or even to prototype features before they are implemented in the full language. I’d love to see more metaprogramming with a basis in dependant types, but alas there’s still lots of work to be done before that has been made workable for the average programmer.

                              1. 1

                                Sure, that’s why I said they aren’t bad, I just don’t like them.

                                On the other hand, I also don’t have any problem with codegen over macros, its basically the same thing at another phase.

                              2. 1

                                Say that this flaw is becoming obvious a couple of years after the language’s release. In that case the fix may have the consequence of breaking some subset of existing code which is arguably worse than including macros in the language. I don’t know where I want to go with this strawman-like argument other than to say that language design is hard and macros lets the users of the language make up for the designers deficiencies.

                                1. 1

                                  I totally appreciate that. I just don’t see “does the language have macros” as the issue people make it. For example, languages with very expressive metaprogramming systems like Ruby have purposefully not included macros and are doing fine.

                                  Macros are often an incredibly complex and problematic fix for this, though. Just the patterns list of the Rust macros book is huge and advanced: https://danielkeep.github.io/tlborm/book/pat-README.html

                                  (Other languages have nicer systems, I know, but the issue persists: textual replacement is a mess)

                                  I totally see their place, for example, we couldn’t define a type checked println!("{:?}", myvalue) in Rust proper without adding a lot of additional things to the language.

                              3. 6

                                I agree that “there are a lot of languages waiting to be invented”, and that you could use code generation as a clue to find them. I have been keeping a list of these “missing languages” in general purpose programming languages, many of which are mentioned here:

                                https://lobste.rs/s/aqdixr/gentle_introduction_compile_time#c_0cuoc9

                                One funny thing is that I am doing “double code generation” right now. I have a prototype of the Oil lexer in Python, which uses Python regular expressions. I’m writing my own code generator to generate re2c syntax, which then will be run through the re2c compiler to produce a bunch of switch/goto statements embedded in C code.

                                Original: https://github.com/oilshell/oil/blob/master/osh/lex.py

                                First step will look like this (example from Ninja):

                                https://github.com/ninja-build/ninja/blob/master/src/lexer.in.cc#L123

                                Second step will look like this:

                                https://github.com/ninja-build/ninja/blob/master/src/lexer.cc#L128

                                This feels like a smell for sure. But I have looked into the algorithm for generating NFAs, DFAs, and then C code, and it’s not super simple or straightforward. re2c is quite a big project now. Hopefully one day I will replace this with my own thing.

                                1. 2

                                  One thing that is curiously absent from the discussion is dependency management. This may seem trivial but is full of interesting edge cases, e.g. multiple versions of a dependency, to which current state of the art has not yet appeared to find an optimal solution.

                                  1. 1

                                    Those are discussed in this comment:

                                    https://lobste.rs/s/cd5lk4/low_hanging_fruit_programming_language#c_8y9bpu - 3rd bullet point.

                                    Or am I misunderstanding you?