1. 18

  2. 8

    torture the user on behalf of the implementor

    Do they really think this makes the language or resulting code easier to understand for end-users? Being complicated hasn’t done C++ usability any favors either.


    1. 9

      That’s not an apt comparison: the languages may both be complicated, but with entirely different aims. C++ tortures everyone, on behalf of the machine.

      [Also: you got the quote backwards.]

      1. 2

        [Also: you got the quote backwards.]

        I connected Perl to C++. Wording was intentional.

        1. 2

          Do they really think this makes the language or resulting code easier to understand for end-users ?

          Perl and C++ have been used in production-grade software.
          Their complexity must have helped.

          Can richness in syntax help make more readable programs ?
          I can’t make a definite statement on it, but it does make programming more closer to mathematics.

      2. 2
      3. 7

        Larry Wall has a very interesting theory of computer language development based on a language evolving idioms which compactly express common tasks and allow the program to (roughly) do what the programmer means.

        I think the rise and (relative) fall of perl shows the strengths and weaknesses of this approach. Essentially, a language that seems to capture the essence of what you’re saying by letting you specify a behavior without unneeded pieces can seem like a dream and in the short-run can make someone more productive. In Perl’s case, I think it’s been effectively shown this approach doesn’t scale in time and space (The failure to get the Perl 6 compiler going might included as just one part of this failure).

        The problem I’d see is that in a programming language, these informal little cues become fixed whereas in human natural language, these same things are variable and context dependent. At the same time, I think the model of allow this kind of language use is cool. If you could pursue it so it could capture context and scale to larger usage, the results could be a breakthrough. Unfortunately, that’s probably one of many unwalked paths in computer science.


        1. 4

          Actually, yes, that’s exactly my own inclination in programming language design, as well - the formalization of contextual knowledge, although I’m further along than that and I think there are actually a lot of mechanisms, not just context, which are used in human communication and not used in human-computer communication. Relevant: http://xkcd.com/203/ (“Hallucinations”).

          And yes, by the way, I don’t think that shifting the grammar and syntax of the language in such hideously informal scope-creepy ways is actually filling any of these needs. And it’s deeply troubling because it seriously complicates the development of source-code-handling tools, such as indenters, intelligent searches (not just grep), automated refactoring, static analysis, … So if you want a sound bite, I’m anti-Perl-6.

          You may stop reading now unless you are a person who wants to use these ideas to design a language, in which case please continue. I post them because I don’t want them to be lost. :)

          We leave things out which are contextually understood, which requires a very robust notion of what the context is. It would be wonderful to have a programming language which understood that, say… a particular module is talking about the rendering of text, which is a problem domain that involves mostly-affine transformations between various two-dimensional coordinate systems set in metric spaces. Not only that, but most of the transformations are orthonormal translation (ie. shift it around), only, which means that directions and distances remain meaningful, so that it is not usually necessary to be precise about which system they refer to, unless the statement being made involves systems between which the transformation is not known to be orthonormal.

          This context idea has the obvious benefit of feeding into formal verification to show that the program is correct, and that’s one of the big points of it.

          But it has another benefit, too: It means that if I use a “y” variable from some helper function which was only ever passed an “x”, the compiler can be smart enough to look where that function was called from, figure out its broad purpose, and distinguish between the case where I forgot to pass that parameter through, and the case where there’s a deep conceptual problem because I’m trying to compute something that isn’t knowable even in principle at that point. Then it can say either “would you like to add this parameter”, or “hey, wait, what”. :)

          One can even envision a system, and this might or might not turn out to be a good idea, where an understanding of the contexts in which a function operates replaces or augments an explicit parameter list. Yes, if that had to be specified for every function, it would actually be a parameter list, even if perhaps items that didn’t get used got erased, but I don’t envision that happening; it would be specified at a per-file or per-wrapper-block level, akin to a C++ namespace.

          So contexts are one big thing. And then, well, there are metaphors which make explanations simpler. These can be introduced by using them, and then clarified through back-and-forth if needed. They can be elaborated on, nested, transposed into other problem domains, used to express relationships between problem domains, … The main utility of this would be for describing all these contexts and their relationships, without being much more explicit than I was in that last paragraph. I am not aware of any extant language where that knowledge would be fewer than about a thousand lines of code, so… think of metaphors as templates for contexts.

          To get more concrete - there’s the spatial metaphor, which gives meaning to “up, down, left, right”, and based on that there’s the vector-space metaphor which gives meaning to distances, and there’s the metric-space metaphor which does that and also gives meaning to coordinates. But what those meanings are is up to the context that instantiates the metaphors. And then, some of those are actually parametrized on dimensionality! (Yay - two words added to this particular computer’s spellcheck dictionary in the same sentence. Speaking of Sisyphean tasks which computers make us do…)

          And then, as yet another important feature, which I suggested briefly already - in human communication, there’s back-and-forth. No, the edit->compile->curse cycle at present does not count as back-and-forth. It should be possible for the compiler to notice a high-level concern. The missing-parameter example would be a good instance of that, and it could just as well notice the use of the word “left” without a context that explains it, and inquire as to contexts that might have been meant - including checking the any libraries being used for them if they aren’t found somewhere closer, such as an outer scope.

          And then, saving a big one for last. Among humans, there’s a great deal of meta-communication about which portions of the communique are meant to express things that are novel and/or subtle. If a program notices that values which are semantically typed as “row” and “column” have been provided in the wrong order (a frequent error because the conventional way to write those in English is the opposite of the conventional way to write “width” and “height”), this MIGHT be a clever coordinate-space transformation (flipping it around the y = x diagonal), and extant systems would simply assume that the programmer intended as much. Which would almost always be a bug.

          So, the bug should be an error by default (warnings are useless, existing only to cause contention between programmers who want warning-free code and programmers who don’t), and there should be a way to call attention to things that are intentionally tricky, either via the back-and-forth mechanism or via a “hey, pay attention” denotation. Both of those workflows should be possible, and, just to clarify, the outcome of the former if used should be to insert the latter, since it is highly desirable that the source should always be the entirety of what’s needed to build the program. This probably requires adding all sorts of error-quelling annotations to the source - and that’s a good thing! If the errors are sufficiently high-level in meaning, the annotations will actually aid a human reader, as well.

          So that’s where I’m at with it so far. I’d love to discuss all of this, but I frankly expect nobody else cares quite as much. :) If you read this far, thank you, and it’s my hope that these memes will someday make it to someone who has time to play with them.

          1. 2

            Oh - I forgot to write above, but part of the “metaphors” idea is also that humans are pretty good at working on the same underlying conceptual model from multiple perspectives at once. So the idea is that if what you’re modelling is a set of discourse topics (perhaps you’re writing an NLP thing), you could have a spatial metaphor for talking about how close two topics are by some metric that you have to define as part of introducing that, and then you could have a variety of other metaphors, also in scope, and more suited to the fact that the objects you’re discussing are communications. Perhaps these metaphors would talk about the intent of the topics in terms of speech acts, or about who is likely to use them, or about groupings of them based on interest categories.

            The point there is that these are all talking about the same data structure, but are different interfaces to it. Ideally, this would all be resolved at compile-time, rather than having a zillion methods to cause runtime overhead. We can do that today with multiple inheritance or with typeclasses, but it’s almost never worthwhile. Current practice is to permute our code into all using the same “metaphor” (ie. the same API) for a given data structure, because implementing more APIs would be a lot of work to express. It needs to be much, much simpler.

            I’d also like to mention mathematical model theory and its similarity to typeclasses, because I think it’s related to how contexts and metaphors as I’ve described them can be made less work to implement than APIs are today. But I don’t know entirely whether that’s true; it’s the subject on my mind right now, is all.

            1. 1

              the problem with warnings-as-bugs-always is that there is yet another layer of context that the programmer knows and the compiler cannot - the intended scope of the program’s usage. for example, it is great that a compiler can warn me that i am converting a string to an int without checking the failure case, but it cannot know that the program will only ever be run in a pipeline downstream of something that is emitting those strings from ints in the first place.

              1. 1

                I mean, that example holds but I really feel that programs should be robust against malicious input. Have you also secured the network connecting your tool to its upstream source, for example? How good is that security? What protection do you have against an ops person changing the pipeline because they didn’t realize your tool relied on this invariant, which, after all, it hasn’t described anywhere?

                But yes, there needs to be an escape hatch; for performance, sometimes you really do need to omit checks. I think the escape should be in the form of signing off on the root cause of each warning, separately. Again, I see source annotations as a good place to put the sign-off. For this to be sensible, the compiler needs to reproducibly find the same root cause each time it runs, even in the face of changes to nearby code; that in turn means that the set of annotations needs to be carefully chosen to be scoped neither too narrowly nor too broadly, and to mean something specific, not just “silence a warning around here”.

                Yes, there’s a great deal of design work to be done to make that happen. I didn’t say this was easy. :)

                1. 1

                  right, and i am a huge fan of the compiler helping you as much as it can, but there are times i want to say “look, i know better than you that fixing this would be needless overhead in terms of time, effort and code cluttering”. the last is a pretty big one - i am all for future-proof, robust code, but not if it means i have to add a lot of boilerplate in places where i know i will never need it, even if i cannot prove that to the compiler’s satisfaction.

                  maybe a nice, lightweight way to do it would be to add explicit “trusted” and “untrusted” annotations to the source code, which would override any inference the compiler performs. you could even have a sufficiently smart tool generate coverage data for the portion of the code that would break if something you marked trusted wasn’t reliable after all.

                  1. 2

                    An all-purpose “stop complaining” annotation is lightweight, but it’s not the problem I’m interested in solving - I’m looking at much more thorough re-imaginings. The fundamental limit created by the annotation being vague as to what it does is that it will someday mask an unrelated problem that’s nearby, and if people are in the habit of using that, they aren’t getting any benefit from the system’s capabilities - so they would be better off, honestly, using a language that doesn’t have them.

                    So high-precision annotations: Don’t warn me about implicit floating-point promotion of this particular variable. Or, say, of any variable in this portion of the file. Perhaps a selector language would help reduce the number of such annotations. I agree that it will always be verbose compared to just letting things break, but what’s boilerplate for you is documentation for people wanting to use your code, to see whether it meets their robustness expectations, or how much work it would be to fix if not.

                    I also have to say that I feel like the specific example of string input being in a known format is something that should be handled by a library, anyway. This isn’t 1990 and there are plenty of versatile, easy-to-use parse paradigms to choose from. It’s no longer sensible to be rolling that yourself. Presumably you can just set a “safe” or “crash please” flag when you call the library. Unless it’s a really, really tight inner loop, you’re going to have to convince me pretty thoroughly that you need the latter, when the issue of how much work it is at your end is removed, but…

          2. 6

            I kinda wanted a “compiler” tag here.

            1. 8

              or the wait a year and target the first Perl 6 language release

              uh huh.

              1. 3

                If you missed it, Larry announced at FOSDEM that beta will be out this year, for Christmas.