Threads for spookylukey

  1. 4

    Credential Stuffing - Does your password matter - No – attacker has exact password.

    This is precisely a situation where your choice of password does matter - if you don’t re-use, then the attack fails.

    Password spray - No, unless it is in the handful of top passwords attackers are trying.

    In other words - yes, it matters a lot. If you choose a good password, you won’t be vulnerable.

    Brute force - No, unless you are using an unusable password (and therefore, a password manager) or a really creative passphrase. See below.

    In other words - choosing a good strong password is exactly the thing that will protect you.

    In these 3 cases it’s the opposite of what the article says - your choice of password makes a large difference.

    The article appears to be mixing up two different questions:

    • Does the way an individual chooses passwords make a difference to the chance they will be hacked (answer, yes, it makes a big difference, especially in situations where other ways of obtaining passwords are likely to fail)

    • Should companies focus on passwords and password policies as a way of ensuring user security?

    1. 3

      One of the nice things about Django’s model layer is that you can often get both db level and application level for the price of one. If you can arrange something similar then that is a winner.

      1. 3

        I think we might get further with this if we did more of a “5 Whys” approach to responding to the excuses. e.g. why are people not allowed to touch that code, or why do they think that?

        1. 11

          So, AI has now graduated to the point where it mercilessly parodies and mocks its creators - the entire software industry - and we’re still not worried yet?

          1. 7

            honestly, if you forced anyone to look at gigabytes of code I’m sure they’d also start mocking the software industry!

            1. 4

              If it showed any signs of understanding the patterns that it’s synthesizing from a giant corpus, I would find it interesting.

              Worrying is that people still mistake this output for having meaning, but I suppose pareidolia is a pervasive condition.

              1. 3

                Yeah, I was joking :-)

            1. 6

              I think the idea that software doesn’t wear out is important, but after that, the perspective that software doesn’t fail doesn’t really help you:

              You never have anything close to a full specification, so it’s impossible to know if it’s correct or not.

              Your software will have to run in a large, changing range of software and hardware environments, that are all vastly under specified. So it might well work here and now, but not there and then. Bitrot is real.

              For example, you have to make a bunch of assumptions about what the OS or the CPU will do. But then Spectre comes along, or a bug in a dependency or platform. It does not help to argue “my software is correct” - it still does the wrong thing and still has to be fixed.

              1. 4

                An excellent example of how adhering to rigid definitions removes all utility from language. We can’t say software fails for the reasons the article describes, which are technically correct. We can only say that software does not work, and never did. But in reality according to the definitions set out, there is no non-trivial software that does anything useful and also works. So we can not say that a given piece of software works. So we are left with no language to describe the difference between two pieces of software, one of which works most of the time and one of which is plagued by constant failures and bugs.

                I am normally the one advocating for rigid definitions, but they have to be useful ones. The main hallmark of a bad definition is that it encompasses almost everything, or almost nothing within the set that it divides. In this case the word ‘working’ when applied to software encompasses almost nothing. This means that it is a bad (non-useful) definition.

              1. 1

                Coed looks interesting. Elm architecture, but in typescript.

                1. 2

                  eeue56 did a lot in the Elm community. I will venture to guess one of two things happened here: 1) a new project or its lead decided on TypeScript and TEA is still one decent way to construct UI architectures and it was ported to TS to use a familiar, tested pattern or 2) a project hit the limitations of Elm and was forced to use something else with types (which currently usually involves migrating to TypeScript, ReScript, or PureScript). Seeing the dates of these projects leads me to believe the author misses the ergonomics of an ML language but needs to be inside the TypeScript ecosystem.

                  1. 1

                    Somewhat accurate - I find myself using TS a bunch for work, but find working with React frustrating and find the syntax of TS to be awkward for type-safe code.

                    1. 2

                      Did you look at Rescript (bucklescript) at all? What puts me off is the emphasis on React, it’s not clear there is a decent TEA implementation for it.

                      1. 3

                        Yep, I did. Not a fan of the OCaml styling, plus my experience with the toolchain has been pretty rough - though I know they’ve improved stuff a bunch since I last used it. In my eyes, Elm is near perfect syntax wise for my personal preferences. I just want a better way of working with TypeScript projects with it.

                      2. 1

                        Thanks for reaffirming my understanding of the state of these technologies. I can definitely understand the niche you’re filling. I can empathize being unhappy with TypeScript ergonomics as well specifically moving from a PureScript project to a new, different TypeScript project.

                  1. 2

                    Anyone see any benchmarks for Hotwire? Mention of performance is conspicuously absent from DHH’s article and the Hotwire home page. This other article idealizes performance by implying updates take only 18ms, but that’s not under realistic traffic conditions and doesn’t include the DOM update itself, only the HTTP overhead.

                    Generally speaking, the SPAs will perform better for anything but mostly static content, especially under heavy traffic. Hotwire sends requests to the server for every DOM update and wait for the server to render HTML. SPAs send requests to the server only as needed and perform DOM updates in the browser itself.

                    LiveView takes a similar approach. I hear it performs OK except for memory scaling issues. But being built on Erlang gives it the benefit of being designed from the ground up to manage stateful connections concurrently. I suspect Hotwire performs more like Blazor (i.e. not very well). It seems like it might actually perform worse under heavy traffic since Hotwire doesn’t compile application logic to wasm so it can be run client-side like Blazor does.

                    1. 2

                      Looks like you skimmed the excellent post mortem there, liveview is used at scale already elsewhere. this is more of a pubsub design issue rather than an erlang vm or liveview issue per se. If you’re streaming updates faster than they can be consumed then you have problems anyway in any system. You need to find an alternative approach, which they did.

                      1. 1

                        Used at scale where? I’d genuinely like to see some data.

                        The article I linked to was the first really in-depth one I’ve seen. I didn’t skim it, but it’s a fair point: If you want to provide live updates without refreshing, you’re going to have scaling challenges regardless.

                        1. 1

                          https://smartlogic.io/podcast/elixir-wizards/s7e2-jose/ 12:00 on - but you can’t take this as a cargo cult example of success. They are using Liveview “at scale” but with what problems? ~19:00 I don’t know the exact details of where the friction happened.

                          Stress test already showed millions (2015) on a single server. https://fly.io/blog/how-we-got-to-liveview/ But I think this is micro/in-theory. I tweeted at Angel, I’m curious too.

                      2. 1

                        I think a big part of the answer is that although in theory a carefully written SPA could out-perform HTML over the wire, other constraints take things very far away from optimal. Compare, for example, Jira vs Github issues (or a blast from the past like Trac, which is still around, for example Django’s bug tracker https://code.djangoproject.com/query). The latter two both feel much lighter and you spend far less time waiting, despite both being server rendered HTML.

                        Another example would be my current client’s custom admin SPA (Django backend) . Some pages are very slow for a whole bunch of reasons. I reimplemented a significant fraction using the Django admin (server rendered HTML, almost no javascript), in a tiny fraction of the development time and the result felt 10 times lighter. Unfortunately this project is too far gone to change track now.

                        Some of the reasons are:

                        • app structure means you do far more HTTP requests, and then effectively do a client side join of data that could have been processed on the server.
                        • js libraries or components encourage loading all data so you can sort tables client side, for example, slowing down page load
                        • browsers are really good at rendering HTML fast, and SPAs end up downgrading this performance.
                        • visibility and optimisation of server side rendering performance is massively simpler.
                        • a single dev is typically responsible for a page loading speed, as opposed to split frontend/backend teams, which makes a massive difference.
                        1. 1

                          Interesting. I didn’t realize Github used HTML over the wire. What’s their implementation? Hotwire? Something custom? I’m digging through their blog, but the only article I’ve found that’s remotely related is their transition from jQuery to the Web Components API, which only relates to relatively small interactions in widgets.

                          I’m working on a .NET project now that uses async partials in a similar manner, but user interactions are noticeably slower than a comparable app I’d previously written in Vue. The more dynamic content in the partial, the longer it takes the server to render it. There may be some performance optimizations I’m missing and I admit to being a relative novice to C#. But, in general, SPA bashing is rarely supported by evidence.

                          Let’s take your Jira example. I’m looking at a Lighthouse report and a performance flamegraph of the main issue page. To chalk their performance problems up to “client side join” doesn’t tell the whole story. It takes half a second just for the host page to load, never mind the content in it. They also made the unfortunate choice of implementing a 2.2 MB Markdown WYSIWYG editor in addition to another editor for their proprietary LWML. Github sensibly has only one LWML (GFM) and you have to click on a preview tab to see how it will be rendered. I think it’s fair to say that If you rewrote all of Jira’s features, it’d be a pig no matter how you did it.

                          1. 2

                            Meant to reply to this earlier, then the weekend happened!

                            GitHub is basically a classic Ruby on Rail app - see https://github.blog/2019-09-09-running-github-on-rails-6-0/ and https://github.blog/2020-12-15-encapsulating-ruby-on-rails-views/ - using Web Components where they need it for enhanced UI. Open up web tools and you’ll see on most pages, the bulk of the page arrives as HTML from the server, and a few parts then load afterwards, also as HTML chunks. I’m guessing they have a custom JS implementation of this, it’s not that hard to do.

                            I completely agree that the comparison I made is far from the whole story, but part of my point is that other decisions and factors often dominate. Also, once you’ve gone down the SPA route, slapping in another chunk of JS for extra functionality is the path of least resistance, and justified on the basis of “they’ll only need to download this once”. While if you have HTML over the wire, where every page load has to stand on its own, I think you are more cautious about what you add.

                            I can’t comment on .NET technologies. I also agree that there are times when you simply must have the low latency of Javascript in the browser with async HTTP. I have an app exactly like this - one page is very demanding on the UI front, and complex too. It’s currently about 7000 lines of Elm, but I imagine that React would have probably worked OK too. But it would be terrible, both in terms of complexity and performance, with server-rendered HTML. But in my experience quite a lot of apps really don’t need the SPA. For that app, I just have the one page that is SPA-style (and it’s a critical page, user’s will spend 80% of their time), but the rest of the site, which contains a long tail of important functionality, is server-side HTML with some smatterings of JS and HTMX.

                        2. 1

                          I read that LiveView postmortem, and found it odd that none of the “lessons learned” included load testing (which AFAICT would have completely prevented the outage). Also, unbounded queues with no natural backpressure (aka Erlang mailboxes) are land mines waiting to explode.

                        1. 14

                          Part of the argument here is over the transitive and intransitive meanings of propagate - see https://en.m.wiktionary.org/wiki/propagate

                          The transitive form (I propagated a message) implies push mechanism. The intransitive form (the message propagated) is consistent with any flow of information, whether a push or pull mechanism, especially if viewed at a higher level of abstraction.

                          1. 2

                            One question I have: what does signing a file actually mean? For example, does it mean “I created this file” or “I confirm this file does not have a backdoor”, or “I confirm this is an official release”? Without defined semantics and additional metadata (or standardised formats, like software packages) , isn’t this open to cross protocol attacks? It feels like the “file” namespace is too generic to be useful.

                            1. 7

                              It means a given private key was used to sign the file. There isn’t a fundamental meaning above that.

                              1. 1

                                Yeah, the file namespace strikes me as a mistake, though one folks have been making with pgp since forever…

                              1. 3

                                It must be so easy to use that a full professor can use it

                                Is “professor” Fred Brooks’ equivalent of the way we talk about a user as “grandma”? That’s pretty interesting!

                                1. 18

                                  Pattern matching has been available in functional programming languages for decades now, it was introduced in the 70s. (Logic programming languages expose even more expressive forms, at higher runtime cost.) It obviously improves readability of code manipulating symbolic expressions/trees, and there is a lot of code like this. I find it surprising that in the 2020s there are still people wondering whether “the feature provides enough value to justify its complexity”.

                                  (The fact that Python did without for so long was rather a sign of closed-mindedness of its designer subgroup. The same applies, in my opinion, to languages (including Python, Go, etc.) that still don’t have proper support for disjoint union types / variants / sums / sealed case classes.)

                                  1. 45

                                    Pretty much every feature that has ever been added to every language ever is useful in some way. You can leave a comment like this on almost any feature that a language may not want to implement for one reason or the other.

                                    1. 14

                                      I think it makes more sense in statically typed languages, especially functional ones. That said, languages make different choices. For me, Python has always been about simplicity and readability, and as I’ve tried to show in the article, at least in Python, structural pattern matching is only useful in a relatively few cases. But it’s also a question of taste: I really value the simplicity of the Go language (and C before it), and don’t mind a little bit of verbosity if it makes things clearer and simpler. I did some Scala for a while, and I can see how people like the “power” of it, but the learning curve of its type system was very steep, and there were so many different ways to do things (not to mention the compiler was very slow, partly because of the very complex type system).

                                      1. 22

                                        For the record, pattern-matching was developed mostly in dynamically-typed languages before being adopted in statically-typed languages, and it works just as well in a dynamically-typed world. (In the ML-family world, sum types and pattern-matching were introduced by Hope, an experimental dynamically-typed language; in the logic world, they are basic constructs of Prolog, which is also dynamically-typed – although some more-typed dialects exist.)

                                        as I’ve tried to show in the article, at least in Python, structural pattern matching is only useful in a relatively few cases

                                        Out of the 4 cases you describe in the tutorial, I believe your description of two of them is overly advantageous to if..elif:

                                        • In the match event.get() case, the example you show is a variation of the original example (the longer of the three such examples in the tutorial), and the change you made makes it easier to write an equivalent if..elif version, because you integrated a case (from another version) that ignores all other Click() events. Without this case (as in the original tutorial example), rewriting with if..elif is harder, you need to duplicate the failure case.
                                        • In the eval_expr example, you consider the two versions as readable, but the pattern-version is much easier to maintain. Consider, for example, supporting operations with 4 or 5 parameters, or adding an extra parameter to an existing operator; it’s an easy change with the pattern-matching version, and requires boilerplate-y, non-local transformations with if..elif. These may be uncommon needs for standard mathematical operations, but they are very common when working with other domain-specific languages.
                                        1. 1

                                          the change you made makes it easier to write an equivalent if..elif version

                                          Sorry if it appeared that way – that was certainly not my intention. I’m not quite sure what you mean, though. The first/original event example in the tutorial handles all click events with no filtering using the same code path, so it’s even simpler to convert. I added the Button.LEFT filtering from a subsequent example to give it a bit more interest so it wasn’t quite so simple. I might be missing something, though.

                                          In the eval_expr example, you consider the two versions as readable, but the pattern-version is much easier to maintain. Consider, for example, supporting operations with 4 or 5 parameters, or adding an extra parameter to an existing operator;

                                          I think those examples are very hypothetical – as you indicate, binary and unary operators aren’t suddenly going to support 4 or 5 parameters. A new operation might, but that’s okay. The only line that’s slightly repetitive is the “attribute unpacking”: w, x, y, z = expr.w, expr.x, expr.y, expr.z.

                                          These may be uncommon needs for standard mathematical operations, but they are very common when working with other domain-specific languages.

                                          You’re right, and that’s part of my point. Python isn’t used for implementing compilers or interpreters all that often. That’s where I’m coming from when I ask, “does the feature provide enough value to justify the complexity?” If 90% of Python developers will only rarely use this complex feature, does it make sense to add it to the language?

                                          1. 3

                                            that was certainly not my intention.

                                            To be clear, I’m not suggesting that the change was intentional or sneaky, I’m just pointing out that the translation would be more subtle.

                                            The first/original event example does not ignore “all other Click events” (there is no Click() case), and therefore an accurate if..elif translation would have to do things differently if there is no position field or if it’s not a pair, namely it would have to fall back to the ValueError case.

                                            You’re right, and that’s part of my point. Python isn’t used for implementing compilers or interpreters all that often.

                                            You don’t need to implement a compiler for C or Java, or anything people recognize as a programming language (or HTML or CSS, etc.), to be dealing with a domain-specific languages. Many problem domains contain pieces of data that are effectively expressions in some DSL, and recognizing this can very helpful to write programs in those domains – if the language supports the right features to make this convenient. For example:

                                            • to start with the obvious, many programs start by interpreting some configuration file to influence their behavior; many programs have simple needs well-served by linear formats, but many programs (eg. cron jobs, etc.) require more elaborate configurations that are DSL-like. Even if the configuration is written in some standard format (INI, Yaml, etc.) – so parsing can be delegated to a library – the programmer will still write code to interpret or analyze the configuration data.
                                            • more gnerally, “structured data formats” are often DSL-shaped; ingesting structured data is something we do super-often in programs
                                            • programs that offer a “query” capability typically provide a small language to express those queries
                                            • events in an event loop typically form a small language
                                        2. 14

                                          I think it makes more sense in statically typed languages, especially functional ones.

                                          In addition to the earlier ones gasche mentioned (it’s important to remember this history), it’s used to pervasively in Erlang, and later Elixir. Clojure has core.match, Racket has match, as does Guile. It’s now in Ruby as well!

                                          1. 3

                                            Thanks! I didn’t know that. I have used pattern matching in statically typed language (mostly Scala), and had seen it in the likes of Haskell and OCaml, so I’d incorrectly assumed it was mainly a statically-typed language thing.

                                            1. 1

                                              It is an important feature of OCaml.

                                              1. 3

                                                I am aware - was focusing on dynamically typed languages.

                                            2. 7

                                              For me, it is the combination of algebraic data types + pattern matching + compile time exhaustiveness checking that is the real game changer. With just 1 out of 3, pattern matching in Python is much less compelling.

                                              1. 1

                                                I agree. I wonder if they plan to add exhaustiveness checking to mypy. The way the PEP is so no hold barred makes it seem like the goal was featurefulness and not an attempt to support exhaustiveness checking.

                                                1. 2

                                                  I wonder if they plan to add exhaustiveness checking to mypy.

                                                  I don’t think that’s possible in the general case. If I understand the PEP correctly, __match_args__ may be a @property getter method, which could read the contents of a file, or perform a network request, etc.

                                            3. 11

                                              I find it surprising that in the 2020s there are still people wondering whether “the feature provides enough value to justify its complexity”.

                                              I find it surprising that people find this surprising.

                                              Adding features like pattern matching isn’t trivial, and adding it too hastily can backfire in the long term; especially for an established language like Python. As such I would prefer a language take their time, rather than slapping things on because somebody on the internet said it was a good idea.

                                              1. 3

                                                That’s always been the Scheme philosophy:

                                                Programming languages should be designed not by piling feature on top of feature, but by removing the weaknesses and restrictions that make additional features appear necessary.

                                                And indeed, this pays off: in the Scheme world, there’s been a match package floating around for a long time, implemented simply as a macro. No changes to the core language needed.

                                                1. 4

                                                  No changes to the core language needed.

                                                  I’m sure you recognize that this situation does not translate to other languages like in this case Python. Implementing it as a macro is just not feasible. And even in Scheme the usage of match macros is rather low. This can be because it is not that useful, but also might be because of the hurdle of adding dependencies is not worth the payoff. Once a feature is integrated in a language, its usage “costs” nothing, thus the value proposition when writing code can be quite different.

                                                  1. 7

                                                    This is rather unrelated to the overall discussion, but as a user of the match macros in Scheme, I must say that I find the lack of integration into the base forms slightly annoying. You cannot pattern-match on a let or lambda, you have to use match-let and match-lambda, define/match (the latter only in Racket I think), etc. This makes reaching for pattern-matching feel heavier, and it may be a partial cause to their comparatively lower usage. ML-family languages generalize all binding positions to accept patterns, which is very nice to decompose records for example (or other single-case data structures). I wish Scheme dialects would embrace this generalization, but they haven’t for now – at least not Racket or Clojure.

                                                    1. 2

                                                      In the case of Clojure while it doesn’t have pattern matching built-in, it does have quite comprehensive destructuring forms (like nested matching in maps, with rather elaborate mechanisms) that works in all binding positions.

                                                      1. 2

                                                        Nice! I suppose (from your post above) that pattern-matching is somehow “integrated” in the Clojure implementation, rather than just being part of the base macro layer that all users see.

                                                        1. 2

                                                          I think the case is that Clojure core special forms support it (I suppose the implementation itself is here and called “binding-forms”, which is then used by let, fn and loop which user defined macros often end up expanding to). Thus it is somewhat under the base layer that people use.

                                                          But bear in mind this is destructuring, in a more general manner than what Python 2.x already supported, not pattern matching. It also tends to get messy with deep destructuring, but the same can be said of deep pattern matches through multiple layers of constructors.

                                              2. 8

                                                I agree about pattern matching and Python in general. It’s depressing how many features have died in python-ideas because it takes more than a few seconds for an established programmer to grok them. Function composition comes to mind.

                                                But I think Python might be too complicated for pattern matching. The mechanism they’ve settled on is pretty gnarly. I wrote a thing for pattern matching regexps to see how it’d turn out (admittedly against an early version of the PEP; I haven’t checked it against the current state) and I think the results speak for themselves.

                                                1. 6

                                                  But I think Python might be too complicated for pattern matching. The mechanism they’ve settled on is pretty gnarly.

                                                  I mostly agree. I generally like pattern matching and have been excited about this feature, but am still feeling out exactly when I’ll use it and how it lines up with my intuition.

                                                  The part that does feel very Pythonic is that destructuring/unpacking is already pretty pervasive in Python. Not only for basic assignments, but also integrated into control flow constructs. For example, it’s idiomatic to do something like:

                                                  for key, val in some_dictionary.items():
                                                      # ...
                                                  

                                                  Rather than:

                                                  for item in some_dictionary.items():
                                                      key, val = item
                                                      # ...
                                                  

                                                  Or something even worse, like explicit item[0] and item[1]. So the lack of a conditional-with-destructuring, the way we already have foreach-with-destructuring, did seem like a real gap to me, making you have to write the moral equivalent of code that looks more like the 2nd case than the 1st. That hole is now filled by pattern matching. But I agree there are pitfalls around how all these features interact.

                                                2. 2
                                                  for i, (k, v) in enumerate(d.items(), 1): pass
                                                  

                                                  looks like pattern matching to me

                                                  1. 2

                                                    Go aims for simplicity of maintenance and deployment. It doesn’t “still don’t have those features”. The Go authors avoided them on purpose. If you want endless abstractions in Go, embedding Lisp is a possibilty: https://github.com/glycerine/zygomys

                                                    1. 5

                                                      Disjoint sums are a basic programming feature (it models data whose shape is “either this or that or that other thing”, which ubiquitous in the wild just like pairs/records/structs). It is not an “endless abstraction”, and it is perfectly compatible with maintenance and deployment. Go is a nice language in some respects, the runtime is excellent, the tooling is impressive, etc etc. But this is no rational excuse for the lack of some basic language features.

                                                      We are in the 2020s, there is no excuse for lacking support for sum types and/or pattern matching. Those features have been available for 30 years, their implementation is well-understood, they require no specific runtime support, and they are useful in basically all problem domains.

                                                      I’m not trying to bash a language and attract defensive reactions, but rather to discuss (with concrete examples) the fact that language designer’s mindsets can be influenced by some design cultures more than others, and as a result sometimes the design is held back by a lack of interest for things they are unfamiliar with. Not everyone is fortunate to be working with a deeply knowledgeable and curious language designer, such as Graydon Hoare; we need more such people in our language design teams. The default is for people to keep working on what they know; this sort of closed-ecosystem evolution can lead to beautiful ideas (some bits of Perl 6 for example are very nice!), but it can also hold back.

                                                      1. 3

                                                        But this is no rational excuse for the lack of some basic language features.

                                                        Yes there is. Everyone has a favorite feature, and if all of those are implemented, there would easily be feature bloat, long build times and projects with too many dependencies that depend on too many dependencies, like in C++.

                                                        In my opinion, the question is not if a language lacks a feature that someone wants or not, but if it’s usable for goals that people wish to achieve, and Go is clearly suitable for many goals.

                                                    2. 3

                                                      Ah yes, Python is famously closed-minded and hateful toward useful features. For example, they’d never adopt something like, say, list comprehensions. The language’s leaders are far too closed-minded, and dogmatically unwilling to ever consider superior ideas, to pick up something like that. Same for any sort of ability to work with lazy iterables, or do useful combinatoric work with them. That’s something that definitely will never be adopted into Python due to the closed-mindedness of its leaders. And don’t get me started on basic FP building blocks like map and folds. It’s well known that Guido hates them so much that they’re permanently forbidden from ever being in the language!

                                                      (the fact that Python is not Lisp was always unforgivable to many people; the fact that it is not Haskell has now apparently overtaken that on the list of irredeemable sins; yet somehow we Python programmers continue to get useful work done and shrug off the sneers and insults of our self-proclaimed betters much as we always have)

                                                      1. 25

                                                        It is well-documented that Guido Van Rossum planned to remove lambda from Python 3. (For the record, I agree that map and filter on lists are much less useful in presence of list comprehensions.) It is also well-documented that recursion is severely limited in Python, making many elegant definitions impractical.

                                                        Sure, Python adopted (in 2000 I believe?) list comprehensions from ABC (due to Guido working with the language in the 1980s), and a couple of library-definable iterators. I don’t think this contradicts my claim. New ideas came to the language since (generators, decorators), but it remains notable that the language seems to have resisted incorporating strong ideas from other languages. (More so than, say, Ruby, C#, Kotlin, etc.)

                                                        Meta: One aspect of your post that I find unpleasant is the tone. You speak of “sneers and insults”, but it is your post that is highly sarcastic and full of stray exagerations at this or that language community. I’m not interested in escalating in this direction.

                                                        1. 7

                                                          less useful in presence of list comprehension

                                                          I’m certainly biased, but I find Python’s list comprehension an abomination towards readability in comparison to higher-order pipelines or recursion. I’ve not personally coded Python in 8-9 years, but when I see examples, I feel like I need to put my head on upsidedown to understand it.

                                                          1. 6

                                                            It is also well-documented that recursion is severely limited in Python, making many elegant definitions impractical.

                                                            For a subjective definition of “elegant”. But this basically is just “Python is not Lisp” (or more specifically, “Python is not Scheme”). And that’s OK. Not every language has to have Scheme’s approach to programming, and Scheme’s history has shown that maybe it’s a good thing for other languages not to be Scheme, since Scheme has been badly held back by its community’s insistence that tail-recursive implementations of algorithms should be the only implementations of those algorithms.

                                                            You speak of “sneers and insults”, but it is your post that is highly sarcastic and full of stray exagerations at this or that language community.

                                                            Your original comment started from a place of assuming – and there really is no other way to read it! – that the programming patterns you care about are objectively superior to other patterns, that languages which do not adopt those patterns are inherently inferior, and that the only reason why a language would not adopt them is due to “closed-mindedness”. Nowhere in your comment is there room for the (ironically) open-minded possibility that someone else might look at patterns you personally subjectively love, evaluate them rationally, and come to a different conclusion than you did – rather, you assume that people who disagree with your stance must be doing so because of personal faults on their part.

                                                            And, well, like I said we’ve got decades of experience of people looking down their noses at Python and/or its core team + community for not becoming a copy of their preferred languages. Your comment really is just another instance of that.

                                                            1. 8

                                                              I’m not specifically pointing out the lack of tail-call optimization (TCO) in Python (which I think is unfortunate indeed; the main argument is that call stack matters, but it’s technically fully possible to preserve call stacks on the side with TC-optimizing implementations). Ignoring TCO for a minute, the main problem would be the fact that the CPython interpreter severely limits the call space (iirc it’s 1K calls by default; compare that to the 8Mb default on most Unix systems), making recursion mostly unusable in practice, except for logarithmic-space algorithms (balanced trees, etc.).

                                                              Scheme has been badly held back by its community’s insistence that tail-recursive implementations of algorithms should be the only implementations of those algorithms.

                                                              I’m not sure what you mean – that does not make any sense to me.

                                                              [you assume] that the programming patterns you care about are objectively superior to other patterns [..]

                                                              Well, I claimed

                                                              [pattern matching] obviously improves readability of code manipulating symbolic expressions/trees

                                                              and I stand by this rather modest claim, which I believe is an objective statement. In fact it is supported quite well by the blog post that this comment thread is about. (Pattern-matching combines very well with static typing, and it will be interesting to see what Python typers make of it; but its benefits are already evident in a dynamically-typed context.)

                                                              1. 4

                                                                and I stand by this rather modest claim, which I believe is an objective statement.

                                                                Nit: I don’t think you can have an objective statement of value.

                                                                1. 4

                                                                  Again: your original comment admits of no other interpretation than that you do not believe anyone could rationally look at the feature you like and come to a different conclusion about it. Thus you had to resort to trying to find personal fault in anyone who did.

                                                                  This does not indicate “closed-mindedness” on the part of others. They may prioritize things differently than you do. They may take different views of complexity and tradeoffs (which are the core of any new language-feature proposal) than you do. Or perhaps they simply do not like the feature as much as you do. But you were unwilling to allow for this — if someone didn’t agree with your stance it must be due to personal fault. You allowed for no other explanation.

                                                                  That is a problem. And from someone who’s used to seeing that sort of attitude it will get you a dismissive “here we go again”. Which is exactly what you got.

                                                              2. 4

                                                                This is perhaps more of a feeling, but saying that Rust isn’t adopting features as quickly as Ruby seems a bit off. Static type adoption in the Python community has been quicker. async/await has been painful, but is being attempted. Stuff like generalized unpacking (and this!) is also shipping out!

                                                                Maybe it can be faster, but honestly Python probably has one of the lowest “funding amount relative to impact” of the modern languages which makes the whole project not be able to just get things done as quickly IMO.

                                                                Python is truly in a funny place, where many people loudly complain about it not adopting enough features, and many other loudly complain about it loudly adopting too many! It’s of course “different people have different opinions” but still funny to see all on the same page.

                                                                1. 3

                                                                  It is well-documented that Guido Van Rossum planned to remove lambda from Python 3

                                                                  Thank you for sharing that document. I think Guido was right: it’s not pythonic to map, nor to use lambdas in most cases.

                                                                  Every feature is useful, but some ecosystems work better without certain features. I’m not sure where go’s generics fall on this spectrum, but I’m sure most proposed features for python move it away from it’s core competency, rather than augmenting a strong core.

                                                                  1. 1

                                                                    We have previously discussed their tone problem. It comes from their political position within the Python ecosystem and they’re relatively blind to it. Just try to stay cool, I suppose?

                                                                    1. 6

                                                                      I really do recommend clicking through to that link, and seeing just what an unbelievably awful thing I said that the user above called out as “emblematic” of the “contempt” I display to Python users. Or the horrific ulterior motive I was found to have further down.

                                                                      Please, though, before clicking through, shield the eyes of children and anyone else who might be affected by seeing such content.

                                                                  2. 5

                                                                    To pick one of my favorite examples, I talked to the author of PEP 498 after a presentation that they gave on f-strings, and asked why they did not add destructuring for f-strings, as well as whether they knew about customizeable template literals in ECMAScript, which trace their lineage through quasiliterals in E all the way back to quasiquotation in formal logic. The author knew of all of this history too, but told me that they were unable to convince CPython’s core developers to adopt any of the more advanced language features because they were not seen as useful.

                                                                    I think that this perspective is the one which might help you understand. Where you see one new feature in PEP 498, I see three missing subfeatures. Where you see itertools as a successful borrowing of many different ideas from many different languages, I see a failure to embrace the arrays and tacit programming of APL and K, and a lack of pattern-matching and custom operators compared to Haskell and SML.

                                                                  3. 1

                                                                    I think the issue is more about pattern matching being a late addition to Python, which means there will be lots of code floating around that isn’t using match expressions. Since it’s not realistic to expect this code to be ported, the old style if … elif will continue to live on. All of this adds up to a larger language surface area, which makes tool support, learning and consistency more difficult.

                                                                    I’m not really a big fan of this “pile of features” style of language design - if you add something I’d prefer if something got taken away as well. Otherwise you’ll end up with something like Perl 5

                                                                  1. 1

                                                                    Is it just me, or is unveil a terrible choice of name? It normally means “remove a veil”, “disclose” or “reveal”. Its function is almost exactly the opposite - it removes access to things! As the author says:

                                                                    Let’s start with unveil. Initially a process has access to the whole file system with the usual restrictions. On the first call to unveil it’s immediately restricted to some subset of the tree.

                                                                    Reading the first line of the man page I can see how it might make sense in some original context, but this is the opposite of the kind of naming you want for security functions…

                                                                    1. 3

                                                                      Is it just me, or is unveil a terrible choice of name? It normally means “remove a veil”, “disclose” or “reveal”. Its function is almost exactly the opposite - it removes access to things!

                                                                      It explicitly grants access to a list of things, starting from the empty set. If it’s not called, everything is unveiled by default.

                                                                      1. 3

                                                                        I am not a native speaker, so I cannot comment if the verb itself is a good choice or not :)

                                                                        As a programmer who uses unveil() in his own programs, the name makes total sense. You basically unveil selected path to the program. If you then change your code to work with other files, you also have to unveil these files to your program.

                                                                        1. 2

                                                                          OK, I understand - it’s only for the first usage it actually restricts, and immediately also unveils, after that it continues to unveil.

                                                                        2. 2

                                                                          “Veiling” is not a standard idea in capability theory, but borrowed from legal practice. A veiled fact or object is ambient, but access to it is still explicit and tamed. Ideally, filesystems would be veiled by default, and programs would have to statically register which paths they intend to access without further permission. (Dynamic access would be delegated by the user as usual.)

                                                                          I think that the main problem is that pledges and unveiling are performed as syscalls after a process has started, but there is no corresponding phase before the process starts where pledges are loaded from the process’s binary and the filesystem is veiled.

                                                                          1. 1

                                                                            Doing it as part of normal execution implements separate phases of pledge/unveil boundaries in a flexible way. The article gives the example of opening a log file, and then pledging away your ability to open files, and it’s easy to imagine a similar process for, say, a file server unveiling only the public root directory in between loading its configuration and opening a listen socket.

                                                                            1. 1

                                                                              I think that the main problem is that pledges and unveiling are performed as syscalls after a process has started, but there is no corresponding phase before the process starts where pledges are loaded from the process’s binary and the filesystem is veiled.

                                                                              Well the process comes from somewhere. Having a chain-loader process/executable that sanitises the inherited environment and sets up for the next fits well with the established execution model. It’s explicitly prepared for this in pledge(, execpromises).

                                                                              1. 2

                                                                                You could put it in e.g. an elf header, or fs-level metadata (like suid). Which also fits well with the existing execution model.

                                                                                Suid is a good comparison, despite being such an abomination, because under that model the same mechanism can double as a sandbox.

                                                                                Chainloader approach is good, but complexity becomes harder to wrangle with explicit pledges if you want to do djb-style many communicating processes. On the other hand, file permissions are distant from the code, and do not have an answer for ‘I need to wait until runtime to figure out what permissions I need’.

                                                                                1. 1

                                                                                  Not going too far into the static/dynamic swamp shenanigans (say setting a different PT_INTERP and dlsym:ing out a __constructor pledge/unveil) - there’s two immediate reasons why I’d prefer not to see it as a file-meta property.

                                                                                  1. Filesystem legacy is not pretty, and accidental stripping of meta on a move to incompatible file-system would have a fail-silent-dangerous (stripping sudo is not dangerous versus stripping pledge setup).
                                                                                  2. Pledge- violations go kaboom, then you need to know that this was what happened (dmesg etc.) and you land in core_pattern like setups. The choice of chain-loader meanwhile takes the responsibility of attribution/communication so x11 gets its dialog or whatever, isatty() a fprintf and others a syslog and so on.
                                                                            2. 1

                                                                              Like Linux’s unshare

                                                                            1. 6

                                                                              Drawback: some languages/projects are very unergonomic without an IDE. This is almost always the result of bad language design and/or bad library design.

                                                                              I’m no particular fan of IDEs, but if you are designing a language/library with IDEs in mind, it’s obvious that language/library designers are going to consciously or unconsciously make the most of the environment they expect developers will be in when they are designing languages/libraries. Not doing so would be silly, in fact.

                                                                              Now, you might not like languages/libraries designed for IDEs, but you cannot simply claim it is bad without reasons in an article with this title - it is the very thing you are supposed to be discussing.

                                                                              1. 6

                                                                                For example, Apple’s API design guidelines encourage longer names because they’re more readable, especially with the parameter keywords used in Obj-C and Swift. So you get “indexOfSubstring:inRange:” instead of, say, “strstr”. This was of course done with autocomplete in mind, because that means you don’t have to type the long names by hand.

                                                                                Likewise, I’m sure LISP coding would be a lot less bearable in an editor that didn’t highlight matching parentheses for you ;-)

                                                                              1. 22

                                                                                I’m honestly appalled that such an ignorant article has been written by a former EU MEP. This article completely ignores the fact that the creation of Copilot’s model itself is a copyright infringement. You give Github a license to store and distribute your code from public repositories. You do not give it a permission to Github to use it or create derivative works. And as Copilot’s model is created from various public code, it is a derivative of that code. Some may try to argue that training machine learning models is ‘fair use’, yet I doubt that you can call something that can regurgitate the entire meaningful portion of a file (example taken from Github’s own public dataset of exact generated code collisions) is not a derivative work.

                                                                                1. 13

                                                                                  In many jurisdictions, as noted in the article, the “right to read is the right to mine” - that is the point. There is already an automatic exemption from copyright law for the purposes of computational analysis, and GitHub don’t need to get that permission from you, as long as they have the legal right to read the code (i.e. they didn’t obtain it illegally).

                                                                                  This appears to be the case in the EU and Britain - https://www.gov.uk/guidance/exceptions-to-copyright - I’m not sure about the US.

                                                                                  Something is not a derivative work in copyright law simply due to having a work as an “input” - you cannot simply argue “it is derived from” therefore “it is a derivative work”, because copyright law, not English language, defines what a “derivative work” is.

                                                                                  For example, Markov chain analysis done on SICP is not infringing.

                                                                                  Obviously, there are limits to this argument. If Copilot regurgitates a significant portion verbatim, e.g. 200 LOC, is that a derivative? If it is 1,000 lines where not one line matches, but it is essentially the same with just variables renamed, is that a derivative work? etc. I think the problem is that existing law doesn’t properly anticipate the kind of machine learning we are talking about here.

                                                                                  1. 3

                                                                                    Dunno how it is in other countries, but in Lithuania, I can not find any exceptions to use my works without me agreeing to it that fit what Github has done. The closest one could be citation, but they do not comply with the requirement of mentioning my name and work from which the citation is taken.

                                                                                    I gave them the license to reproduce, not to use or modify - these are two entirely different things. If they weren’t, then Github has the ability to use all AGPL’d code hosted on it without any problems, and that’s obviously wrong.

                                                                                    There is no separate “mining” clause. That is not a term in copyright. Notice how research is quite explicitly “non-comercial” - and I very much doubt that what Github is doing with Copilot is non-comercial in nature.

                                                                                    The fact that similar works were done previously doesn’t mean that they were legal. They might have been ignored by the copyright owners, but this one quite obviously isn’t.

                                                                                    1. 8

                                                                                      There is no separate “mining” clause. That is not a term in copyright. Notice how research is quite explicitly “non-comercial” - and I very much doubt that what Github is doing with Copilot is non-comercial in nature.

                                                                                      Ms. Reda is referring to a copyright reform adapted on the EU level in 2019. This reform entailed the DSM directive 2019/790, which is more commonly known for the regulations regarding upload filters. This directive contains a text and data mining copyright limitation in Art. 3 ff. The reason why you don’t see this limitation in Lithuanian law (yet), is probably because Lithuania has not yet transformed the DSM directive into its national law. This should probably follow soon, since Art. 29 mandates transformation into national law until June 29th, 2021. Germany has not yet completed the transformation either.

                                                                                      That is, “text and data mining” now is a term in copyright. It is even legally defined on the EU level in Art. 2 Nr. 2 DSM directive.

                                                                                      That being said, the text and data mining exception in Art. 3 ff. DSM directive does not – at first glance, I have only taken a cursory look – allow commercial use of the technique, but only permits research.

                                                                                      1. 1

                                                                                        Oh, huh, here it’s called an education and research exception and has been in law for way longer than that directive, and it doesn’t mention anything remotely translatable as mining. It didn’t even cross my mind that she could have been referring to that. I see that she pushed for that exception to be available for everyone, not only research and cultural heritage, but it is careless of her to mix up what she wants the law to be, and what the law is.

                                                                                        Just as a preventative answer, no, Art 4. of DSM directive does not allow Github to do what it does either, as it applies to work that “has not been expressly reserved by their rightholders in an appropriate manner, such as machine-readable means in the case of content made publicly available online.”, and Github was free to get the content in an appropriate manner for machine learning. It is using the content for machine learning that infringes the code owners copyright.

                                                                                      2. 5

                                                                                        I gave them the license to reproduce, not to use or modify - these are two entirely different things. If they weren’t, then Github has the ability to use all AGPL’d code hosted on it without any problems, and that’s obviously wrong.

                                                                                        Important thing is also that the copyright owner is often different person than the one, who signed a contract with GitHub and uploaded there the codes (git commit vs. git push). The uploader might agree with whatever terms and conditions, but the copyright owner’s rights must not be disrupted in any way.

                                                                                        1. 3

                                                                                          Nobody is required to accept terms of a software license. If they don’t agree to the license terms, then they don’t get additional rights granted in the license, but it doesn’t take away rights granted by the copyright law by default.

                                                                                          Even if you licensed your code under “I forbid you from even looking at this!!!”, I can still look at it, and copy portions of it, parody it, create transformative works, use it for educational purposes, etc., as permitted by copyright law exceptions (details vary from country to country, but the gist is the same).

                                                                                      3. 10

                                                                                        Ms. Reda is a member of the Pirate Party, which is primarily focused on the intersection of tech and copyright. She has a lot of experience working on copyright-related legislation, including proposals specifically about text mining. She’s been a voice of reason when the link tax and upload filters were proposed. She’s probably the copyright expert in the EU parliament.

                                                                                        So be careful when you call her ignorant and mistaken about basics of copyright. She may have drafted the laws you’re trying to explain to her.

                                                                                        1. 16

                                                                                          It is precisely because of her credentials that I am so appalled. I cannot in a good mind find this statement not ignorant.

                                                                                          The directive about text mining very explicitly specifies “only for “research institutions” and “for the purposes of scientific research”.” Github and it’s Copilot doesn’t fall into that classification at all.

                                                                                          1. 3

                                                                                            Indeed.

                                                                                            Even though my opinion of Copilot is near-instant revulsion, the basic idea is that information and code is being used to train a machine learning system.

                                                                                            This is analogous to a human reviewing and reading code, and learning how to do so from lots of examples. And someone going through higher ed school isn’t “owned” by the copyright owners of the books and code they read and review.

                                                                                            If Copilot is violating, so are humans who read. And that… that’s a very disturbing and disgusting precedent that I hope we don’t set.

                                                                                            1. 6

                                                                                              Copilot doesn’t infringe, but GitHub does, when they distribute Copilot’s output. Analogously to humans, humans who read do not infringe, but they do when they distribute.

                                                                                              1. 1

                                                                                                Why is it not the human that distributes copilots output?

                                                                                                1. 1

                                                                                                  Because Copilot first had to deliver the code to the human. Across the Internet.

                                                                                              2. 4

                                                                                                I don’t think that’s right. A human who learns doesn’t just parrot out pre-memorized code, and if they do they’re infringing on the copyright in that code.

                                                                                                1. 2

                                                                                                  The real question, that I think people are missing, is learning itself is a derivative work?

                                                                                                  How that learning happens can either be with a human, or with a machine learning algorithm. And with the squishiness and lack of insight with human brains, a human can claim they insightfully invented it, even if it was derived. The ML we’re seeing here is doing a rudimentary version of what a human would do.

                                                                                                  If Copilot is ‘violating’, then humans can also be ‘violating’. And I believe that is a dangerous path, laying IP based claims on humans because they read something.

                                                                                                  And as I said upthread, as much as I have a kneejerk that Copilot is bad, I don’t see how it could be infringing without also doing the same to humans.

                                                                                                  And as a underlying idea: copyright itself is a busted concept. It worked for the time before mechanical and electrical duplication took hold at a near 0 value. Now? Not so much.

                                                                                                  1. 3

                                                                                                    I don’t agree with you that humans and Copilot are learning somewhat the same.

                                                                                                    The human may learn by rote memorization, but more likely, they are learning patterns and the why behind those patterns. Copilot also learns patterns, but there is no why in its “brain.” It is completely rote memorization of patterns.

                                                                                                    The fact that humans learn the why is what makes us different and not infringing, while Copilot infringes.

                                                                                                    1. 2

                                                                                                      Computers learn syntax, humans learn syntax and semantics.

                                                                                                      1. 1

                                                                                                        Perfect way of putting it. Thank you.

                                                                                                    2. 3

                                                                                                      No I don’t think that’s the real question. Copying is treated as an objective question (and I’m willing to be corrected by experts in copyright law) ie similarity or its lack determine copying regardless of intent to copy, unless the creation was independent.

                                                                                                      But even if we address ourselves to that question, I don’t think machine learning is qualitatively similar to human learning. Shoving a bunch of data together into a numerical model to perform sequence prediction doesn’t equate to human invention, it’s a stochastic copying tool.

                                                                                                  2. 3

                                                                                                    It seems like it could be used to shirk the effort required for a clean room implementation. What if I trained the model on one and only one piece of code I didn’t like the license of, and then used the model to regurgitate it, can I then just stick my own license on it and claim it’s not derivative?

                                                                                                  3. 2

                                                                                                    Ms. Reda is a member of the Pirate Party

                                                                                                    She has left the Pirate Party years ago, after having installed a potential MEP “successor” who was unknown to almost everyone in the party; she subsequently published a video not to vote Pirates because of him as he was allegedly a sex offender (which was proven untrue months later).

                                                                                                    1. 0

                                                                                                      Why exactly do you think someone from the ‘pirate party’ would respect any sort of copyright? That sounds like they might be pretty biased against copyright…

                                                                                                      1. 3

                                                                                                        Despite a cheeky name, it’s a serious party. Check out their programme. Even if the party is biased against copyright monopolies, DRM, frivolous patents, etc. they still need expertise in how things work currently in order to effectively oppose them.

                                                                                                    2. 4

                                                                                                      Have you read the article?

                                                                                                      She addresses these concerns directly. You might not agree but you claim she “ignores” this.

                                                                                                      1. 1

                                                                                                        And as Copilot’s model is created from various public code, it is a derivative of that code.

                                                                                                        Depends on the legal system. I don’t know what happens if I am based in Europe but the guys doing this are in USA. It probably just means that they can do whatever they want. The article makes a ton of claims about various legal aspects of all of this but as far as I know Julia is not actually a lawyer so I think we can ignore this article.

                                                                                                        In Poland maybe this could be considered a “derivative work” but then work which was “inspired” by the original is not covered (so maybe the output of the network is inspired?) and then you have a separate section about databases so maybe this is a database in some weird way of understanding it? If you are not a lawyer I doubt you can properly analyse this. The article tries to analyse the legal aspect and a moral aspect at the same time while those are completely different things.

                                                                                                      1. 3

                                                                                                        I recently implemented a system for GDPR compliance (especially data retention and right to erasure), using Django’s ORM, which enforces the kind of thing mentioned in this article, and it turned out really nicely. Being able to iterate over all your tables and fields, and require that a policy is defined for everything, is really not hard with Django’s _meta API.

                                                                                                        I also made the configuration of the data retention policy into a human-and-machine readable document - data_retention.yaml, which worked well for a system this size (small charity). The implementation (both of applying the policy and checking that the policy is exhaustive) is in data_retention.py for anyone interested. This way also required essentially zero changes to main application code.

                                                                                                        Of course, there are still holes - I can’t enforce that all future code will only store data in the main database. But mostly people will do, because it is the easiest path.

                                                                                                        Another nice feature was that, despite the fact that the details are specified in YAML not Python or SQL, running the data retention purge commands generates between 0 and 1 database queries per table (nothing, a DELETE or an UPDATE) - thanks to ORM goodness in being able to build up complex queries dynamically.

                                                                                                        1. 3

                                                                                                          I have a suspicion that a large part of big discrepancy there is in people’s reaction to using Docker, especially in development, is whether your dev machine is Linux or not, which people often forget to mention.

                                                                                                          As a long time Linux user, I already know how do things like install multiple versions of Postgres on my dev machine (which isn’t hard), and isolate projects as needed (virtualenvs etc), in ways that are far more lightweight and convenient than Docker, and very, very rarely hit “it works on my machine but not in prod”. I imagine that if I were a Mac user Docker might be a lifesaver.

                                                                                                          1. 1

                                                                                                            Bookmarks/links: https://pinboard.in/

                                                                                                            Random bits of text, occasional file: “Note to self” in Signal Messenger can be quite useful, you can attach files.

                                                                                                            More substantial files/directories that need to be shared, other random files: Seafile - https://www.seafile.com -self-hosted. This took a bit of work to set up, but after Dropbox completely messed up with some symlinks and deleted a bunch of my files (thankfully recovered), I needed a low-scale solution that I trusted.

                                                                                                            1. 1

                                                                                                              Pinboard is great. However I’ve traditionally used it as a “links I want to save or reference later” tool and most of the time the link I want to move between devices doesn’t fit that criteria.

                                                                                                              Thanks for the seafile mention! I’ll check it out.

                                                                                                            1. 2

                                                                                                              Every increase in expressiveness brings an increased burden on all who care to understand the message.

                                                                                                              I beg to differ. For example, the word “history” is a supposedly an “increased burden”, since you have to learn it. But it adds to your expressiveness, and it would be much more of a burden trying to have a conversation about history without it.

                                                                                                              I agree that giving the users ultimate freedom may lead to creating a bad code culture, and might even lead to limitation on the language as it tries to grow. But I would argue that these problem usually happen not because there is an abundance of power, but because the design doesn’t help users harness it properly. For example, in Python there is exec, which lets you do pretty much anything you might want. Python also has support for direct memory access. But these functions are almost never used, because 1) they are not part of the language’s core design, and so are uncomfortable to use, and 2) there’s almost always an easier way to do whatever it is that you want. I’m not saying that Python is perfect, but just that it is possible to influence users, by giving them all the power, but putting some of it on a higher shelf, on purpose, so not everyone will try to reach.

                                                                                                              The only thing worse than a language with too much power, is a language with not enough of it.

                                                                                                              1. 6

                                                                                                                Every increase in expressiveness brings an increased burden on all who care to understand the message.

                                                                                                                I beg to differ. For example, the word “history” is a supposedly an “increased burden”, since you have to learn it. But it adds to your expressiveness, and it would be much more of a burden trying to have a conversation about history without it.

                                                                                                                In the context of Paul Philips’ talk where the quote is from, a word like that would not be “an increase in expressiveness”. The kind of expressiveness he was talking about was “grammar” more than “vocab”, so to speak. In almost any language, human or programming, you can define helpful new “vocab” (libraries for programming) without usually changing the language itself, and without putting “geometric” burdens on the listener, as Paul described it.

                                                                                                                To use Python as example, if I define a new class, or function, or new member of a class, and you are trying to “understand” my code, I have added a small burden on you, the human, and a small burden on any tools - a bit of extra memory will be required to store information about that addition. But suppose I add the descriptor protocol - https://docs.python.org/3/howto/descriptor.html . I’ve now added a whole new feature that you and all your tools will have to learn every time we want to work out “what does attribute x on object y refer to?”. In addition to understanding object __dict__, and class __dict__, and __getattr__, and __getattribute__, and probably others, you now have to be aware of __get__ as well, which works quite differently, and you have to (potentially) think about it every single time you do attribute lookup. For example, a static analyzer and a Python implementation etc. will most likely need new code, not just a little bit more memory to run in.

                                                                                                                Now, this may be worth it, and depending on the specifics it might not be too bad, but it is a burden.

                                                                                                                1. 0

                                                                                                                  You are talking about language complexity, which is different from power. For example, lisp is one of the simplest languages there is, but it’s more powerful than most languages.

                                                                                                                  __get__ doesn’t actually give you that much more power. It’s just a way to tidy up syntactic issues that shouldn’t have been there in the first place. (in a perfect design, there’s no reason you’ll need both __get__ and __getattr__)

                                                                                                                  But regarding the broader point, I agree that all features come at a cost. But it’s much more often that I want a missing feature, rather than lament the abuse of an existing one.

                                                                                                                  I can also see the point that beginners need a simpler language, like Go, that forces them into good habits. But I’ve also seen Go code that would have been much simpler if a little more abstraction was allowed.

                                                                                                                  1. 2

                                                                                                                    The point is that removing power removes needing to ask “What if?”

                                                                                                                    If Python didn’t have __getattr__, you wouldn’t have to ask ‘what if someone is doing a network call behind this assignment?’ when debugging. If your language maps to primitive recursion, you don’t have to ask ‘what if this is an infinite loop’?

                                                                                                                    Like static type systems, a restrictive language eliminates entire classes of errors by construction.

                                                                                                                    1. 1

                                                                                                                      Seems like you are still missing the point, though exhibiting exactly the way of thinking that is criticized in the talk. :-/

                                                                                                                      1. 0

                                                                                                                        Perfect, here’s your chance to educate me.

                                                                                                                        1. 1

                                                                                                                          Not with that attitude.

                                                                                                                1. 2

                                                                                                                  To me, the problem is that you now have to use two syntaxes. Anyone who wants to contribute to your docs has to be told - “Markdown, but don’t forget all the Restructuredtext syntax and directives as well.” Isn’t this admitting that restructuredtext is better for your needs? And isn’t there a logical course of action in that case?

                                                                                                                  Sphinx is very widely used and readthedocs is brilliant. I don’t think it is going away anytime soon, we don’t have to worry about Markdown having “won”, since in some domains it clearly hasn’t. There is room for more than one markup language just like there is room for more than one programming language.

                                                                                                                  1. 1

                                                                                                                    For me it’s like… I think that Markdown syntax is “better” than ReST, but it lacks the extensibility. So “ReST-flavored Markdown” is more interesting to me than “ReST”.

                                                                                                                    This is all a bit relative though, and if you are working on (for example) a general documentation thing with collaborators, just using ReST seems better

                                                                                                                  1. 22

                                                                                                                    I’ve come to the same conclusions. For me, the magic combination of features needed in a static type system is almost the same:

                                                                                                                    • type inference everywhere (or almost everywhere)
                                                                                                                    • algebraic data types (i.e. Sum/tagged union types, must be parameterizable i.e. generics, and have pattern matching)
                                                                                                                    • soundness
                                                                                                                    • immutability by default

                                                                                                                    These are not always enough by themselves. You need libraries and frameworks that build on these things in the right way. But when they are present, it seems the ecosystem naturally gets it right.

                                                                                                                    I think the key thing is this set of features enables you to “make illegal states unrepresentable” - and makes it less painful than the alternative ways of coding.

                                                                                                                    This doesn’t mean I only use languages like this. I actually use Python mainly, and it also has a completely different set of features that has its own set of advantages. In some cases these features produce ecosystems that are hard for statically typed languages to rival (e.g. the Django ecosystem).

                                                                                                                    But I do think if you only have experience of C/C++/C#/Java/TypeScript, you don’t know enough about type systems to dismiss the advantages of things like Elm/Haskell/OCaml/ReScript/Rust, which really do put you into a different world.

                                                                                                                    1. 5

                                                                                                                      I think the key thing is this set of features enables you to “make illegal states unrepresentable”…

                                                                                                                      For me, this was how the value of static typing really “clicked” in my brain. I had understood very generally why people liked it for years, but it never seemed like that big a deal until I learned about sum types (although I guess enums started me down that line of thought).

                                                                                                                      1. 3

                                                                                                                        Fwiw, TypeScript does actually have just about rich enough types to support for nice things like sum types - not directly but there’s an okay way to encode it.

                                                                                                                        1. 6

                                                                                                                          With TypeScript, it’s the lack of soundness that lets you down, combined with the number of type annotations needed to get you to the point where you are confident that the compiler will always have your back. You start in a very different place to something like Elm. YMMV etc.

                                                                                                                          1. 5

                                                                                                                            Yes. I think the richness of TS’ types still make it a meaningful win over JS, especially where they’ve pushed it over the last few years, but I also consistently miss the soundness of Rust, Elm, etc. when working with it.

                                                                                                                            1. 1

                                                                                                                              I don’t feel that on a day to day basis the unsoundness is causing me big problems. It just means the type system has loopholes: you don’t poke them on purpose and it doesn’t come up. null safety is a much bigger deal and tsc gets that right.

                                                                                                                              It’s like, even in Haskell I can accidentally write an infinite loop and it’ll almost certainly type check since let x = x in x has type forall a. a. IME that’s about 50% “just don’t do that, then” plus 50% “explicit recursion is slightly perilous so write everything as map+foldr invocations”.

                                                                                                                          2. 2

                                                                                                                            I went through a similar process, from being a massive Smalltalk fan (and implementing an AoT-compiled Smalltalk) to working on a language with a static type system. The things that make me hate static types in languages were largely coming from the Pascal view of the world, where the types aren’t sufficiently expressive to check the invariants that I do get wrong or to be able to understand things that I knew were correct. In languages like Pascal (and, often, C++), I need type-system escapes because I can’t express exactly what I want in the type system and so it raises false positives (this is safe, but you need a reinterpret_cast) and false negatives (this is unsafe, but the type system still allows it).

                                                                                                                            My list (i.e. what we’ve converged on for Verona, which is not what I originally wanted but what other people have convinced me that I now want) is similar to yours:

                                                                                                                            • Complete type inference within a function, but fully specified types on function boundaries. Global type inference is both complex for the compiler and for the programmer, because I have to run the type inference algorithm in my head over the whole program to understand the types at a specific point.
                                                                                                                            • No magic types (anything that has a type that can be inferred should have a type that I can write down using the language syntax. If it’s hard for the reader to figure out a type then I should be able to write it explicitly.
                                                                                                                            • Structural types. Concrete types that implement a set of methods implement any interface that matches that set of methods.
                                                                                                                            • Algebraic types. Union and intersection types are both incredibly useful. Union types let me pass around something that must be one of a small set of things. Intersection types on interfaces let me match on things that implement multiple behaviours easily. That gives me most of what I enjoy from dynamic languages in the Smalltalk family but in a way that is much easier to optimise.
                                                                                                                            • Flow sensitivity. If I write if (x is T) then I shouldn’t need to cast x to T in the body of that if statement. Any code in blocks that are dominated by that check should assume that the type of x is T. The same applies to pattern matching on types. If I nest pattern matching on interface types, the type in the nested block should be an intersection type of both interfaces.
                                                                                                                            • Immutability and ownership as first-class components of the type system. If I want an immutable string, I can write imm & String (intersection type that must be both immutable and String). In Verona, ownership is tied to regions. Any pointer is either imm (can’t be modified by anyone), iso (is the only pointer from outside a region to the sentinel object that dominates all objects in a region), mut (is an interior pointer to an object in a region, can be stored only on the stack or in a field of another object in the region), or readonly (may be any of the above, but I cannot modify it via this pointer).
                                                                                                                            • Generics.
                                                                                                                            • Rich compile-time reflection in the language, which I can use to implement run-time reflection on the types and properties that I want (ideally using standard-library types).

                                                                                                                            I’m not a fan of immutability by default because it then requires a lot of careful optimisation in the compiler to figure out when it’s safe to do in-place mutation and that leads to performance that’s very hard to reason about. The Verona model for immutability is that every object is created mutable (either as the sentry of a new region or in the same region as some other object) but if you have an iso pointer to the sentinel of a region, you can freeze that region. You then get an immutable object graph, which may contain cycles. Immutability is a distinct property to the object type.

                                                                                                                            This system makes it very easy to express the invariant that any only immutable objects can be shared between threads (which is the number one property I want my type system to enforce) and makes it cheap to pass complex (possibly cyclic) object graphs between threads without needing any type-system escapes (this is possible in Rust only by using unsafe crates - even passing a DAG between two threads in Rust requires using something like the RC trait, which is implemented in unsafe).