1. 9

    I like the concept, but do you really think they will let you circumvent their paywall forever? This seems like a project with a built-in expiration date.

    1. 18

      Thanks! And yeah, this could stop working any moment 😂 I still got a lot of value out of making it and launching it though. It’s also possible this doesn’t make a big enough dent in their profit for them to care. Nitter, Invidious, and the like all continue to work well.

      1. 4

        Don’t let that discourage you! I love these types of front ends. I currently used the Privacy Redirect extension in Firefox for Invidious mirrors, nitter, libredd.it and others. They’re all amazing, and contributes have kept many of them functional as well as providing several alternative hosts!

        Most of them are way better than using the actual sites as far as usability goes as well. Thank you for your effort. These types of frontends are really important.

      2. 18

        You get the first 2 or 3 articles “free”; if you don’t store cookies in your browser then you always get “free” articles. If the scribe.rip service gets banned then you can host your own version – it’ll be hard to block that as long as they keep the “first n articles for free”-scheme.

        But even if you get all of it for free this is great, because the entire thing is just so incredible annoying with all their stupid JavaScript that prevents you from copying text (instead, you can tweet about it because, apparently, tweeting something is more important than copying).

        1. 1

          In practice, it’s more than 2 or 3 because whether or not articles go behind the paywall is a decision made by individual authors (though it’s now enabled by default, rather than disabled by default as it was up until about a year ago). This proxy seems to remove the intrusive pop-overs asking non-logged-in users to register, which people repeatedly mistake for the very visually similar paywall pop-overs.

          I’m not sure what you mean about “javascript that prevents you from copying text” – maybe some extension is removing it for me, or maybe some extension on your end is causing the social-highlighting mechanism to interfere with copy & paste? I copy text from medium all the time.

      1. 8

        This is a little bit too harsh on hedgehogs. The thing is, when “big ideas” are right, the impact is enormous – and the people arguing for those big ideas often end up forgotten because the underlying idea becomes part of ‘common sense’. When big ideas are even mostly right, it’s enough to shift the landscape so that instances when they don’t apply are dedicated niches, identified and explored much more quickly because all you need to do to find the pathological cases is to show when the norm falls apart (which requires establishing the norm in the first place). So, lots of people want to be the ones who are publicly associated with big ideas in case the ones they support are right or mostly right.

        In software, experienced devs tend to become foxes – because part of experience is working on a variety of very different codebases and toolchains. Experienced devs who become hedgehogs drift away from tech and toward PR work. But ‘big ideas’ are also situated & contextual, and because software is socially constructed, the popularity of a big idea can by itself change the environment in such a way that the idea becomes ‘true’ (for instance, TDD is a great fit with an agile model, and an agile model fits the economic model of a small company with more engineers than experienced managers working on new products in a field that doesn’t have established norms, and for a while the most profitable companies in software were like that so agile & TDD gained a following – and now it’s baked into policy a lot of places & you can’t circumvent agile & TDD without a bureaucratic nightmare, even when working with big globs of inherited or purchased legacy code).

        The other thing is that thought leaders aren’t, typically speaking, hedgehogs in Tetlock’s model. A hedgehog is somebody like Karl Marx, who had a deep and broad knowledge of economics, philosophy, and history and brought it all to bear on a specific model of economics wherein capitalism is born out of a particular set of tensions that feudalism inevitably produced, expecting to project out possible future economic systems from capitalism’s tensions, or Charles Darwin, whose knowledge of natural history (both first-hand and theoretical) was beyond his peers due to unique experiences, and whose experiences led him to create a grand unifying theory of evolution based on the twin forces of mutation and selection. In other words, hedgehogs are experts in at least one domain. Most thought leaders in software are not really experts in software, and the kind of guys he’s talking about his experiences with in OP are a different type entirely – folks who have attached themselves to a grand narrative because they heard some emotionally compelling defense of it, and then have never let go. Even hedgehogs are willing to recognize the borders around where their Big Idea applies. If your answer to everything is TDD, you’re not just single-minded but actually lack experience!

        1. 3

          The distinction he’s trying to draw is a bit too fuzzy to draw directly, and is at best a spectrum, because every piece of code is an interface between the language it’s written in and the problem domain. Nearly all of his “insightful” examples are actually cases where specific language features (often not understood by outsiders) happen to match relatively well-understood problem-domain features. This is probably at least partially a side effect of writing for a general audience of programmers (who are more likely to know enough python or haskell to get by than to understand specific quirks of an obscure domain), but it’s also a very real problem for maintainability concerns, since maintenance in big companies is often put on the shoulders of junior engineers who aren’t experts in either the domain or the tooling. It muddies the water because duff’s device, in its original incarnation, was exactly this kind of problem: written in an environment where everybody knew C quite well, it was a necessary optimization for a particular piece of code that ran too slow otherwise, & while everyone recognized it was clever at the time, it only gained its low reputation when C optimizers got good enough (and clock speeds got high enough) that optimization tricks no longer needed to be in the toolbox of your average developer.

          In other words, rather than two classifications of ‘cleverness’ there are two systems you can be clever in interacting with – the problem domain and the language domain – and to the degree that one is clever toward one of these domains, the resulting code is fragile with respect to changes in that domain and obscure with respect to a lack of knowledge about that domain.

          To illustrate, take an example from python. Expecting an expression with division in it to be an integer division expression if all of its components are integers used to be as normal in python as using list(set(x)) to remove duplicates from some list x – in other words, not clever at all, and not even particularly python-specific (since many languages, including java and c, work this way). With python 3.x, dedicated integer division operations were introduced and an underlying distinction within numeric types between integers and non-integers was eliminated – so the behavior of division with respect to integers changed drastically, in a way that breaks a lot of third party code. So expecting division to work the same way in python as it does in c was, essentially, “too clever” with respect to the massive changes to the language being undertaken, simply because those changes happened to make this behavior fragile.

          This is almost exactly the same as a problem being redefined & requiring a code rewrite. The difference is that, in today’s professional software environment, there’s an expectation that third party interfaces (including but not limited to language features) will change slowly or not at all while the problem definition you’re working from will probably change multiple times per sprint, being contingent & based on faulty data in the first place.

          Python changing the definition of division is not structurally different from your PM requiring that a parameter should be able to come as a list or an object; you just don’t expect the former as often, so there’s an expectation that developers can and will over time learn all of the quirks of their language.

          This part of maintainability has a lot of intersection with other parts (like compatibility). Like, the difference between “language feature” and “implementation quirk” comes down to whether or not it’s standardized (and whether or not ostensibly standard-compliant implementations obey that standard). The more independent implementations there are, the longer it takes for core features to mutate, but at the same time, if you have a single dominant implementation, you can expect obscure quirks to be more widely applicable. Sometimes, the value of using an implementation quirk is enormous – for instance, GNU Awk has a non-standard feature for streaming to an open pipe, and so if you decide that you’re only ever going to use GNU’s version of Awk, you can build things like spawning HTTP requests in the background into your data-processing awk script, preventing you from needing to involve a heavier language or framework.

          1. 32

            I love the idea!

            If you’ll forgive some bikeshedding, would calling it post-mortem be better? When I first saw the title of this post, I thought it was going to be about some sort of argument-inciting event (which would normally be off-topic for lobsters, right?)

            Unless the tag is also to be used for posts regarding notable current outages, etc, in which case incident makes more sense.

            1. 21

              I think I like “incident” better than “post-mortem”, if only because it’s a bit broader (i.e. I think technical post-mortem posts would appropriately fall under the hypothetical tag of “incident”). I think being able to distinguish between regular news and interesting technical work done as a consequence of that news is valuable, though, so +1

              1. 12

                I tend to prefer retrospectives rather than post-mortem. The death analogy really isn’t necessary here.

                1. 3

                  Even though I think that post-mortem is widespread and accepted terminology (especially in, say, gamedev circles), I like the slightly broader tag of retrospective as well–it feels like a tag we could use to denote a more general lessons-learned sort of submission.

                  1. 5

                    I think the spirit of what we’re trying to categorize are posts that cover specialized, targeted, technical responses to events (security breaches, production application failures caused by programming errors, etc). This is very different from, say, the retrospectives that a software team using the Agile methodology may hold to address problems that may have come up during their last sprint (don’t bikeshed this, I know Agile doesn’t have a patent on retrospectives or whatever, it’s just an example)

                    So, for the sake of clarity, I think “incident” would be a better tag than “retrospective” (or “post-mortem” for that matter), because it sends a clear message to Lobsters users that an event occurred that warranted a response, and subsequently produced the technical material being posted.

                    1. 2

                      Sure!

                      As long as @pushcx gives us some tag, I’m not going to bikeshed.

                    2. 2

                      “Retrospective” more naturally applies to what we use the “historical” tag for now, so this might be confusing.

                      For this very narrow proposed case – where we’re looking for overviews of security incidents – we should probably pick a specific term. The justification for this tag (that it’s historically important & needs more exposure) goes away if we broaden it to include other kinds of postmortems (let alone other kinds of retrospectives), or if we invite users to misinterpret the purpose of the tag with a too-vague description of its purpose (which functionally results in the same thing, modulo pushcx going and deleting submissions).

                  2. 8

                    +1 for the idea but -1 for calling it post-mortem; incident or PIR/incident-review would be better

                    1. 3

                      I wish “after action report” was in general circulation. Not seriously proposing it, it would only be more confusing. I just think there are good reasons to dislike “postmortem”.

                      1. 3

                        Post-mortems are interesting in general, & I’d personally be more interested in reading the content of a dedicated project post-mortem tag. But I agree that incident reports themselves are more historically important and have lower visibility. Post-mortems should probably be spun out into a different request tbh.

                      1. 2

                        In response to some of the misunderstandings I’ve seen in this comments section, I’ve written a different essay making the same points in a way that may be more clear.

                        1. 3

                          On the other hand, no RSS client or RSS-generating application does any of this work until somebody complaints, because webtech culture loves postel’s law — which, put into practical terms, really means “just wing it, try to support whatever trash other applications emit, and if somebody can’t handle the trash you emit then it’s their problem”. No point in actually specifying the date format for RSS — everybody should just guess, and if some client can’t handle every possible date format then fuck ’em (unless your valuable user happens to prefer that client).

                          Postel’s law isn’t a choice; it’s a force of nature. Any widely used system or protocol has to interoperate with misbehaving implementations like your RSS feed. RSS is poorly specified but the same issues are always present.

                          Quoting from IPv4, IPv6, and a sudden change in attitude

                          Postel’s Law is the principle the Internet is based on. Not because Jon Postel was such a great salesperson and talked everyone into it, but because that is the only winning evolutionary strategy when internets are competing. Nature doesn’t care what you think about Postel’s Law, because the only Internet that happens will be the one that follows Postel’s Law. Every other internet will, without exception, eventually be joined to The Internet by some goofball who does it wrong, but just well enough that it adds value, so that eventually nobody will be willing to break the connection. And then to maintain that connection will require further application of Postel’s Law.

                          RSS, TSV, CSV

                          I mean, RSS is a product of its time. I understand why RSS didn’t use JSON — because JSON didn’t exist yet. But RSS could have used… TSV. Or CSV. Or a line-based key-value format with section separators. Using XML for anything should have been immediately and obviously a bad idea to any professional developer, even in the early 90s, considering just how many problems one immediately encounters in trying to work with it.

                          TSV and CSV are riddled with interoperability issues; CSV has an RFC but noncompliant files and readers are a regular occurrence. The timing is also wrong; XML was standardized in 2001.

                          I spend a depressing amount of time coping with CSV import issues; here are some issues I would expect in a CSV or TSV based RSS.

                          • Are line endings permitted within a field?
                          • Are quotes, tabs, or commas escaped?
                          • How are nested quotes handled?
                          • Are quote marks mandatory?
                          • Are trailing spaces after a comma permitted?
                          • Are empty lines between records permitted?
                          • How are missing fields addressed?
                          • Line endings - \r\n, \n, \r.
                          • Encoding issues - receiving a localized character set instead of UTF-8.

                          New web browsers cannot be written, nor can web browsers be maintained except by the three largest tech companies in the world, because postel’s law (along with the IETF policy of “loose consensus and running code”) has doomed all web standards to being enormous baroque messes of corner cases that can only be navigated by the chinese-army technique of throwing waves of cheap contractors at it. Since no single person can completely understand any W3C standard, no person can be sure that they are generating W3C-compliant code or markup, so they test using an existing dominant browser (or maybe two or three of them). Any new browser, even in the event that it happens to adhere to the W3C standard, is unlikely to behave exactly the same way as these dominant browsers, and so actually-existing code will look “broken” (even if it’s actually being interpreted correctly according to the standard). This is a moral failing: it leads inevitably to centralization.

                          How is interoperability a moral failing? It isn’t realistic to expect that people will read the standards or fix their code. XHTML tried ignoring Postel’s law - that’s why we’re using HTML5. There are only two kinds of protocols: the ones people complain about and the ones nobody uses.

                          1. 4

                            I’m going to try to repeat my main point here, with more emphasis:

                            Belief in postel’s law takes the onus off of protocol designers to design simple, clean protocols that can be easily implemented and audited and instead puts the onus on individual developers to deal with whatever nonsense happens to exist. This, in turn, means that common kinds failures to properly implement unnecessarily complicated specs are rolled into the folk understanding of what it means to “properly implement” those specs (and sometimes into the specs themselves), until (quite quickly) interoperability becomes impossibly difficult.

                            You can stop this process by:

                            1. making protocols that are easy to implement correctly
                            2. making it easy to identify implementation errors

                            If you do this, then nobody has a good excuse for an implementation error, and you can feel free to reject invalid input. Postel’s law becomes unnecessary.

                            If you don’t do this, every popular protocol eventually becomes like HTML5: something that nobody other than Google and Microsoft can fully implement.

                            1. 3

                              Do you have any examples of protocols with multiple implementations and at least 1k users that meet your criteria?
                              The IETF protocols are widely implemented and used; it’s a given that there are implementation and specification errors.

                              If you do this, then nobody has a good excuse for an implementation error, and you can feel free to reject invalid input. Postel’s law becomes unnecessary.

                              I have yet to work at a place where rejecting invalid input was an option; you can’t choose which systems you interoperate with. Has your experience been different?

                          1. 12

                            Using XML at all leads inevitably to centralization.

                            This is a logical leap of epic proportions. I fail to see how the argument put forth backs it up.

                            HTTP is also bloated and over-complex in the same way. And don’t get me started on certificates and certificate chains. Or fucking DNS.

                            It doesn’t take a genius to come up with better formats than these for the same applications. These aren’t inherently-complex problems.

                            Hint: the formats are not the compilicated (or even interesting) part. They could all be XML right now and it wouldn’t change much of anything.

                            1. 3

                              If you use XML, you are either writing your own XML parser or using someone else’s.

                              Writing your own is hard – much harder than the project you’re planning to use XML for, probably.

                              Using someone else’s is centralization: there aren’t many good XML parsers available, so you’re placing more power and trust in the hands of Jackson or libxml (which already underpin practically everything).

                              A non-centralizing format is one where rolling your own implementation is not an obviously terrible idea.

                              A format where any developer on the team can implement it more or less correctly in an afternoon is also a format that isn’t going to become a problem later - one where developers can reason about the behavior of parsers and generators, where different implementations can be expected to generally agree, and where corner cases that might be handled differently between implementations can be identified and avoided. It’s a format where auditing third party implementations is also potentially straightforward (and you can reject difficult-to-audit ones out of hand). So there are more reasons to prefer them than just avoiding power consolidation.

                              (Maybe I should rewrite this essay, seeing as how I need to rephrase its core thesis in response to practically every response.)

                              1. 6

                                Using someone else’s is centralization:…

                                A non-centralizing format is one where rolling your own implementation is not an obviously terrible idea.

                                By this logic, code reuse is “centralization”. Is this actually what you’re arguing? Because it sounds pretty ridiculous.

                                I, for one, would prefer to use a well-tested, auditable implementation than roll my own in an afternoon.

                                1. 3

                                  As a practical example of the difference between XML and JSON when using popular libraries isn’t possible, let’s look at Monte’s JSON support. Monte doesn’t have any XML parsers. It does have a JSON parser, which started out as a handwritten lexer and parser. Later, we were able to write basic parser combinators, and that led to the current combinator-driven parser, which is smaller and faster. JSON is quicker than XML to support, even considering the effort required to write a parser library.

                                  Parsing even a comfortable and reasonable subset of XML is hard enough that I’ve simply not parsed XML in-process; instead, I’ll look for external tools like relpipe (XML example) which will do it for me, handling escaping and CDATA and entities and all of that fun stuff. Calling subprocesses is safer than XML to support, even considering the security risks.

                                  I could imagine JSON-specific arguments against this, so as a tiebreaker, we could consider Monte’s support for Capn Proto. We implement the recommended convention, and maintain a tool which compiles Capn Proto schemata into Monte ASTs. This requires a lot of binary parsing, recursive algorithms for nested data structures, deferred/lazy parsing, and other hallmarks of XML parsers. Capn Proto is much easier than XML to support, even considering the code generation that is required.

                                  To exemplify a point from the author’s sibling response:

                                  If correct implementation is such a problem that you would rather have a third party do it for you, then you are probably not qualified to independently audit or test a third party implementation! Sometimes (as with encryption) this is unavoidable: the problem domain is so complex that you need to trust expert opinion, and rolling your own is never going to be a good idea.

                                  Monte is a language where we generally don’t have FFI; we require binding libsodium and are prepared to pay the price for vulnerabilities in that particular C library, but we’re not willing to make that trade for a JSON or XML parser, and we won’t let users try to make that choice for themselves, either.

                                  1. 3

                                    This matches my experience.

                                    I’m not coming to this from a place of ignorance of XML (though I had a fair amount of ignorance of RSS while implementing it); I’m coming from a place of having spent a whole lot of time dealing with Jackson, libxml, and various implementations of xpath and xslt, in multiple languages, for work and often discovering that because the other end actually uses a tiny fraction of the expressive domain of XML, the easiest and most reliable way to get the data I needed was to use sed. At the same time, hand-processing subsets of json (or even all of msgpack) was a lot easier than doing fairly simple things with existing xml parsers.

                                  2. 2

                                    If you’re reusing your own code? I wouldn’t call that centralization of power (although sometimes poor factoring will lead to some of the same kinds of concerns – ex., if you have a module intended to be general purpose but the needs of the functions that use it cause it to become too tightly coupled, the behavior that must be implemented in order to make this ‘general purpose module’ work for one function may break the other in subtle ways). So, shared code is a shared point of failure.

                                    But dependencies, on top of being a shared point of failure, also need to be trusted or audited. Core complexity is both a major factor in implementation difficulty and a major factor in auditing & testing difficulty. Doing a complete audit (where you have absolute confidence that the behavior is correct) can sometimes be more difficult than a full reimplementation, since you need to have a complete model of the correct behavior (and if you are using the right tools, having a complete and correct model of the problem is the most difficult part of implementation); testing can help, at the cost of adding complexity and corner cases (ex., where is it safe and appropriate to add mocks – and must you modify the natural structure of the code to do so?).

                                    If correct implementation is such a problem that you would rather have a third party do it for you, then you are probably not qualified to independently audit or test a third party implementation! Sometimes (as wtih encryption) this is unavoidable: the problem domain is so complex that you need to trust expert opinion, and rolling your own is never going to be a good idea. But if you are, for instance, looking to serialize a key-value table or some kind of nested tree structure, there are no ‘essential’ gotchas (anybody with a CS degree knows everything you need to watch out for in both these cases to guarantee correctness) – so why deal with complexity introduced by the format? And on the other hand, why put trust into unknown third parties when the only thing you need to do to make checking correctness easy is to roll your own (simpler) format?

                                    Depending on your needs, it is often going to be easier to roll your own format than to use an existing parser implementation for a more complex one (especially with XML, where the format is awkward and all of the parsers are awkward too, simply due to the way the spec is defined); if you are using XML, odds are there’s an existing simpler format (with easier-to-audit implementations) that cleanly handles whatever you need to handle, and furthermore, a subset of that format that supports all your data without ugly corner cases.

                                    The correct format to use is the simplest format that is rich enough to express everything you need to express. Don’t use XML when you can use JSON and don’t use JSON when you can use TSV, and don’t use TSV with quoting if you can use TSV without quoting – and if you can use TSV without quoting, implement it with string split.

                                    If you use a third party dependency, every bug in that dependency is a bug you are responsible for but can’t promise to resolve – something you’re not in the right position to identify, and don’t have permission to fix. That’s a really ethically dicey proposition. Your users have no choice but to trust you to some degree (the best they can do is trust you or the most trustworthy of your peers – and often, that is no choice at all), and you have chosen to expose them to a risk that you know that you may be unable to protect them against. Most of the time, you’ve done this without it being truly necessary: you could take full responsibility for that functionality, but it’s boring, or you’re on a deadline that makes it impossible (in which case your project manager is morally culpable).

                                    Often, all that it takes to move from “I can’t possibly take responsibility for a hundred thousand lines of complex logic in a third party library” to “I can keep an accurate model of every operation this code performs in my head” is replacing large third party dependencies whose features you don’t need with a simpler hand-rolled implementation that has none of those features.

                                    1. 4

                                      I find it amusing that you claim that RSS is “webtech mentality” and yet unironically advocate the use of JSON, which is just as much, if not more, “webtech mentality” than RSS. And JSON probably has more corner cases than RSS.

                                      I’m not sure what you mean by “TSV with quoting”—as long as the values don’t contain the tab character, there’s not quoting required, unlike with CSV. I do wish the ASCII separator characters were used, but as Loup-Vailant said:

                                      it was mostly short term convenience: since basically forever, we had this thing we call a text editor, that displays ASCII text. So if your format is based on that, it’s easy to debug input and output by just displaying the text in an editor, or even modifying it manually. It is then very tempting to optimise the text for human readability over a standard text editor… next thing you know, you’re using text for everything.

                                      1. 3

                                        I find it amusing that you claim that RSS is “webtech mentality” and yet unironically advocate the use of JSON, which is just as much, if not more, “webtech mentality” than RSS. And JSON probably has more corner cases than RSS.

                                        I use the term “webtech mentality” to mean a really specific thing here: the idea (produced in part by postel’s law) that interoperability can be guaranteed without simple and consistent design, because the onus of properly interpreting even impossibly-complex or internally-inconsistent specs is upon the individual implementor. This is like the tech equivalent of neoliberalism, and has very similar results.

                                        I advocate JSON over XML as the container language for something like RSS because it has fewer practically-meaningful corner cases for this purpose. I’d advocate for MSGPACK over JSON in basically any case where JSON is appropriate. In many cases where JSON is used, it would make more sense to have a line-based or tabular format (though if you are working inside a web browser or are interacting with code running in a web browser, your choices are seriously limited and it’s not really possible to do anything in a completely sensible way).

                                        I’m not sure what you mean by “TSV with quoting”—as long as the values don’t contain the tab character, there’s not quoting required

                                        You just answered your own question :)

                              1. 8

                                Winer chose XML for RSS because he used XML for OPML, and Winer hasn’t met any problem that cannot be solved by outlining.

                                RSS does suck, which was why a team developed Atom instead (which fixed the encoding issues and date formats). Of course Winer saw this as an attack and the Syndication Wars were on! Good times.

                                The decision to use XML for RSS led inevitably to both Google Reader and Google’s decision to kill Google Reader, and that has been a huge setback for the “open web” (which, while it was never really open — basically for exactly these reasons — has never been as close to open since).

                                I miss Google Reader too, but c’mon.

                                This essay originally appeared on secure scuttlebutt at %3k6qAo85Q/1hjMW6xc3S0MNt+PsBCM00S354HeXOUco=.sha256

                                I’m sure it did, but at least I can’t read it there. Looks like the big bad evil centralized web still has a leg up.

                                1. 4

                                  This essay originally appeared on secure scuttlebutt at %3k6qAo85Q/1hjMW6xc3S0MNt+PsBCM00S354HeXOUco=.sha256

                                  I’m sure it did, but at least I can’t read it there. Looks like the big bad evil centralized web still has a leg up.

                                  It gets more ironic. SSB’s core protocol is tied to the formatting rules of JSON.stringify.

                                  Talk about the webtech mentality…

                                  1. 3

                                    That’s my number 1 issue with SSB (which I generally think is really cool!) - essentially all clients are stuck on the same library to handle the protocol stuff because it was badly designed and nobody wants to implement a compatible client (yes, I know someone did; they tied the reimplementation to the same obscure database format as the canonical one).

                                    1. 2

                                      I posted it to SSB as a criticism of SSB’s own webtech mentality – which I believe was understood there :)

                                      1. 1

                                        Fair enough. It was a bit too subtle for me.

                                        1. 1

                                          It’s a shame that you’re not able to see the full thread over on SSB. It’s full of good discussion that really can’t be easily summarized. (We get into JSON vs BSON vs MSGPACK vs protobuf vs s-expressions, and gopher vs gemini). I may rework some of the points there into a new essay – SSB is not for public posts, after all.

                                          1. 1

                                            Very cool!

                                  1. 3

                                    The author’s core point seems to be:

                                    Technology decisions have a genuine moral valence. Every temporary hack, as soon as more than one person uses it, becomes effectively permanent. This means that if you are creating something that multiple people will use, you are in relationship of power over those people & part of your responsibility is to make sure you don’t harm them in the long run.

                                    I agree.

                                    But the argument that complex formats leads to centralisation? The evidence that XML/RSS/HTTP/DNS/blahblahblah are simultaneously too easy to get wrong and also too hard to get right, so that only large well funded organisations can manage them well? No mate. The existence counter-examples are literally countless.

                                    Ironically, webtech people have written some of the most cogent criticisms of Postel’s Law: RFC 3117 and The Harmful Consequences of the Robustness Principle.

                                    The author again:

                                    It doesn’t take a genius to come up with better formats than these for the same applications. These aren’t inherently-complex problems. These are relatively simple problems with obvious and straightforward solutions that the industry collectively ignored in favor of obviously-bad solutions that a couple famous people promoted.

                                    Astounding arrogance and hindsight bias.

                                    I strongly recommend reading about the history of some of these formats.

                                    1. 2

                                      The existence counter-examples are literally countless.

                                      Would you care to list some? (Just enough to demonstrate that there is no trend in the opposite direction)

                                      Also: I think you misunderstood how DNS & HTTP centralize. It’s not, in that case, so much that implementation is hard (though HTTP is more complicated to implement than a properly designed protocol that performs the same tasks would be), but that having a host name in a data address means that the data is inherently tied to the host – third party mirrors cannot help ease the load of lots of requests, and so it is the responsibility of the original host to build up more capacity than he would ever use, lest the data become unavailable. The host (or somewhere on the path to the host, since the usual solutions to this problem involve creating layers of load balancing) is a single point of failure. This is implicit in the address format.

                                      I strongly recommend reading about the history of some of these formats.

                                      If there’s any reason why Dave Winer could not have possibly imagined a format like < URL > < tab > < ISO date > < newline > for a reverse-chronological list of URLs in 1996, let me know :)

                                      1. 3

                                        Would you care to list some? (Just enough to demonstrate that there is no trend in the opposite direction)

                                        You re-explained your argument about centralisation, so the following is me answering your question though I realise it isn’t cogent to your point:

                                        For XML, RSS, HTTP, and DNS in specific? There are countless serialisers, deserialisers, tools, clients, servers, applications, books, courses, institutions ranging from country neighborhood house parties to international standards organisations etc. etc. etc that every single day and night are rehearsing and rebuilding from scratch these technologies. SV could be nuked and every country’s top three tech companies have their charters revoked and these technologies would live on.

                                        For complex formats in general? waves hand around in all directions Complex things are easier to make than simple things. That Antoine de Saint-Exupéry quote. That Rich Hickey talk. Because of that, the world is chock full of complex things. Every natural human language. LIFE

                                        1. 2

                                          If there’s any reason why Dave Winer could not have possibly imagined a format like < URL > < tab > < ISO date > < newline > for a reverse-chronological list of URLs in 1996, let me know :)

                                          What if the content contains newlines (or tabs, such as might be present if the content is source code)?

                                          What does the ISO date refer to? Created date? Modified date? What if you want both?

                                          RSS is more than a list of URLs. You generally have the full text of an entry, author name (important in the case of multiple authors on a source), modification times, other media (podcasts are built on RSS, and were Winer’s “next big thing”). You want stuff like when the feed was last modified, and so on.

                                          Atom tightened up RSS a bit, but had to create its entirely own format for reasons that are now lost in the mists of time but can generally be summarized as Winer being a dick.

                                          Edit this article about the rise and fall of RSS is fascinating and a lot more nuanced than my crude summary. Winer is still a dick though.

                                          https://twobithistory.org/2018/12/18/rss.html

                                          1. 2

                                            What if the content contains newlines (or tabs, such as might be present if the content is source code)?

                                            URLs and ISO dates are both specified to never contain tabs or newlines.

                                            What does the ISO date refer to? Created date? Modified date? What if you want both?

                                            The date the owner of the RSS feed added that link to his feed.

                                            RSS is more than a list of URLs.

                                            Winer’s first mistake with the format, and probably one of the reasons he thought XML might be a good idea for the container structure. He’s packed in a whole lot of stuff that people might want and in the process, made the core function a lot messier.

                                            I might be convinced that it is justifiable to have a short “title” field – one specified to contain no tabs, newlines, or formatting. This lets somebody decide whether or not they want to open the link (which may be, as you noted, a link to a media file and therefore possibly quite large). But, once you’re allowing people to put arbitrarily large formatted commentary into a syndication feed, you are essentially delivering duplicates of the websites you are linking to, and so you have screwed up the basic value proposition of RSS (to be able to fetch the stuff you haven’t seen and not fetch the stuff you have).

                                            With a line-based format that has the date of a post in each line, a reader can fetch each line and then stop if the date in the line is not newer than the newest date in its previous fetch cycle. That’s a whole lot better than a “feed modified date”.

                                            1. 2

                                              It sounds like Winer’s goals for RSS and your goals for RSS are different.

                                              You might prefer OPML. 😉

                                          2. 1

                                            Also: I think you misunderstood how DNS & HTTP centralize. It’s not, in that case, so much that implementation is hard (though HTTP is more complicated to implement than a properly designed protocol that performs the same tasks would be), but that having a host name in a data address means that the data is inherently tied to the host – third party mirrors cannot help ease the load of lots of requests, and so it is the responsibility of the original host to build up more capacity than he would ever use, lest the data become unavailable. The host (or somewhere on the path to the host, since the usual solutions to this problem involve creating layers of load balancing) is a single point of failure. This is implicit in the address format.

                                            You just got me in the feels. I spent 2010 giving presentations on this very topic. One was half jokingly entitled “SSL is racist” because it explained how the middlebox caching then common in the developing world was being killed.

                                            I think we both love content centric networking technologies; but, for the most part, they didn’t really exist until recently!

                                            If there’s any reason why Dave Winer could not have possibly imagined a format like < URL > < tab > < ISO date > < newline > for a reverse-chronological list of URLs in 1996, let me know :)

                                            I guess no reason except we were busy litigating things likes:

                                            • How to encode URLs into URIs into US-ASCII (percent encoding?)
                                            • Which ISO date time to use
                                            • What even is a newline? (IIRC UserLand was Windows, so I guess it’s \r\n)

                                            Also, UTF-8 hadn’t won yet. (See all the corner cases of encoding UTF-8 in URLs and visa-versa.) So your format would have a heap of fun there too; it doesn’t even have the advantage that documents get in providing context for auto-detection.

                                            1. 1

                                              Which ISO date time to use

                                              RSS specifies RFC 822 , so the date is in the format “Mon, 29 Mar 2021 15:26:45 +0200”.

                                              What even is a newline? (IIRC UserLand was Windows, so I guess it’s \r\n)

                                              I’m pretty sure it was originally developed for the classic Mac, so \r would have been more appropriate.

                                              1. 1

                                                RFC-822 is two digit years. You might be thinking of RFC-2822, published in 2001.

                                                1. 1

                                                  Porque no dos?

                                                  All date-times in RSS conform to the Date and Time Specification of RFC 822, with the exception that the year may be expressed with two characters or four characters (four preferred).

                                                  https://validator.w3.org/feed/docs/rss2.html

                                                  RSS 2.0 is of course the rebranding of RSS 0.92 and is basically an end-run around RSS 1.0, which uses real ISO dates, being derived from RDF.

                                                  1. 1

                                                    I addressed this in another thread.

                                                    But you’ve remade my point: there was no brain dead correct date and time format in 1996.

                                                    1. 1

                                                      Maybe not in the US. I’ve been using ISO 8601 since I learned to write.

                                                      1. 1

                                                        🙄 Have you read ISO-8601:1998?

                                                        1. 1

                                                          Nope, can’t afford to.

                                                          But Sweden has used (YY)YY-MM-DD as a date format for as long as I’ve been aware there’s been a date format… The Swedish standard SIS 10211 was replaced by ISO-8601, but it dates from 1972.

                                        1. 5

                                          Yes, RSS specification has problems, but why won’t one use Atom instead? Atom support is widespread and Atom is very well specified. The article does not mention Atom at all.

                                          1. 3

                                            I’m surprised anyone would seriously consider generating RSS now that every remaining feed reader supports Atom and has supported it for decades.

                                            JSONFeed looks somewhat appealing, but it’s not really free of ambiguities either. For example, I tried to make a patch to fix one of those that arises from having both “publication date” and “modification date” in the protocol, which still lingers there.

                                            1. 1

                                              TBH, I tried generating RSS because somebody asked for it & I had it working in about ten lines in the reader I was testing with in a few minutes. “Serious consideration” is not on the table - this whole project (including fetching titles, generating HTML, posting digests to SSB & twitter) is maybe 150 lines of shell and should never be substantially more complicated than it is.

                                              At least a year later, several people complained that my RSS feed didn’t work in their reader, after which I took a look at the spec again and started testing against several readers.

                                              I didn’t know much about either RSS or Atom at the time, since my only prior experience was writing a “stupidrss” reader that converted both formats into a newline-separated list of URLs with sed.

                                          1. 7

                                            On the other hand, no RSS client or RSS-generating application does any of this work until somebody complaints, because webtech culture loves postel’s law — which, put into practical terms, really means “just wing it, try to support whatever trash other applications emit, and if somebody can’t handle the trash you emit then it’s their problem”.

                                            The author complains that the world does exactly what they did.

                                            The author didn’t read the RSS specification, which states with attendant matching samples, “all date-times in RSS conform to the Date and Time Specification of RFC 822.”

                                            The author didn’t run a validator against their RSS!

                                            RSS has its problems, e.g. which RSS is RSS? But replacing XML with JSON? Or TSV? Or CSV?

                                            The author doesn’t realize the world is full of JSON. And making a production JSON, TSV, or CSV parser is not easy.

                                            […] there are basically no barriers to using existing JSON parsers and generators either.

                                            The same is true of XML. 🤦🏾‍♂️

                                            1. 6

                                              RSS specification is such a gem. Why omit the best part?

                                              All date-times in RSS conform to the Date and Time Specification of RFC 822, with the exception that the year may be expressed with two characters or four characters (four preferred).

                                              This really is a case of webtech mentality the article is talking about. I really don’t understand what they were thinking.

                                              1. 3

                                                It is such a gem.

                                                As you note in another thread— Atom is well specified. But there’s no connection to the “webtech mentality.” The author has an issue with applying Postel’s Law and flails about ahistorically blaming that for things they don’t like.

                                                1. 1

                                                  This really is a case of webtech mentality the article is talking about. I really don’t understand what they were thinking.

                                                  What they were thinking is that RFC-822 (dd mm yy) was written in 1981 and not Y2K compliant. Its update (RFC-2822) wasn’t finalised until 2001.

                                                  HTTP/1.1 supported three different datetime formats, but preferred RFC-1123 … which is … drumroll RFC-822 where “the syntax for the date is hereby changed to: date = 1*2DIGIT month 2*4DIGIT

                                                  And lest I be accused of cutting out the juicy hilarity of the time:

                                                  All mail software SHOULD use 4-digit years in dates, to ease the transition to the next century.

                                                  In short, they didn’t want to add yet another format to the mix.

                                                2. 4

                                                  the author complains that the world did exactly what they did

                                                  No, I complained that the world is incentivized into doing what I did by poor design practices.

                                                  I took one look at the RSS spec and said, “it isn’t worth fully reading and understanding this document for the sake of a ten line shell script”. This is a totally reasonable thing to do. Delivering a list of URLs should not require a standard dozens of pages long. So I read examples & tested against the least bloated RSS reader application I could find.

                                                  The same is true of XML.

                                                  Not really?

                                                  JSON, CSV, and TSV have complicated edge cases that matter for “production” but can be avoided if you choose an appropriate format for your data (ex., if your data never includes newlines, tabs, or nulls, and it is tabular, TSV can be used without quoting & can be performed with tools like cut & awk).

                                                  XML also has complicated edge cases, but it has no simple cases: you must use an XML parser, and all existing XML parsers are awkward. So maybe you buy in totally & try to use some of the other tools intended to making serializing in and out of XML easier – and then you’re stuck with specifying schemas (so that your XML data, no smaller and no less complex, is now less flexible, and you have also had to learn a second language for defining those schemas). Pretty soon, you are stuck juggling the differences between XSLT 1 and XSLT 2, or worse, you are using some off the shelf framework that generates completely alien-looking schemas out of an attempt to cover over the differences between XML and any sane way of organizing data.

                                                  A sensible standard lets you write reliable, compatible code without many dependencies.

                                                  The author didn’t read the RSS specification, which states with attendant matching samples, “all date-times in RSS conform to the Date and Time Specification of RFC 822.”

                                                  I skimmed the RSS specification, read RFC 822, implemented RFC 822 date formats, and discovered that it would not work because (as sanxiyn notes) RSS doesn’t actually support RFC 822 date formats but instead a nonexistent variant of RFC 822 with four digit dates. Years after the original implementation – which, based on existing examples of RSS feeds, used the default human-readable locale date format provided by date & worked on most clients.

                                                  If somebody was paying me to generate an RSS feed – sure, of course, I would read the entire RSS spec and use a big bloated XML library and spend weeks testing it against dozens of feed readers. If some stranger on the internet asks me to add RSS generation to a short shell script, spending more than four hours on it is a failure, because working with XML and RSS is miserable.

                                                  1. 2

                                                    No, I complained that the world is incentivized into doing what I did by poor design practices.

                                                    That same process you (and the world) do is the same process that results in those “poor” designs.

                                                    JSON, CSV, and TSV have complicated edge cases that matter for “production” but can be avoided if you choose an appropriate format for your data […] XML also has complicated edge cases, but it has no simple cases: you must use an XML parser, and all existing XML parsers are awkward. […]

                                                    The first category has complicated edge cases, but you get to ignore them by changing the requirements. The second category has complicated edge cases… but you’ve arbitrarily argued that you can’t change the requirements to ignore them?

                                                    For example, if quoting is hard, then don’t do it? Item descriptions are optional! And your proposed RSS alternative history formats don’t include one.

                                                1. 1

                                                  I have been spending weeks on an admin interface on a tiny proxy that we only need because the deadline for a project that requires a feature from an internal third party service is a few months before the earliest point where the developers of that service have said they might be able to start working on that feature. So the proxy implements the feature – about 50 lines of code, 40 of which would be unnecessary if the feature were implemented where it was intended – and we’ve got like 400 lines of internal admin UI in case, during the few months when this code will be running, somebody wants to manually edit our database without installing an SQL client. Even so, 90% of the work on this project has been getting after devops people to remind them that they have promised to spin up machines and create security groups.

                                                  1. 6

                                                    It’s true that semantic versioning doesn’t reliably indicate compatibility. However, the idea that newer versions are always better (or that anybody wants to be up to date with upstream at all times) is pretty naive.

                                                    Newer versions of a piece of third party software often means that its own minimum dependency set has shifted, for one thing – so if you require compatibility with version 3.1 of package X and your other dependency Y when upgrading from 4.5 to 4.6 now requires X version 3.2 that has some major known bug or some weird compatibility problem, you’re fucked. Projects also often grow – so maybe you don’t personally know of a bug in 3.2 but you know that the source code of 3.2 is three times the size of 3.1 and adds six hundred features you don’t need or want, all of which represent new vulnerability surfaces.

                                                    In my experience, this is the attitude of industry. We still run python 1.6 (not 2.6, that’s not a typo – 1.6) and we still run code on it, because somebody on our team exhaustively audited the entire python 1.6 codebase decades ago and determined it was safe to install on that machine, and he refuses to do the same for later versions and so we don’t use them in that flow. We have a lot of dependencies that are like that – where we had to exhaustively determine some third party thing was safe under certain circumstances, and when that package got harder to audit we froze it and stopped supporting newer versions. New releases mean new, often unknown bugs. If a piece of software has been in heavy use for 15 years, you may well know all of the easily-exploited bugs in it and be able to determine that none of them matter.

                                                    The function of versioning is not to say which version is newer, but to allow you to specify which version(s) work – to give you a language for talking about dependency webs.

                                                    Semantic versioning lets a package maintainer hint to the developer of a new project (and the creator of that project’s dependency web) about which version ranges are likely to be compatible – something that this developer nevertheless needs to test themselves. I think software engineers are, generally speaking, well aware of the fact that documentation (let alone hints made by numbering) can’t be blindly trusted – we’ve all run into situations where a piece of software doesn’t work the way it’s supposed to, and isn’t compatible with the things it claims compatibility with, or even claims to implement standards or specifications that it does not.

                                                    There are ways to hint compatibility with all sorts of different granularities. For instance, you can claim compatibility with a particular standard (in which case, theoretically, your software should interoperate not just with previous and future versions of the same software but similar software written by other people). Or, you can expose interfaces and document their behavior (promising that this version of the software will work with other software that calls these functions in these particular ways). This is because maintaining compatibility is a constant problem that no automated system can currently solve, and hints are useful for helping humans solve it. (Sometimes, the appropriate response to some insane malformed input is to crash! Sometimes, a piece of software doesn’t need to be secure against fuzzers because it sits on an airgapped machine and gets all its input through a hex keypad!)

                                                    Eschewing hints about compatibility in favor of always being compatible with only the newest version of whatever trash some incompetent third party puts out means you’re continuously rewriting the least interesting parts of your own software in order to track somebody else’s software – which, in the normal case, is literally only getting worse over time.

                                                    1. 1

                                                      Could you share more of your organization’s backstory around this:

                                                      We still run python 1.6 (not 2.6, that’s not a typo – 1.6) and we still run code on it, because somebody on our team exhaustively audited the entire python 1.6 codebase decades ago and determined it was safe to install on that machine, and he refuses to do the same for later versions and so we don’t use them in that flow.

                                                      1. 2

                                                        There isn’t much to say.

                                                        We put an internal wiki (which had python as a dependency) on a machine that was also running critical business logic, back before python 2.x came out, and the guy who audited the implementation of python to make sure that it was safe to do this was so disgusted at the state of the 1.6 code that, although he ultimately decided it was safe to run python internally on non-internet-facing hardware, refused to audit any future implementations & refused to allow any non-audited code on that machine.

                                                        Up until a couple years ago, it was pretty normal in our organization to have 10+ year old versions of third party software anyway, basically because we audited most third party code & we needed everything to interact. So, we were limited not mostly by the last version of python we audited (since we had very little python code around) but the last version of GCC and GLIBC we audited. We were stuck with this until we migrated to LLVM+MUSL (which was easier to audit).

                                                        Python 1.6 is not actually the most interesting case here.

                                                        A few years ago, we had a project to upgrade f77 to support compiling more modern fortran to GCC 2.95 compliant C in order to replace a largeish third party graph-drawing package in C with a smaller third party graph-drawing package that was written in fortran – although we abandoned that project because the graph drawing package had a bunch of inefficient read code that couldn’t be trivially made to read quickly from a pipe. All of this was basically because nobody wanted to audit gfortran & nobody wanted to audit the original c graph drawing package.

                                                        We also had this big campaign to replace GNU coretools with the netbsd versions of the same, which was partially justified by the average size of each tool in LOC (and thus ease of auditing) although it was probably partially motivated by licensing concerns. Turns out, though, that nawk, on top of missing a bunch of very useful features from gawk, was slower and had a number of really ugly bugs.

                                                      2. 1

                                                        Are you backporting fixes to your pinned version of Python 1.6? Why or why not?

                                                        How you do weigh the pros and cons of one person doing a one-time extensive audit of Python 1.6 versus many* people using, testing, and improving Python every year?

                                                        By “many” I would estimate:

                                                        • at least hundreds of eyes looking deeply
                                                        • at least thousands developing libraries
                                                        • at least tens of thousands finding bugs
                                                        1. 2

                                                          This particular guy, in part because of his time spent auditing Python 1.6, had a poor opinion of the ability of the hundreds of core python developers to write secure, clean, and efficient C code. (Unsurprisingly, he had an even lower opinion of the competence of the developers of third party libraries for python – who he figured would not be writing python if they were competent. His attitude has softened basically because he’s worked with me & another reasonably competent guy, and we both like python while not ignoring or dismissing its flaws. Nevertheless, we do not have a lot of third party python libraries in use here.)

                                                          I don’t think it’s controversial to say that until nearly 3.x, python was designed and implemented in an ad-hoc manner & that auditing the implementation is difficult for the same reason that weird tricks like pypy work – the ad-hoc-ness of the codebase leads to unexpected behavior and performance characteristics highly dependent on historical accident. His more controversial opinion is that while 3.x is better, it is not better enough to justify the risk of allowing new code to be written in it in our org.

                                                          He allowed 1.6 on one particular machine to support one particular already-existing piece of internal-only code, but forebade us from running any python code in the parts of production he controlled in any other circumstance.

                                                          We now have some python in production, but he doesn’t touch it & it’s completely divorced from the core package set & isolated from our core functionalities. (This is nice for me, because I like developing in python, and because we have other language implementations in production I personally trust a lot less, like perl.)

                                                      1. 1

                                                        I’m @enkiv2@eldritch.cafe. Most of my posts these days are automated or semi-automated (from my massive and ever-growing archive of interesting links).

                                                        I’d recommend following:

                                                        1. 2

                                                          Yes, UX is important, but when UX conflicts with functionality functionality usually wins. I will give an example.

                                                          https://github.com/rust-lang/rust/pull/21406 is six years old Rust issue. It proposes using Unicode characters for compiler output, not even input. This failed, because Unicode output support isn’t universal, and proposal to detect Unicode output support using locale wasn’t convincing. This is the reality. And the author of this article wants us to consider Unicode input…

                                                          1. 4

                                                            Unicode input is slick in Julia. You just type \cup and get in any editor with auto completion.

                                                            1. 4

                                                              Julia is a great all-around example of how modern features can be integrated smoothly into a very “conventional” language. It integrates a JIT, good unicode support out of the box, a sane package & module system, an extremely usable REPL (including online documentation and color/highlighting)… It has replaced python as my go-to example of a language that cares about user experience, since the language designers also make an effort to use consistent naming and expose consistent interfaces.

                                                              Since being exposed to Julia, I’ve set a new bar for myself. When I develop new programming languages (which I do, from time to time, as experiments – they are always very fringe, and often there isn’t even enough interest to justify fixing major bugs), I won’t tell anybody about it unless I have a REPL that looks as polished as Julia’s.

                                                              At the same time – Julia is extremely conventional in its syntax. I think this accounts for much of its popularity, but I also think this conventional ALGOL-68-style syntax will hold it back from the full potential of a more flexible language. As far as I can tell, you can’t dump a couple macro rules into Julia to turn it into a reverse polish notation language or an APL-like (the latter being potentially very nice).

                                                              1. 2

                                                                If you want to write Julia but in a different syntax you can use string macros, which sounds a bit shit, but there’s good support for just taking your string macro and making a REPL out of it, which is very nice.

                                                                The base syntax is more flexible than it might first appear, too. If you overload e.g. broadcasting or use juxtaposition multiplication.

                                                                Edit: for practical examples, check out APL.jl and LispSyntax.jl

                                                                1. 1

                                                                  Fantastic!

                                                                  I will need to get back into Julia, once work calms down a bit.

                                                                2. 1

                                                                  I feel like the conventional syntax is a necessary evil for adoption, to bring Fortran/MatLab/Python/R developers under the same roof as type theorist/PL nerds/lisp hackers. I really wish I could do pattern matching/de-structuring without macros though.

                                                                  1. 1

                                                                    Agreed. I wish we, as an industry, weren’t subject to so much time pressure that an interesting new syntax is a negative though. I dunno about y’all but back before college, learning new programming languages with bizarre syntaxes was literally what I did for fun.

                                                            1. 16

                                                              Perhaps I’m in a foul mood [1] but just once, I would like to see someone not rant, but sit down and actually implement their ideas. Stop complaining, and just do it! Yes, I’ve seen this rant before. I’ve been reading rants like this since 1990 and you know what? NOTHING HAS CHANGED! Why? Probably because it’s easier to complain about it than do anything about it.

                                                              [1] UPS died today, and it took out my main computer.

                                                              1. 8

                                                                Rust tries. Have a look at Shape of errors to come.

                                                                1. 8

                                                                  People have been ranting about it well before 1990, as well.

                                                                  However, it is worth noting that plenty of people have sat down and implemented their ideas. There are programming languages out there that attempt to throw off the points mentioned in this article. (Fortress, for example, uses traditional mathematical notation, with a tool to automatically translate source into a LaTeX file.) But the fact is that none of these experiments have gained much traction – or when they do gain traction (e.g. Perl), they are often widely reviled for those exact traits.

                                                                  Proponents of these ideas will argue that people are too hidebound for new ideas to reach a critical mass of adoption … and while that is certainly a factor, after so many decades of watching this pattern repeat, I have to wonder. I question the initial premise. Would programming languages actually be better if they were written in a proportional font, or required punctuation marks not printed on a standard keyboard? It’s not clear to me that this assumption is true.

                                                                  1. 1

                                                                    Well, “would be better if ” omits the most important point – which is “better for whom”. APL is very popular among fans of APL, who have already done the groundwork of learning new tooling and getting access to specialized characters, which is different from but not fundamentally harder than becoming familiar with our usual UNIX build toolchain.

                                                                    So long as the unstated “whom” factor in this question is “people who are already professional programmers”, radically new ideas will have a hard time gaining popularity. If we expand it to “people who are planning to become professional programmers”, a handful of other technologies start to make the cut based on the ease with which people can pick them up. Our current software ecosystem is optimized by some mix of two main utility functions: “is it convenient for people who have been administering UNIX machines since 1985” and “is it convenient for people who don’t know what a for loop is”.

                                                                    I don’t personally think proportional fonts are a plus. Typography people are big on proportional fonts because they remove ‘rivers’, which can be distracting when you’re looking at a page of text at a distance because part of human nature is to project meaning onto shapes (such as the shape of whitespace), and in densely-packed prose, patterns in whitespace between lines are almost always noise. In source code, patterns in whitespace between lines are basically always intentional and semantically meaningful, and monospace fonts are the easiest way to make such patterns controllable.

                                                                    But, unicode can be fantastic for readability, since it aids in chunking. Where the density of operators based on ascii in obfuscated perl and in APL-derived languages like j make code seem “write-only”, turning multi-character operators into single non-ascii character operators makes functionally read-only code. Still, lots of modern languages support unicode operators & have unicode aliases for multi-character operators, and perhaps widescale adoption of these aliases will produce pressure to create specialized extended keyboards that can type them and lower the price of such keyboards. In the mean time, there are (monospace) fonts that render multi-character operators as single characters, simulating composition, and if they didn’t screw up code alignment this would be a fantastic solution.

                                                                    There are a lot of tiny but thriving tech microcultures – forth, smalltalk, APL, prolog, red, and tcl all have them. There are bigger tech microcultures too – the ones around haskell, rust, and erlang. Very occasionally, they seed ideas that are adopted by “the mainstream” but it’s only through decades of experimentation on their own.

                                                                    1. 2

                                                                      Our current software ecosystem is optimized by some mix of two main utility functions: “is it convenient for people who have been administering UNIX machines since 1985” and “is it convenient for people who don’t know what a for loop is”.

                                                                      Any proof of this assertion?

                                                                      1. 1

                                                                        Only anecdotal evidence from 10 years in the industry and another 10 developing software in public, but I imagine other folks on lobste.rs can back me up.

                                                                        I can exhaustively list examples of mediocre tech that has been adopted and popularized for one of these two reasons, but that’s not exactly a proof – it’s merely evidence.

                                                                        1. 2

                                                                          My own anecdotal experience of a similar amount of time (add 6 years of developing software/doing research in academia, subtract accordingly from the others, and sprinkle in some other years to make the math work) is not the same. There are complicated incentives behind software development, maintenance, and knowledge transmission that affect all of these things, and these incentives are not captured in a dichotomy like this. I see it most trotted out when used to justify alternative opinions, the “You just don’t understand me” defense.

                                                                          1. 1

                                                                            I would naturally expect the constraints to be quite different in academic research, & of course, I’ve simplified the two cases to their most recognizable representatives. At the same time, highly competent people who have a lot of social capital in organizations (or alternately, people in very small organizations where there’s little friction to doing wild experiments, or people who have been institutionally isolated & are working on projects totally alone) have more flexibility, and I’ve been blessed to spend much of my time in such situations and have limited my exposure to the large-institution norm.

                                                                            We could instead say that the two big utility functions in industry are intertia & ease of onboarding – in other words, what tools are people already familiar with & what tools can they become familiar enough with quickly. They interact in interesting ways. For instance, the C precedence rules are not terribly straightforward, but languages that violate the C precedence rules are rarely adopted by folks who have already internalized them (which is to say, nearly everybody who does a lot of coding in any language that has them). How easy/intuitive something is to learn depends on what you’ve learned beforehand, and with professional developers, it is often much easier to lean on a large hard-learned set of rules (even if they are not very good rules) than learn a totally new structure from scratch (even if that structure is both complete and reasonably simple). It’s a lot easier to learn forth from scratch than it is to learn Java, but everybody learns Java in college and nobody learns forth there, so you have a lot less friction if you propose that the five recent graduates working on your project implement some kind of 10,000 line java monstrosity than that they learn forth well enough to write the 50 line forth equivalent.

                                                                            As long as there’s a lot of time pressure, we should expect industry to follow inertia, and follow onboarding when it coincides with inertia.

                                                                            Onboarding has its own problems. Take PHP, for instance. A lot of languages make the easy things very easy and the difficult things almost impossible, & undirected beginners lack the experience to recognize which practices don’t scale or become maintainable. I spent a couple years, as a kid, in the qbasic developer ghetto – rejecting procedure calls and loop structures in favor of jumps because qbasic’s versions of these structures are underpowered and because I had never written something that benefitted much from the modularity that structured programming brings. Many people never escape from these beginning-programmer ghettos because they don’t get exposed to better tooling in ways that make them want to adopt it. I might not have escaped from it had I not been susceptible to peer pressure & surrounded by people who made fun of me for not knowing C.

                                                                            And onboarding interacts with inertia, too. PHP and ruby were common beginner’s languages during the beginning of the “web 2.0” era because it was relatively easy to hook into existing web frameworks and make vaguely professional-looking web applications using them on the server side. These applications rarely scaled, and were full of gotchas and vulnerabilities that were as often inherited from the language or frameworks as they were from the inexperience of the average developer. But Facebook was written in PHP and Twitter in Ruby, so when those applications became big and suddenly needed to scale, rather than immediately rewriting them in a sensible way, Twitter spent a lot of time and money on new Ruby frameworks and Facebook literally forked PHP.

                                                                            Folks who are comfortable with all of PHP’s warts move in different circles than folks who are comfortable with all of C’s warts, or UNIX’s, or QBasic’s, but they are unified in that they gained that comfort through hard experience &, given a tight deadline, would rather make use of their extensive knowledge of those warts than learn a new tool that would more closely match the nuances of the problem. (Even I do this most of the time these days. Learning a totally new language in and out can’t be put on a gantt chart – you can’t estimate the unknown unknowns reliably – so I can only do it when not on the clock. And, when I’m not on the clock, learning new languages is not my highest priority. I often would prefer to eat or sleep. I am part of the problem here.)

                                                                            Obviously, most wild experiments in programming language design will be shitty. Even the non-shitty ones won’t immediately get traction, and the ones that do get traction will probably take decades to become popular enough in hobby communities for them to begin to appear in industry. I think it’s worth creating these wild experiments, taking them as far as they’ll go, and trying other people’s wild experiment languages too. The alternative is that we incrementally add new templates to STL (while keeping all the existing ones backward compatible) forever, and do comparable work in every other stack too.

                                                                            1. 3

                                                                              so when those applications became big and suddenly needed to scale, rather than immediately rewriting them in a sensible way, Twitter spent a lot of time and money on new Ruby frameworks and Facebook literally forked PHP.

                                                                              Familiarity isn’t the reason why Twitter and Facebook spent time trying new frameworks and forking PHP. The reason is that these companies believed, from a cost-benefit perspective, that it was cheaper to preserve all of their existing code in the existing language and try to improve the runtime rather than rewrite everything in a different language, and risk all the breakages that come with language transitions. I have friends who were around both companies around the time (and at FB on the teams working on Hack, as it was a decent destination for PLT folks back then) and these were well known. Having worked on language migrations myself, I can say that they are very expensive.

                                                                              1. 4

                                                                                Familiarity isn’t the reason why Twitter and Facebook spent time trying new frameworks and forking PHP. The reason is that these companies believed, from a cost-benefit perspective, that it was cheaper to preserve all of their existing code in the existing language and try to improve the runtime rather than rewrite everything in a different language, and risk all the breakages that come with language transitions.

                                                                                There’s no contradiction there.

                                                                                Switching to another language would not, ideally, be a language migration so much as a from-scratch rewrite that perhaps reused some database schemas – all of the infrastructure that you created to insulate you from the language (and to insulate the way you want to do & think about problems from the way the language designers would like you to do and think about problems) would be of no particular use, unless you switched to another language so similar to the original one that there wasn’t much point in migration at all. This is a big project, but so is maintenance.

                                                                                I don’t have first-hand access to these codebases, but I do know that PHP is insecure and Ruby on Rails doesn’t scale – and that solving those problems without abandoning the existing frameworks requires a lot of code that’s very hard to get right. If you knew you were likely to produce something very popular, you wouldn’t generally choose PHP or Ruby out of the gate because of those problems, and conceptually nothing about Facebook’s feature set is uniquely well-suited to PHP (and nothing about Twitter’s is uniquely suited to Ruby).

                                                                                I hear that Twitter eventually did migrate away from Ruby for exactly this reason.

                                                                                The inertia of old code & the inertia of existing experience are similar, and they can also overlap. A technical person on the ground level can have a better idea about whether a total rewrite is feasible than a manger with enough power to approve such a rewrite. And techies tend to be hot on rewrites even when essential complexity makes them practically prohibitive. Facebook may have made the right move, when they finally did move, simply because they have this endless accumulated cruft of features introduced in 2007 that hardly anybody has used since 2008 but that still needs to be supported (Facebook Memories recently reminded me of a note I wrote 12 years ago – around the time I last saw the facebook note feature used); had they put a potential rewrite on the table in 2006, when the limitations of PHP were already widely known, the cost-benefit ratio may have been very different.

                                                                                I’ve got some first-hand experience with this. Where I work, we had a large C and perl codebase that grew about 5 years before I joined when we bought a competitor and integrated their existing large Java codebase. Most of the people who had touched the Java codebase before we inherited it either left the company or moved into management. When I was brought on as an intern, one of my tasks was to look at optimizing the tasks that one of the large (millions of lines) java projects was handling. It turned out that this project had extremely wasteful retry logic designed to deal with some kind of known problem with a database we hadn’t supported in a decade, and that the very structure of our process was extremely wasteful. I worked alone for months on one component & got this process’ run time down from 8 days to 5 (though my changes were never adopted). Later, I worked with a couple people to get the process down from 8 days to 1 day by circumventing some of the retry logic & doing things in parallel that could easily be done in parallel. Last year, we moved from our local datacenter to the cloud, and we had to radically restructure how we handled this process, so we rewrote it completely (using a process that was based on something I had developed for mirroring our data in backup data centers) and – the 8 day time period turned into about 20 minutes and the multi-million-line java codebase turned into an approximately one hundred line shell script. I am under the impression that practically every million-line java codebase in my organization can be turned into a one hundred line shell script with equivalent functionality and substantial performance improvements, and that the biggest time and effort sink involved is to reverse engineer what is actually being done (in the absence of the original authors) and whether or not it needs to be done at all. I don’t want to underestimate or understate the detective work involved in figuring out whether or not legacy code paths should exist, because it is substantial and it requires someone with both depth and breadth of experience.

                                                                                This is a bit of a hot take, and it’s not remotely viable in a commercial environment, but I think our software would benefit quite a bit if we were less afraid of rewrites and less afraid of reinventing the wheel. The first working version of any piece of software ought to be treated as an exploration of the problem domain (or at most, a prototype) and should be thrown away and rewritten from scratch using the knowledge gained – now that you know how to make it work, make it right from the ground up, using the tools, techniques, and structure that are suited to the task. This requires knowing a whole range of very dissimilar tools, and ideally would involve being willing to invent new tools.

                                                                                1. 1

                                                                                  I am under the impression that practically every million-line java codebase in my organization can be turned into a one hundred line shell script with equivalent functionality and substantial performance improvements, and that the biggest time and effort sink involved is to reverse engineer what is actually being done (in the absence of the original authors) and whether or not it needs to be done at all.

                                                                                  Ouch. I’ve never had quite this bad an experience, but I’ve had similar experiences in academia. Must have been cathartic to reduce all that extraneous complexity though.

                                                                                  This is a bit of a hot take, and it’s not remotely viable in a commercial environment, but I think our software would benefit quite a bit if we were less afraid of rewrites and less afraid of reinventing the wheel. The first working version of any piece of software ought to be treated as an exploration of the problem domain (or at most, a prototype) and should be thrown away and rewritten from scratch using the knowledge gained – now that you know how to make it work, make it right from the ground up, using the tools, techniques, and structure that are suited to the task. This requires knowing a whole range of very dissimilar tools, and ideally would involve being willing to invent new tools.

                                                                                  I think this boils down to what the purpose of software is. For me, the “prime imperative” of writing software is to achieve a desired effect within bounds. In commercial contexts, that’s to achieve a specific goal while keeping costs low. For personal projects, it can be many things, with the bounds often being based around my own time/enthusiasm. Reducing complexity, increasing correctness, and bringing out clarity are only means to an end. With that in mind, I would not be in favor of this constant exploratory PoC work (also because I know several engineers who hate writing throwaway code and become seriously demoralized when their code is just tossed, even if it’s them writing the replacement). I’m also not optimistic about the state space of tools, techniques, and structure. I don’t actually think there are local maxima much higher than the maximum we find ourselves in now from redoing everything from the ground up, and the work in reaching the other maxima is probably a lot more than the magnitude of difference between the maxima. I do think as a field of study we need to balance execution and exploration more so that we don’t make silly decisions based on what’s available to us, but I’m not optimistic that the state space really has a region of maxima much higher than our own at all, let alone within easy reach.

                                                                                  1. 2

                                                                                    I’m not optimistic that the state space really has a region of maxima much higher than our own at all, let alone within easy reach.

                                                                                    I can’t bring myself to be so pessimistic. For one thing, I’ve seen & used environments that were really pleasant but are doomed to never become popular enough to support development after the current crop of devs are gone, and some of these environments have been around for decades. For another thing, if I allowed myself to believe that computers and computing couldn’t get much better than they are right now, I’d get quite depressed. The current state of computing is, to me, like being at the bottom of a well with a broken leg; to imagine that it could never be improved is like realizing that there is no rescue.

                                                                                    Then again, in terms of the profit motive (deploying code in such a way that it makes money, whether or not anybody actually likes or benefits from using it), I don’t prioritize that at all. It is probably a mistake to ignore or underestimate that factor, but I think most big shifts have their origin in domains that are shielded from it, and so going forward I’d like to create more domains that are shielded from the pressures of commerce.

                                                                                    1. 2

                                                                                      The current state of computing is, to me, like being at the bottom of a well with a broken leg; to imagine that it could never be improved is like realizing that there is no rescue.

                                                                                      This is the problem with these discussions. They really just boil down to internal assumptions of the state of the world. I don’t think computing is broken. I think computing, like anything else, is under a complex set of forces, and if anything, I’m excited by all the new things coming out in computing. If you don’t think computing is broken, then the prospect of an unknown state space with already discovered maxima isn’t a bad thing. If you do, then it is. And so folks who disagree with the state of computing think we need to change directions and explore, folks who agree think that everything is mostly okay.

                                                                                      I don’t prioritize that at all. It is probably a mistake to ignore or underestimate that factor, but I think most big shifts have their origin in domains that are shielded from it, and so going forward I’d like to create more domains that are shielded from the pressures of commerce.

                                                                                      Do you mean specifically pressures of commerce, or pressure in general? There are a lot of people for whom programming is just a means to an end, and so there are still pressures, just maybe not monetary. They just slap together some Wordpress blog for their club, or write some uninspired CRUD to help manage inventory at their local library. These folks aren’t sitting there trying to understand whether a graph database better fits their needs than bog standard MySQL; they just care more about what technology does for them than the technology itself. I don’t think that’s unrealistic. Technology should be an enabler for humans, not an ideal to aspire unto.

                                                                                      1. 2

                                                                                        Do you mean specifically pressures of commerce, or pressure in general?

                                                                                        Commerce in particular. I think it’s wonderful when people make elaborate hacks that work for them. Markets rapidly generate complex multipolar traps that incentivize the creation and adoption of elaborate hacks that work for no one.

                                                                                        1. 1

                                                                                          Full agreement about this, personally.

                                                                  2. 3

                                                                    Wholeheartedly agree. This was a very annoying thing to read and moreover I think most of the ideas presented here are actually pretty terrible. I’d love to see the author implement their ideas and prove me either wrong or right.

                                                                    1. 3

                                                                      I’ve been reading rants like this since 1990 and you know what? NOTHING HAS CHANGED

                                                                      Well, that’s not quite the case, because this:

                                                                      If I write int f(X x), where X is an undeclared type, the compiler should not do what GCC does, which is to write the following:

                                                                      error: expected ‘)’ before ‘x’
                                                                      

                                                                      [..]

                                                                      It should say either something both specific and helpful, such as:

                                                                      error: use of undeclared type ‘X’ in parameter list of function ‘f’
                                                                      

                                                                      is now (ten years after this article was written):

                                                                      $ cat a.c
                                                                      int f(X x) { }
                                                                      
                                                                      $ gcc a.c
                                                                      a.c:1:7: error: unknown type name 'X'
                                                                          1 | int f(X x) { }
                                                                            |       ^
                                                                      
                                                                      $ clang a.c
                                                                      a.c:1:7: error: unknown type name 'X'
                                                                      int f(X x) { }
                                                                            ^
                                                                      

                                                                      So clearly there is progress, and printing the actual code is even better than what was suggested in this article IMO.

                                                                      gcc anno 2011 was kind of a cherry-picked example anyway, as it was widely known to be especially horrible in all sorts of ways, including error messages which with notorious and widely disliked.


                                                                      As for the rest of the article: many would disagree Perl is “human friendly”, many people find the sigils needlessly hard/confusing, and “all of the ugly features of natural languages evolved for specific reasons” is not really the case; well, I suppose there are specific reasons, but that doesn’t mean they’re useful or intentional. I mean, modern English is just old Saxon English badly spoken by Vikings with many mispronunciations and grammar errors, and then the Normans came and further morphed the language with their French (and the French still can’t speak English!) Just as natural selection/evolution doesn’t always pick brilliantly great designs, neither does the evolution of languages.

                                                                      Why don’t we use hyphens for hyphenation, minus signs for subtraction, and dashes for ranges, instead of the hyphen-minus for all three?

                                                                      As if distinguishing -, −, and – is human friendly… “Oh oops, you used the wrong codepoint to render a - character: yes we know it looks nearly identical and that we already know what you intended from the context, but please enter the correct codepoint for this character that is not on your keyboard.”

                                                                      Even if and were easy to type, visually distinguishing between the two is hard. I had to zoom to 240% just now to make sure I entered them correctly. And I’m 35 with fairly decent eye sight: in 20 years I’ll probably have to zoom to 350%.

                                                                      1. 2

                                                                        The author is clearly some kind of typography nerd, but there are good ideas embedded in this. French-style quote marks (which look like << and >>) are a much more visually distinguishable version of “smart quotes”, and the idea of having nestable quotes is good enough that many languages implement them (even if they do not implement them in the same way). Lua’s use of [[ and ]] for nestable quotes seems closest to the ideal here. Of course, you’ll still need either a straight quote system or escaping to display a lone quote :)

                                                                        1. 1

                                                                          Yeah, using guillemets might be a slight improvement, but to be honest I think it’s a really minor one. And you can’t just handwave away the input issue: I use the Compose key on X, and typing « means pressing Alt < <; even with this, three keystrokes for such a common character is quite a lot IMO.

                                                                          “Code is read more often than it is written”, yes yes, I know, but that doesn’t mean we can just sacrifice all easy of writing in favour of readability. Personally I’d consider it to be a bad trade-off.

                                                                          1. 1

                                                                            It’s a chicken-and-egg problem between hardware and software, and hardware is a little more centralized so it’s easier for a single powerful figure to solve this problem. If tomorrow Jony Ive decided that every macintosh keyboard would have extra keys that supported the entire APL character set from now on, we’d probably see these characters in common use in small projects within a year, and bigger projects within a couple years, and within ten years (assuming other keyboard manufacturers started to follow suit, which they might) we’d probably see them in common use in programming in any language that supported them (including as part of standard libraries, probably with awkward multi-character aliases for folks who still can’t type them).

                                                                            The same cannot be said for software, as the failure of coordinated extreme efforts by the python core devs to exterminate 2.x over the course of nearly a decade has shown.

                                                                            1. 1

                                                                              Many languages never use guillemets; I never use them when writing Dutch or English, for example. They are used in some other European languages, but it’s nowhere near universal. Adding these characters to these keyboards only for programmers seems a bit too narrowly focused, IMO: most people typing on keyboards are not programmers. Hell, even as a programmer I probably write more English and Dutch than code.

                                                                              This is why the French, German, and English keyboard layouts are all different: to suit the local language.

                                                                              1. 2

                                                                                Folks writing english prose also rarely use the pound, at, carat, asterisk, square braket, angle bracket, underscore, vertical bar, and back slash characters (and for half a century or more, have been discouraged by style guides from even learning what the semi-colon is for), but every US keyboard has them – mostly because programmers use them regularly. Because they are available, users (or developers of user-facing software) have invented justifications for using at, pound, and semicolon in new ways specific to computing. Any new addition to a keyboard that gets widespread adoption will get used because of its availability.

                                                                                Even emoji are now being used in software now (even though not only are they difficult to type on basically all platforms but they don’t display consistently across them and don’t display at all on many).

                                                                                1. 1

                                                                                  That’s true, but for one reason or the other they are on the keyboard and (much) easy to input, and people are familiar with them due to that.

                                                                                  Perhaps the keyboard layout should be changed; I wouldn’t be opposed. But I also don’t think this is something that can really be enforced from just the software community, but maybe I’m wrong.

                                                                    1. 2

                                                                      I want to reply to a decade-younger self who wrote a comment to the author on this post. It seems that I sketched three criticisms of the article’s argument.

                                                                      I claimed that programming is mathematics, and thus that UX concerns must bend before the needs of mathematical theorems. However, Past Corbin did not understand that Turing categories have arbitrary syntax; the Church-Turing Thesis is actually a family of proven theorems, not a hypothesis about maths. As a result, while programming is mathematical, the choice of language syntax is quite arbitrary and we have great latitude to choose to make our languages friendlier to humans.

                                                                      I claimed that the dialect of English used in legal writing has poor UX. Combined with the author’s argument that English has been well-worn and had its UX developed by generations of iterative usage, I think that I was trying to make a point about how formality and usage interact, but this is a red herring and not at all a dilemma.

                                                                      I am still very interested in my final argument, though. I pointed out that, unlike here and now at Lobsters, the comment box of past Blogger was configured to show text in a fixed-width font. The post contains a screed against fixed-width fonts in programming, and so I found it not just ironic, but a UX failure on Blogger’s part. Ignoring the irony, the author’s reply is important: They couldn’t change it. UX is thus intimately connected to Free Software and user freedoms.

                                                                      1. 2

                                                                        I claimed that the dialect of English used in legal writing has poor UX.

                                                                        One can argue (and this is almost definitely at least partially correct) that english legalese is better adapted to the purposes of a particular type of person (the professional lawyer whose fluency in legalese, familiarity with best practices around litigation, awareness of common pitfalls and loopholes, ability to navigate the complex formalized social environment of a court, access to databases of precedent and command over teams of paralegals to navigate it on their behalf, and specialized skills in rhetoric allow him to command high pay and special legal privileges) than it is to the normal use case (an untrained layman trying to understand a contract). Law is an interesting field, especially english-style law, since you’ve got an impossibly large pool of technical debt (precedent, which often conflicts and which often must be bent in order to claim that it’s applicable to a case, giving lawyers considerable flexibility in using it) that is sometimes weaponized against opponents but much of the time simply gets in the way.

                                                                        Computing is similar, except that there’s no bar exam for programmers (nor is there disbarrment, nor is there programmer-client privilege). We too inhabit a position of power over “regular people” that we got by spending a few decades studying arcane highly-formalized languages (like C++) and the strange formalized rules of conduct that go with (like “create a branch, make your edits, commit them, push them, and then file a pull request upstream, but search jira for a possibly overlapping bug report before submitting a patch”). As professional developers, this works out really well for us. And we’re not liable for malpractice either. It kind of sucks for people who didn’t start programming at age nine, though, or who started at 9 and then stopped at 12 and tried to pick it up again at 18. Those folks will literally never catch up, unless we change the tooling to be more accessible.

                                                                      1. 6
                                                                        1. 3

                                                                          I’ll believe it when I see it. Besides, wasn’t MIT doing AR stuff back in the late 80s already?

                                                                          1. 1

                                                                            I tried a hololens system before I wrote this essay. I wasn’t very impressed. The resolution appeared to be about the same as other head mounted display systems available on the consumer market at the time.

                                                                            I understand that there are a lot of fiddly technical problems that need to be solved in order to get depth of field working properly for virtual objects and so on, and I wish my other head mounted displays were translucent, but it’s easy to make the case that hololens is a very fancy version of the ceiling-mounted computer-controlled display armatures that Ivan Sutherland was working with in the 50s.

                                                                            I haven’t seen any indication that the hololens tech is introducing interesting new interface metaphors (the way that, admittedly, Jaron Lanier did for his VR rigs in the 80s). And, many of the technical challenges hololens has been working with were known and addressed by Steve Mann in the 80s and 90s.

                                                                            Certainly, post-1970s advances in the manufacture of integrated circuits gave us access to technologies like micromirror arrays, which I imagine would be useful for translucent displays based around projection.

                                                                            But, there’s no major conceptual jump. Hololens developers are not thinking brand new thoughts that they weren’t capable of thinking before putting on the glasses – while early CS and early computer hardware design was full of these genuinely groundbreaking ideas. By the end of the 60s, mice, pen computing, trackballs, interactive command line interfaces, joysticks, touch interfaces, gesure-based control, handwriting recognition, and selection of icons were all established (with promising prototypes that sometimes worked). By a couple years later, so was piping and other forms of inter-process communication, regular expressions, networking, fat and thin terminals, the client-server model…

                                                                            1. 3

                                                                              I’m slightly biased, because a lot of the Hololens work was done by folks just down the corridor from me, but I also worked with a startup building an AR platform around a decade ago and Hololens is a massive jump from what they were doing (I’ve only played with Hololens 2, I don’t know how much better it was than V1). A lot of the novel interface work is still ongoing but there are some very neat things done to enable it. One of the biggest problems with the older stuff I tried was that you needed to hold your finger vertically to point to things because the camera obviously couldn’t see your finger in a natural pointing direction when it was occluded by your fist. The Hololens builds a machine-learning model of the shape of your hand and so it can accurately track your finger position from how your knuckles move, even when it can’t see your finger. It’s the first AR display system that I’ve found even vaguely intuitive. I think it needs a few more generations of refinement to get the weight (and price!) down, but I’m pretty enthusiastic longer term.

                                                                              This is still incremental improvement, but so is everything since the first stored-program computer.

                                                                              1. 1

                                                                                This is still incremental improvement, but so is everything since the first stored-program computer.

                                                                                This is exactly what I reject.

                                                                                I’ve given a pretty specific idea of what non-incremental means: it’s an expansion of the adjacent possible. I don’t think it’s controversial to say that there are things created after ENIAC that expanded what people could imagine (even as science fiction) – after all, even in science fiction, up until the early 70s all the computers were big hulking centralized things.

                                                                                There’s a particular time when we started seeing non-technical people imagine what we’d now call AR – some time between the publication of PKD’s A Scanner Darkly in the 70s and Gibson’s Neuromancer in 1984. VR appeared earlier (maybe in Simulacron-3 and maybe as much as ten years earlier). While home automation was a mainstay of 30s science fiction magazines, what we’d call ubiquitous computing (wherein the devices in your home have some kind of rudimentary autonomy and communicate with each other as individual entities) shows up – alongside a now-familiar freemium model of monetization – in PKD’s Ubik, the very story that PARC researchers took the inspiration for the name “ubiquitous computing” from. Something recognizable as social media showed up in E. M. Forester’s The Machine Stops in 1910, and most of the familiar criticisms of social media (aside from those revolving around monetized surveillance) showed up there; all our current concerns about mass surveillance were present in 20s radio serial episodes about “television”, back before the reality of centralized TV networks and big broadcast towers became part of the collective understanding of what television meant. As soon as landline telephones coexisted in the same homes as radio receivers, we started seeing newspaper comics accurately predicting the design of cellular phones and the social ramifications of their use.

                                                                                Outside of computing – well, we don’t have human cloning, but folks knew animal cloning was theoretically possible for years before it was done to zebra fish in 1957, and we had a good run of science fiction exploring all the possible ramifications of that during the 60s and 70s until finally we emptied the well (after which it was basically all reruns with bigger budgets: Overdrawn at the Memory Bank and Boys From Brazil became templates for everything clone-related afterward). There was a time before the concept of a human clone, and science fiction that was clone-adjacent before that was very strange – Alraune, the german silent blockbuster horror film about what we’d now call a test-tube baby, used alchemical metaphors. There was a time before the concept of robots, too, and it was sometime after RUR (because the robots in RUR, like the replicants in blade runner, are biological). Modern readers of Brave New World may be confused by Huxley’s description of the process of manufacturing the different classes of humans – but it’s because there was no concept of genetic engineering (although as soon as we knew about DNA encoding – the result of a trendy adoption of cybernetic ideas among the interdepartmental jet-set in the late 50s and early 60s – we started seeing the idea floating around, first in academic discussions and then, quite quickly, in science fiction).

                                                                                Similarly, as soon as Alan Kay saw Plato’s plasma display, he imagined Dynabook. Whether or not Dynabook was ever created is beside the point. The point is that before he saw Plato’s plasma display, he could not imagine Dynabook.

                                                                                When’s the last time you saw a piece of technology that made you imagine possibilities you could never have imagined before?

                                                                                It seems like between 1950 and 1980, it happened regularly.

                                                                                1. 1

                                                                                  Another example from outside of computing, since I came across it the other day:

                                                                                  J. D. Bernal’s 1929 book The World, the Flesh, and the Devil describes the concepts and future challenges behind spaceflight, solar sails, the creation of human environments from hollowed-out asteroids, and the production of cyborg bodies for humans – basically, the fundamentals of what we might call modern transhumanist thought (sans mind uploading, which he was unable to predict seeing as how he was writing before the first stored program computer, and genetic engineering, which he was unable to predict because he was writing before the discovery of DNA). Bernal was a pioneer of x-ray crystallography & brings this to bear on the materials science side of this work. If I hadn’t read the date of publication, I would have thought that this was a work from the 60s, since that’s when most of these ideas really hit the mainstream.

                                                                                  Just as the 1929 publication date puts Bernal in a position to imagine space rockets and solar sails (and solar power, since the photoelectric effect was already known) but doesn’t give him access to the idea of mind uploading or genetic engineering (which he brushes up against with regard to a discussion of embryonic surgery – the same tech as used in The Island of Doctor Moreau), there was a time when people imagined spaceflight but could not imagine rockets doing it – Jules Verne wanted to use a cannon, while two centuries earlier Wilkins and Hooke (and, drawing from them, de Bergerac) wanted to use sails. Once the rocket formula was developed, we almost immediately see people changing their imagined form of space flight to rocketry – suddenly we could very clearly see that rockets could achieve escape velocity, given a certain amount of tinkering, and during the first half of the 20th century people like Werner von Braun and Jack Parsons invented the technologies necessary to fulfill the prophecy that Tsilovsky’s rocket equations made.

                                                                                  The situation with rocketry is similar to the one with steam engines, but much more intense. The ancient greeks had steam engines of a sort; they were toys, because if you tried to get enough force out of them, they’d explode. It took the invention of calculus literally thousands of years later for somebody to quantify the pressure a boiler needed to withstand in order to get a certain amount of torque, and some advances in the synthesis of steel to get steam engines capable of moving trains. Once we had calculus, it took only decades for the first useful steam engine to show up, and only a century after that for steam locomotives to become commonplace.

                                                                                  Basically, while some technical advances are incremental because they allow you to advance toward the goal you have in mind, others create a general model that allows you to predict what sorts of things will be possible in the future. You need to have both in order to progress: the latter type, which forms the outlines of the world-picture, and the former, incremental type wherein you color in the shapes that have already been drawn.

                                                                                  We have structured our society so that only coloring within the lines is rewarded, and software has a worse case of this than most other fields of similar age. What we’re doing with computing is the equivalent of saying that rockets are only good for fireworks.

                                                                            2. 1

                                                                              Redmond

                                                                            1. 2

                                                                              Nope, no innovation at all.

                                                                              https://www.microsoft.com/en-us/hololens (VR Headset)

                                                                              https://www.oculus.com/ (VR Headset)

                                                                              https://en.wikipedia.org/wiki/AlphaGo (NN which beat Lee Sedol at Go)

                                                                              https://en.wikipedia.org/wiki/BERT_(Language_model) (SOTA language model)

                                                                              https://en.wikipedia.org/wiki/GPT-3 (Text generation NN)

                                                                              https://waymo.com/ (Self-Driving Cars)

                                                                              https://nest.com/ (Voice controlled home assistants, thermostats, and more)

                                                                              But, using 70s tech to make 60s tech bigger (ex., deep neural networks) isn’t innovation — it’s doing the absolute most obvious thing under the circumstances, which is all that can be defended in a context of short-term profitability.

                                                                              Much easier said than done. Deep Learning was dead in the water for a long time, and it wasn’t until the widespread adoption of ReLU (https://en.wikipedia.org/wiki/Rectifier_(neural_networks)) as an activation function that we were really able to train an NN quick enough to even achieve useful results, whether we had access to GPUs for training or not.

                                                                              1. 1

                                                                                You’re exhaustively listing (impressive) incremental advances, which were predictably possible to even laymen in the 60s. There are no paradigm shifts in this list.

                                                                              1. 2

                                                                                Major innovations you use every day off the top of my head:

                                                                                • cloud computing, in the sense of massive-scale data centers built out of heterogeneous, disposable commodity hardware
                                                                                • the scroll wheel (invented in 1989 at Apple, re-invented a couple times, popularized with the MS Intellimouse)
                                                                                • the @-mention, which grew somewhat organically out of Twitter social practices but is now a ubiquitous feature of communications.
                                                                                1. 3

                                                                                  I wouldn’t call any of those “new innovations”. Cloud computing is the centralized-data-center model. There were cranks for scrolling through code as gag hardware in the 60s. The @-mention is not fundamentally different from notification practices on nick mentions on IRC. More importantly, none of these made a formerly unimaginable world trivially imaginable.