Threads for fanf

    1.  

      RISC-V is like quantum…always just a few years away

      1. 7

        I think this is completely untrue. RISC-V is real, exists, works, there are hardware products being built, embedded into mass produced devices, etc.

        It’s just in the space that most of us are mostly interested in - modern high performance CPUs - the instruction set is maybe 1% of the story. Modern CPUs are one of the most elaborated artifacts of human engineering and result of decades of research and improvements, part of a huge industrial and economical base built around it. That’s not something that can be significantly altered rapidly.

        Just look how long it took Arm to get into high performance desktop CPUs. And there was big and important business behind it, with strategy and everything.

        1. 6

          They’re not asking for high-performance desktop CPUs here though. Judging by the following statement on IRC:

          <q66> there isn't one that is as fast as rpi5
          <q66> it'd be enough if it was
          

          it sounds like anything that even approaches a current day midrange phone SoC would be enough. RPi5 is 4x Cortex-A76, which is about a midrange phone SoC in 2021.

          1.  

            Last I checked, the most commonly recommended RISC-V boards were slower than a Raspberry Pi 3, and the latest and greatest and most expensive boards were somewhere in the middle between RPi 3 and 4. So yeah, pretty slow.

          2.  

            Beyond microcontrollers, I really haven’t seen anything remotely usable. I’d love to be wrong though.

            I tried to find a Pi Zero replacement, but the few boards available all had terrible performance. The one I bought in the end turned out has an unusable draft implementation of vector instructions and it’s significantly slower than any other Linux board I’ve ever used, from IO to just CPU performance. Not to mention the poor software support (I truly despise device trees at this point).

            1.  

              Just for a data point, the RP2350 chip in the Raspberry Pi Pico 2 includes a pair of RV32IMACZb* cores as alternates for the main Cortex-M33 cores: you can choose which architecture to boot into. The Pico 2 costs $5 in quantities of 1.

          3.  

            This does seem interesting – however, this isn’t anything new as a concept.

            It’s been around on X11 for years as virtual desktop support. What makes this “interesting” is how applications are arranged.

            Now, under Wayland, this approach is easy as there’s fewer restrictions around how applications are sized. I hope this changes…

            1.  

              Since when did X11 virtual desktops extend infinitely? As far as I remember every virtual desktop implementation in X11 WMs were just separate workspaces you could switch between.

              1.  

                The old twm window manager has a big canvas virtual desktop instead of little boxes. Fvwm can be configured either way https://www.fvwm.org/Man/fvwm3/#_the_virtual_desktop

            2.  

              Either I have Stockholm syndrome or this is not really too bad. The Java codebase I work on is 25 years old and also has large areas without test coverage (smaller, recently!), was only updated from Java 8 to 11 last year, and had portions generated with a JavaCC version from the 90s that could no longer be found anywhere. Tens of thousands of IDE warnings are not a big deal. At a certain point once all the original developers leave you start to be concerned about minimizing diffs, to preserve the ability to bisect and find what changes introduced a bug - thus large changes such as reformatting or fixing IDE warnings become untenable. This is a sort of local maxima of development efficiency which can really only be escaped by investing in writing truly staggering quantities of unit tests, so that newly-discovered bugs can be analyzed by writing another test instead of bisecting through changes. And we are doing that, but it just takes a while.

              Me writing all that was just an indicator of getting old, though. I remember coming out of college and hating warnings. They are simple and easy to fix without understanding the codebase, which seems overwhelming by comparison. Fixing all -Wall compiler warnings in a large C++ project was my first task upon joining Azure out of undergrad. Some day all the code you write will, unchanged, become riddled with IDE warnings as language capabilities shift over time. And a new grad will ask how you can tolerate it.

              1.  

                I agree that the codebase isn’t bad for like… a standard Java codebase. However I would have expected that a high-stakes nuclear research facility would have had a better codebase? I’m just surprised that bad code is this pervasive.

                1.  

                  One thing they don’t really teach you in undergrad is that old code is usually good code, even if it doesn’t have tests and comes with loads of compiler warnings. Time, ultimately, is the greatest judge of quality and code that has been battle-tested and found/refined to withstand the rigor of real-world use is good code in almost every way that matters. For empirical support of this sensibility you can look at Google’s RIIR experience reports where they talk about targeting new C++ code for Rust rewrites, because the bug density in old code is much lower.

                  1.  

                    Google’s experience is based on their old code being maintained to a high standard. Google’s old code is not like the kind of old code that developers are scared to change.

                    Old code in C or C++ that has lots of warnings is probably also riddled with undefined behaviour that a compiler upgrade is likely to change from latent bugs into actual bugs.

                    I would expect old Java code to be much less problematic, but on the other hand there must be some reason why upgrading from Java 8 is so difficult. If old code can’t run on new systems it must be approaching the other end of the bathtub curve.

                    1.  

                      battle-tested and found/refined to withstand the rigor of real-world use

                      You are not describing research code. Scientists are generally concerned with publishing their next paper, not with maintaining software which often has no users at all outside the lab. Code is a means to an end, and usually seen as somewhat incidental to the business at hand. Even computer scientists tend to write sloppy, throwaway research prototype code, in my experience.

                    2.  

                      having grown up near such a facility, and knowing how fast and loose they tend to play with radiological materials, let alone perl scripts, i’m utterly unsurprised, but i guess for people without prior exposure it would be a shock, yeah

                      ime most processes in most organizations, code or otherwise, are held together with duct tape and prayer to varying degrees of literalism

                  2.  

                    I think part of the answer to the mystery is,

                    • Netscape’s JS engine recognises <!-- as a comment

                    • as mentioned in the link in the post about wrapping JS in comments, the convention was to write the HTML comment terminator inside a JS comment like //-->, not bare as in the article itself

                    • so Netscape’s JS engine needed a special case for the HTML comment starter so that it was possible to hide scripts from non-JS browsers, but it didn’t need a special case for the HTML comment terminator since it could already be hidden from the JS

                    • the remaining mystery is why bare --> was added to JS nearly two decades later

                    1.  

                      I suppose if you’re not going to parse the script tag contents with an HTML parser anymore, then both the start and end tokens need to be valid tokens in the javascript parser. Otherwise you’d need to parse the script contents with the HTML parser, and then reparse that parsed HTML with the javascript parser.

                      And I expect that is to be able to write a spec that requires switching parsers mid-stream.

                      The interesting aside for me here is that there are probably still old pages that don’t necessarily have the HTML comment end token with all those requirements met.

                    2. 5

                      In this context it’s useful to distinguish between an AST and a concrete syntax tree: a CST includes all the details of spelling, spacing, and commentary that are omitted from an AST.

                      1.  

                        The article does mention the term “concrete syntax tree”:

                        An AST that retains this level of information is sometimes called a “concrete syntax tree”. I do not consider this a useful distinction

                        But I actually disagree with your usage, and the article’s usage! I would say it’s more like:

                        1. Parse tree aka concrete syntax tree
                          • this tree is homogeneously typed, and has the derivation of the node from the grammar.
                          • heuristic: it contains parens like 1 * (2 + 3), but does NOT contain comments.
                          • a CST is created by many parser generators: CPython prior to 3.8, ANTLR, … - namely the ones without semantic actions
                        2. Abstract syntax tree
                          • heterogeneously typed, used to interpreter or compile
                          • heuristic: contains NEITHER parens nor comments - neither is relevant at runtime
                        3. “lossless syntax tree”
                          • used for physical transformations of the code, like formatting and translation
                          • heuristic: contains BOTH parens and comments
                          • used by Clang, and Go

                        I made up the “lossless syntax tree” term based on what I knew about Clang and Go: From AST to Lossless Syntax Tree (2017, wow 8 years ago now)

                        Go was the language that popularized auto-formatting, and Clang came quickly after.

                        So you can quibble with my terminology, but if you Google it, it’s reasonably popular, though by no means standard.

                        Wiki page - https://github.com/oils-for-unix/oils/wiki/Lossless-Syntax-Tree-Pattern


                        All these sources agree that a CST is a synonym for “parse tree”, and contains the derivation:

                        https://en.wikipedia.org/wiki/Parse_tree

                        A parse tree or parsing tree[1] (also known as a derivation tree or concrete syntax tree) is an ordered, rooted tree that represents the syntactic structure of a string according to some context-free grammar.

                        Abstract vs. Concrete Syntax Trees - https://eli.thegreenplace.net/2009/02/16/abstract-vs-concrete-syntax-trees/

                        What’s the difference between parse trees and abstract syntax trees (ASTs)? - great diagram, and cites a book “Compilers and Compiler Generators”


                        I checked the terminology of the Dragon Book specifically for this comment:

                        • “parse tree” appears in the index, but “concrete syntax tree” does not appear in the index
                        • “Syntax tree” refers to “abstract syntax tree”

                        So the Dragon Book is making what I call the CST vs AST distinction, with slightly different terms.

                        Additionally, I make a “lossless syntax tree” distinction.

                        The key difference is that the Dragon book only talks about compiling, not formatting or translating. I think “lossless syntax tree” is useful for talking about what Clang and Go do – they are neither CSTs nor ASTs.

                        Their trees don’t “recapitulate” the grammar – quite the contrary, their parsers are hand-written. But they contain all the physical details of the code – not just parens, but also whitespace and comments.

                      2.  

                        This looks like a solution looking for a problem. I’ve heard lots of things about grpc being absurdly complex. I think it’s cool, but I think I’ll stick with SSH. Thank you very much.

                        1.  

                          They said it pretty clearly

                          This is a demo program to show how to use the gokrazy/rsync module over a gRPC transport.

                          1.  

                            Indeed :) The point is not to get anyone to use rsync over gRPC, but to demonstrate that if you are already working in a corporate (or similar) environment with a landscape of RPC services using gRPC, Thrift, etc., you can now also speak rsync protocol over those channels, if it helps your project.

                          2.  

                            Upstream SSH with the static buffers or hpn-ssh?

                            I just use upstream and accept the inefficiency these days. But if I was doing more performance critical things, it would be an issue.

                            1.  

                              When scp is too slow, use qcp which is designed to do well with high-bandwidth high-latency lossy links.

                          3.  

                            Is it widely known that golang html/template has context-sensitive escaping?

                            i.e. it knows what kind of escaping to perform depending on where in the template the substitution occurs?

                            https://pkg.go.dev/html/template

                            This package understands HTML, CSS, JavaScript, and URIs. It adds sanitizing functions to each simple action pipeline … At parse time each {{.}} is overwritten to add escaping functions as necessary.

                            When I learned this, I was quite surprised/pleased/amazed.

                            Is this feature present in other templating libraries?

                            1.  

                              This language-aware and context-aware safe interpolation is a thing that Google’s templating stuff has been doing for a long time. My link log has this caja document from 2008 and this Google Security blog post from 2009.

                              I remember at the time being surprised and impressed at the huge amount of engineering that must have gone into this work. When the “langsec” slogan emerged a few years later, I thought, oh yes, a principled and generalized approach to that Google templating stuff. It’s still a lot more work than smooshing strings together which is why it never really took off.

                              1.  

                                Thank you! I didn’t realise it was a “google thing” as opposed to a “go thing”.

                                But that still leaves the question open - is this a rare feature in other language’s favourite templating systems?

                                (I briefly looked at jinja and it suggests it will auto-escape >, <, & and ", but not in a context-aware way? This would leave it open to some of the threats described in your link (e.g. a url containing a javascript: scheme. Oh…it looks like react does a (more limited?) version of it browser-side (https://pragmaticwebsecurity.com/articles/spasecurity/react-xss-part1.html), so perhaps it is only rare/unusual on the backend?)

                                which is why it never really took off.

                                But in some sense it has, since it is baked into go stdlib? Or have I misunderstood?

                                1.  

                                  It didn’t take off in the sense that context-oblivious templating is very common and hasn’t been displaced except in contexts where someone has done a lot of engineering work to build a better foundation that’s easier to use.

                                  There are some areas where langsec is doing well, such as binary data. Generic formats like json and cbor help — but the yaml devops world is a disaster. JSX is good.

                                  Langsec approaches to stringy data are very application-specific because they require parsers and serializers for complicated human-oriented languages. So the application has to be popular enough to justify the work — which is true for the web.

                              2.  

                                I think it is (at least for people who’ve taken any look at the documentation) Though it’s also not something that I’d depend on (it’s trivial and not very error-prone to mark it manually, if in the template)

                                1.  

                                  Though it’s also not something that I’d depend on (it’s trivial and not very error-prone to mark it manually, if in the template)

                                  Sorry, I’m not sure I understand.

                                  Do you mean doing the escaping manually (and using a non-escaping template lib - say text/template - to avoid double-quoting instead)? And doing so to avoid relying on the correctness of the context-sensitivity in html/template?

                                  1.  

                                    I meant that I like context-sensitivity, but I could do without by e.g. writing {{ escape_param(blah) }}

                                    1.  

                                      OK, yes. But I think across the industry as a whole it is a significant issue, so many people don’t find it that easy?

                                      https://www.computer.org/publications/tech-news/trends/understand-cross-site-scripting-threats

                              3.  

                                Is this the first big Go project at Microsoft? I had no idea they could choose Go, I would assume that it just wouldn’t be acceptable, because they wouldn’t want to unnecessarily legitimize a Google project.

                                1. 12

                                  Their web browser runs on Chromium, I think it’s a little late to avoid legitimizing Google projects

                                  1.  

                                    They had already lost the web platform by the time they switched to Chromium.

                                    When I think of MS and ‘developers developers developers develop’, I was lead to believe that they’d have more pride in their own development platforms.

                                    I wonder if the switch to Chromium, the embrace of Linux in Azure, the move from VS to VSCode, any of these, would have happened under Ballmer or Gates. I suspect Gates’ new book is like Merkel’s: retrospective but avoiding the most controversial questions: Russian gas and giving up dogfooding. Am I foolish to expect a more critical introspection?

                                  2. 11

                                    I think corporate open source is now “proven”, and companies don’t worry so much about this kind of thing.

                                    Google has used TypeScript for quite awhile now (even though they had their own Closure compiler, which was not staffed consistently, internally)

                                    All of these companies are best thought of as “frenemies” :) They cooperate in some ways, and compete in others.

                                    They have many common interests, and collaborate on say lobbying the US govt


                                    When Google was small, and Microsoft was big, the story was different. They even had different employee bases. But now they are basically peers, and employees go freely back and forth

                                    1. 6

                                      Microsoft has actually maintained a security-hardened build of Go for a while now: https://devblogs.microsoft.com/go/

                                      1. 5

                                        I can see a blog post saying they made changes for FIPS compliance, which is usually understood to be for satisfying government requirements and not a security improvement. I can’t see a summary of any other changes.

                                      2.  

                                        MS is a pretty big company, so I wouldn’t be surprised if they have a ton of internal services in Go. Plus with Azure being a big part of their business, I doubt they care what language people use as long as you deploy it on their platforms.

                                        1.  

                                          Can confirm that some internal services were written in go as of a couple years ago. (I worked for Microsoft at the time.) I didn’t have a wide enough view to know if it was “a ton,” though.

                                          1.  

                                            I’m too lazy to look up a source but some of the error messages I’ve seen in azure make it clear Microsoft uses go to build their cloud.

                                            Message like this one:

                                            dial tcp 192.168.1.100:3000: connect: timeout
                                            

                                            Keep in mind Microsoft also maintains their own go fork for fips compliance. I don’t know exactly their use case but I can tell you at Google we did the same thing for the same reason. I think the main fips compliant go binaries being shipped were probably gke/k8s related.

                                            (Edit removed off topic ramble, sorry I have ADHD)

                                            1.  

                                              dial tcp 192.168.1.100:3000: connect: timeout

                                              Could this message just be coming from Kubernetes?

                                          2. 83

                                            The TypeScript dev lead posted this response about the language choice on Reddit, for anyone who’s curious:

                                            (dev lead of TypeScript here, hi!)

                                            We definitely knew when choosing Go that there were going to be people questioning why we didn’t choose Rust. It’s a good question because Rust is an excellent language, and barring other constraints, is a strong first choice when writing new native code.

                                            Portability (i.e. the ability to make a new codebase that is algorithmically similar to the current one) was always a key constraint here as we thought about how to do this. We tried tons of approaches to get to a representation that would have made that port approach tractable in Rust, but all of them either had unacceptable trade-offs (perf, ergonomics, etc.) or devolved in to “write your own GC”-style strategies. Some of them came close, but often required dropping into lots of unsafe code, and there just didn’t seem to be many combinations of primitives in Rust that allow for an ergonomic port of JavaScript code (which is pretty unsurprising when phrased that way - most languages don’t prioritize making it easy to port from JavaScript/TypeScript!).

                                            In the end we had two options - do a complete from-scrach rewrite in Rust, which could take years and yield an incompatible version of TypeScript that no one could actually use, or just do a port in Go and get something usable in a year or so and have something that’s extremely compatible in terms of semantics and extremely competitive in terms of performance.

                                            And it’s not even super clear what the upside of doing that would be (apart from not having to deal with so many “Why didn’t you choose Rust?” questions). We still want a highly-separated API surface to keep our implementation options open, so Go’s interop shortcomings aren’t particularly relevant. Go has excellent code generation and excellent data representation, just like Rust. Go has excellent concurrency primitives, just like Rust. Single-core performance is within the margin of error. And while there might be a few performance wins to be had by using unsafe code in Go, we have gotten excellent performance and memory usage without using any unsafe primitives.

                                            In our opinion, Rust succeeds wildly at its design goals, but “is straightforward to port to Rust from this particular JavaScript codebase” is very rationally not one of its design goals. It’s not one of Go’s either, but in our case given the way we’ve written the code so far, it does turn out to be pretty good at it.

                                            Source: https://www.reddit.com/r/typescript/comments/1j8s467/comment/mh7ni9g

                                            1. 78

                                              And it’s not even super clear what the upside of doing that would be (apart from not having to deal with so many “Why didn’t you choose Rust?” questions)

                                              People really miss the forest for the trees.

                                              I looked at the repo and the story seems clear to me: 12 people rewrote the TypeScript compiler in 5 months, getting a 10x speed improvement, with immediate portability to many different platforms, while not having written much Go before in their lives (although they are excellent programmers).

                                              This is precisely the reason why Go was invented in the first place. “Why not Rust?” should not be the first thing that comes to mind.

                                              1. 9

                                                I honestly do think the “Why not Rust?” question is a valid question to pop into someone’s head before reading the explanation for their choice.

                                                First of all, if you’re the kind of nerd who happens to follow the JavaScript/TypeScript dev ecosystem, you will have seen a fair number of projects either written, or rewritten, in Rust recently. Granted, some tools are also being written/rewritten in other languages like Go and Zig. But, the point is that there’s enough mindshare around Rust in the JS/TS world that it’s fair to be curious why they didn’t choose Rust while other projects did. I don’t think we should assume the question is always antagonistic or from the “Rust Evangelism Strike Force”.

                                                Also, it’s a popular opinion that languages with algebraic data types (among other things) are good candidates for parsers and compilers, so languages like OCaml and Rust might naturally rank highly in languages for consideration.

                                                So, I honestly had the same question, initially. However, upon reading Anders’ explanation, I can absolutely see why Go was a good choice. And your analysis of the development metrics is also very relevant and solid support for their choice!

                                                I guess I’m just saying, the Rust fanboys (myself, included) can be obnoxious, but I hope we don’t swing the pendulum too far the other way and assume that it’s never appropriate to bring Rust into a dev conversation (e.g., there really may be projects that should be rewritten in Rust, even if people might start cringing whenever they hear that now).

                                                1.  

                                                  While tweaking a parser / interpreter a few years ago written in Go, I specifically replaced a struct with an ‘interface {}’ in order to exercise its pseudo-tagged-union mechanisms. Together with using type-switch form.

                                                  https://github.com/danos/yang/commit/c98b220f6a1da7eaffbefe464fd9e734da553af0

                                                  These day’s I’d actually make it a closed interface such that it is more akin to a tagged-union. Which I did for another project which was passing around instances of variant-structs (i.e. a tagged union), rather than building an AST.

                                                  So it is quite possible to use that pattern in Go as a form of sum-type, if for some reason one is inclined to use Go as the implementation language.

                                              2. 34

                                                That is great explanation of “Why Go and not Rust?”

                                                If you’re looking for “Why Go and not AOT-compiled C#?” see here: https://youtu.be/10qowKUW82U?t=1154s

                                                A relevant quote is that C# has “some ahead-of-time compilation options available, but they’re not on all platforms and don’t really have a decade or more of hardening.”

                                                1. 8

                                                  That interview is really interesting, worth watching the whole thing.

                                                  1. 9

                                                    Yeah Hjelsberg also talks about value types being necessary, or at least useful, in making language implementations fast

                                                    If you want value types and automatically managed memory, I think your only choices are Go, D, Swift, and C# (and very recently OCaml, though I’m not sure if that is fully done).

                                                    I guess Hjelsberg is conceding that value types are a bit “second class” in C#? I think I was surprised by the “class” and “struct” split, which seemed limiting, but I’ve never used it. [1]

                                                    And that is a lesson learned from the Oils Python -> C++ translation. We don’t have value types, because statically typed Python doesn’t, and that puts a cap on speed. (But we’re faster than bash in many cases, though slower in some too)


                                                    Related comment about GC and systems languages (e.g. once you have a million lines of C++, you probably want GC): https://lobste.rs/s/gpb0qh/garbage_collection_for_systems#c_rrypks

                                                    Now that I’ve worked on a garbage collector, I see a sweet spot in languages like Go and C# – they have both value types deallocated on the stack and GC. Both Java and Python lack this semantic, so the GCs have to do more work, and the programmer has less control.

                                                    There was also a talk that hinted at some GC-like patterns in Zig, and I proposed that TinyGo get “compressed pointers” like Hotspot and v8, and then you would basically have that:

                                                    https://lobste.rs/s/2ah6bi/programming_without_pointers#c_5g2nat


                                                    [1] BTW Guy Steele’s famous 1998 “growing a language” actually advocated value types in Java. AFAIK as of 2025, “Project Valhalla” has not landed yet

                                                    1. 5

                                                      and very recently OCaml, though I’m not sure if that is fully done

                                                      Compilers written in OCaml are famous for being super-fast. See eg OCaml itself, Flow, Haxe, BuckleScript (now ReScript).

                                                      1.  

                                                        If you want value types and automatically managed memory, I think your only choices are Go, D, Swift, and C#

                                                        Also Nim.

                                                        1.  

                                                          Also Julia.

                                                          There surely are others.

                                                          1.  

                                                            Yes good points, I left out Nim and Julia. And apparently Crystal - https://colinsblog.net/2023-03-09-values-and-references/

                                                            Although thinking about it a bit more, I think Nim, Julia, (and maybe Crystal) are like C#, in that they are not as general as Go / D / Swift.

                                                            You don’t have a Foo* type as well as a Foo type, i.e. the layout is orthogonal to whether it’s a value or reference. Instead, Nim apparently has value objects and reference objects. I believe C# has “structs” for values and classes for references.

                                                            I think Hjelsberg was hinting at this category when saying Go wins a bit on expressiveness, and it’s also “as close to native as you can get with GC”.


                                                            I think the reason this Go’s model is uncommon is because it forces the GC to support interior pointers, which is a significant complication (e.g. it is not supported by WASM GC). Go basically has the C memory model, with garbage collection.

                                                            I think C#, Julia, and maybe Nim/Crystal do not support interior pointers (interested in corrections)


                                                            Someone should write a survey of how GC tracing works with each language :) (Nim’s default is reference counting without cycle collection.)

                                                            1.  

                                                              Yeah that’s interesting. Julia has a distinction between struct (value) and mutable struct (reference). You can use raw pointers but safe interior references (to an element of an array for example) include a normal reference to the (start of the) backing array, and the index.

                                                              I can understand how in Rust you can safely have an interior pointer as the borrow checker ensures a reference to an array element is valid for its lifetime (the array can’t be dropped or resized before the reference is dropped). I’m very curious - I would like to understand how Go’s tracing GC works with interior pointers now! (I would read such a survey).

                                                              1.  

                                                                Ok - Go’s GC seems to track a memory span for each object (struct or array), stored in kind of a span tree (interval tree) for easy lookup given some pointer to chase. Makes sense. I wonder if it smart enough to deallocate anything dangling from non-referenced elements of an array / fields of a struct, or just chooses to be conservative (and if so do users end up accidentally creating memory leaks very often)? What’s the performance impact of all of this compared to runtimes requiring non-interior references? The interior pointers themselves will be a performance win, at the expense of using an interval tree during the mark phase.

                                                                https://forum.golangbridge.org/t/how-gc-handles-interior-pointer/36195/5

                                                        2.  

                                                          It’s been a few years since I’ve written any Go, but I have a vague recollection that the difference between something being heap or stack allocated was (sometimes? always?) implicit based on compiler analysis of how you use the value. Is that right? How easy it, generally, to accidentally make something heap-allocated and GC’d?

                                                          That’s the only thing that makes me nervous about that as a selling point for performance. I feel like if I’m worried about stack vs heap or scoped vs memory-managed or whatever, I’d probably prefer something like Swift, Rust, or C# (I’m not familiar at all with how D’s optional GC stuff works).

                                                          1.  

                                                            Yes, that is a bit of control you give up with Go. Searching for “golang escape analysis”, this article is helpful:

                                                            https://medium.com/@trinad536/escape-analysis-in-golang-fc81b78f3550

                                                            $ go build -gcflags "-m" main.go
                                                            
                                                            .\main.go:8:14: *y escapes to heap
                                                            .\main.go:11:13: x does not escape
                                                            

                                                            So the toolchain is pretty transparent. This is actually something I would like for the Oils Python->C++ compiler, since we have many things that are “obviously” locals that end up being heap allocated. And some not so obvious cases. But I think having some simple escape analysis would be great.

                                                            1.  

                                                              Yes, the stack/heap distinction is made by the compiler, not the programmer, in Go.

                                                            2.  

                                                              Why did you leave JS/TS off the list? They seem to have left it off too and that confuses me deeply because it also has everything they need

                                                              1. 5

                                                                Hejlsberg said they got about 3x performance from native compilation and value types, which also halved the memory usage of the compiler. They got a further 3x from shared-memory multithreading. He talked a lot about how neither of those are possible with the JavaScript runtime, which is why it wasn’t possible to make tsc 10x faster while keeping it written in TypeScript.

                                                                1.  

                                                                  Yeah but I can get bigger memory wins while staying inside JS by sharing the data structures between many tools that currently hold copies of the same data: the linter, the pretty-formatter, the syntax highlighter, and the type checker

                                                                  I can do this because I make my syntax tree nodes immutable! TS cannot make their syntax tree nodes immutable (even in JS where it’s possible) because they rely on the node.parent reference. Because their nodes are mutable-but-typed-as-immutable, these nodes can never safely be passed as arguments outside the bounds of the TS ecosystem, a limitation that precludes the kind of cross-tool syntax tree reuse that I see as being the way forward

                                                                  1.  

                                                                    Hejlsberg said that the TypeScript syntax tree nodes are, in fact, immutable. This was crucial for parallelizing tsgo: it parses all the source files in parallel in the first phase, then typechecks in parallel in the second phase. The parse trees from the first phase are shared by all threads in the second phase. The two phases spread the work across threads differently. He talks about that kind of sharing and threading being impractical in JavaScript.

                                                                    In fact he talks about tsc being designed around immutable and incrementally updatable data structures right from the start. It was one of the early non-batch compilers, hot on the heels of Roslyn, both being designed to support IDEs.

                                                                    Really, you should watch the interview https://youtu.be/10qowKUW82U

                                                                    AIUI a typical LSP implementation integrates all the tools you listed so they are sharing a syntax tree already.

                                                                    1.  

                                                                      It’s true that I haven’t watched the interview yet, but I have confirmed with the team that the nodes are not immutable. My context is different than Hejlsberg’s context. For Hejlsberg if something is immutable within the boundaries of TS, it’s immutable. Since I work on JS APIs if something isn’t actually locked down with Object.freeze it isn’t immutable and can’t safely be treated as such. They can’t actually lock their objects down because they don’t actually completely follow the rules of immutability, and the biggest thing they do that you just can’t do with (real, proper) immutable structures is have a node.parent reference.

                                                                      So they have this kinda-immutable tech, but those guarantees only hold if all the code that ever holds a reference to the node is TS code. That is why all this other infrastructure that could stand to benefit from a shared standard format for frozen nodes can’t: it’s outside the walls of the TS fiefdom, so the nodes are meant to be used as immutable but any JS code (or any-typed code) the trees are ever exposed to would have the potential to ruin them by mutating the supposedly-immutable data

                                                                      1.  

                                                                        To be more specific about the node.parent reference, if your tree is really truly immutable you need to replace a leaf node you must replace all the nodes on the direct path from the root to that leaf. TS does this, which is good.

                                                                        The bad part is that then all the nodes you didn’t replace have chains of node.parent references that lead to the old root instead of the new one. Fixing this with immutable nodes would mean replacing every node in the tree, so the only alternative is to mutate node.parent, which means that 1) you can’t actually Object.freeze(node) and 2) you don’t get all the wins of immutability since the old data structure is corrupted by the creation of the new one.

                                                                        1.  

                                                                          See https://ericlippert.com/2012/06/08/red-green-trees/ for why Roslyn’s key innovation in incremental syntax trees was actually breaking the node.parent reference by splitting into the red and green trees, or as I call them paths and nodes. Nodes are deeply immutable trees and have no parents. Paths are like an address in a particular tree, tracking a node and its parents.

                                                            3. 7

                                                              You are not joking, just the hack to make type checking itself parallel is well worth an entire hour!

                                                              1. 10

                                                                Hm yeah it was a very good talk. My summary of the type checking part is

                                                                1. The input to the type checker is immutable ASTs
                                                                  • That is, parsing is “embarassingly parallel”, and done per file
                                                                2. They currently divide the program into 4 parts (e.g. 100 files turns into 4 groups of 25 files), and they do what I’d call “soft sharding”.

                                                                That is, the translation units aren’t completely independent. Type checking isn’t embarassingly parallel. But you can still parallelize it and still get enough speedup – he says ~3x from parallelism, and ~3x from Go’s better single core perf, which gives you ~10x overall.

                                                                What wasn’t said:

                                                                • I guess you have to de-duplicate the type errors? Because some type errors might come twice, since you are duplicating some work
                                                                • Why the sharding is in 4 parts, and not # CPUs. Even dev machines have 8-16 cores these days, and servers can have 64-128 cores.

                                                                I guess this is just because, empirically, you don’t get more than 3x speedup.

                                                                That is interesting, but now I think it shows that TypeScript is not designed for parallel type checking. I’m not sure if other compilers do better though, like Rust (?) Apparently rustc uses the Rayon threading library. Though it’s hard to compare, since it also has to generate code


                                                                A separate thing I found kinda disappointing from the talk is that TypeScript is literally what the JavaScript code was. There was never a spec and will never be one. They have to do a line-for-line port.

                                                                There was somebody who made a lot of noise on the Github issue tracker about this, and it was basically closed “Won’t Fix” because “nobody who understands TypeScript well enough has enough time to work on a spec”. (Don’t have a link right now, but I saw it a few months ago)

                                                                1.  

                                                                  Why the sharding is in 4 parts, and not # CPUs. Even dev machines have 8-16 cores these days, and servers can have 64-128 cores.

                                                                  Pretty sure he said it was an arbitrary choice and they’d explore changing it. The ~10x optimization they’ve gotten so far is enough by itself to keep the project moving. Further optimization is bound to happen later.

                                                                  1.  

                                                                    I’m not sure if other compilers do better though, like Rust (?) Apparently rustc uses the Rayon threading library.

                                                                    Work has been going on for years to parallelize rust’s frontend, but it apparently still has some issues, and so isn’t quite ready for prime time just yet, though it’s expected to be ready in the near term.

                                                                    Under 8 cores and 8 threads, the parallel front end can reduce the clean build (cargo build with -Z threads=8 option) time by about 30% on average. (These crates are from compiler-benchmarks of rustc-perf)

                                                                    1.  

                                                                      I guess this is just because, empirically, you don’t get more than 3x speedup.

                                                                      In my experience, once you start to do things “per core” and want to actually get performance out of it, you end up having to pay attention to caches, and get a bit into the weeds. Given just arbitrarily splitting up the work as part of the port has given a 10x speed increase, it’s likely they just didn’t feel like putting in the effort.

                                                                    2.  

                                                                      Can you share the timestamp to the discussion of this hack, for those who don’t have one hour?

                                                                      1.  

                                                                        I think this one: https://www.youtube.com/watch?v=10qowKUW82U&t=2522s

                                                                        But check the chapters, they’re really split into good details. The video is interesting anyway, technically focused, no marketing spam. I can also highly recommend watching it.

                                                                  2. 5

                                                                    Another point on “why Go and not C#” is that, he said, their current (typescript) compiler is highly functional, they use no classes at all. And Go is “just functions and data structures”, where C# has “a lot of classes”. Paraphrasing a little, but that’s roughly what he said.

                                                                  3. 8

                                                                    They also posted a (slightly?) different response on GitHub: https://github.com/microsoft/typescript-go/discussions/411

                                                                    1.  

                                                                      Acknowledging some weak spots, Go’s in-proc JS interop story is not as good as some of its alternatives. We have upcoming plans to mitigate this, and are committed to offering a performant and ergonomic JS API.

                                                                      Yes please!

                                                                  4. 6

                                                                    Hang on… by “native” they mean “in Go”? Interesting choice!

                                                                    1. 19

                                                                      I imagine it’s easier to port their codebase when they don’t have to deal with adding memory management. Concerns like ownership and reference cycles can affect the way you design the data structures, and make it difficult to preserve an architecture that wasn’t designed that way.

                                                                      I’m kind of hoping that as a result of this project they come up with a TS-to-Go transpiler. That would make TS close to being my dream language.

                                                                      1. 8

                                                                        Hejlsberg addresses the question of a native TypeScript runtime towards the end of this interview https://youtu.be/10qowKUW82U?t=3280s

                                                                        He talks a bit about the difficulties of JavaScript’s object model. If you implement that model according to the JavaScript spec in a simple manner in Golang, the performance will be terrible. There are a lot of JS performance tricks that depend on being able to JIT.

                                                                        What might be amusing is an extra-fancy-types frontend for Golang, that adds TypeScript features to Golang that the TypeScript developers want to use when writing tsgo.

                                                                        1.  

                                                                          he also mentions in there about the syntax-level ts-to-go transpiler they wrote, I don’t know the timestamp though

                                                                      2. 7

                                                                        I’m surprised that a group inside Microsoft that’s presumably led by the creator of C# (author of the post) chose Go. Not because I think C# would have been better for a TypeScript compiler, but because I would have guessed C# AOT would have been the default (even if just for intellectual property reasons) and they had good reasons to use something else.

                                                                        Did they prefer Go’s tooling? Was it Go’s (presumably) smaller runtime? Maybe just Go is more mature for AOT (since it was always AOT)?

                                                                        1. 5

                                                                          Good guesses! Kerrick’s comment links to a video interview which addresses this.

                                                                        2.  

                                                                          I was slightly surprised as well. They may have been influenced by esbuild (see Evan Wallace’s summary on why he went with Go over Rust). They may even be reusing some code from or integrating with esbuild in some way, though it doesn’t seem likely to me. My personal preference would lean toward Rust for something like this but I can see why they’d use a native GC’d language.

                                                                          1.  

                                                                            I think if they had the experience and time, they may have chose Rust, but they wanted to deliver something in under a year with no knowledge for performance gains now.

                                                                        3. 2

                                                                          Great job!

                                                                          I hope this isn’t too off topic but it’s on queues for rails.

                                                                          Did anyone switch from Sidekiw to SolidQueue and can share the experience?

                                                                          I am curious because while sidekiq is nice I do think that with the limited scope that such systems have having to buy the for-pay features is something that might not be feasible for small or non profit projects when they simply are relying heavily on queues because of the application.

                                                                          1. 3

                                                                            In general running your queue on the same database as your app is gonna be a bad time, performance wise. Now, solid queue can go on a different database on the same machine (SQLite) but usually the database is the first thing that needs to be scaled and competing with it for resources is also not a great idea. If you run it, I would run it on a different machine and different than your main database. At that point you have to pay for another box/service for your queue datastore anyway.

                                                                            My preference has been to stick with Sidekiq.

                                                                            1.  

                                                                              In general running your queue on the same database as your app is gonna be a bad time, performance wise.

                                                                              Could you go into that a more?

                                                                              Sounds a bit like something breaking down quickly. We are talking about relatively simple tables, with relatively simple queries. Stuff where I’d expect that more complex applications would potentially dozens of requests (or alternatively somewhat complex joins and conditions, queries), so it feels like a query that just locks the next thing shouldn’t really carry weight overall.

                                                                              usually the database is the first thing that needs to be scaled

                                                                              That also really depends on the application. Both on doing silly things in application code and doing silly things regarding database query. Doing a DB server with let’s say 1 TB NVMe and 128 GiB and an identical secondary for around ~150 USD can get you really far, even if you do a bit more complex stuff (eg. on-demand GIS with measuring, manipulating and relating polygons) even if you don’t invest a lot of effort into making everything super efficient. And having a couple of hundred thousand entries for your queue doesn’t sound like it should be that big of a deal. And if it is then just use a separate DB for your queue?

                                                                              So in other words: Yes, you are technically competing for resources, but we are talking about a pretty easy task for any DB?

                                                                              1.  

                                                                                Yeah it shouldn’t be a problem, but the database needs to be operated properly like having the right indexes and using SKIP LOCKED for queue ops.

                                                                                https://web.archive.org/web/20160416164956/https://blog.2ndquadrant.com/what-is-select-skip-locked-for-in-postgresql-9-5/

                                                                                (Sadly EnterpriseDB have fucked the 2nd Quadrant blog.)

                                                                                1.  

                                                                                  Yes, but that’s Sidekiq’s or SolidQueue’s job.

                                                                                  Ages ago built my own queues, both in Redis and PG.

                                                                                2.  

                                                                                  I work for Heroku where I get performance tickets and I’ve seen hundreds, if not thousands of Rails apps in a performance context. I also am on the Nate Berkopec performance slack. And database load is pretty much the primary bottleneck for most rails apps.

                                                                                  It’s not about “complex” or “easy” or not, it’s about load and locking. Congestion.

                                                                                  And if it is then just use a separate DB for your queue?

                                                                                  I’m not sure why there’s a question mark. That was a recommendation in my comment. The next logical step though was…if you’re going to use a different data store anyway, why not just use redis/keyval which has persistent queue data structures.

                                                                                  One weird thing though: it’s surprisingly hard to migrate off of a queue backend (or can be). It’s easier than migrating from MySQL to PostgreSQL, but perhaps harder than people realize.

                                                                                  We use delayedjob for a legacy but important app, and without realizing it we’re relying on database transaction semantics for some critical behavior. The effort and cost to identify and replicate all that behavior that would allow us to move to a different queue backend isn’t justifiable, so we are kinda just…stuck with it. At least for now, on that one app.

                                                                                  Even migrating from a queue in the same DB to a different DB could yield slightly different behavior. Not saying it can’t be done or that it will be prohibitively difficult or expensive, but rather: make a plan for how you will possibly scale your database and queue systems in the future now.

                                                                                  For me: I choose Sidekiq and keyval/redis. In addition to what I’ve already mentioned, I know I can get support if required and I know it won’t be community abandonware like resque or webpacker. I also have a personal relationship with Mike and hang with him at confs and he’s generally accessible online and active in the communities. Mentioning both for disclosure and to contrast with Dave, who isn’t accessible unless you’re in an inner circle.

                                                                                  1.  

                                                                                    I work for Heroku where I get performance tickets and I’ve seen hundreds, if not thousands of Rails apps in a performance context. I also am on the Nate Berkopec performance slack. And database load is pretty much the primary bottleneck for most rails apps.

                                                                                    Are you able to provide more insight there?

                                                                                    In my experience say you take a DB setup like to above for ~150 USD (or 76 without a replica - in other words we will only use the replica for fail over, no reads from it) and with lets say 10k active users a day, let’s say that gives you an average of 1k DB queries/second on average with let’s say 10k DB queries/second in more peak time. In my experience for relatively basic CRUD apps with little optimization and somewhat idiomatic Rails code throwing out JSON for web and mobile apps to ingest will be barely noticeable in terms of DB load. Even if you don’t optimize it. Even if you turn logging up for every statement for monitoring. Even if you use something like ZFS so you can make point in time snapshots. Yet your Rails application will start to struggle in peak times, eating up CPU.

                                                                                    But I have to say while I work on a Rails app right now that is way bigger than the above I am certainly not an expert when it comes to Rails or even Ruby.

                                                                                    1.  

                                                                                      Here’s an article that shows load average on basically the smallest crud app you can think of, and it’s also open source. In this case the article walks through finding and eliminating a bug/problem, but it shows the correlation between usage and load to help build up intuition.

                                                                                      https://schneems.com/2017/07/18/how-i-reduced-my-db-server-load-by-80/

                                                                                      1.  

                                                                                        That’s 6 queries per minute per user on average, which seems like A LOT. How many queries per page load is that?

                                                                                        1.  

                                                                                          Let’s see

                                                                                          • You do one query for authentication.
                                                                                          • You get something related to the resource, to authenticate the resource (if you use any kind of ACL mechanism like cancancan), might be two queries if it’s for example that being based on relationships to other users
                                                                                          • You get the thing that the endpoint is about
                                                                                          • It might be something more involved where you end up doing additional queries

                                                                                          That’s already at least 3-4 queries. And that’s the most basic one request can be if you separate authentication and authorization into their own things.

                                                                                          Now think about the fact that oftentimes a page load does more than one ajax request. So you have a multitude of them. Eg. you might have whatever the page is about, then in addition checking something like notification, some inbox. You might have a general endpoint for the current user profile, settings, etc. You might in some situations do non user triggered updates, for example to mark stuff as read or similar. In addition if it’s something social you might have something to give suggestions. etc. So the first page load have multiple of these 3-4 queries. Sometimes you also want to do some more generic calls. For example a geoip call that might still be authenticated or simply load some news that might still be authenticated or use it based on the user, so again you have to do the authentication part. In addition sometimes stuff ends up in the queue, to come back to topic. So that ID there has to be queried again. Sometimes you interact with the outside world, so you query, the communicate, then query again. Sometimes stuff is callbacks. Sometimes requests fail and are retried (again, mobile app, so network quality varies, it’s an outdoorsy app). Then you have some caching queries and since it’s Rails there is the classic touch: true association.

                                                                                          If you then go further like having activities, you might do stuff where user uploads something, then uploads pictures individually so that it’s not one big request, but if you are a mobile app multiple so it’s easier in terms of retrying when only one changes (think wifi-cell switch).

                                                                                          Now if your site is more complex you multiply those 3 with even more. Very different of course if you just use Rails directly for rendering. But if you have endpoints for a lot of features that stuff of course adds up, when they are queried independently, so the client can compose it.

                                                                                          So overall you end up with a bigger amount of initial queries (because many AJAX calls) on first page load and then reduce it. Same for when the app is opened. Couple of initial queries, that often mostly do the authentication part and then their main thing.

                                                                                          Just checked. The initial page load for a logged in user is 12 AJAX requests. All with at least Authentication -> ACL -> main request(s). Could be optimized for sure.

                                                                                          But then again, as mentioned the 75 USD/month primary DB is pretty bored, as it should it. So the need to improve there hasn’t yet arisen. The additional Rails instances are also mostly therefor failover. Compared to so many other things the infrastructure cost is pretty much irrelevant. Pretty much every few minutes of unnecessary meetings covers the DB costs for months.

                                                                                          All of that is without any requests not initiated by users directly. Think of accepting callbacks (payment, etc.), reporting, monitoring, altering, etc. related queries regarding the current state of the DB. Or status endpoints that trigger requests that are pulled in regular intervals 24/7. Not all of these go through Rails of course. A lot of them go either directly to the DB or through something else. And it feels like the queue would be nothing more than one of these small side things.

                                                                                          I hope that clarifies things a bit. :)

                                                                              2. 4

                                                                                This is nice and old-school.

                                                                                My static site is buckling under the strain of ai scrapers. I wonder when that problem will need addressing at this entry level.

                                                                                but you don’t want to permaban the Google search bot!

                                                                                I’m sorely tempted sometimes. I’m not convinced Google remain relevant in search anymore.

                                                                                1.  

                                                                                  I’ve banned Googlebot on my personal site, though only through robots.txt and not the user-agent sniffing I use for other bots. I don’t really care about it being searchable anymore.

                                                                                  1. 1

                                                                                    If Google isn’t relevant than what is?

                                                                                    1. 7

                                                                                      Decentralised social media, rss feeds, and stuff built around those things to aid in discovery. Things like Lobsters! But it depends on what you want.

                                                                                      1. 4

                                                                                        Many (most?) of the non-Google search engines use Bing as their index.

                                                                                    2. 7

                                                                                      This reminded me about a throwaway paragraph in the Signal crypto review (previously):

                                                                                      It would be much simpler if Signal adopted something like RFC 9420 (Messaging Layer Security), but MLS doesn’t provide the metadata resistance that Signal prioritized.

                                                                                      I wonder what metadata resistance Signal offers that Wire, through its use of MLS, doesn’t?

                                                                                      1. 5

                                                                                        The metadata resistance of signal is largely mythical anyway since the necessarily have the metadata via other channels and just pinky promise not to look or store it

                                                                                        1. 2

                                                                                          A source for this claim would be appreciated.

                                                                                          1. 7

                                                                                            You can derive it from necessity if you like. Signal server sees the message come in over a network connection from an app. The server must be able to deliver it to a target user. This is the metadata. That the message data on the wire doesn’t contain this metadata doesn’t prevent the server from knowing it, it must know it in order to function at all. Signal has never claimed otherwise they only claim that the server forgets right away. But of course that must be taken on trust

                                                                                            1. 2

                                                                                              At best, that associates two IP addresses… not withstanding CGNAT, VPNs, MASQUE, and friends.

                                                                                              But it doesn’t associate them with accounts / contacts. That’s a stronger guarantee than Matrix or XMPP. It may also be a stronger guarantee than Wire?

                                                                                              1. 5

                                                                                                But it doesn’t associate them with accounts / contacts.

                                                                                                That isn’t true. Signal messages need to be routed by account identifier, an IP address is not sufficient. And unless you have the “sealed sender” feature turned on, messages identify their senders.

                                                                                                There’s no mechanism for the Signal server to know the IP addresses of iOS clients because an iOS device only maintains one persistent connection to Apple for notifications. There’s no way a Signal client can keep track of the IP addresses of its contacts, because it isn’t a mesh network, it’s a star. Even for non-iOS devices, an IP address isn’t sufficient to identify a client because (for example) there are multiple clients in our house and our house has only one IP address.

                                                                                                1. 3

                                                                                                  Sealed sender is enabled by default, no?

                                                                                                  1. 2

                                                                                                    So it is. As far as I can tell the official documentation for the feature is still this blog post https://signal.org/blog/sealed-sender/ which makes it sound like the feature is incomplete, but the last few paragraphs say they were (in 2018) rolling it out to everyone so I guess the preview was actually the main event.

                                                                                                    1.  

                                                                                                      I just checked in settings. There’s only “show when it’s used” and “allow for even unknown senders” preferences for me, which makes me conclude that it’s already enabled by default and can not be disabled.

                                                                                                  2.  

                                                                                                    Sealed sender is also not a good protection if Signal was to actually start keeping logs. There are two sources of metadata leakage with sealed sender:

                                                                                                    1. You need to acquire a sender certificate before you can use sealed sender. If you do this from the same IP as you later use when sending a message, your IP and your identity can be linked.

                                                                                                    2. When you send a message, the receiver sends a delivery notice back to you. This is a simple correlation, a sealed message to Person A on IP address X from IP address Y is immediately followed by a sealed message from IP address X to Person B on IP address Y.

                                                                                                    1. 0

                                                                                                      Yes, and if you do have Sealed Sender turned on, the only metadata left on the server that’s needed for message delivery is a 96-bit “delivery token” derived from a “profile key” that conveniently rotates whenever you block an account.

                                                                                                      1.  

                                                                                                        My reading of the description of sealed sender is that the delivery token is used check that the sender is allowed to send to the recipient – it’s an anti-abuse mechanism. It is used when the server is deciding whether to accept a message, it isn’t used to decide where to deliver the message.

                                                                                                        1.  

                                                                                                          I was going off the above-linked blog post that dives into the Signal internals.

                                                                                                          1.  

                                                                                                            That is not my reading of the server code for either single or multi-recipient messages. And Signal iOS at least seems to use sealed sender by default, though it falls back to unsealed send if there’s an auth failure, which seems bad. (so the server can force the client to identify itself? … but I also can’t find anywhere that throws RequestMakerUDAuthError.udAuthFailure, so maybe it’s dead code…)

                                                                                                            But I admit it’s a very casual reading of the code!

                                                                                                            edit: found it!

                                                                                                      2. 3

                                                                                                        To say what sibling says in a different way, the connection the message is delivered to the server over must be authenticated. If it weren’t the server would not accept the message, due to spam reasons etc. so the server knows the account of the sender. And it needs to know the account of the receiver for delivery to be possible

                                                                                                        1.  

                                                                                                          I strongly suspect you’ve misunderstood how Signal works. What do you think about https://soatok.blog/signal-crypto-review-2025-part-8/, specifically the addendum section?

                                                                                                          1.  

                                                                                                            That article specifically admits this is true. Signal doesn’t choose to write it down (assuming the published code is what they run) which means it cannot be recovered after the fact (if you trust the server to not have recorded this) of course any other operator could also not write this down and one could choose to trust that operator. It’s not specific to signal really.

                                                                                                            1.  

                                                                                                              I believe we agree that the server must know the recipient of a message. I believe we disagree about whether the server needs to know the sender of a message.

                                                                                                              Erm, so what do you mean by authenticated?

                                                                                                              That article notes the sender’s metadata is (e2e) encrypted. The server accepts and routes messages whose envelope includes a delivery token. And, similarly, that delivery token is shared via e2e encrypted sessions to all a recipient’s contacts.

                                                                                                              It’s unclear to me how unknown senders / randos are handled, however. I haven’t read that deep into the code.

                                                                                                      3. 2

                                                                                                        Sure, that’s fair.
                                                                                                        But I was hoping your claim was more substantial than just this, since, as since child comment below says, almost all signal competitors suffer from this.

                                                                                                        1. 2

                                                                                                          Not just almost all. It is fundamentally impossible for a communications system to operate if whoever does the routing doesn’t know sender and receiver identity at some point (and send/receive time, which is also metadata)

                                                                                                          If you do onion routing you could make it so only one part knows sender and one part knows receiver, which is how the remailer network worked but that’s the only instance I’m aware of doing that. Everyone else has the metadata and it’s just various shades of promising not to write it down.

                                                                                                          1. 2

                                                                                                            Aren’t there protocols for deniable drop offs on servers and similar? Those wouldn’t scale well, but AFAIK they work. So they are possible (just not practical).

                                                                                                            1.  

                                                                                                              There is SecureDrop, but as far as the technology is concerned it’s a web app accessed via Tor. The rest of the anonymity guarantees come from server-side opsec performed by the recipient org https://docs.securedrop.org/en/stable/what_is_securedrop.html

                                                                                                            2.  

                                                                                                              SimpleX is a chat system that does onion routing. Only two hops, and I am not vouching for anything about the app or its servers; just noting this feature.

                                                                                                              1.  

                                                                                                                They were also recently audited by Trail of Bits, so SimpleX is probably not clownshoes.

                                                                                                          2. 1

                                                                                                            This level of metadata leakage (IP addresses) is also true of nearly every so-called Signal competitor too.

                                                                                                            1. 3

                                                                                                              No one claimed otherwise. The context is the claim expressed above that you get worse metadata resistance than Signal, which seems irrelevant given that Signal doesn’t really have it either.

                                                                                                              1. 1

                                                                                                                Sorry. I hear this line of argument on Hacker News and Reddit a lot, only for the person to turn around and recommend XMPP or Matrix instead. I wanted to cut it off at the pass.

                                                                                                      4. 1

                                                                                                        Look at zkgroup for a deep dive into that question.

                                                                                                      5. 5

                                                                                                        This blog post has far more plot twists than I expected!

                                                                                                        1. 10

                                                                                                          I intended to write it as a single-pass compiler directly from the syntax tree and targeting C via TCC to enable it to run quickly and on any platform. Unfortunately I also designed a language with top-level execution and nested functions - neither of which I could come up with a good compilation model for if I wanted to preserve the single-pass, no AST, no IR design of the compiler. It’s certainly possible there is a model, but my musings couldn’t figure it out.

                                                                                                          I wrote a language with a single pass compiler (compiling to bytecode) and somewhat conventional imperative syntax. The implementation initially went smoothly, and of course the compiler was super fast, with memory allocation only needed to extend the bytecode array. As I extended the language, it became increasingly obvious why compilers for languages with conventional recursive syntax use an AST. I still use the language, because some of my tools are written in it, but I’ve abandoned trying to extend it.

                                                                                                          No AST works well for some syntaxes, like Forth and Brainfuck. In retrospect, that class of syntaxes is not for me.

                                                                                                          My current language uses an AST. It also has “top-level execution and nested functions” and other nice syntax I learned to like from Haskell and other powerful languages. I didn’t experience any of the pain from the previous language implementation (caused by not having an AST). I even figured out how to compile it into C++ and GLSL. Everything went smoothly, until I tried to get ambitious about doing optimization and partial evaluation. I created a kind of IR, but I botched the design. Next language will have a well designed IR.

                                                                                                          I learned that if you want to extend your language to have all the nice things, then you should plan for a multi-pass compiler, it just makes the job so much easier.

                                                                                                          1. 5

                                                                                                            Lua is a nice example of a language with nested functions and a single-pass compiler.

                                                                                                            1. 4

                                                                                                              It depends on your definition of “single-pass”.

                                                                                                              In the early language that I implemented (called Gen), I tried to generate byte code directly from the parser, without first converting the code into an intermediate representation. That’s what I meant by “single pass”. This worked fine for statements, but there were tricky cases involving expressions that caused a lot of pain and code complexity.

                                                                                                              In Lua 5, expressions are converted into an intermediate representation, of type expdesc. Optimizations can be applied to an expdesc before it is converted into bytecode. That would have solved the problem I encountered in the Gen compiler.

                                                                                                              So, by “single-pass” I meant “no intermediate representation”.

                                                                                                              1. 4

                                                                                                                It’s simpler than you make it sound!

                                                                                                                In a language like Lua (or Pascal) which has a single-pass recursive-descent parser, the functions that handle each partially-complete syntactic phrase (a block or an expression etc.) keep state representing the translation-in-progress of that phrase. I wouldn’t call this state an IR because it grows with the nesting depth of the program, not with its linear size. It doesn’t represent complete phrases, only work-in-progress.

                                                                                                                In Lua’s expression parser you can see that expdesc objects are allocated on the stack, so they can’t represent complete expressions, only nesting. An expdesc contains enough information to represent an intermediate result but not an entire expression. The luaK_ calls emit the code to evaluate the expression on the fly while the expression is parsed.

                                                                                                            2. 2

                                                                                                              Yes, I’ve thought about your previous comments on here about the importance of an AST for any compiler as I’ve run into all the many problems with generating code directly from the parse tree. But I’d started inspired by some of the Brian Callahan “Let’s Build a Compiler” blogs. I’ve been trying to decide how much effort I want to put into building an AST for this. One of my goals in using this as a side project was to write everything in such a way that I can always, or almost always, make progress in 15 minutes or an hour of development. So everything is incredibly incremental.

                                                                                                              I might have figured out a way around my current nested function generation issue, but I’m sure not having the AST will continue to cause more issues down the road. It’ll influence the type checking in ways I probably don’t want.

                                                                                                            3. 1

                                                                                                              From https://github.com/radarroark/xit/blob/master/docs/db.md :

                                                                                                              But for me, the bigger problem is that git’s data structures are just not that great. The core data structure it maintains is the tree of commits starting at a given ref. In simple cases it is essentially a linked list, and much like the linked lists you may have used, it can’t efficiently look up an item by index. Want to view the first commit? Keep following the parent commits until you find one with no parent. Want to find the descendent(s) of a commit? Uhhh…well, you can’t.

                                                                                                              Ha! This is very good selling point for xit!

                                                                                                              1. 2

                                                                                                                Uhhh… well, you use the commit-graph file

                                                                                                                (one of the results of Derrick Stolee’s performance improvement work)

                                                                                                              2. 4

                                                                                                                I was confused about what kind of QUIC API this blog post is about until I got near the end. This is an API between a TLS implementation and a QUIC implementation. At the moment curl supports multiple OpenSSL forks as TLS backends for ngtcp2 QUIC, but OpenSSL itself only provided a monolithic QUIC+TLS stack.

                                                                                                                1. 3

                                                                                                                  Why would OpenSSL implement QUIC? It’s a TLS implementation, which is at heart just a stream plugin; it shouldn’t be involved in the underlying data stream. Is this just feature creep?

                                                                                                                  1. 3

                                                                                                                    It was one of the worst OpenSSL decisions I’ve seen. OpenSSL was (is?) in the business of providing security and cryptography services. Now they want to be in the business of networking protocols and, perhaps one day, HTTP. I suspect that if they keep going along this path, one day OpenSSL will ship its own TCP/IP stack as well, perhaps as a replacement for lwIP.

                                                                                                                2. 1

                                                                                                                  what if you access the chip via wlan and then use the undocumented commands to alter the device? Does bluetooth have something like a loopback device or link local address?

                                                                                                                  1. 8

                                                                                                                    The commands aren’t for a network interface, they are just internal controller commands. They are not packets that are routable or could come from the outside world.

                                                                                                                    It’s like asking “what if you could access a computer via WiFi and then communicate with a USB device to flash its firmware”. You can’t do that unless there is some software on the computer exposing the USB device to the network (and then that software would be the security problem, not USB itself).

                                                                                                                    If some ESP32 user firmware in the wild is exposing the Bluetooth HCI commands via WiFi, then that would already be a security problem even without these undocumented commands.

                                                                                                                    1. 2

                                                                                                                      See the other comments in this thread.

                                                                                                                      A device like this has two kinds of interface: one that connects to the CPU (aka the host) for controlling the device and others that connect to the outside world. This discovery is about undocumented host control commands.

                                                                                                                    2. 1

                                                                                                                      I like this idea! Do you think it’s extreme to try and implement dark/light mode using static HTML? I can’t seem to find a good workaround for a javascript-less solution to give people the option to choose to deviate from their system preference.

                                                                                                                      But it sure feels like overkill to generate a copy of each page just to avoid making someone enable JS to change the colors on their screen… which I don’t even do because I prefer everything in dark mode anyway.

                                                                                                                      1. 9

                                                                                                                        There’s a CSS-only way (using a heavily restyled checkbox) to toggle other CSS attributes:

                                                                                                                        <!DOCTYPE html>
                                                                                                                        <html>
                                                                                                                        <head>
                                                                                                                        <style type="text/css">
                                                                                                                        .colors input:where([type="checkbox"][role="switch"]) {
                                                                                                                          appearance: none;
                                                                                                                          font-size: inherit;
                                                                                                                          margin: auto;
                                                                                                                          color: inherit;
                                                                                                                        }
                                                                                                                        .colors input:where([type="checkbox"][role="switch"])::before {
                                                                                                                          content: "dark";
                                                                                                                        }
                                                                                                                        .colors:has(input:where([type="checkbox"][role="switch"]):not(:checked)) {
                                                                                                                          color-scheme: dark;
                                                                                                                        }
                                                                                                                        .colors input:where([type="checkbox"][role="switch"]):checked::before {
                                                                                                                          content: "light";
                                                                                                                        }
                                                                                                                        .colors:has(input:where([type="checkbox"][role="switch"]):checked) {
                                                                                                                          color-scheme: light;
                                                                                                                        }
                                                                                                                        
                                                                                                                        :root {
                                                                                                                          color-scheme: light dark;
                                                                                                                        }
                                                                                                                        
                                                                                                                        body {
                                                                                                                          background-color: light-dark(ghostwhite, darkslategray);
                                                                                                                          color: light-dark(darkslategray, ghostwhite);
                                                                                                                        }
                                                                                                                        </style>
                                                                                                                        </head>
                                                                                                                        <body class="colors">
                                                                                                                        <input type="checkbox" role="switch"/>
                                                                                                                        <h1>Colorful!</h1>
                                                                                                                        </body>
                                                                                                                        </html>
                                                                                                                        
                                                                                                                        1. 4

                                                                                                                          Today I learned that light-dark() is a thing! Thanks!

                                                                                                                          1. 1

                                                                                                                            I’m using a similar idea for my own dark mode checkbox: https://isuffix.com (website is still being built).

                                                                                                                            GP comment might enjoy more examples of CSS :has() in this blog post: https://www.joshwcomeau.com/css/has/

                                                                                                                          2. 7

                                                                                                                            I don’t understand why so many web sites implement a dark mode toggle anyway. If your page uses CSS conditionally on prefers-color-scheme to apply a light theme or dark theme depending on the user’s system preference, why isn’t that enough?

                                                                                                                            For example, if the user is looking at your page in light theme and suddenly they think their bright screen is hurting their eyes, wouldn’t they change their system preference or their browser’s preference to dark? (If they don’t solve the problem by just lowering their screen brightness.) After they do so, not only your page but all their other apps would look dark, fixing their problem more thoroughly.

                                                                                                                            For apps (native or web) the user hangs around in for a long time, I can see some reasons to allow customizing the app’s theme to differ from the system’s. A user of an image editing app might want a light or dark mode depending on the brightness of the images they edit, or a user might want to theme an app’s windows so it’s easily recognizable in their window switcher. But for the average blog website, these reasons don’t apply.

                                                                                                                            1. 8

                                                                                                                              I am curious about how many people use it as well. But it certainly is easier to change by clicking a button in your window than going into your system or browser settings, which makes me think that it would be nice to add. Again, for the imagined person who decides to deviate from their system preference.

                                                                                                                              Although you’ve made me realize that even thinking about this without putting work into other, known-to-be-used accessibility features is kind of ridiculous. There is lower hanging fruit.

                                                                                                                              1. 4

                                                                                                                                Here’s a concrete example. I generally keep my browser set to dark mode. However, when using dark mode, the online training portal at work switches from black text on a white background to white text on a white background. If I wanted to read the training material, I would need to go into my browser settings and switch to light mode, which then ruins any other tab I would switch to.

                                                                                                                                If there was a toggle button at the training portal, I could switch off dark mode for that specific site, making the text readable but not breaking my other tabs. Or, if the training portal at work won’t add the button, I could at least re-enable dark mode in every tab whose site had added such a toggle.

                                                                                                                                1. 5

                                                                                                                                  Or, hear me out, instead of adding javascript to allow users to work around its broken css, the training portal developers could fix its css?

                                                                                                                                  (Browsers should have an easy per-site dork mode toggle like the reader mode toggle.)

                                                                                                                                  1. 3

                                                                                                                                    I feel like this is something to fix with stylus or a user script, maybe?

                                                                                                                                    1.  
                                                                                                                                      1.  

                                                                                                                                        Sure, but only on sites that provide a button. It seems a little silly that one bad site should mean that you change your settings on every other site / don’t have your preferred theme on those sites.

                                                                                                                                      2.  

                                                                                                                                        Or the DarkReader extension or similar.

                                                                                                                                    2. 2

                                                                                                                                      Given how widely different colour schemes can vary, even just within the broad realms of “light” and “dark”, I can imagine some users would prefer to see some sites in light mode, even if they want to see everything else in dark mode. It’s the same reason I’ve set my browser to increase the font size for certain websites, despite mostly liking the defaults.

                                                                                                                                      It would be nicer if this could be done at the browser level, rather than individually for each site (i.e. if there was a toggle somewhere in the browser UI to switch toggle between light/dark mode, and if the browser could remember this preference). As it is, a lot of sites that do have this toggle need to either handle the preference server-side (not possible with static sites, unnecessary cookies), handle the preference client-side (FOUC, also unnecessary cookies), or don’t save the preference at all and have the user manually toggle with every visit. None of these options are really ideal.

                                                                                                                                      That said, I still have a theme switcher on my own site, mostly because I wanted to show off that I made two different colour schemes for my website, and that I’m proud of both of them… ;)

                                                                                                                                    3. 6

                                                                                                                                      I remember the days when you could do <link rel="alternate stylesheet" title="thing" href="..."> and the browser would provide its own nice little ui for switching. Actually, Firefox still does if you look down its menu (view -> page style), but it doesn’t remember your preference across loads or refreshes, so meh, not a good user experience. But hey, page transitions are an IE6 feature coming back again, so maybe alternate stylesheets will too someday.

                                                                                                                                      The prefers dark mode css thing really also ought to be a trivial button on the browser UI too. I’m pretty sure it is somewhere in the F12 things but I can’t even find it so woe on the users lol.

                                                                                                                                      But on the topic in general too, like I think static html is overrated. Remember you can always generate html on demand with a trivial program on the server with these changes and still use all the same browser features…

                                                                                                                                      1. 2

                                                                                                                                        I’ve been preparing something like this. You can do it with css pseudo selectors and a checkbox: :root:has(#checkbox-id:checked) or so; then you use this to either ‘respect’ the system theme, or invert it.

                                                                                                                                        The problems I’m having with this approach:

                                                                                                                                        • navigating away resets the checkbox state
                                                                                                                                        • svg and picture elements have support for dark/light system theme, but not for this solution
                                                                                                                                        1. 2

                                                                                                                                          Yeah, I think I saw the checkbox trick before, but the problems you outline make the site/page/dark and site/page/light solution seem more enticing, since they can avoid especially the state reset issue. I like the idea of respecting/inverting the system theme as a way of preserving a good default, though!

                                                                                                                                          1. 2

                                                                                                                                            Yeah, as an alternative, for the state issue I was thinking of using a cookie + choose the styles based on it, but that brings a whole host of other “issues”

                                                                                                                                      2. 8

                                                                                                                                        Long post, many thoughts, but I don’t feel like doing a lot of editing, so apologies in front for unfiltered feedback! I don’t mean the tone I will use here :-)

                                                                                                                                        The start of the article is 🤌, but it sort of does’t live to my expectations. I feel this is mostly about extensible, forward compatible enums, which is quite neat (I didn’t realize that “I want to add a field to all these enum variants” admits such an elegant solution), but I don’t think solves my problems with error handling facilities in languages.

                                                                                                                                        Basically, this post feels like it attacks “make complicated things possible” part of the problem, and, sure, if you add across and along non-exhaustiveness, !__ for warnings, auto-delegation to turn an enum into struct, a capability system to track panics, you can solve everything.

                                                                                                                                        But the problem I see with error handling is that we don’t know how to make simple things easy. That’s especially true in Rust, of course, but it seems that in every language a simple way to go about error handling leads to some pretty significant drawbacks, and the money question is not how can we add extra knobs to handle all of the requirements, but whether there’s some simple idea that kinda solves stuff?

                                                                                                                                        Like, for example, the sled problem — we want every function to be precise about its specific error conditions, but, in practice, the stable equilibrium is one error type for library. Everr (fabulous name by the way, great job) suggest

                                                                                                                                        The Everr language server has a code action for defining error types for a given function based on the error types of the functions that are called. It also can intelligently suggest modifications to error cases as the function body evolves over time, taking contextual rules such as access control into consideration.

                                                                                                                                        But this sucks! With a nominal type system, having to name every function, and every function’s error type is very much not easy, and even if you add a bunch of tooling support, the result would still not be easy.

                                                                                                                                        Another simple-things-are-hard problem in error handling is exposing details. If you write a library A, and it uses a library B, and B is an implementation detail, then a common API design pitfall is to leak B through your error types (either directly, by including B variants, or indirectly, by allowing downcasing to B). The problem here isn’t that it’s impossible to either expose or hide B properly. There’s a bunch of techniques available for that (but I belive that Everr makes them nicer and more powerful). The problem is that you need to decide what do you do, and that is hard. You need pretty high level of discipline and experience to even note that this is a problem.

                                                                                                                                        Or another common pitfall of type-based error types, where

                                                                                                                                        enum MyError {
                                                                                                                                            Io(IoError)
                                                                                                                                        }
                                                                                                                                        

                                                                                                                                        is often a bug, because the actual picture is

                                                                                                                                        enum MyError {
                                                                                                                                            FailedToReadConfigFile(IoError),
                                                                                                                                            FailedToReadFromTCPSocket(IoError),
                                                                                                                                        }
                                                                                                                                        

                                                                                                                                        That is, that the fact that you can aggregate errors based on types doesn’t mean that you should.

                                                                                                                                        I have no idea how to handle errors in general! I just don’t have bullet proof recipes, every time it is “well, let’s look at your specific situation, shall we?”. Super annoying!

                                                                                                                                        I don’t think that a lack of language mechanisms is my problem. What I lack is a gang-of-four book for patterns of error management (I mean, I have such a book at my head obviously, and I consult it often when writing code, but I can’t condense it to a single-paragraph to put into project’s style guide and call it a day).


                                                                                                                                        Assorted smaller thoughts:

                                                                                                                                        For an article about systems programming language, it is surprising that no space is dedicated to the ABI. How exactly do you raise in catch errors, in terms of which bytes go into which register, I feel is an unsolved problem. Returning values is allegedly slow. Unwinding is, counterintuitively, faster (see Duffy’s post & the recent talk on C++ exceptions in embedded (has anyone reproduced that result in particular?)). To avoid potential ambiguity: rust-style error handling, and Java-style exceptions differ on two orthogonal axis:

                                                                                                                                        • whether you syntactically allocate expressions that throws (type system&syntax stuff)
                                                                                                                                        • whether throwing happens by stack unwinding (by the way, what is the best one-page explainer, of how unwinding actually works? I am embarrassed to admit that unwinding is magic pixie dust for me, and I have no idea how landing pads work), or by “normal” return.

                                                                                                                                        I am strictly speaking about the second one.

                                                                                                                                        And than, there’s Zig, and than there’s this paper by Sutter of several years ago which says that “actually, you do want to return an integer” to be fast.

                                                                                                                                        heap exhaustion

                                                                                                                                        Heap exhaustion is not the central example of OutOfMemory error. The central example is someone passing you a malformed gif image whose declared size is 67PiB. That’s the sort of thing that you need to be robust to, a single rouge allocation due to a bug/malformed input.

                                                                                                                                        It would also be interesting to know what sub-fraction of that group has tests for the out-of-memory error handling code path, and how good that test coverage is.

                                                                                                                                        No these data here, but, anecdotally, eyeballing Zig the code that has both allocator parameter, try, and defer/errdefer usually tends to reveal errors.

                                                                                                                                        Zig and Odin are different from other languages here; allocators are passed down ~everywhere as parameters

                                                                                                                                        Such discipline is possible in C++, Rust and other languages to varying extents, but is less common. Rust has an unstable allocator_api feature, where the discussion originally started in

                                                                                                                                        1. Rust also has a competing storage API proposal.

                                                                                                                                        The Rust allocator API is very much not what Zig is doing. https://ziglang.org/download/0.14.0/release-notes.html#Embracing-Unmanaged-Style-Containers is not at all that.

                                                                                                                                        A lint that prevents error values from being discarded using standard shorthands (e.g. _ = ), without an explicit annotation, such as a comment or a call to an earmarked function (to allow for ‘Find references’) etc.

                                                                                                                                        I used to think the what Rust does, with must_use, is the right thing, and was hesitant of swift approach of requiring everything to be used. After using Zig, I am sold though, no need to other think this, a non-void function whose result is unused and is not _ = should be a compilation error. The amount of false positives is vanishingly small.

                                                                                                                                        1. 3

                                                                                                                                          “actually, you do want to return an integer” to be fast.

                                                                                                                                          https://mcyoung.xyz/2024/04/17/calling-convention/ had an interesting idea. Change the abi so that the error/success payloads of Result are passed as out parameters, and then just return the Ok/Err tag. That seems like it allows the best of both worlds - effectively automating the common pattern used in zig and making it type-safe.

                                                                                                                                          1. 2

                                                                                                                                            Returning values is allegedly slow. Unwinding is, counterintuitively, faster

                                                                                                                                            I think this is true in the common case where an error did not occur. Returning error information adds overhead to both the caller and callee, whereas catch/throw has the famous “zero overhead.” On the other hand, when an error does occur, unwinding the stack is significantly slower because a bunch of compiler-generated metadata has to be looked up and processed for each active stack frame.

                                                                                                                                            the recent talk on C++ exceptions in embedded

                                                                                                                                            The talk I watched (i don’t remember who gave it) was primarily about code size, not performance. The common wisdom being that using C++ exceptions bloats your code with all those compiler-generated tables annd extra code for running destructors during unwinding.

                                                                                                                                            1. 2

                                                                                                                                              Hey Alex, thanks for taking the time to read and share your thoughts. I always appreciate reading your blog posts, so thank you for the specific feedback on this post.

                                                                                                                                              That’s especially true in Rust, of course, but it seems that in every language a simple way to go about error handling leads to some pretty significant drawbacks, and the money question is not how can we add extra knobs to handle all of the requirements, but whether there’s some simple idea that kinda solves stuff?

                                                                                                                                              It would be helpful to have an operational definition of “simple” here with one or two examples before I attempt to answer this. 😅

                                                                                                                                              For example, if there is a guideline that by default, an error should not expose structure, and just expose an interface like:

                                                                                                                                              trait RichDebug: Debug {
                                                                                                                                                type Kind: Debug
                                                                                                                                                fn kind(&self) -> Kind
                                                                                                                                                fn metadata(&self, &mut debug::Metadata<_'>) // similar to fmt::Formatter, but creates something like a JSON object instead of a string
                                                                                                                                              }
                                                                                                                                              

                                                                                                                                              and the implementation for this were to be derived using a macro (or comptime machinery in Zig), would that be considered “simple”?

                                                                                                                                              Like, for example, the sled problem

                                                                                                                                              Thanks for linking that blog post, I hadn’t read it earlier. This point stands out to me in particular:

                                                                                                                                              inside the sled codebase, internal systems were [..] relying on the same Error enum to signal success, expected failure, or fatal failure. It made the codebase a nightmare to work with. Dozens and dozens of bugs happened over years of development where the underlying issue boiled down to either accidentally using the try ? operator somewhere that a local error should have been handled, or by performing a partial pattern match that included an over-optimistic wildcard match.

                                                                                                                                              This goes directly against the Rust conventions RFC, which recommends using panics for “catastrophic errors”. I’ve seen this similar tendency in Go codebases, where people will put every kind of error under error, even if it’s technically a serious invariant violation (like a bounds check failure, which does trigger a panic!).

                                                                                                                                              Based on Duffy’s writing on Midori, it feels like a Midori programmer would probably be more likely to use “abandonment” (panic) than a Rust/Go programmer in this kind of situation, given the built-in Erlang-style fault-tolerant architecture.

                                                                                                                                              we want every function to be precise about its specific error conditions, but, in practice, the stable equilibrium is one error type for library

                                                                                                                                              Right, so with Everr’s type system, you could write your code as returning only MyLibraryError, and then the language server can refactor the functions which need specific error conditions to instead return MyLibraryError:union+[.Case1 | .Case2].

                                                                                                                                              The central example is someone passing you a malformed gif image whose declared size is 67PiB. That’s the sort of thing that you need to be robust to, a single rouge allocation due to a bug/malformed input.

                                                                                                                                              This is a fair criticism. In that section, I originally intended to describe a system for regions/different sub-heaps based on some of the research on Verona (and in that context, “heap exhaustion” would mean “this sub-heap is exhausted”, not “the process heap is quite high for the running system”), but then I punted on that because I didn’t feel confident in Verona’s system, so I moved that description to the appendix.

                                                                                                                                              I will update this.

                                                                                                                                              and was hesitant of swift approach of requiring everything to be used. After using Zig, I am sold though

                                                                                                                                              I personally prefer Swift’s approach of warning instead of a hard error, given that iterating on code becomes more fiddly if you need to keep putting/removing _ (speaking from first-hand experience with Go).

                                                                                                                                              However, the point you’ve quoted here is talking about something slightly different. It’s saying that using the same shorthand for discarding ordinary and discarding errors is itself error-prone. Discarding errors should require noisier syntax (in lint form), because an error being discarded is likely to carry higher risk than a success value being discarded.

                                                                                                                                              I have such a book at my head obviously [..]

                                                                                                                                              Perhaps a good idea for a IronBeetle episode? I’m slowly working my way through the list; maybe you’ve already covered this in one of them. 😄


                                                                                                                                              For an article about systems programming language, it is surprising that no space is dedicated to the ABI. How exactly do you raise in catch errors, in terms of which bytes go into which register, I feel is an unsolved problem

                                                                                                                                              I omitted this because:

                                                                                                                                              1. Since this point is primarily about performance, it doesn’t make sense for me to speculate about designs without having concrete measurements. Performing any kind of realistic measurement would likely be a fair amount of work.

                                                                                                                                              2. I didn’t realize it was an “unsolved problem” but rather my working assumption was that “for ABI, the small number of people working on it pretty much know what all their options are, so it doesn’t make sense for me to tell them that”. For example, if you only care about 64-bit machines, perhaps you’re fine with burning a register on errors specifically (like Swift). For larger errors, you could reuse the same register as an out-parameter (as described in mcyoung’s post linked by Jamie).

                                                                                                                                              1. 1

                                                                                                                                                Perhaps a good idea for a IronBeetle episode?

                                                                                                                                                Good idea! I’ll do an error management episode this Thursday then!