1. 3

    Incidentally, here is a piece of advice: if you are ever agonising over some design detail that is not core to what makes your language special, and all options seem equally reasonable, just go with whatever Rust does. Odds are that whatever Rust decided was preceded by significant debate, and that whatever they picked has few gotchas. Lexical syntax is one area where this is particularly good (and easy) advice to follow.

    Questionable.

    1. 10

      I think it’s a good idea if you want to avoid having your language called weird or academic. Rust is very conscious about its “weirdness budget”, and carefully threads the needle between looking familiar and fixing old problems. It managed to get traction as a C-family language despite being influenced by ML-family and other “exotic” languages.

      There are many similarities between Rust, Swift and TypeScript. All of them are here to stay, so it’s likely that they are influencing what the evolution of preferred “C-like” syntax and set of features is going to be.

      1. 1

        fixing old problems

        Which specifically?

        1. 12

          If we’re talking about syntax and comparing with C-family languages as a baseline:

          • Making variable declarations unambiguous
          • Making types unambiguous – AFAIK, you can always look at an arbitrary identifier and from a minimum amount of context know whether it is a type or a variable
          • Relatedly, getting rid of (foo) bar as a way to cast expression bar to type foo, which in my experience is a great way to make parsing hard
          • Making generics use <> (yeah yeah I don’t like it either) but making it unambiguous with regards to the greater-than and less-than operators
          • If’s are expressions, removing the need for a ternary operator
          • Almost everything is an expression, actually
          • Curly braces after if/while/etc are mandatory, removing a source of errors
          • Parentheses after if/while/etc are absent, removing a source of noise
          • Basically everything has a literal syntax of some kind or another
          • Optional trailing commas everywhere you could want them, making code generation easier

          I could go on. There are a few warts that are widely acknowledged as such I think, and you can do some outright painful things with pattern matching if you try hard enough (I recently had occasion to write if let Ok((&mut mut foo,)) = bar {...} and it made me rethink my life), but all in all if you want something that looks kinda like C++, Rust’s syntax is pretty good. If you don’t want something that looks kinda like C++, you can make some fairly superficial alterations and get something that looks more like a typed Lua.

          1. 1

            I don’t think there is a lot in that list in terms of “fixing old problems”.

            It reads more like “don’t be stupid, don’t be C” … welll d’oh!

            1. 1

              Well, C and C++ are still the most popular languages for low-level development, so just being “a better and safer but more powerful C” is actually a great pitch.

              1. 1

                Yeah, just not one that supports the original claim.

            2. 1

              Parentheses after if/while/etc are absent,

              One of the saddest choices…

              1. 3

                I think this one contributes to Rust looking “ugly” to some people, but as a syntax it works fine. Optional braces were proven to be dangerous (goto fail bug), and once you fix that by making braces required, the parens are unnecessary.

                 if (foo) bar();
                 if foo {bar();}
                

                is the same number of characters and the same number of shift presses (at least on a US-layout keyboard).

                1. 1

                  That’s disingenous, usual style uses whitespaces around {}, so if foo { bar(); }. Two characters longer, and I believe I am not alone when I say it’s still uglier even with two characters more.

                2. 2

                  Why? They are not functions

                  1. 1

                    gratuitous incompatibility with parsers already installed in many, many human brains for no benefit

                    1. 4

                      Eh, I thought it was weird at first, but it very quickly goes from weird to normal. I actually think it reads much more clearly than with parens.

                      1. 1

                        Wait until you discover that it isn’t Java you’re parsing. 🤯

                        It’s not gratuitous at all, unlike languages which keep all of C’s defects for sentimental reasons.

                        1. 1

                          since this one isn’t a defect, but merely an arbitrary choice, going with the existing popular choice seems good. There are dozens of languages sharing the parenthesized if/for/while/etc syntax

                    2. 1

                      Gets rid of single-line ifs, which is a great way to cut down bugs. You can still put the whole thing with braces on one line, too.

                    3. 1

                      Regarding garnet research, I’m curious if you had a look at Nim too? I didn’t notice it mentioned in your notes there, at least at a glance. One of the things I find interesting about it, is that from what I read somewhere, its author’s original intention was to try and build it all based on macros, with as minimal core as possible. He claimed there that he kinda failed at that (sorry, I don’t have a link), but I think it leaves Nim with a rather powerful macros system, that (together with Rebol, which I see Nim as a kinda typed variation of) made me start to understand how powerful can macros probably be in Lisp. Other than that, I like how it manages to look nearly like Python, but be statically typed, and that the type declarations tend to nearly disappear. Though, similar as with OCaml, I find it somewhat of a language more focused on being practical than being “pretty from CompSci perspective”, with quite a number of seemingly somewhat sketchy, weird and randomly slapped on features here and there, that surprisingly tend to work together quite well. Sorry for the ode, I just like this language for some time recently :)

                      edit: uh, and one more thing related to modules, where I believe Go struck gold and Nim does rather poorly (though that’s understandable given it has “universal function call syntax” and operator overloading), is with naming and “avoiding stutter” - in that module names give mental context to their contents, and can be assumed as part of the name, but then in quite many places can be thus omitted (esp. inside a module). So you don’t need “newRegex” but just “regex.New” (IIRC Lua indeed also enables a similar approach (in fact, when I was trying to write a Lua interpreter for Go long time ago, I got super challenged by the “uncanny valley” of many “parallel evolution” similarities between them, such that I found it extremely difficult to write code in both languages in a single file)).

                    4. 2

                      Declarations always start with a keyword (particularly fn). This neatly avoids the Most Vexing Parse from C++, and it allows function declarations to be nested inside other function declarations.

                      But more importantly, it will make writing a Rust REPL in the future so much easier, because you’ll be able to type items into a prompt that expects an expression, and it’ll just work. Joe Armstrong complained bitterly about how Erlang can’t do that, because the item grammar and the expression grammar in Erlang aren’t compatible. He flat-out said “Erlang has a bug” because of this, and he’s right.

                  2. 4

                    I think Rust made a big mistake of choosing <> for generics instead of [], but if you are to follow X by default, Rust is probably not a bad choice of X (compared to, say, Scala). Futhark uses ML-style module for generics so does not need generics bracket anyway.

                    1. 3

                      I’d say that Scala 2 would be an excellent choice to follow.

                      The language is way more regular than Rust in the parts that are likely to be of relevance/copied/shared¹.

                      ¹ i. e. not the “emulating compile-time Prolog with implicits” part

                    2. 1

                      This section got my attention as well.

                      I’m curious if folks who do compiler / plt here work feel similarly.

                      1. 2

                        Rust did a good job with its string literal syntax, so I chose to copy it for a byte string interchange format for Oil called QSN:

                        http://www.oilshell.org/release/latest/doc/qsn.html

                        It’s very similar to JSON except it can express arbitrary byte strings, and doesn’t need surrogate pairs.

                        They got rid of the legacy from C like \v and octal, and the unnecessary ambiguity of \u1234 and \U00001234, in favor of \u{1234}.


                        But I think the advice is a little too “extreme” overall. I would say if you are designing a language that looks roughly like C or JavaScript, then Rust and Go are good places to look. Swift has made some good choices too. I think Go designers put just as much thought into it, and came up with similar if not identical conclusions (even though I don’t really use Rust or Go myself).

                        Though in the case of string literals, Go seems to inherit the legacy from C, which makes it slightly more annoying to implement.

                        It does feel like that kind of syntax is “winning”, i.e. ALGOL style with curly braces. Most popular languages now seem to converge on something that looks familiar to C or JavaScript users (e.g. Kotlin). There are definitely limitations to that but on the whole I don’t think it’s bad.

                    1. 5

                      Debian has to figure out how to co-operate with other package managers. By painstakingly repackaging every npm dependency they’re not adding any value. From perspective of an npm user it is very weird. The Debian versions are installed in places that npm doesn’t recognize, so I can’t use them even if I wanted to.

                      1.  

                        On the bright side, the places where they are packaged are out of your way so they won’t unpredictably break things for you.

                        1. 0

                          Debian has no problems co-operating with, say, pip. Python packages installed using Debian package manager is visible to pip. Perhaps it’s npm that should figure out how to co-operate with system package managers, not Debian.

                          1. 5

                            Debian has no problems co-operating with, say, pip.

                            Yes it does.

                            I’ve seen tons of Python projects that basically start their setup instructions with “create a virtualenv and use pip to install the dependencies.” The whole point of virtualenv is to avoid the distribution-packaged pip libraries in favour of whatever the latest one is. Projects that do get redistributed by Debian, like Mercurial, have to go out of their way to avoid pulling in pip dependencies.

                            https://gregoryszorc.com/blog/2020/01/13/mercurial's-journey-to-and-reflections-on-python-3/

                            In April 2016, the mercurial.pycompat module was introduced to export aliases or wrappers around standard library functionality to abstract the differences between Python versions. This file grew over time and eventually became Mercurial’s version of six. To be honest, I’m not sure if we should have used six from the beginning. six probably would have saved some work. But we had to eventually write a lot of shims for converting between str and bytes and would have needed to invent a pycompat layer in some form anyway. So I’m not sure six would have saved enough effort to justify the baggage of integrating a 3rd party package into Mercurial. (When Mercurial accepts a 3rd party package, downstream packagers like Debian get all hot and bothered and end up making questionable patches to our source code. So we prefer to minimize the surface area for problems by minimizing dependencies on 3rd party packages.)

                            1. 1

                              I mean, yes. kornel doesn’t like Debian because Debian packages are not visible to npm. notriddle doesn’t like Debian because upstreams don’t want Debian packages to be visible to pip. Two are in contradiction. This is not a solvable problem.

                              1. 3

                                It’s not a contradiction. Debian is just failing both types of users by doing a half-assed thing. It’s neither leaving packages alone, nor providing a set large and compatible enough to be useful.

                        1. 2

                          I could see this as wanting to keep the implementation as simple as possible, so the question becomes: would we actually want this safety built in or is it enough to put the whole thing into a “secure box”?

                          1. 4

                            A core design principle of web assembly was that it be able to provide a target for more or less any language. That meant the language object model would be opaque to the VM, it also means life time of an allocation is opaque to the VM. The result of that is that the WASM VMs basic memory model has to be a single blob of addressable memory. Some of this is also because of the Mozilla “JS subset” wasm implementation that operated on typed arrays.

                            This brings with it other constraints - no builtin standard library, no standard types, no object introspection, and because it’s intended to be used in browser the validation and launch must be fast as caching of generated code is much less feasible than in a native app - hence the control flow restrictions.

                            The result of all of this is that you can compile Haskell to wasm without a problem, or .NET, or C++ and they all run with the same VM, and none of them incur unreasonable language specific perf penalties (for example you cannot compile Haskell to the CLR or JVM without a significant performance penalty), but compiling to WASM works fine. C/C++ can treat pointers and integers as interchangeably and unsafely as they like, without compromising the browser. And .NET and JVM code can apparently (based on other comments so could be totally wrong here) run in that WASM VM as well.

                            1. 1

                              We of course also want inside-box safety. The question is cost and tradeoff.

                              1. 1

                                If Java and .NET do it just fine (and they do), there’s no perf cost excuse there.

                                1. 1

                                  No, they don’t. C++ code compiled with /clr:safe does slow down. (It doesn’t slow down without the option, but it doesn’t provide inside-box safety either.)

                                  1. 1

                                    Compared to /clr:pure yes, due to some optimisations missed on earlier .NET CLR versions. (the move to Core alleviated most of that overhead… but initially came with a removal of C++/CLI outright before it was added back), and of course, all C#/F# code runs with those checks enabled all the time.

                                    Having the option is always better than not having it for things like this though.

                                  2. 1

                                    There’s a significant penalty to languages with significantly different type systems when running under .NET and the JVM. That’s why you tend to get similar, but slightly different, versions of languages - Scala, F#, etc instead of Haskell/*ML - basically the slight differences are changes to avoid the expensive impedance mismatch from incompatible type systems. The real Haskell type system cannot be translated to either .NET or the JVM - even with .NET’s VM level awareness of generic types - and as such takes a bunch of performance hits. Trust me on this.

                                    Similarly compiling C and C++ to .NET requires sacrificing some pointer shenanigans that wasm allows (for better or for worse).

                                1. 1

                                  Whoops. If mods see this, just merge it.

                                1. 1

                                  Which parts of GNAT aren’t in Ada?

                                  1. 2

                                    Doesn’t GNAT use GCC code generation backend?

                                    1. 1

                                      GNAT has multiple backends, GCC being the most used one, but it also has a couple that are written in Ada (e.g. JGNAT, which targets the JVM).

                                  1. 1

                                    This is a culmination of two worlds that a niche group of researchers have been working towards, combining text transformers with image convolutional nets, and it’s really great to see this first mainstream outcome.

                                    This direction is much bigger than “oh cool I can make images from text”. Because a big problem in NLP is that it has been trapped in just using text for learning. In reality we learn language from text, speech, images, and other stimuli.

                                    When I hear people talk about NLU many folks lack the awareness that even if a model has perfectly represented a text label as an embedding, the embedding itself still lacks meaning. The label “dog” might be captured by a 768 dimension vector, but aside from co-occurrence with other words and contexts, it still doesn’t know what a dog is.

                                    So here’s hoping that the marriage of different vector spaces will get us closer to this realization.

                                    Shameless plug: I’m writing more about these concepts as a contributing author for the book AI-Powered Search. https://www.manning.com/books/ai-powered-search

                                    1. 3

                                      As far as I can tell, this does not use CNN for images at all. Do you have any evidence to the contrary?

                                      1. 1

                                        Yes you are right! They are using Variational Auto Encoders. I mixed it up with some other research I saw using outputs from Resnet and variants. Thanks for calling this out - I’m much more familiar with the text side of things

                                    1. 23

                                      A few observations:

                                      But if SIMD is so awesome, why are the RISC-V ditching it and going for Vector processing instead?

                                      RISC-V is not. There is an official SIMD extension to RISC-V. That said, RISC-V began life as the control-plane processor for Hwacha, a Cray-style vector processor, so it’s not surprising that it would also gain a Hwacha-style extension.

                                      Thus SIMD as it has developed is untenable. There are new instructions every few years

                                      The new instructions are not added for fun, they are added to introduce more complex operations that provide a real-world speedup for at least one workload. Whether you use SIMD or Cray-style vectors, you’ll still have this problem. Vector add as the example hides this. There are a lot of complex vector operations that exist in x86 and Arm because profiling on existing code showed that they would be useful.

                                      There’s a lot in this article that’s confused or misleading. For example, it conflates the vector register width and ALU width in SIMD systems. Intel’s Atom processors, for example, are a counterexample of this, they have 128-bit SSE registers that feed 64-bit ALUs. These still provide a speedup because each vector operation dispatches in two cycles and doesn’t need any extra decode or register renaming.

                                      The digression about ML is also weird. ML processors are custom because they use custom data types. This is one of the big reasons for bloat in SIMD instruction sets: even just for add, you need variants to add two vectors containing 8-, 16-, 32-, and 64-bit integers, 16-, 32-, and 64-bit IEEE floating point values. ML adds a variety of 1-, 2-, 4-, 8-, and 16-bit data types, so requires a load more instructions (at least 5 more instructions for each of your existing instructions, so even with the basic set of C operators you’re looking at a pretty large number of instructions).

                                      The article also completely ignored Arm’s Scalable Vector Extensions. These are a very interesting half-way step that provide a lot of the benefits of both. With a Cray-style vector, the CPU is responsible for defining the loop on each instruction. That makes pipelining quite difficult. Imagine a simple vector ISA with no fused multiply add. You’re doing an operation of r3 = r2 + (r1 * r0). With a cray-style vector, this will look roughly like:

                                      vload r1, ...
                                      vload r2, ...
                                      vload r3, ...
                                      vmul r3, r1, r0
                                      vadd, r3, r3, r2
                                      

                                      On a SIMD system, it will look more like:

                                      vload r1, {r1 base}, {loop induction register}
                                      vload r2, {r2 base}, {loop induction register}
                                      vload r3, {r3 base}, {loop induction register}
                                      vmul r3, r1, r0
                                      vadd, r3, r3, r2
                                      add {loop induction register}, {vector size}
                                      compare-and-branch {if we've finished}
                                      

                                      Now assume you do this on a 2048-element vector on a processor with a single 64-bit vector add unit and a single 64-bit multiply unit. You’d like to do this, per clock cycle:

                                      1. Multiply the first 64 bits.
                                      2. Add the first 64 bits to the result of 1, multiply the second 64 bits.
                                      3. Add the second 64 bits to the result of the multiply in cycle 2, multiply the third 64 bits.

                                      And so on. With a Cray-style vector unit, you now have two many-cycle instructions that are partially completed. Now what happens if you take an interrupt? The processor either needs to save all of the state associated with those partial instructions or it needs to discard a potentially large amount of work and redo it. This gets even more fun if the sequence includes a vstore that may alias with the vload.

                                      To make this even more fun in the case of the RISC-V V extension, there’s a limit to the maximum vector size, so if your data type is not fixed at compile time (e.g. if you’re multiplying two arbitrary-sized matrixes) then you need to handle the case where your vector width is larger than the maximum vector width supported by the processor.

                                      This is much easier in the SIMD version. Every operation is on an architectural register and a trap just needs to preserve the architectural register state. SIMD units typically have a lot of vector registers so that they don’t need much hidden state for good performance (the number of rename registers is only slightly larger than the number of architectural registers). Because each loop iteration has a single multiply and add, it’s trivial for the CPU to pipeline these and forward the result of the multiply into the add. The fact that you’ve only multiplied and added half of a 2048 element is all architectural state.

                                      Amusingly, the Arm alternative is far more in keeping with the RISC philosophy. With SVE, the CPU has a load of vector registers that are 128-2048 bits (implementation defined) and the size can be queried. The compiler then generates a SIMD-style loop that can operate on any of these vector sizes, querying a MSR to find out what the size is (this is used as the stripe size for the loop induction variable in the simplest cases). As with traditional SIMD, all state for part of a source-language vector is architectural and so the compiler can reason about it and ensure aliasing is not a problem and the OS can easily store it in trap frames.

                                      The article also ignores the elephant in the room: vector units on CPUs are not designed for hand-written vector code anymore. They’re designed for auto-vectorisation. This is why scatter-gather and predication are so important: they allow the compiler to use vector instructions for any loop that has no loop-carried dependencies (or which can be rewritten to eliminate loop-carried dependencies), irrespective of whether it operates on regular data or has a power-of-two number of iterations. Again, these instructions contribute a lot to the SIMD instruction bloat that the author complains about and are there because they make a huge difference to the amount of code that is amenable to autovectorisation.

                                      Looking at the RISC-V V extension, the vsetvl instruction controls the vector length for all vector registers and so I expect code that handles mixed-length vectors (i.e. most autovectorised code) will need a lot of those. It’s really unclear to me that this has any benefits relative to SVE and it has several disadvantages.

                                      1. 5

                                        With a Cray-style vector unit, you now have two many-cycle instructions that are partially completed. Now what happens if you take an interrupt?

                                        This is rhetorical, right? You know what happens. It is specified in RISC-V V specification, 18.1, precise vector traps. The only state is vector index, there is no redo. It’s not a big deal.

                                        1. 4

                                          Sure, which means that if you’re doing any forwarding you have to roll back everything after the first instruction in the sequence that you’re forwarding values between. That’s a lot of microarchitectural state that you have to keep during speculative execution (a lot more than in an equivalent SIMD execution) to roll back. That doesn’t matter in a DSP or HPC accelerator but it does matter in a lot in a CPU core.

                                          There’s a reason that the extension to AArch64 co-developed by the company responsible for most of the mobile market and the company that routinely designs the world’s top supercomputer was not a pure Cray vector architecture. You’ll note that it is, in fact, at the top of the TOP500 list now, yet also scales nicely down to mobile phone cores. Looking at the top 10 in the TOP500 list, I don’t actually see any Cray vector processors. It’s possible that everyone is missing a trick, but given that Cray used to dominate that list 20-30 years ago, I’m somewhat skeptical.

                                        2. 4

                                          I don’t know where you got your information about Cray 1 and RISC-V but it’s obviously some kind of misconception.

                                          The major difference between SVE and the others is that SVE processes data that is less than the vector register in length using predication while RISC-V and Cray have an explicit vector length register.

                                          On Cray 1 you write a loop to do C = A + B for arbitrary length vectors like this:

                                          while (n > 0){
                                              int len = n > 64 ? 64 : n;
                                              setvlen(len);
                                              vec A = vload(a_ptr);
                                              vec B = vload(b_ptr);
                                              vec C = vadd(A, B);
                                              vstore(c_ptr, C);
                                              a_ptr += len;
                                              b_ptr += len;
                                              c_ptr += len;
                                              n -= len;
                                          }
                                          

                                          If the vector is shorter than 64 elements then the last elements of the register will not be processed. If the vector is larger than the vector registers then the loop will execute multiple times. If the vector length is not a multiple of 64 then the final shorter vector will automatically be processed the same as a short vector would.

                                          You don’t need the ugly tail cleanup cod (and often initial code to make the vector aligned) than SIMD needs.

                                          The major difference between Cray and RISC-V is that Cray 1 came with only one size of vector register (64 floating point values) and the programmer had to know the length (as shown above). On RISC-V the vector register length can be anything from 1 element (of the maximum element size supported) to 2^31 elements (or bits – I don’t remember right now and I think maybe it hasn’t been definitely decided yet). Certainly much more than SVE’s 128 bit to 2048 bit architectural limit.

                                          A RISC-V vector add looks like this:

                                          while (n > 0){
                                              int len = vsetvli(n, vec32i_t);
                                              vec A = vload(a_ptr);
                                              vec B = vload(b_ptr);
                                              vec C = vadd_32i(A, B);
                                              vstore(c_ptr, C);
                                              a_ptr += len;
                                              b_ptr += len;
                                              c_ptr += len;
                                              n -= len;
                                          }
                                          

                                          The RISC-V program doesn’t know or care how long the vector registers are. The hardware tells you, on each loop iteration.

                                          It seems you know that the Cray 1 processed vectors one element at a time, taking 64 cycles to process the entire vector register. You also know that due to “chaining” the hardware could load the first element of A and B in cycle 1, add those elements in cycle 2, and store them in cycle 3.

                                          RISC-V imposes no such limitation on the hardware implementation. Some implementations might process one element at a time like the Cray 1, but it’s much more likely that they would process 2 or 4 or 8 elements at a time, with or without chaining. The most common would probably be to have a quarter as many ALUs as the vector length so that it takes 4 clock cycles to process the entire vector. Other implementations might process the entire vector register in parallel in one clock cycle.

                                          The chip designer can size the vector registers based on the expected workload, how many ALUs they want to build, and the latency and bandwidth of the memory system (whether some cache level or RAM).

                                          The programmer doesn’t have to know anything about this. The code is identical for every machine, and runs optimally given the choices the CPU designer made.

                                          If the implementation executes RISC-V vector instructions in a single cycle then there is no implication for trap handling. Even if it is 2 or 4 cycles that may not be a problem – just complete the instruction. But the RISC-V Vector extension has a vstart CSR that can be used by the hardware to save the point in the instruction that it was at when an interrupt occurred, and when the interrupt returns the hardware can re-run the instruction starting from that point.

                                          It’s true that an implementation using chaining might have a bit of fun trying to save the machine state on an interrupt. If you want to make such an implementation then you could choose to take that pain, or you could say that the cores (often fairly simple minion cores) with the vector unit either don’t take take interrupts at all, or have potentially quite long interrupt response, and direct most of the interrupts in the system to some other core.

                                          You say “The new instructions are not added for fun, they are added to introduce more complex operations that provide a real-world speedup for at least one workload.”

                                          The author of the article (who I admit was somewhat confused – it’s best to read the original Patterson and Waterman article, or the draft reference manual and code examples) was not talking about adding useful new instructions. He was talking about making a complete set of instructions for MMX and then a few years later throwing those away and making a complete set of SSE instructions. And then a few years later making a complete duplicate set of AVX instructiond. And then AVX512.

                                          It’s little different in the ARM world with DSP instructions, SIMD extensions for Multimedia, NEON, SVE, and MVE.

                                          The RISC-V Vector extension has a single set of instructions which work the same, running the same binary, on machines with vector registers from 4 bytes up to gigabytes (if someone ever wants to build such a thing).

                                          The initial version of the RISC-V V extension has all the useful instructions that previous vector or SIMD machines had. The Working Group has members from many different companies, with experience ranging back to the CDC6600 and Cray, to more modern supercomputers and DSPs and everything in between.

                                          That doesn’t mean that more won’t be added in future, but version 1.0 draws on a long history. WHat is sure is that new instructions won’t be needed in future simply because the register length got doubled (again).

                                        1. 7

                                          My personal opinion is that support for ARMv6+VFPv2 should be maintained in distributions like Debian and Fedora.

                                          My personal opinion is exactly the opposite. Raspberry Pi users are imposing unreasonable burden on ARM distribution maintainers. For better or worse, the entire ecosystem standardized on ARMv7 except Raspberry Pi. The correct answer is to stop buying Raspberry Pi.

                                          1. 2

                                            I also agree wiťh you that the correct answer is to stop buying Raspberry Pi, especially their ARMv6 products. But for most beginners in electronics, it seems like “Raspberry Pi” equals “single board computer”. They aren’t going to stop buying them.

                                            I don’t love MIPS64 or i686 either, but the reality is that the hardware exists and continues to be used. Maintainers should deal with that, IMHO.

                                            1. 3

                                              I am just tired of getting issues like https://github.com/rust-embedded/cross/issues/426. This is just a tiny corner. What a horror this is being replicated 100x for every toolchain out there.

                                            2. 2

                                              I faced this issue directly, being as I was a distribution developer working on ARM at the time. I feel your pain.

                                              However, they made the choices they made for cost reasons, and the market has spoken. I can’t argue with that.

                                              1. 2

                                                It could be worse. At least the Pi is ARMv7TDMI. Most AArch32 software defaults to Thumb-2 now and the Pi is just new enough to support it. I maintain some Arm assembly code that has two variations, ARMv6T2 and newer, everything else. I can probably throw away the older ones now, they were added because an ultra low-budget handset maker shipped ARMv5 Android devices and got a huge market share in India or China about 10 years ago and a user of my library really, really cared about those users.

                                                1. 1

                                                  shipped ARMv5 Android devices and got a huge market share in India or China

                                                  Interesting, do you know which phone models? The oldest Android phones I could find are ARMv6.

                                                  1. 1

                                                    No idea, sorry. I never saw them, I just got the bug reports. Apparently they’re all gone (broken / unsupported) now. It was always a configuration that Google said was unsupported, but one handset manufacturer had a custom AOSP port that broke the rules (I think they also had their own app store).

                                              1. 6

                                                This is a part of why I really love Common Lisp. Many of its libraries (like bordeaux-threads, the de facto threading library, and usocket, the BSD sockets thing) were last updated something like 10-20 years ago, and they still just work.

                                                This kind of ecosystem stability is refreshing when compared to a language like Rust, where all my code became unidiomatic / dependent on now-stale libraries within half a year… (!)

                                                1. 3

                                                  But all your Rust code still works, doesn’t it? So it must be you love Common Lisp for some other reasons: not that bordeaux-threads still works, but that bordeaux-threads is still idiomatic.

                                                1. 4

                                                  For an update, see On the Information Bottleneck Theory of Deep Learning.

                                                  The article itself asks:

                                                  It remains to be seen whether the information bottleneck governs all deep-learning regimes, or whether there are other routes to generalization besides compression.

                                                  The paper I linked answers:

                                                  Moreover, we find that there is no evident causal connection between compression and generalization: networks that do not compress are still capable of generalization, and vice versa.

                                                  I think it was a great idea. But evidences suggest that it’s just not true.

                                                  1. 2

                                                    My intuition has been that discovering sparse representations are usually necessary for any kind of generalization – a model learning speech-to-text, for instance, will necessarily have somewhere inside it an understanding of the individual vowel/consonant sounds and utterances which are then building blocks for generating text.

                                                    “Compression” ~= “sparse representation”, right? So the paper refutes that idea?

                                                    1. 1

                                                      thank you kindly for the link ! having cursorily looked at it and the arguments raised by tishby et al, it seems that information bottleneck might still be relevant…

                                                      1. 1

                                                        Why do you think information bottleneck might still be relevant? I am curious. (I consider the theory mostly failed at this point.)

                                                        1. 2

                                                          In that link @sanxiyn posts, there seems to be a very vigorous back and forth between Tishby et al. (the IB theory crew) and the article criticizing IB (EDIT: with neither side conceding defeat). The program committee accepting the paper to the conference may only mean they thought it worthy of a much broader discussion in the community than their review process.

                                                          Since that was 2 years ago, perhaps other papers or discussion have taken place in the understanding of IB or its critique. I think the link itself and publication is non-conclusive, even of community opinion, never mind the fact of the matter.

                                                          One kind of “obvious” point about “compression” and “generalization” is that the are almost semantically the same. To find a well generalizing representation means to have some representation that has been “properly de-noised”. Representing noise takes up space (probably a lot, usually, but that is problem specific). This applies to all fitting, from multi-linear regression on up, and maybe to all “induction”. (The transduction case, more analogous to “interpolation” is different.)

                                                          That is just one piece of the puzzle, of course, and there may be controversy over how to define/assess “compression” (e.g. a neural net with a bunch of near zero weights may take up computer memory, but be the same as one without those weights at all), and also controversy over specific quantitative relationships between compression, however assessed and out of sample error rates.

                                                          TL;DR - I think @sanxiyn has more work to do in order to establish “mostly failed” or “debunked” status.

                                                          1. 2

                                                            @cblake, i don’t think i could have said it better than you did. thank you !

                                                            1. 2

                                                              You’re welcome. The Wikipedia entry on the Information Bottleneck Method covers some of this controversy in the “Information theory of deep learning” section (today’s version..Future folk may someday have to go back to that in wiki history). They also have more references.

                                                      1. 1

                                                        How do you use a rust crate from a zig program?

                                                        1. 5

                                                          In this case, the Rust crate in question (wgpu) provides a C API, and Zig consumes that. C API header is autogenerated from Rust source.

                                                        1. 2

                                                          Why use WebGPU and not WSL?

                                                          1. 3

                                                            Well, GLSL is here. WSL (now WGSL) is work in progress.

                                                          1. 2

                                                            Ok but why this and not - say - the various ML-alike-for-web languages?

                                                            1. 2

                                                              Which one do you have in mind? If you specify one, I will compare.

                                                              1. 2
                                                            1. 34

                                                              If there are any questions or remarks, I am right here!

                                                              1. 15

                                                                I wish I could invite this story multiple times. The perfect combination of being approachable, while still being packed with (to me) new information. Readable without ever being condescending.

                                                                One thing I learned was that DNA printers are a thing nowadays. I had no idea. Are these likely to be used in any way by amateur hackers, in the sense that home fusion kits are fun and educational, while never being useful as an actual energy source?

                                                                1. 14

                                                                  So you can actually paste a bit of DNA on a website and they’ll print it for you. They ship it out by mail in a vial. Where is breaks down is that before you inject anything into a human being.. you need to be super duper extra totally careful. And that doesn’t come from the home printer. It needs labs with skilled technicians.

                                                                  1. 7

                                                                    Could any regular person make themselves completely fluorescent using this method? Asking for a friend.

                                                                  2. 5

                                                                    You may be interested in this video: https://www.youtube.com/watch?v=2hf9yN-oBV4 Someone modified the DNA of some yeast to produce spider silk. The whole thing is super interesting (if slightly nightmarish at times if you’re not a fan of spiders).

                                                                    1. 1

                                                                      So that’s going to be the next bioapocalypse then. Autofermentation but where as well as getting drunk, you also poop spider silk.

                                                                  3. 8

                                                                    Love the article. Well done.

                                                                    1. 5

                                                                      Thanks for the awesome article! Are there any specific textbooks or courses you’d recommend to build context on this?

                                                                      1. 12

                                                                        Not really - I own a small stack of biology books that all cover DNA, but they cover it as part of molecular biology, which is a huge field. At first I was frustrated about this, but DNA is not a standalone thing. You do have to get the biology as well. If you want to get one book, it would have to be the epic Molecular Biology of the Cell. It is pure awesome.

                                                                        1. 2

                                                                          You can start with molecular biology and then a quick study of bio-informatics should be enough to get you started.

                                                                          If you need a book, I propose this one, it is very well written IMO and covers all this stuff.

                                                                        2. 2

                                                                          Great article! I just have one question. I am curious why this current mRNA vaccine requires two “payloads” ? Is this because it’s so new and we haven’t perfected a single shot or some other reason?

                                                                          1. 2

                                                                            It’s just the way two current mRNA vaccines were formulated, but trials showed that a single shot also works. We now know that two shots are not required.

                                                                            1. 2

                                                                              The creators of the vaccine say it differently here: https://overcast.fm/+m_rp4MLQ0 If I remember correctly, they claim that one shot protects you but doesn’t prevent you to be infective, while the second make sure that you don’t infect others

                                                                            2. 2

                                                                              As I understand it[1] a shot of mRNA is like a blast of UDP messages from the Ethernet port — they’re ephemeral and at-most-once delivery. The messages themselves don’t get replicated, but the learnt immune response does permeate the rest of the body. The second blast of messages (1) ensures that the messages weren’t missed and (2) acts as a “second training seminar”, refreshing the immune system’s memory.

                                                                              [1] I’m just going off @ahu’s other blogs that I’ve read in the last 24 hours and other tidbits I’ve picked up over the last 2 weeks, so this explanation is probably wrong.

                                                                              1. 1

                                                                                Not an expert either, but I think this is linked to the immune system response, like some other vaccines, the system starts to forget, so you need to remind him what the threat was.

                                                                              2. 1

                                                                                I enjoyed the article, reminded me of my days at the university :-)

                                                                                So here are some quick questions in case you have an answer:

                                                                                • Where does the body store info about which proteins are acceptable vs not?
                                                                                • How many records can we store there?
                                                                                • Are records indexed?
                                                                                • How does every cell in the body gets this info?
                                                                                1. 12

                                                                                  It is called negative selection. It works like this:

                                                                                  1. Body creates lots of white blood cells by random combination. Each cell has random binding sites binding to specific proteins and will attack them.
                                                                                  2. Newly created white blood cells are set loose in staging area, which is presumed to be free of threats. All cells triggering alarm in staging area kill themselves.
                                                                                  3. White blood cells, negatively selected not to react to itself, mature and are released to production.
                                                                                  1. 1

                                                                                    Interesting, thanks for sharing!

                                                                                  2. 5

                                                                                    How does info spread through the body

                                                                                    I came across this page relatively recently and it really blew my mind.

                                                                                    glucose is cruising around a cell at about 250 miles per hour

                                                                                    The reason that binding sites touch one another so frequently is that everything is moving extremely quickly.

                                                                                    Rather than bringing things together by design, the body can rely on high-speed stochastic events to find solutions.

                                                                                    This seems related, to me, to sanxiyn’s post pointing out ‘random combination’ - the body:

                                                                                    • Produces immune cells which each attack a different, random shape.
                                                                                    • Destroys those which attack bodily tissues.
                                                                                    • Later, makes copies of any which turn out to attack something that was present.

                                                                                    This constant, high-speed process can still take a day or two to come up with a shape that’ll attack whatever cold you’ve caught this week - but once it does, that shape will be copied all over the place.

                                                                                    1. 2

                                                                                      I did some projects in grad school with simulating the immune system to model disease. Honestly we never got great results because a lot of the key parameters are basically unknown or poorly characterized, so you can get any answer you want by tweaking them. Overall it’s less well understood than genetics, because you can’t study the immune system in a petri dish. It’s completely fascinating stuff though: evolution built a far better antivirus system for organisms than we could ever build for computers.

                                                                                    2. 1

                                                                                      Is there any information on pseudouridine and tests on virus encorporating it in their DNA?

                                                                                      The one reference in your post said that there is no machinery in cells to produce it, but the wiki page on it says that it is used extensively in the cell outside of the nucleus.

                                                                                      It seems incredibly foolhardy to send out billions of doses of the vaccine without running extensive tests since naively any virus that mutated to use it would make any disease we have encountered so far seem benign.

                                                                                      1. 1

                                                                                        From https://en.wikipedia.org/wiki/Pseudouridine#Pseudouridine_synthase_proteins:

                                                                                        Pseudouridine are RNA modifications that are done post-transcription, so after the RNA is formed.

                                                                                        That seems to mean (to me, who is not a biologist) that a virus would have to grow the ability to do/induce such a post-processing step. Merely adding Ψ to sequences doesn’t provide a virus with a template to accelerate such a mutation.

                                                                                        1. 1

                                                                                          And were this merely a nuclear reactor or adding cyanide to drinking water I’d agree. But ‘I’m sure it will be fine bro’ is how we started a few hundred environmental disasters that make Chernobyl look not too bad.

                                                                                          ‘We don’t have any evidence because it’s obvious so we didn’t look’ does not fill me with confidence given our track record with biology to date.

                                                                                          Something like pumping rats with pseudouridine up to their gills then infecting them with rat hiv for a few dozen generations and measuring if any of the virus starts encorporating pseudouridine in its RNA would be the minimum study I’d start considering as proof that this is not something that can happen in the wild.

                                                                                          1. 2

                                                                                            As I mentioned, I’m not a biologist. For all I know they did that experiment years ago already. Since multiple laymen on this forum came up with that concern within a few minutes of reading the article, I fully expect biologists to be aware of the issue, too.

                                                                                            That said, in a way we have that experiment already going on continuously: quickly evolving viruses (such as influenza) that mess with the human body for generations. Apparently they encountered pseudouridine regularly (and were probably at times exposed to PUS1-5 and friends that might have swapped out an U for a Ψ in a virus accidentally) but still didn’t incorporate it into their structure despite the presumed improvement to their fitness (while eventually leading our immune system to incorporate a response to that).

                                                                                            Which leaves me to the conclusion that

                                                                                            1. I’d have to dig much deeper to figure out a comprehensive answer, or
                                                                                            2. I’ll assume that there’s something in RNA processing that makes it practically impossible for viruses to adopt that “how to evade the immune system” hack on a large scale.

                                                                                            Due to lack of time (and a list of things I want to do that already spans 2 or 3 lifetimes) I’ll stick to 2.

                                                                                      1. 4

                                                                                        Native : … Both the user and the developer should feel at home on each platform. The look and feel and experience should match the users’ expectations of a native application.

                                                                                        … and yet, although the screenshots have macOS traffic light buttons, the UI look and feel is nothing like a native Mac app. Am I misunderstanding this goal?

                                                                                        1. 6

                                                                                          No, you aren’t misunderstanding. It’s a work in progress, that’s why.

                                                                                          1. 3

                                                                                            Thanks for being honest about that. It’s a pretty cool concept, however!

                                                                                        1. 5

                                                                                          GPLv3 GUI toolkit from people who built Qt. They have a demo compiled to WebAssembly here.

                                                                                          1. 2

                                                                                            It will all be fun and games until they have market share, then the dual license becomes an issue because they don’t make any money…

                                                                                            1. 3

                                                                                              I assume you’re mostly referring to Qt’s problems pushing their licenses. Only time will tell how SixtyFPS fares, but there are some differences. There’s no LGPL offering, only GPL. This presumably means that commercial users who don’t want to share their code will have to buy a commercial licence. This may avoid the Qt situation where they amass a large number of non-paying corporate LGPL users then go round threatening them in the hope they’ll buy a licence which they don’t need because they’re already LGPL compliant. I also believe that in many ways Qt themselves are responsible for their failure to sell licenses, rather than the dual-licence model itself.

                                                                                              When I first saw the headline about a new GUI toolkit, I thought, “Another one!?”. I think it’s clear that the current state of GUI toolkits is not ideal, so we have recently been seeing a number of attempts to do it better. However, it’s equally clear that Qt is a massive beast, almost unassailable if you want to compete on most of its features rather than offering a limited subset of platforms or use cases.

                                                                                              However, now that I’ve read a little bit about this one and seen who’s behind it, I’m cautiously optimistic about it. It looks like they are trying to keep some of Qt’s strengths, such as the declarative language for describing GUIs, while avoiding some of its downsides, such as the use of dynamically-typed, interpreted, javascript in the frontend code and the garbage collector.

                                                                                              The choice of Rust as the main implementation language, with the option to use Rust, C++, or javascript to write applications seems good. Qt has had a long and not always easy relationship with C++. Early versions of Qt tried to fills in some of the gaps in C++, but at times got carried away, with all manner of qThis, qThat and qTheOther types implementing Qt’s take on C++. While this was eventually reigned in and current Qt is largely compatible with std C++ containers and algorithms, the Qt API still uses raw pointers and implicitly-shared containers extensively, meaning that despite C++ ostensibly being Qt’s ‘native’ language, an existing user of ‘modern C++’ may not find Qt to their taste.

                                                                                              I’ll definitely be giving this one a try later…

                                                                                              1. 2

                                                                                                the declarative language for describing GUIs, while avoiding some of its downsides, such as the use of dynamically-typed, interpreted, javascript in the frontend code and the garbage collector.

                                                                                                Fantastic observation. I didn’t think of this similarity or this difference.

                                                                                          1. 4

                                                                                            A good explanation of how Git works, but that’s a design bug in Git. Commits should be diffs, that’s why it is confusing.

                                                                                            1. 4

                                                                                              Why?

                                                                                              1. 3

                                                                                                It makes cherry picking and rebasing easy, and you don’t need to write articles like “commits are snapshots, not diffs”.

                                                                                                1. 11

                                                                                                  I don’t see how it makes those easy. Calculating a diff from snapshots (as we do now) is trivial; the hard part of cherry-picking and rebasing is resolving conflicts when the context for the resulting diff isn’t the same anymore. Storing diffs instead of snapshots doesn’t make that problem go away.

                                                                                                  Furthermore, the patch format is lousy at storing some changes like file moves or deletions or mode changes. A more efficient patch format might manage that better.

                                                                                                  1. 2

                                                                                                    It does make cherry picking easy, because commits modeled as diffs keep identity. In Git, the exact same commit gets different identifiers when you cherry pick from branch A to branch B. This is a cause of large number of problems.

                                                                                                    https://pijul.org/faq explains the issue in more details.

                                                                                                    1. 1

                                                                                                      I could not extract a problem from that web page that explains what you mean. Can you try summarising here? Can you give a practical example of the problem?

                                                                                                      1. 4
                                                                                                        • 3-way merge doesn’t always behave “as expected”: there is an example on https://pijul.org/manual/why_pijul.html where the result (as in, the contents of the files) of a Git merge depends on whether you merge two commits one at a time, or just the last one (i.e. both diffs together).
                                                                                                        • Conflicts are not modeled at all in Git, and Git has git rerere to handle that. Patches make that intuitive.
                                                                                                2. 3

                                                                                                  Because it just makes sense given the operations we do. Conmits are patches we use to take the code from version x to y. That’s why we can rebase and mail patches around. They’re patches. Which is like a diff but the input/active view while a diff could also be the passive/output/observation view.

                                                                                                3. 4

                                                                                                  If you want diff, you should look into old vcs. SVN and CVS both stored diffs. (troll intended)

                                                                                                1. 2

                                                                                                  This sounds resonable but it is surprising to me that they do this kind of changes in a minor release.

                                                                                                  1. 4

                                                                                                    It is surprising, but toolchain is explicitly excluded from Go compatibility promise. See https://golang.org/doc/go1compat.

                                                                                                    1. 5

                                                                                                      The culture around Go’s tooling confuses me. On the one hand, the project obviously realizes that tooling is important for the language. On the other hand, any complaint about the conventions the tooling forces on you, or any complaint about the tooling itself, is met with a “it’s not part of the language, it’s just tooling, you don’t need to use it”. And evidently, the project doesn’t see tooling as something which people should in any way rely on, considering it changes in major backwards incompatible ways between minor releases.

                                                                                                      I don’t get it.

                                                                                                      1. 1

                                                                                                        You could just use GCC for compiling Go:

                                                                                                        gccgo -O2 main.go -o main
                                                                                                        

                                                                                                        It makes sense that the language and tooling are separate, to avoid “the one true compiler” issue.