Threads for lemon

    1. 4

      Lots of dunking on Nintendo in this thread, when it’s pretty clear that Steam was the entity refusing to host the Dolphin emulator unless the developers got a go-ahead from Nintendo.

      1. 12
        • Dolphin: “Hey Valve, can we publish this on Steam?”
        • Valve: “Grey area, let us check. Hey Nintendo, can they publish this on Steam?”
        • Nintendo: “If they do we will do everything we can to make your life a living hell.”
        • Valve: “Gotcha. Hey Dolphin, sorry, this is what Nintendo said and we don’t want to deal with it.”

        Right, and Valve is the one being unreasonable.

      2. 7

        Valve (Steam) inquired if the completely legal emulator could be distributed on the Steam store-front, aware of Nintendo’s nature. Nintendo informed Valve they consider it a violation of the DMCA. Recently the DMCA caused a lot of trouble for Valve as a very popular game, Dark & Darker, was taken down by Korean game publisher NEXON for copyright infringement, which caused Valve a ton of negative PR and community backlash. I figure Valve’s legal team contacting Nintendo is defensive behaviour with respect to another copyright-induced backlash from the community. They have a lot of good will with gamers and have try to take steps like this to protect their reputation.

        1. 5

          I’d argue that partnering with Dolphin to defend against Nintendo in court would bring Valve and enormous amount of good PR.

          I’m disappointed that they chickened out. I understand why, and don’t really blame them for it - they are a business after all - but I am disappointed and my respect/goodwill for them has gone down a bit.

      3. 5

        Valve’s position is not that unreasonable when you consider that they would not want to jeopardize their business relationship with Nintendo (

    2. 1

      The immediate operand encoding is especially terrible in the compressed instruction set, where there’s at least 10 different bit-shuffling formats used by different instructions..

    3. 5

      I would advise against using any of the is* functions from ctype.h. Aside from the locale issues, the standard says the the int argument shall be either EOF or a nonnegative integer representable as unsigned char, and anything else means undefined behavior (it’s like they’re designed to work with the return value of getchar and co.). This means if you’re trying to classify ASCII input from a string (a char *), doing the obvious e.g. isalnum(str[i]) can invoke UB on platforms where char is signed. Sane implementations should handle this the safely, and while glibc does handle that negative range for compatibility with old software, it will segfault (due to unchecked lookup in a table) if the input is >255 (for example if you’re passing unicode codepoints as input)… I think these functions are too error prone to rely on and for ASCII I always roll my own inline (they’re all one-liners) or copy them from musl.

    4. 6

      If I understand correctly, this is about avoiding common pitfalls with dynamic types in Python, and making code more robust, by adding types everywhere. Where it differs from Rust is that even with all these types, Python interpreter will still happily run whatever it is given (while Rust compiler will complain loudly). So one still needs to rely heavily on mypy and pyright.

      1. 10

        It’s not only about type annotations. Even in a fully dynamically typed unannotated language, code might be more or less prone to runtime type errors. A good example here is null-safety.

        Python is not null (None) safe — it occasionally happens that you try to call a method on something, and something turns out to be None.

        Python is more null-safe than Java. Java APIs tend to happily return null, while Python in general prefers to throw. For example, looking up a non-existing element in a dictionary returns null in Java, and throws KeyError in Python. Python behavior prevents silent propagation of nulls, and makes the bugs jump out.

        Erlang is null-safe. It is dynamically typed, but, eg, map lookup functions return ('ok, value) pair on success, and 'error atom on failure. This forces the call-site to unpack an optional even for happy cases, signaling the possibility of nulls.

        1. 9

          Python and Java are safe in a type safety sense with respect to None/null. The behavior is well defined (raise/throw an exception). Your complaint is about API design not language design.

          For contrast, in C/C++ it is not defined what happens if you dereference a NULL pointer and that makes it unsafe. The NULL constant might be some thing else than 0x0. Dereferencing it could return a value or kill the process with a segmentation fault or something else.

          1. 6

            Yes, this is mostly a question of API design, rather than a question of language design (though, to make Erlang API convenient and robust to use, you need to have symbols and pattern-matching in the language).

            Whether null-unsafety leads to UB is an orthogonal question.

          2. 1

            The NULL constant might be some thing else than 0x0.


            The macro NULL is an implementation-defined null pointer constant, which may be

            • an integer constant expression with the value ​0​
            • an integer constant expression with the value 0 cast to the type void*
            • predefined constant nullptr (since C23)


            nullptr_t has only one valid value, i.e., nullptr. The object representation of nullptr is same as that of (void*)0.

            1. 3

              The integer literal 0 when used as a pointer is a valid way to specify the null pointer constant, but it doesn’t mean the representation of null pointers is all-zero bits.


              A zero-initialized pointer is the null pointer value of its type, even if the value of the null pointer is not integral zero.


     The calloc function


              1 #include <stdlib.h>

              void *calloc(size_t nmemb, size_t size);


              2 The calloc function allocates space for an array of nmemb objects, each of whose size

              is size. The space is initialized to all bits zero.261)


              3 The calloc function returns either a null pointer or a pointer to the allocated space.

              1. Note that this need not be the same as the representation of floating-point zero or a null pointer constant.
              1. 2

                Everything that you say is true with respect to the de jure standard. The de facto standard is somewhat different and a platform where null is not 0 will break a lot of assumptions. In particular, C’s default zero initialisation of globals makes it deeply uncomfortable for null to not be a zero bit pattern.

                For CHERI C, we considered making the canonical representation a tagged zero value (I.e. a pointer that carries no rights). This has some nice properties, specifically the ability to differentiate null from zero in intptr_t and a cleaner separation of the integer and pointer spaces. We found that this was a far too invasive change to make in a C/C++ implementation that wanted to be able to handle other values.

                Similarly, AMD GPUs have the problem that their stack starts at address 0, so 0 is a valid non-null pointer. I proposed that they lower integer to pointer and pointer to integer casts as adding and subtracting one. This would let their null pointer representation be -1. This broke a lot of assumptions even in a fairly constrained accelerator system.

                In particular, C has functions like memset that write bytes. C programmers expect to be able to fill arrays with nulls by using memset. They expect to be able to read addresses by aliasing integers and pointers. They expect to be able to stick objects in BSS and find that they are full of nulls. They expect the return from calloc to be full of nulls, even if the standard does not require it.

                My favourite bit of the standard’s description of null is that a constant expression 0 cast to a pointer may compare not equal to a dynamic value of zero converted to null. If an and b are integers and a is constant, a can compare equal to b, but a cast to a pointer would compare not equal to b cast to the same pointer type. That’s such a bizarre condition that I doubt anyone writing C has ever considered whether their code is correct if it can occur.

                In general, I am happy what WG14 and WG21 try hard to avoid leaking implementation details into the language, but there are two places where I think that the ship has sailed. Bytes (char units) are octets and null is represented by zero. So much C/C++ code assumes these two things that any implementation that changes them may, technically, by C or C++, but it won’t run more than a tiny fraction of code written in either language.

                1. 1

                  Oh, I agree that de facto C code almost universally assumes the null pointer representation to be zero when initializing data with memset or calloc, for better or worse (my C code does too). And in practice assuming nulls are zero is rather convenient. Though machines with non-zero null pointers do (or did) exist (

                  In particular, C’s default zero initialisation of globals makes it deeply uncomfortable for null to not be a zero bit pattern.

                  Pointers are initialized to the null pattern in the default zero initialization, so I’m not sure what you mean. Are you saying that it is inconvenient for null representation to be non-zero because then the compiler couldn’t put the zero-initialized global in .bss? I agree that’s a good point in favor of having a zero null in practice, but it wouldn’t be so dire, an implementation could put the global data in .bss to be zero-initialized by the loader and then manually initialize the pointer fields, for example with a special pseudo-relocation. Though I don’t know if there’s any implementation that does anything like that, and that doesn’t solve the problem for calloc/memset.

                  1. 1

                    Pointers are initialized to the null pattern in the default zero initialization, so I’m not sure what you mean. Are you saying that it is inconvenient for null representation to be non-zero because then the compiler couldn’t put the zero-initialized global in .bss?

                    Yes, and the mid-level optimisers in most compilers really, really like that assumption. If they need to fill in another bit pattern then causes problems. More importantly, it’s surprisingly common to put huge data structures in BSS. Every platform except Windows was very happy for snmalloc to put a 256 GiB structure full of pointers in BSS and then have the OS lazily allocate pages as needed. If null were non-zero, then suddenly that would require a 256 GiB binary. We are quite an extreme example of this but I’ve seen variants of it elsewhere.

                    implementation could put the global data in .bss to be zero-initialized by the loader and then manually initialize the pointer fields, for example with a special pseudo-relocation

                    That’s basically what I did for early versions of CHERI/Clang and it has some really bad pathological cases in some common codebases.

                    1. 1

                      This reminded me of MM_BAD_POINTER. I wonder whether its marginal utility is practically zero, or even negative due to the .bss considerations, now that NTVDM is rather long in the tooth. That’s the main reason I’m aware of for having the null page be dereferencable (the VM86 IVT was situated there). Even then, who knows, with all the speculative execution mitigations, I’ve lost track of whether a 32-bit Windows driver running code on behalf of an NTVDM process would actually be null-dereference-exploitable.

    5. 3

      How do people deal with comments when combining lexing and parsing? An upfront pass to strip comments?

      1. 4

        While the other replies cover the standard approach (treating it as whitespace), that’s not always what you want, especially if you’re building a “blessed” parser for a language since a lot of use cases want to understand your comments (for example formatters which need to preserve comments and LSPs which might use documentation comments).

        You can check what rust-analyzer and black do for references.

      2. 2

        In my parser, every lexeme stores its position in the input text. (This can be done as a separate array.) Then a comment can just be a single COMMENT token; to get the content of the comment (if you need it for some reason, like doc generation), just grab the position and look up the original text.

        (Though most parse rules treat it as equivalent to whitespace, and ignore it.)

      3. 2

        Recognize them as a kind of whitespace, and let them get handled by the rule (which might be explicit or implicit) that ignores whitespace between significant tokens.

      4. 1

        You treat it as whitespace; for example given functions like this:

        static void eatspaces() { // consume whitespace and comments from input stream
            for (;; nextchr()) {
                int c = peekchr();
                if (c == '#') { // comment
                    while ((c = peekchr()) != '\n' && c != EOF) nextchr() ;
                } // you could match multiline comments, even nested ones, too
                if (!aisspace(peekchr())) {
        static bool match_nospc(int c) {
            return match(c);
        static void expect_nospc(int c) {

        and the parser:

        static Expr primary() {
            int c = nextchr();
            if (aisdigit(c)) { // number literal
            } else if (c == '(')) { // '(' EXPR ')'
                Expr ex = expression();
                return ex;
            } else ...
        static Expr unary() {
            if (match_nospc('-')) { // '-' UNARY
                return makeunaryexpr('-', unary());
            } else if ... {
            } else {
                return primary();

        I wouldn’t recommend doing things this way (unified lexer and parser), it can get pretty messy.

    6. 9

      Note that this TAS was later improved by 6 frames on the Japanese version: Since the game only starts polling inputs after 10 frames, that leaves only 3 actual frames of input!

      1. 1

        I bet if they wiggled the cart a certain way they could do 0 frame game completion :^) (Such that the initial instructions are modified to instantly load the ending sequence)

    7. 2

      I don’t know LLVM’s register allocator, nor do I know GCC’s all that well, but this kind of register allocation is very specific. Straight line code for an entire function with a lot of computation isn’t all that common. Specializing the register allocator to deal with this is probably not worth the time.

      Also, it wasn’t clear to me that the comparison with Clang/LLVM and GCC is all that fair because the SSRA doesn’t appear to be taking the calling convention into account. In the example

      r12 = INPUT0()
      r2 = MUL_IMM(r12)

      if MUL_IMM is a function, then the argument has to be passed in r0. I didn’t see anything about that in the description of the algorithm or the code. Maybe I missed it.

      If MUL_IMM is not a function, then the comparison is not a fair one for the same reason: LLVM and GCC are bound by the calling convention of the ABI.

      1. 3

        Since the author mentions LuaJIT uses reverse linear scan too, I guess the implication is that the same concept can be extended to support more complicated control flow as well. The linked article does not go into a lot of detail, but it says this:

        In LuaJIT, the IR maintains its SSA form - there is only a single definition of a value. This means that when iterating in reverse order, computing the live range becomes trivial. When we first encounter a use of a value, then by definition that is the last use. And when we encounter a definition, that is the only and single definition, and we can release the register. So there’s no need to compute the live range in advance of allocation.

        Furthermore, rather than merging the values of multiple branches into the same live range, each value on either side becomes an individual live range. As a result, the live range of a value never has a hole, further simplifying code.

        1. 4

          LuaJIT is a tracing JIT, so it too doesn’t have control flow.

      2. 2

        My understanding is that there are no calls here. It is for code without any jumps or branches. Calls are a kind of jumps.

        Yes, I agree it is a quite specific case.

        1. 3

          My understanding is that there are no calls here. It is for code without any jumps or branches. Calls are a kind of jumps.

          That was my understanding as well, in which case the comparison to LLVM and GCC is invalid since that is nothing but jumps.