1. 3

    I support standardizing -fno-strict-overflow and -fno-strict-aliasing in ISO C. This is basically status quo, and standardizing existing practice is usually a good idea. On the other hand, I am pretty skeptical about proposals along the line of platform-specific UB. I am yet to see any good proposal, and most proposals can be described as compiler user wishlist with no connection to compiler implementer reality.

    1.  

      I disagree with standardizing the no-overflow/no-strict-aliasing flags. Using these options is not standard practice (it may be reasonably common practice, but that’s not the same thing). Supporting these options (or equivalents) is pretty standard in compilers, but the standard already allows for that (it doesn’t mandate it).

      The point of standardising the language is so that it is clear what the semantics are, and what is and what is not allowed, so you can have some assurance about the behaviour of code even when compiled using different compilers. That assurance is significantly reduced if you now need to know the specific variant of the language the code is written in. I can foresee problems arising where “no-strict-overflow, no-strict-aliasing” code would be unwittingly used in the wrong (strict-overflow, strict-aliasing) context and be broken as a result. It would arguably be better to not have these options at all, since their presence leads to their use, and their use allows what is fundamentally incorrect code to be written and used. I would much rather see standardised consistent solutions that would be embodied within the source: special “no strict overflow” integer types (or type attributes), “may alias all” pointer types, and so on. And we sorely need simple, consistent and standard ways to safely check for overflow before it happens (such as what GCC provides, but which of course is not standard: https://gcc.gnu.org/onlinedocs/gcc/Integer-Overflow-Builtins.html).

      1.  

        For implementation strategy, I think standardized pragma would be best, because as you pointed out, flag risks use in wrong context.

        what is fundamentally incorrect code to be written and used

        This is… non sequitur? Incorrect according to whom? It is only incorrect according to the current standard, which is immaterial since the proposal is precisely to update the standard. This is like replying to “software should not be patentable” by “infringing software patent is illegal”, both true and useless. This is not about being correct or incorrect. -fno-strict-overflow and -fno-strict-aliasing are useful. Linux kernel uses them!

        “no strict overflow” integer types, “no strict aliasing” pointer types, overflow checking builtins

        All good ideas, but this is also perfect embodiment of “perfect is the enemy of good”. These are not existing practices (okay, except GCC overflow checking builtins, I support standardizing them yesterday), unlike -fno-strict-overflow and -fno-strict-aliasing flags. Considerable design work needs to be done, and prototype needs to be field tested. We should start that today, but standardization is far off.

        1.  

          This is… non sequitur?

          No, I don’t think so.

          Incorrect according to whom? It is only incorrect according to the current standard

          Exactly.

          which is immaterial since the proposal is precisely to update the standard

          That’s the proposal you are making, but not one I agreed with, for the specific reason that I don’t want “standard C” to actually be multiple different languages. The existence of these options (outside the standard) and the fact that they lead to multiple variants of the language (standardised or not) is very relevant, not immaterial at all.

          This is like replying to “software should not be patentable” by “infringing software patent is illegal”

          The argument is “these options should not be standardised, because they cause problems”.

          1.  

            I understand concerns about fragmenting the language, but my view is that it is already lost. Linux kernel exists, and C is already fragmented in practice.

            I am not proposing this, but one way to solve fragmentation is to standardize -fno-strict-overflow and -fno-strict-aliasing behavior, without any option. If you want lost optimization you can add flags yourself, exactly as you can add -ffast-math now.

            1.  

              I understand concerns about fragmenting the language, but my view is that it is already lost. Linux kernel exists, and C is already fragmented in practice.

              I agree it’s already a problem. I think the underlying causes for this should be addressed, but not by standardising the language variants that have already emerged (nor by removing strict aliasing / strict overflow altogether).

          2.  

            All good ideas, but this is also perfect embodiment of “perfect is the enemy of good”. These are not existing practices

            I also disagree with this characterisation, regardless of whether they are existing practices.

            But in fact: https://gcc.gnu.org/onlinedocs/gcc-4.0.2/gcc/Type-Attributes.html

            • “may_alias” attribute (applies to pointee types, not pointers themselves, so not exactly what I suggested, but close enough).

            There’s no similar attribute for wrap-on-overflow, unfortunately. But I don’t think “it hasn’t been done, therefore it should not be done” is an argument that really holds water. And characterising it as “the perfect” because it hasn’t been done seems a stretch. (edit: and characterising your own proposal as “the good” is begging the question).

            1.  

              it hasn’t been done, therefore it should not be done

              What the fuck. I said “We should start that today”. I am just pointing out that “put it on types” has had less field testing than “put it on flags”.

              1.  

                Sorry, I missed your “we should start that today” comment and I didn’t intend to anger you.

                this is also perfect embodiment of “perfect is the enemy of good”

                I interpreted the above other than “we shouldn’t do what you have proposed (the perfect) because there is a solution that I have proposed (the good) that is easier because it has already been done”. Now, I think what you meant was actually “we should do what I have proposed now rather than delaying indefinitely until we can do something better”. I’m afraid I still disagree; I don’t want to see these language fragments standardised. On a practical level, I also think it’s unlikely either change would be standardised in any short time frame. The ISO C committee is not known for, err, actually correcting significant problems in the language and its specification.

          3.  

            not standard

            It has been proposed.

        1.  

          There’s so many issues that come from the need to optimise that I wonder if C could solve a few problems by introducing “don’t touch this” blocks. Basically “volatile”, but for lines of code where no optimisation takes place, no dereference is skipped, no overflows are analysed, etc. So you’d write:

          volatile { *foo = bar[a+b]; }
          

          and whatever else is happening around that block, you’d do the addition, deref bar, load the foo address and write there - no changes allowed.

          Given how much analysis and workarounds we’re already stacking here, wouldn’t handing the control back to the dev be simpler at this point? (This would probably need to disable LTO though)

          1.  

            The root problem is that people want C to be two things:

            • A portable assembler.
            • A language with compiler optimisations.

            You can’t have both in a single language. If you want a trivial-to-understand lowering from your source syntax to the sequence of executed instructions then you can’t do any non-trivial optimisations. You can do constant propagation. You might be able to do common-subexpression elimination (though not if you have shared-memory parallelism). That’s about it.

            If you want optimisation then those optimisations will be correct according to some abstract machine. You need to understand what that abstract machine is and accept that your code describes a space of possible behaviours and that any of the behaviours allowed by the abstract machine may occur. The more flexibility the abstract machine allows, the larger the set of possible optimisations. If you want things like autovectorisation of loops then you need to have a memory model that allows the compiler to make assumptions about non-aliasing and happens-before relationships: if partial results to four loop iterations are visible in memory then this would violate the semantics of a very close-to-the-ISA abstract machine, but is fine in C/C++ because the memory model tells you that no atomic has happened that established a happens-before relationship and so the exact order that these things appear in memory is undefined.

            Personally, I’d love to work on a good language for writing operating systems and language runtimes. Something that had a memory model that let you reason about behaviour of lockless data structures and that had a mechanism for me to define my own type-punning rules in the language (such that I could implement a memory allocator and expose the explicit point at which the type associated with underlying memory changed). There are probably a dozen or so projects that would adopt such a language, so it’s hard to justify spending time on it.

            1.  

              This would be bitch to specify, but yes, I would like to see a serious try of this. Can a and b be on register, or should compiler be required to load them from stack, for example? You basically need to specify compilation algorithm, which amounts to re-implementing a C compiler. By the way, HTML5 parsing specification does work that way, so such standard can be valuable. It’s just a lot of work and very different style of standardization.

              1.  

                I’m not super familiar with the C standard - why do you think the whole compiler would have to be redefined rather than adding qualifiers like “this transformation may be done here - unless it’s a volatile block”, “this is undefined - unless it’s a volatile block where …”, etc. ?

                1.  

                  The C standard doesn’t directly specify transforms that can be applied, at all (maybe one or two very minor exceptions). The extent to which permissible optimisations are specified is mainly via two concepts:

                  The “as-if” rule, which says (more or less) that as long as the observable behavior of a program is correct then the compiler has its job (i.e. it doesn’t matter what code is generated, as long as it produces the output that it should have, according to the semantics of the “abstract machine”). Quote from the standard:

                  The semantic descriptions in this International Standard describe the behavior of an abstract machine in which issues of optimization are irrelevant.

                  Then, there’s the “undefined behaviour” concept, which says (again - roughly) that if a program violates certain constraints, the semantics of that abstract machine are not specified at all. This notion is particularly useful for compilers to exploit in order to enable optimisations. But the standard doesn’t generally talk about actual transformations to the program.

                  That leaves your second point:

                  “this is undefined - unless it’s a volatile block where …”

                  That could be done, to some extent; but then the behaviour (inside such a block) would have to be specified. It’s hard to explain why this is difficult without going into a lot of detail, but suffice to say, the standard is already sufficiently vague in enough areas that it’s already difficult to tell in some cases whether certain things have defined behaviour (and if they do, what it is). Getting the details right for such a change would be very finicky. However, ultimately, what you suggest could probably be done - it would just need a lot of work. I don’t think it would in fact require specifying the whole compilation algorithm.

              2.  

                How would it work with, lets say, function call boundaries? In particular inline functions.

                inline void write_byte(uint8_t *p, uint8_t v) { *p = v; *p = v; }
                
                volatile {
                    write_byte(p, 42);
                    write_byte(p, 64);
                }
                

                Should the above write to *p once, twice, or four times? I think twice seems the most reasonable, but I think there are arguments to be made for four writes as well, depending on whether or not write_byte is static inline or not.

                1.  

                  work with, lets say, function call boundaries?

                  They don’t have to be allowed inside. I imagine using the volatile block for just a few lines like the inside of write_byte + preventing reordering around that block. Basically a high-level asm block.

              1. 6

                We discussed reply to this paper here.

                1. 1

                  Well, reading this makes me very thankful that I have nobody forcing me to use a proxy of any sort. It looks like a lot of effort to maintain this, and it’s actually somewhat surprising that it is even possible with a lot of things (I don’t think too much software gets developed with using a mandatory man-in-the-middle anymore these days?)

                  1. 1

                    I don’t think too much software gets developed with using a mandatory man-in-the-middle anymore these days?

                    You’d hope, but no. I have worked behind mandatory MITM TLS-breaking proxy in the past, and a lot of software still gets developed in such environment, and supporting such environment is basically a requirement once you get popular enough to get such users.

                    1. 1

                      Well it’s part of reality, even in non mega-corps. I guess every tool (or especially package manager type thing) needs to grow up at a certain point and accept this reality, sometimes the people who have to live with the fallout (behind the proxies) need to do the work :P

                    1. 10

                      So one answer to the question as initially posed is that no, WebAssembly is not the next Kubernetes; that next thing is waiting to be built, though I know of a few organizations that have started already.

                      I agree there is something of an “inner platform effect” going on here…

                      https://en.wikipedia.org/wiki/Inner-platform_effect

                      That is, WebAssembly is useful, but we don’t need it to encompass all of server side computing … That’s just rebuilding an operating system on top of an operating system, which basically doubles the number of problems your system has.

                      I imagine it’s going to be a similarly useful technology as the JVM … but the JVM is by and large “just another Unix process” in a typical server workload. It didn’t end up being as general or universal as was once thought.

                      1. 2

                        I haven’t started using webassembly in practice yet, but isn’t one benefit of WASI et al that the enable more advanced capabilities systems than (most) traditional server operating systems? That could be a practical benefit, especially for multi-tenant services i think.

                        1. 7

                          Having default-deny of a wasm sandbox vs. default-allow of Unix is definitely better in principle. However it does make it harder to port existing software, which was one benefit of WASM.

                          I think what @eyberg listed is basically what I’m getting at. WASM is a small part of a server side platform; WASI adds more, but you will end up rebuilding most of an OS. That is exactly what the JVM tried to do.

                          And you will have downsides like lack of heap protection, since WASM is a flat address space:

                          https://lobste.rs/s/pzr5ip/everything_old_is_new_again_binary

                          And performance issues like lack of SIMD. I think they are working on adding SIMD, but again it’s another layer / point of indirection, i.e. an “inner platform”. An ISA is already an abstraction layer over physical hardware :) This may be a cost you want to pay in some circumstances, but not all.

                          Here I question if WebAssembly is really “polyglot”: https://news.ycombinator.com/item?id=28581634

                          i.e. programmers underestimate the degree to which languages and VMs are coupled. Any VM design decision creates some “losers”, and the losers are the languages that will experience a 2x-10x slowdown over their native environment.

                          1.  

                            Having default-deny of a wasm sandbox vs. default-allow of Unix is definitely better in principle.

                            Snap’s --classic flag comes to mind here.

                            1. 1

                              And you will have downsides like lack of heap protection, since WASM is a flat address space

                              There is https://github.com/WebAssembly/multi-memory, which is basically memory segmentation for WebAssembly.

                            2. 1

                              The capability system is incredible, almost comparable to that of lua - with the advantage that nearly all WASM implementations are coded defensively and expect hostile bytecode.

                          1. 10

                            You know, Wine runs on FreeBSD too. Linux tag is inappropriate.

                            1. 16

                              It also runs on macOS and is used by a lot of companies to port Windows games to Mac. A lot of the development funds come from this: game companies write a Windows game and then pay one of the companies that backs WINE to implement the APIs that their game needs that are currently missing.

                              1. 2

                                What about Unix then?

                                1. 1

                                  How about a not-windows tag?

                                  1. 6

                                    Surely it’s about time Wine was ported to Windows.

                                    1. 8

                                      WSLW: Windows Subsystem for Linux, Wine edition.

                                      Edit: beyond parody https://lobste.rs/s/a697jw

                                      1. 3

                                        That might improve how some of the win95-xp era video games run on newest releases from Redmond.

                                  1. 2

                                    Parts of this work were supported by the European Union’s Seventh Framework Programme

                                    I commend EU for the support. EU seems to be especially good at things like this.

                                    1. 6

                                      This seems bad? C should copy D’s scope, not Go’s defer.

                                      1. 1

                                        Why? Because you like it better?

                                        edit: I happen to like Go’s defer, but I have no technical arguments to why it would be or would not be superior to D’s scope feature, so I’d be interested in any such arguments.

                                        1. 4

                                          Go’s defer is function-scoped. You can’t use it for block-scoped cleanup.

                                          1. 1

                                            True. However, the proposal here does not rule out block-scoped defers (see 5.1), but leaves it implementation defined whether to allow it.

                                            1. 4
                                              void f() {
                                                  for ... {
                                                      r = get_resource();
                                                      defer cleanup(r);
                                                      ...
                                              

                                              This code would compile and run successfully with or without block-scoped defers, but it’s good and safe code in one case, and an enormous bug in the other. Is it a good idea to leave that (quite important!) semantic distinction up to the implementation’s discretion?

                                              edit: This isn’t aimed at you, birkelund, but rather a more general observation.

                                              1. 1

                                                My reading of the spec (see 3.1.4) is that the code would either compile and do the right thing (i.e. cleanup on each loop iteration), or not compile (with an error along the lines of “defer only allowed in top-level function bodies”).

                                                1. 1

                                                  Oh! Interesting, thanks for the clarification.

                                          2. 3

                                            If I’m reading this proposal correctly, there’s no way to do defer without dynamically allocating memory.

                                            Zig’s defer is better - no dynamic allocations necessary.

                                            1. 2

                                              Honestly - and I say this as a heavy Go user - Go’s defer mechanism for ‘catching panics’ sucks. D’s scope mechanism provides a much cleaner way to specify in which situation (success/failure) the deferred function should run, as well as providing more granular scopes as someone else pointed out.

                                              With all that said, I think I like Python’s context managers the best, as they provide simple rules to define the scope and allow hiding of state management inside of a library.

                                            2. 1

                                              On top of that, I never understood why defer needed a function – just give it an expression.

                                              defer close(fd);
                                              defer free(p);
                                              
                                            1. 2

                                              What they benchmarked was only the space efficiency of various encodings. Disappointing that they didn’t look at performance of encoding the data or of accessing all or portions of it.

                                              1. 3

                                                Agreed, on the other hand that would be a benchmark of implementations, not specifications.

                                              1. 2

                                                Apparently gifting a wafer to everyone who contributed to chip development is a tradition. I also saw one on the wall when I visited home of a hardware engineer friend.

                                                1. 2

                                                  Is it just me or does the test suite seem woefully inadequate?

                                                  Also, using C++ to implement a core C library feels a bit like the snake that eats its own tail.

                                                  1. 1

                                                    Also, using C++ to implement a core C library feels a bit like the snake that eats its own tail.

                                                    LLVM’s libc is also using C++. The requirements for a language implementing libc are:

                                                    • Must not depend on anything that libc provides (at least, in the parts that depend on that).
                                                    • Must be able to export C symbols.

                                                    C++ meets these requirements as does Rust with nostd. A lot of the things in libc end up being macros that provide error-prone implementations of C++ templates. For example, qsort, qsort_r and qsort_b are all exactly the same algorithm, with minor tweaks to how they invoke the callbacks. Some things, such as bits of locale state, need atomic reference counting. You can implement these in C with explicit calls to incref and decref functions but using C++ smart pointers makes it almost impossible to get wrong.

                                                    I’ve worked a reasonable amount on things in libc implementations and a significant fraction of that time has involved thinking ‘this would be much less code and easier to audit if I wrote it in C++’.

                                                    We’ve implemented malloc in C++ and that’s one of the lowest-level parts of libc. It’s about half the size of jemalloc (which is written in C) and performs better.

                                                    1. 1

                                                      I suppose I say this because I work in the embedded world, where C++ for C libs doesn’t fly.

                                                      1. 1

                                                        I don’t really buy that argument. C++ can generate at least as small code as C (there was a great talk someone linked to here using modern C++17 features to compile for a 6502 and generating code as good as hand-written assembly). The only embedded programming that I’ve done has been on things like the Cortex-M0 and the SDKs supported C++ out of the box and let me write high-level generic code that compiled down to a few hundred bytes of object code when instantiated for my specific SoC. Mind you, they were freestanding environments and so didn’t actually have a libc.

                                                        There are only two reason that I wouldn’t use C++ in an embedded context. The first isthe lack of a C++ compiler. That still happens for some of the more esoteric ISAs but it isn’t a problem for any M-profile Arm cores and 16 KiB of RAM is plenty for C++. The other is if I’m right at the constrained end of the spectrum (things with on the order of hundreds of bytes of instruction memory, 1 KiB of data memory) and there I wouldn’t use C either, I’d use assembly, because any language that assumes a stack would be a problem (though the 6502 talk I mentioned above relied on inlining to completely eliminate the stack, so C++ might even be feasible there).

                                                        You do generally need to disable exceptions and RTTI for embedded C++ work (but you often do that even on large systems). You also need to think about your use of templates, to ensure that you’re not bloating your code, but you need to do the same with C macros and C++ gives you tools like non-templated base classes for common logic that make this kind of thing easier. C++ also makes it easy to force certain bits of code to be evaluated at compile time (giving a compile failure if they aren’t), which means that you’re less at the whim of optimisers for code size than with C. Most of the C codebases I’ve seen that need this end up with something horrible like Python code that generates a C data structure or even C code, whereas in C++ everything can be in the same language.

                                                        C++ is far from a perfect language but it’s a much better language than C for anything that C is used for.

                                                    2. 1

                                                      First, C++ is overall a better language than C. Second, libc is in no way a “core” library: it is a compatibility layer. For example, consider position of CRT in Windows. Third, Managarm kernel is written in C++, so it is natural to use the same in userland too.

                                                      1. 1

                                                        Second, libc is in no way a “core” library: it is a compatibility layer.

                                                        This assumes you have an operating system.

                                                    1. 2

                                                      What’s the motivation for using this over another implementation, like musl? Not trying to crap on the project, just genuinely curious.

                                                      1. 3

                                                        musl is a libc for Linux. It won’t work for Managarm.

                                                        1. 4

                                                          musl has been ported to quite a few environments and is fairly common in bare-metal toolchains. That said, it doesn’t have a clean OS abstraction and the ports generally work by providing a stub layer that looks like Linux, which isn’t ideal. Musl is also a C codebase and a C standard library can be a lot cleaner and simpler if implemented in a higher-level language. For example, there are a bunch of things that, on modern systems, end up being reference counted (e.g. bits of locale state) and having std::shared_ptr available is much nicer than remembering to manually do the refcounting for these things.

                                                          1. 2

                                                            The Fuchsia libc is based on musl but a lot of hacking and slashing was required to free it from the assumptions of a posixy unix world.

                                                            1. 1

                                                              There seem to be a lot of Fuchsia people involved with LLVM libc, so I’m guessing that this experience wasn’t particularly pleasant for them?

                                                              1. 1

                                                                My own involvement with Fuchsia’s libc has been pretty small so I’m not the best to answer authoritatively, but my sense is that it was great to have a relatively small libc to build our platform specific libc on, but they saw the value in having a portable libc that could be shared with other platforms. To get musl working on Fuchsia it had to be chopped up in ways that makes what we have a fork rather than a port - it would be nice to be able to take a cleaner approach.

                                                      1. 2

                                                        This is a portable libc from Managarm OS, initially with backends for Linux and Managarm, and later ported to other hobby OSes.

                                                        mlibc is written in C++. For example, printf is a thin wrapper around StreamPrinter class implemented with C++ template.

                                                        1. 3

                                                          It looks interesting but I find their code structure completely opaque. I wish they’d put something in their README to give me some hints about where to look for the implementations of their code. Most libcs have a similar structure to the original UNIX libc. LLVM’s one is a bit different but is well documented. This one is completely different.

                                                          1. 1

                                                            Eh, it’s there in README?

                                                            1. 5

                                                              Okay, where is the printf code? The README tells me it’s probably somewhere in options/ (which contains mostly platform-independent code). That directory contains something called ansi, I guess that’s where I’d look since printf is an ISO C API? That directory contains generic, include and musl-generic-math. It’s definitely not either of the last two, so I guess it’s in the first one? There are some files in there that have stdio in their name, but they also have stubs in their name, so I guess that’s now where I’d find the implementation.

                                                              In contrast, in most libc implementations, I’d find it in src/stdio/printf.c. I’ve worked on a few libc implementations and this is the only one where five minutes of browsing the source didn’t take me to the printf implementation. I’m sure I could use code search to find it but that doesn’t help me find the next thing unless I understand the underlying logic of the layout. I presume it has some and so I’d like the README to explain it to me.

                                                          2. 1

                                                            So it’s the C standard library, written in C++? What?

                                                            1. 1

                                                              As long as the functions are exported correctly, why would the internals matter? There’s also rust https://github.com/anp/rusl zig https://github.com/tiehuis/zligc and probably many others

                                                          1. 1

                                                            Why would “fully asynchronous I/O” be a good idea?

                                                            (Assuming the usual meaning of async = “programming w/out control flow”.)

                                                            1. 6

                                                              In general, it’s easy to implement synchronous API on top of asynchronous API, but not vice versa. Managarm implements POSIX synchronous API on top of its asynchronous API, for example.

                                                              1. 1

                                                                It is impossible to implement synchronous API on top of asynchronous API in the most widely used programming language, JavaScript.

                                                                If you have threads then yes, it might be possible, but why not use threads to begin with?

                                                                1. 4

                                                                  The difference is that asynchronous I/O in Javascript works only via callback. For an OS kernel it is trivial to provide a single synchronous completion-wait syscall and thus all asynchronous I/O can be made synchronous by turning it into two steps: schedule asynchronous I/O, then wait for that I/O to complete. This doesn’t require the application to be multi-threaded.

                                                                  1. 2

                                                                    It is impossible to implement synchronous API on top of asynchronous API in the most widely used programming language, JavaScript.

                                                                    I’m not sure I entirely understand what you mean. If you want to block on a fetch in JavaScript, you can simply await it. That makes it synchronous, does it not?

                                                                    There’s of course an event loop / scheduler that decides when to schedule your function’s executive, but the same is true of processes/threads on Linux.

                                                                    1. 1

                                                                      await is only possible within special contexts (at the top-level or within async functions). Now say for example you want to use an API that requires a non-async function as parameter. Can’t use await in there.

                                                                      1. 1

                                                                        But isn’t that like saying “Now say for example you want to use an API that doesn’t do any context switches. You can’t make blocking IO calls in there.”?

                                                                        1. 0

                                                                          I am just saying that you can’t - in general - program async as if it was sync. Not in JS.

                                                                          You can do it in a language with threads (because a thread can be blocked anywhere, whereas async/await can only block in particular contexts).

                                                                          P.S. I don’t think my example is frivolous. Let’s say the API in question does some sophisticated compute work and you can’t replace or modify it easily. But your requirements also force you to make an async IO call from the callback. Well, you can’t with async/await.

                                                                          P.P.S. Context-switching behavior is usually not under the control of app programmers so I don’t really get your comparison.

                                                                          1. 1

                                                                            I’m just thinking out loud, essentially. I’m still on the fence about the whole function colors debate.

                                                                            I think it’s interesting, though, that while the syntax of async/await is different, the semantics is essentially the same as traditional processes/threads and context switching. Until you introduce parallel execution primitives such as Promise.all, at which point async/await becomes strictly more expressive.

                                                                            From this perspective, it seems like async IO is indeed a better foundation on which to build an OS.

                                                                    2. 1

                                                                      how are threads implemented? microkernels are just on top of hardware, I don’t know anything about this but from reading a bit on the hurd website the issue is that the synchronous microkernels block a lot whereas the async ones can get more done >.> idk

                                                                  2. 6

                                                                    You seem to be thinking in terms of language-level abstractions, not OS abstractions. Your definition is definitely not ‘the usual meaning of async’ in the context of systems programming. When you do synchronous I/O in an OS, the following sequence happens:

                                                                    1. The OS deschedules the calling thread.
                                                                    2. The OS notifies the relevant subsystem (e.g. storage, network) to begin processing the I/O.
                                                                    3. The relevant subsystem may return immediately if it has some cached value (e.g. disk I/O in the buffer cache, incoming network packets) but typically it issues some DMA commands to tell the hardware to asynchronously deliver the result.
                                                                    4. The scheduler runs some other threads.
                                                                    5. The I/O completes.
                                                                    6. The kernel wakes up the calling thread.

                                                                    The flow with asynchronous I/O is very similar:

                                                                    1. The OS allows the calling thread to remain scheduled after processing the request.
                                                                    2. The OS notifies the relevant subsystem (e.g. storage, network) to begin processing the I/O.
                                                                    3. The relevant subsystem may return immediately if it has some cached value (e.g. disk I/O in the buffer cache, incoming network packets) but typically it issues some DMA commands to tell the hardware to asynchronously deliver the result.
                                                                    4. The scheduler runs some other threads, including the calling thread.
                                                                    5. The I/O completes.
                                                                    6. The kernel either asynchronously notifies the calling thread (e.g. via a signal or writing an I/O-completed bit into a userspace data structure) or waits for an explicit (blocking or non-blocking) call to query completion state.

                                                                    Given the latter and a blocking wait-for-completion call, you can trivially simulate the former by implementing a synchronous I/O call as an asynchronous request followed by a blocking wait-for-completion. The converse is not true and requires userspace to maintain a pool of threads that exist solely for the purpose of blocking on I/O and waiting for completion.

                                                                    If your program wants to take advantage of the asynchronous nature of I/O then it can perform other work while waiting for the I/O.

                                                                    Most OS interfaces are synchronous for two reasons:

                                                                    • They were designed before DMA was mainstream.
                                                                    • They originated on single-core systems.

                                                                    On DOS or early ‘80s UNIX, for example, if you wanted to read a file then you’d do a read system call. The kernel would synchronously call through the FS stack to find the right block to read, then would write the block request to the device’s I/O control registers and then sit doing a spinning read of the control registers to read each word that the device returned. There was no point making it async because there was no way of doing anything on the CPU other than polling the device. Even back then, this model didn’t work particularly well for things like networks and keyboards, where you may have no input for a while.

                                                                    With vaguely modern (late ‘90s onwards) hardware neither of these is really true. The kernel may synchronously call through the FS stack to get a block, but then it writes a DMA request to the device. The device eventually writes the result directly into memory and notifies the kernel (either via an interrupt or via a control register that the kernel periodically polls). The kernel can schedule other work in the middle. On a multicore system, all of the kernel’s work can happen on a different core to the userspace thread and so all of the FS stack work can happen in parallel with the userspace application’s work.

                                                                    There’s one additional dimension, which is the motivation for POSIX APIs such as lio_listio and Linux APIs such as io_uring: system calls can be expensive. In the simple async model outlined above, you potentially double the number of system calls because each call becomes a dispatch + block (or, worse, dispatch + poll multiple times) sequence. You can amortise this if you allow the dispatch to start many I/O operations (you generally don’t want to do this with sync I/O because if you had to, for example, wait until a network packet was received before seeing the result of a disk read then you’d introduce a lot of latency. APIs such as readv and writev do this for the case where it is useful: multiple I/Os to the same descriptor). You can make the poll fast by making the kernel just write a completion flag into userspace memory, rather than keeping state in the kernel that you need to query.

                                                                    Don’t conflate this with a language’s async keyword, especially not JavaScript’s. JavaScript has a run loop and event model tied into the language. It handles a single event to completion and then processes the next one. This is already asynchronous because if you had synchronous event polling then you’d block handling of any other event (you can already mess this up quite easily by spending too long servicing one event). The JavaScript async keyword does CPS construction to generate a handler for an event that captures all of the state of the things that happen after an await.

                                                                  1. 2

                                                                    You would close your project off to contributions from anyone outside the collective.

                                                                    One of the joys of open source is finding a bug, or missing feature that I care about, being able to fix/implement it myself, and send a patch upstream. If I contribute under GPL, and you accept my contributions, you lose the right to relicense. Well, unless you make me sign a CLA (no thanks).

                                                                    1. 1

                                                                      I have a lot of Open Source projects, and amount of contributions is minimal comparing to amount of users. And even contributors are often additional work because their changes have to guided, reviewed and then supported.

                                                                      I’ve signed CLAs before and it’s annoying, but I don’t mind otherwise. Small price to pay, for otherwise free and open software provided for me. Especially if I knew it allows the developers to support themselves without begging for donations while doing unpaid work for me, everyone, and especially companies making a lot of money using their software.

                                                                      1. 2

                                                                        If that’s the case for you, then I guess it’s no loss.

                                                                        For me at least, the point of GPL is protection against proprietary software. which signing a CLA, and this type of arrangement in general, undermines.

                                                                        Given the sentiments of you, and many others around you have little to do with software freedom or copyleft, I wonder if you wouldn’t be better served by a non-commercial license rather than a copyleft one.

                                                                        1. 2

                                                                          For me at least, the point of GPL is protection against proprietary software. which signing a CLA, and this type of arrangement in general, undermines.

                                                                          The Free Software Foundation used to require people to sign over their copyright for contributions to all official GNU projects. If I recall correctly, they did this so they had the freedom to retroactively update code to newer GPL versions. It seems like they’ve relaxed this a bit now.

                                                                          1. 2

                                                                            I wouldn’t sign a CLA for the FSF either.

                                                                            That said, the FSF isn’t using their CLA to sell unlicensed versions of the code.

                                                                            1. 4

                                                                              That said, the FSF isn’t using their CLA to sell unlicensed [proprietary] versions of the code.

                                                                              Technically, yes they are, that is precisely why they want copyright assignment (not a CLA). This used to be where most FSF revenue came from (more than donations). If a company was caught violating the GPL, the FSF will offer them a time-limited retroactive proprietary license to the code. They then have the duration of the proprietary license to comply with the GPL. If they’re not in compliance at the end then the FSF can take them to court for copyright infingement.

                                                                              Note that granting this license is possible for the FSF only because they own all of the copyright. RedHat, for example, couldn’t do the same for Linux because hundreds of other people would still have standing to sue for copyright infringement on their contribution to the Linux kernel.

                                                                          2. 0

                                                                            For me at least, the point of GPL is protection against proprietary software. which signing a CLA, and this type of arrangement in general, undermines.

                                                                            Signing a CLA by no means is undermining copyleft licenses or their spirit. There can be a AGPL project, you contribute, sign CLA and everybody can still use it as a AGPL licensed free software.

                                                                            1. 2

                                                                              The point of AGPL isn’t to have a badge saying “AGPL”, it’s that people who make derivatives of one’s work have to share their source. This arrangement undermines that, because the entire point of the collective is that you sell the right to strip the license.

                                                                              1. 2

                                                                                The root problem is that dual-licensing isn’t an open source business model. It is a proprietary business model. Your money comes from selling proprietary software. This is predicated on the idea that proprietary software is more valuable than F/OSS, which is one that any proponent of Free Software and most advocates of Open Source would reject.

                                                                                You don’t choose a specific F/OSS license in a dual-licensed open/proprietary project because you believe that it’s the best way of ensuring freedom for users or because you think it’s the best way of growing a community, you choose it to give the most friction to any of your downstream consumers who have money. You intentionally make using the open version difficult to encourage people not to use it.

                                                                                This is fine if you want to be running a proprietary software company but still get some marketing points from a try-before-you-buy version that is technically open source but claiming that it’s an open source (or Free Software) business model is misleading.

                                                                                1. 2

                                                                                  Yes, but selling exceptions to AGPL is no worse than using MIT in terms of software freedom. In fact it is better! In terms of software freedom, MIT < AGPL and selling exceptions < AGPL. So if you are okay with contributing to MIT projects, you should be okay with contributing to AGPL and CLA for selling exceptions projects, software freedom wise.

                                                                                  https://www.gnu.org/philosophy/selling-exceptions.html is of the same opinion, by the way.

                                                                                  1. 1

                                                                                    Exactly. Unless someone is an AGPL-maximalist (which I can understand and respect), I don’t know what are their qualms with dual-licensing. It’s a ideological compromise from AGPL-purism, but waaaay more pure than liberal OSS licenses which are like dual-licensing but with unconditional, perpetual, reciprocitiy-free license for all commercial applications.

                                                                        1. 3

                                                                          First, this is a great work.

                                                                          For this kind of work, you necessarily need to stop at some point. But that also means it misses RLBox, which I think is a significant advance. I hope this can be updated periodically.

                                                                          1. 15

                                                                            Note that Rust does this automatically for you since Rust 1.18, released in 2017. By coincidence, the example case used, (u8, u16, u8), is exactly same in the Rust release note.

                                                                            1. 2

                                                                              Does anyone know if Go could do this? And was it ever considered?

                                                                              1. 4

                                                                                As far as I know, there is nothing in the spec that guarantees struct order layout, and the only way to really observe it is to use the unsafe package which is not covered by the Go 1 compatibility promise. So technically it could.

                                                                                That said, changing it now would break too much code that uses structs with cgo/assembly/syscalls, so I doubt it will happen any time soon. If it ever does, I’d expect it will come with a similar annotation to Rust’s #[repr(C)], at the very least.

                                                                                Here’s an issue where these things have been considered: https://github.com/golang/go/issues/36606

                                                                              2. 1

                                                                                That seems rather brutal for binary compatibility or when interacting with C. I assume it can be turned off on a struct by struct basis?

                                                                                1. 12

                                                                                  Yes, #[repr(C)], it’s right there in the linked release notes :-)

                                                                                  1. 3

                                                                                    And all the details (and other representations) are documented: https://doc.rust-lang.org/reference/type-layout.html#representations

                                                                              1. 2

                                                                                Slightly off topic, I noticed recently that HeidiSQL (a semi-prominent SQL GUI) is written in Delphi (but ironically their code base says it’s too old to be compiled by FreePascal).

                                                                                Are there any other big open source apps written in Delphi or Pascal? My impression is that Delphi/Pascal are most common in proprietary apps so I was just curious if there were other common apps like HeidiSQL you may know of that are open source.

                                                                                1. 2

                                                                                  There is PyScripter which is a Python IDE.

                                                                                  1. 1

                                                                                    Neat! Thanks! Interesting that it also does not use FreePascal, uses Delphi Community to build.

                                                                                1. 2

                                                                                  If you include a language with a single implementation within your definition of portability, then this definitely is more portable than C.

                                                                                  Still, this goes without saying that we should simplify build-systems in general. A simple Makefile and a config.mk file for optional tweaks that don’t need to be done on 99% of systems suffices in most cases.

                                                                                  1. 25

                                                                                    Have you noticed that there no portable way to print “Hello World” in C?

                                                                                    There is a solid standard for source code that passively promises it could do it when built, but you can’t run this source code in a standard, portable way. I see people treat gcc on POSIX as this sort of standard, but it’s not. There are other compilers, other platforms, and even within gcc-flag-copying compilers on close-enough-to-POSIX systems it’s endless papercuts.

                                                                                    I had a simple Makefile with config.mk, and ended up in a situation where I couldn’t build my own project on my own machine. After days of debugging I think the simplest solution would be to recompile GCC from scratch with -mno-outline-atomics… but I just don’t want to deal with these things. In C everything is theoretically possible, but nothing is easy, nothing is reliable.

                                                                                    I’m completely unmoved by the theoretical possibility of porting my project to DEC Alpha or a DSP with 12-bit bytes, when I can’t even run it on macOS and Windows.

                                                                                    1. 1

                                                                                      I know about C’s inconsistencies and quirks, but it’s reasonable to just target a POSIX compliant system, which is a standard and clearly defines interfaces you can use to print “Hello World”.

                                                                                      I do not depend on the mercy of the Rust developers to support a certain platform,and when I check the Rust officially supported platforms I don’t see that many. Even Microsoft itself acknowledges that by offering WSL.

                                                                                      1. 15

                                                                                        Your position is odd. On one hand you seem to value support for all platforms, even ones that are too obscure too be in LLVM. But at the same time you say it’s reasonable to just drop the biggest desktop platform.

                                                                                        Porting Rust needs an LLVM back-end and libc, which is pretty much the same as porting Clang. You don’t need a permission from an ISO committee to do this. LLVM and Rust are open. In practice the list includes every modern platform people care to support. There’s also a GCC back-end in the works, so a few dead platforms will get support too.

                                                                                        1. 2

                                                                                          I think FRIGN is arguing that targeting POSIX does not mean dropping Windows, because WSL exists. I don’t agree, but it is an arguable position.

                                                                                          1. 7

                                                                                            WSL is Linux. I would not call that targeting the Windows platform.

                                                                                            1. 1

                                                                                              Why not? WSL is part of Windows. What do you gain by targeting non-WSL Windows? (I think there is a gain, I am just curious what you think it is.) Is it that WSL is an additional install? (JVM is an additional install too, and frankly WSL is easier to install than JVM.) Is it support for versions older than Windows 10? Is it overhead, which I think is tiny?

                                                                                              Would you stop targeting Windows if all of the following happens: 1) WSL becomes part of default install 2) all versions of Windows not supporting WSL become irrelevant 3) performance overhead improves enough to be immaterial? What else would you want?

                                                                                              1. 13

                                                                                                Windows can also run Web apps and Android applications, or even emulate an old Mac, but in a discussion of portability I think it’s important to make a distinction between getting software to run somehow vs being able to directly use the native vendor-recommended APIs of the platform.

                                                                                                Is Visual Basic 6 portable and can target Linux? There is WINE after all, and it’s probably as easy to install as WSL.

                                                                                                The WSL tangent feels like I’m saying “I can’t eat soup with a fork”, and you “You can! If you freeze it solid or if you spin the fork really really fast!”

                                                                                                1. 2

                                                                                                  I also think depending on Wine instead of porting to Linux a defensible position.

                                                                                                2. 4

                                                                                                  Why not? WSL is part of Windows. What do you gain by targeting non-WSL Windows?

                                                                                                  I worked at a place that wouldn’t allow WSL because the security team couldn’t install the network monitoring tool they used. Ultimately, it is an optional component and can be disabled.

                                                                                                  1. 3

                                                                                                    Why not? WSL is part of Windows. What do you gain by targeting non-WSL Windows?

                                                                                                    WSL is a brand name covering two things:

                                                                                                    • WSL1 uses picoprocesses in the NT kernel with a Linux system call compatibility layer (similar in concept to the *BSD Linux compat layers)
                                                                                                    • WSL2 just runs a Linux VM with Hyper-V.

                                                                                                    There is some integration with the host system, a WSL application in either version can access the host filesystem (with WSL1, there are filter drivers over NTFS that present POSIX semantics, with WSL2 it uses 9p over VMBus, both are slow). They can create pipes with Windows processes. But they can’t use any of the Win32 API surface. This means no GUI interaction, no linking with any Windows DLLs. For something like an image library (as in the article), providing a Linux .so is of no help to someone writing a Windows application. The most that they could do with it is write a Linux command-line tool that decoded / encoded an image over a pipe, and the run that win WSL, on the subset of Windows machines that have enabled WSL (unless it’s changed recently, neither WSL1 or 2 is enabled by default).

                                                                                            2. 3

                                                                                              I think it does boil down to the question whether targeting POSIX is reasonable or not. Many people, including myself and the author of this article, find it unreasonable. But I admit it is a defensible position.

                                                                                          2. 7

                                                                                            That works unless it involves one of threading, Windows, or cross-compilation. All three works out of the box with Rust. C is more capable, but Rust is more convenient.

                                                                                            1. 1

                                                                                              That’s a fair point, but this only works because there is one single Rust implementation that is ported, and that’s what you depend on.

                                                                                              I’m not arguing about the convenience.

                                                                                            2. 4

                                                                                              A simple Makefile and a config.mk file for optional tweaks that don’t need to be done on 99% of systems suffices in most cases.

                                                                                              I’d argue the opposite, that cargo new and cargo build suffice in most cases, and you don’t need the capabilities of C most of the time unless you’re doing something weird, or something with hardware.

                                                                                              1. 1

                                                                                                But think about the massive complexity behind cargo. Why does anyone think it’s reasonable to pull in dozens of online (!) dependencies for trivial code (node.js all over again), and with Rust you can’t get around this bloat.

                                                                                                And I’m not even saying that this was about C vs. Rust. I am aware of the advantages Rust offers.

                                                                                                But the bloat is unnecessary. Consider Ada for example, which is way more elegant and provides even more security than Rust.

                                                                                                1. 14

                                                                                                  Cargo is simple technically. Probably simpler than cmake and dpkg.

                                                                                                  Rust uses many dependencies, but that isn’t necessarily bloat. They’re small focused libraries. Slicing a pizza into 100 slices instead of 6 doesn’t make it any larger. I have 45 transitive dependencies in libimagequant, and together they are smaller than 1 OpenMP dependency I’ve had before.

                                                                                                  Cargo uses many small deps, because it’s so easy and reliable, that even trivial deps are worth using. I don’t think pain should be the driving factor technical decisions — deps in C are a pain whether they’re used justifiably or not. Even despite dependency friction C has, applications still use many deps. Here’s a recent amusing example of dependency bloat in vim.

                                                                                                  I’ve considered Ada, but it offers no safety in presence of dynamic memory allocation without a GC. The usual recommendation is “then just don’t allocate memory dynamically”. That’s what I’m aiming for in C and Rust too, but obviously sometimes I do need to allocate, and then Rust offers safety where Ada doesn’t.

                                                                                                  1. 6

                                                                                                    The usual recommendation is “then just don’t allocate memory dynamically”

                                                                                                    Ada allows you to declare anything pretty much in any declaration block, including run-time sized arrays, functions, variables, tasks, etc. so a lot of time you’re logically “allocating” for work, but it’s usually stack allocated though the compiler can implicitly allocate/free from the heap (IIRC). Being able to return VLAs on the secondary stack also simplifies things a bit like returning strings. The standard library is pretty extensive too, so you usually just use stuff from there which will be RAII controlled. Between RAII via “Controlled types” and limits on where access types (pointers) can be used, I don’t think I’ve ever actually seen raw allocations being passed around. Implicitly passing by reference when needed, and allowing declaration of different pointer types to the same thing, but declared in different places seems to really cut down on pointer abuse.

                                                                                                    One of my Ada projects is ~7000 LoC (actual, not comments or blanks) and I have one explicit allocation where I needed polymorphism for something, wrapped in a RAII smart pointer which also allocates a control block.

                                                                                                    Looking at Alire, ~28600 LoC (actual, not comments or blanks), shows only 1 allocation which doesn’t go through the program lifetime, which is inside an RAII type. If you include all 1554 files of it and its dependencies, Unchecked_Dellocation only appears in 60 of them.

                                                                                                    I understand the disbelief. I get that it’s weird.

                                                                                                    (shrug), I dunno, it’s sort of hard to explain, but the language and standard library means you just often don’t need to allocate explicitly.

                                                                                                  2. 9

                                                                                                    This is a strawman argument, because Cargo’s package management features aren’t what we were talking about. But you can use Rust without needing to depend on any other packages, in fact you can use it without Cargo at all if you want, just create a Makefile that calls rustc. Saying that you “can’t get around” this bloat is not really factual.

                                                                                                    1. 1

                                                                                                      How far is this going to get you when the standard library doesn’t even include the most basic of things?

                                                                                                      Packagers (rightfully) won’t touch rust-packaging as the majority uses cargo and everything is so scattered. When you are basically forced to pull in external dependencies, it’s reckless, on the other hand, not to rely on cargo.

                                                                                                      Ada does it much better and actually reflects these things, and C benefits from a very good integration into package management and systems overall.

                                                                                                      Rust could’ve had the chance to become easily packagable, but it isn’t. So while the “move fast”-philosophy definitely helped shape the language, it will lead to long-term instability.

                                                                                                      1. 12

                                                                                                        As someone who occasionally packages rust things (for Guix) I don’t really know why everyone thinks it’s so hard. Cargo metadata for every project means you can often automate much of the packaging process that in C I do by hand.

                                                                                                        1. 2

                                                                                                          Given you seem to have experience with this, how simple is it to create a self-contained tarball of a program source with all dependencies (as source or binary) that does not require an internet connection to install?

                                                                                                          1. 14
                                                                                                            1. 2

                                                                                                              Nice, I didn’t know about that one. Thanks!

                                                                                                            2. 6

                                                                                                              Note that this isn’t a requirement for all packaging systems. The FreeBSD ports tree, for example, requires only that you be able to download the sources as a separate step (so that the package builders don’t need network access). There’s infrastructure in the ports tree for getting the source tarballs for cargo packages from crates.io. These are all backed up on the distfiles mirror and the package build machines have access to that cache. All of the FreeBSD packages for Rust things use this infrastructure and are built on machines without an Internet connection.

                                                                                                              1. 1

                                                                                                                Very interesting, thanks for pointing that out!

                                                                                                          2. 5

                                                                                                            Just because you like C and Ada doesn’t mean every other language is terrible.

                                                                                                            1. 1

                                                                                                              I totally agree. For instance I love Julia for numerics.