1. 3

    This is fun. The syntax is actually very similar to what lilypond uses.

    Who recognizes the melody I transcribed here?

      1. 11

        Better than nothing perhaps, but the least secure of all 2fa methods (even in your link), as well as being cloneable/hijackable and vulnerable to “vendor social engineering”. Not to mention requires handing your phone number off to a company, to increase your targeting profile, to be added to txt spam lists, and/or sold to other companies so they can advertiser to (spam) you.

        Hardware tokens, push-message-based, even totp, all are superior. Why even spend the dev cycles implementing something marginal like SMS-2fa, paying for txt messaging (and/or integrating with an sms vendor), when you can just do something better instead (and arguably more easily)?

        1. 5

          Not to mention requires handing your phone number off to a company, to increase your targeting profile, to be added to txt spam lists, and/or sold to other companies so they can advertiser to (spam) you.

          It’s also a pain in areas with poor or intermittent mobile coverage.

          1. 1

            The criticism in the article seems to be mostly around phishing attacks. Are these other approaches more resilient to phishing? With the suggestion of randomized passwords as the best alternative, the author seems to be against any kind of 2FA.

            1. 5

              Are these other approaches more resilient to phishing? With the suggestion of randomized passwords as the best alternative, the author seems to be against any kind of 2FA.

              U2F and WebAuthn categorically prevent phishing by binding the domain into the hardware device.challenge response.

              1. 5

                The author also states:

                If you also want to eliminate phishing, you have two excellent options. You can either educate your users on how to use a password manager, or deploy U2F, FIDO2, WebAuthn, etc. This can be done with hardware tokens or a smartphone.

                So I don’t think the author is against 2FA in general, just specifically SMS-2FA.

                Also note the first suggestion of using a password manager is, in my opinion, a bit nuanced, because “how to use a password manager” includes having the manager fill in credentials for you, and the password manager restricting this to only on the correct domain defined for the password.

                Are these other approaches more resilient to phishing?

                I would say U2F, FIDO2, WebAuthn is far more resilient to phishing, yes.

                “A good password manager”? As I mentioned above I feel this one is more tenuous. I personally feel users could easily be tricked to copy/pasting credentials out of a password manager, since users have the expectation that software in general is kind of clunky and broken so “it must not be working right so I’ll do it manually”. As such, I’m not sure I necessarily agree that just using a good password manager is sufficient to prevent phishing. It would be interesting to see stats on it though, as my hunch is just that and has no scientific basis or real evidence behind it.

                TOTP as a 2nd factor is presumably just as vulnerable to phishing as a password alone, but being an extra step and relatively out of band from normal credential flow, but for preventing automated (non-phishing) attacks, seems useful. In my opinion better than SMS-2FA, but nowhere near as good as U2F, FIDO2, WebAuthn.

                push-message-based tokens (like Okta uses for example) are, presumably (caveat I’m not a security professional) as secure as the weakest link of vendors involved: push-vendor (eg. google, apple) and token vendor (eg. okta). Generally requires server side integration/credentials to get the vendor to invoke the push, and are typically device locked.

                1. 2

                  “A good password manager”? As I mentioned above I feel this one is more tenuous. I personally feel users could easily be tricked to copy/pasting credentials out of a password manager, since users have the expectation that software in general is kind of clunky and broken so “it must not be working right so I’ll do it manually”.

                  I can’t count the number of times I have copy/pasted a password because the Firefox password manager saved the credentials for one login form on the site, but then didn’t autofill them on a different form. Maybe that means that it doesn’t count as a “good password manager” though? I guess I should be filing bugs on these cases anyway.

                  1. 2

                    Same. I also have a few sites that don’t even work well with 1password (generally considered pretty decent). Some sites also seem to go out of their way to make password managers not work. Why?! ;_;

            2. 3

              Good link!

              I posted this because I think it’s interesting to see articulated arguments for a position I’m surprised by.

              1. 6

                Google wants to know our phone numbers. From that research, we can see that a phone number is effective in deterring some attacks. The question I would ask is, can we achieve similar security through other means? For example, even Google shows that On-device prompts or security tokens are better than SMS.

                So please, if you think you must, offer SMS. But also offer other 2FA options and especially don’t force collect phone numbers if you can avoid it.

            1. 5

              Another confusion easily solved by using proper units and measuring mass instead of volume, as commonly done in cooking instructions outside the US.

              1. 6

                For what it’s worth, even here in the EU my bag of quinoa has the same instructions in volume.

              1. 5

                With all the enthusiasm for zettelkasten/second-brain like systems (roam, org-roam, now this), I’m surprised that nobody has been working on I haven’t heard of an external format/tool that various UI’s can interface. VSCode, at least that’s my impression, is the kind of editor that gets displaced from it’s throne every few years by the next new thing, as has happened to Sublime and Atom before, so I certainly wouldn’t be too confident in making my “second brain” depend on it, except maybe if it’s used as a brainstorming tool for projects, but then it would have to be distributable too – but from skimming the article that doesn’t seem to be the case.

                Edit: Fixed the first sentence, sorry for my ignorance. Also I missed that this is markdown based, so I guess the rest of the comment isn’t quite right either, but I guess/hope my general point is still legitimate.

                1. 6

                  I’m surprised that nobody has been working on an external format/tool that various UI’s can interface

                  Checkout neuron which is editor-independent, has native editor extensions, but can also interface (in future) with editors through LSP.

                  Some examples of neuron published sites:

                  Easiest way to get started (if you don’t want to install yet): https://github.com/srid/neuron-template

                  1. 3

                    That sounds cool, but I don’t really get why LSP would help? I (personally) would much prefer a native client, in my case for Emacs, than something that forces itself into a protocol for program analysis.

                    1. 2

                      Well, neuron does have native extensions for emacs and vim (see neuron-mode and neuron.vim) - but LSP support just makes multiple editor support easier by shifting common responsibility to a server on neuron.

                      EDIT: I’ve modified the parent comment to clarify this.

                    2. 1

                      Is there any easier way to install (i.e. without nix?) I’m on a laptop and installing new toolchains is prohibitive for the low storage I have.

                      1. 1

                        Nix is the only way to install neuron (takes ~2GB space including nix and deps), until someone contributes support for building static binaries.

                        But I’d encourage you give Nix a try anyway, as it is beneficial even outside of neuron (you can use Nix to install other software, as well as manage your development environments).

                        1. 2

                          I got a working binary with nix-bundle, that might be a simpler option. It’s a bit slow though, especially on first run when it extracts the archive. nix-bundle also seems to break relative paths on the command line.

                          1. 1

                            Interesting. Last time I tried nix-bundle, it had all sorts of problem. I’ll play with it again (opened an issue). Thanks!

                    3. 3

                      Isn’t the markdown that this thing runs on exactly that external format, and one that has been getting adoption across a wide range of platforms and usecases at that?

                      1. 3

                        There is tiddlywiki and the tiddler format.

                        1. 2

                          I wish the extension used the org format instead of markdown (so if something happens to vscode, I can use it with emacs), but otherwise I totally agree with your comment!

                          1. 2

                            You can use markdown files with org-roam in emacs by using md-roam. I prefer writing in Markdown most of the time, so most of my org-roam files are markdown files.

                        1. 18

                          Worth reading to the end just for the totally evil code snippet.

                          It was kind of foreshadowed to be evil when the author named it “skynet.c” I guess.

                          1. 4

                            Reminds me of the Java-code we used to see around 2000.

                            With a RuntimeException try-catch at the top and then just print it and continue like nothing happened.

                            How much bad bugs, data corruption and weirdness did that practice cause?

                            1. 1

                              How is that any different from kubernetes and “just restart it”? Its mostly the same practice ultimately, though with a bit more cleanup between failures.

                              1. 2

                                I guess it depends on whether you keep any app state in memory. If you’re just funnelling data to a database maybe not much difference.

                            2. 2

                              Now I start to wonder, how the correct code should look like (as opposed of jumping 10 bytes ahead).

                              Read DWARF to figure out next instruction?

                              Embed a decompiler to decode the faulty opcode length?

                              1. 4

                                Increment the instruction pointer until you end up at a valid instruction (i.e., you don’t get SIGILL), of course ;)

                                1. 6

                                  I have code that does this by catching SIGILL too and bumping the instruction pointer along in response to that. https://github.com/RichardBarrell/snippets/blob/master/no_crash_kthxbai.c

                                  1. 2

                                    Brilliant. I’m simultaneously horrified and amused.

                                  2. 1

                                    SIGILL

                                    That’d be a pretty great nerdcore MC name.

                                  3. 1

                                    If you want to skip the offending instruction, à la Visual Basics “on error resume next”, you determine instruction length by looking at the code and then increment by that.

                                    Figuring out the length requires understanding all the valid instruction formats for your CPU architecture. For some it’s almost trivial, say AVR has 16 bit instructions with very few exceptions for stuff like absolute call. For others, like x86, you need to have a fair bit of logic.

                                    I am aware that the “just increment by 1” below are intended as a joke. However I still think it’s instructive to say that incrementing blindly might lead you to start decoding at some point in the middle of an instruction. This might still be a valid instruction, especially for dense instruction set encodings. In fact, jumping into the middle of operands was sometimes used on early microcomputers to achieve compact code.

                                  4. 2

                                    Here’s a more correct approach: https://git.saucisseroyale.cc/emersion/c-safe

                                    1. 1

                                      Just don’t compile it with -pg :)

                                    1. 2

                                      Ironically, I installed BIOS and Intel ME updates from Lenovo this morning using fwupdmgr update, something I’ve done many times before on my T480s.

                                      Except this time around, it wiped everything except the preinstalled ‘Windows Boot Manager’ entry from my UEFI Boot Order List, which stopped me rebooting after the firmware update completed until I fished out a USB drive with an Arch ISO so I could re-run grub-install and restore the entry.

                                      To me, this means they simply didn’t test the update with Linux/UEFI systems, I’ll give them the benefit of the doubt and assume they did check BIOS boot, given it’s still more common.

                                      I hope they sort out this sort of issue as a part of this ‘certification’ process!

                                      1. 2

                                        I did the same thing on my T480s (also running Linux/UEFI) yesterday without issues, so it’s most likely a more complicated problem than “only Windows is supported”.

                                      1. 5

                                        Does hermes require $HERMES_STORE be consistent across machines to take advantage of caching?

                                        Nix lets you change where the store is located, but nobody ever does it because you lose the enormous community cache at cache.nixos.org. Tons of those binaries have hard-coded paths to their dependencies with the /nix/store/... prefix, which affects their hashes.

                                        1. 5

                                          This is one limitation that is shared with Nixos. For now I am building all software myself as there is not that much of it.

                                          1. 4

                                            I was wondering about this as well. Both Nix and Hermes advertise installation in addition to a system package manager. This especially comes in handy if you’re on a system where you don’t have root access, but then you can’t create a store at the standard location and thus have to build everything from source. This often takes more time than just building the stuff you need manually and linking to system libraries.

                                            I suppose absolute paths (usually into /usr/lib, /usr/share and so on) are very common. I believe AppImages enforce binary-relative paths, which might work here as well, but would mean lots of extra work with packaging. Detecting absolute paths is easy, but patching them out is not.

                                            1. 3

                                              This often takes more time than just building the stuff you need manually and linking to system libraries.

                                              In the medium term I want to make it easy for someone to get access to a remote build on extremely powerful build machine, currently google is offering 96 cores for cheap at spot prices. These could potentially help for such situations.

                                              For me the most expensive hermes package (gcc) builds in about 4 minutes on my desktop. It is definitely an annoyance at times I want to solve.

                                              I also want to setup a way to export hermes packages as appimages that can work at any path.

                                          1. 1

                                            Is this actually an issue people come across? All the home routers I’ve come across so far had firewalls blocking incoming connections (for both IPv4 and v6). Most of them (especially the ISP-issued ones) don’t even allow configuring that firewall. Company and University networks will always have a firewall as well. On University networks, there’s a high chance of getting a public IPv4 address anyways.

                                            And a comment regarding the (pretty neat) tool itself: with IPv6, you’ll probably use different addresses for incoming and outgoing connections. For firewall configuration, you usually need a static address (e.g., EUI-64) , but for privacy reasons, the preferred address for outgoing connections should be randomly generated. As your tool (as far as I can see) only can check the address the user connects from, it would miss the address services would usually listen on.

                                            1. 4

                                              There’s another, more recent paper from this year’s EuroSys where the authors try to achieve something Unikernel-like using Linux configuration options. At some point you really have to wonder whether it’s still a Unikernel.

                                              1. 8

                                                sorry folks but can someone explain to me why SMR is bad, please? I’m not arguing, I know nothing about this and I am curious.

                                                1. 5

                                                  SMR is not bad per se, it’s actually pretty cool technology (higher density, cheaper drives, …) - if the drive allows the operating system to manage the SMR data. For example, it’s generally not an issue if a drive in a ZFS pool is not available temporarily for some planned maintenance operation.

                                                  However, the WD drives here pretend to be normal CMR drives, so there’s no way to manage SMR regions from the OS and you end up with very surprising performance (slow writes and long pauses).

                                                  1. 4

                                                    blocksandfiles.com is one of the websites that AFAICT looked into the issue, they have an article explaining it in detail [0]. TL;DR from memory: While the drives are busy reorganizing the data internally, the performance will obviously drop and they might not report back for more than a minute which will cause them to be dropped from RAIDs.

                                                    [0] https://blocksandfiles.com/2020/04/15/shingled-drives-have-non-shingled-zones-for-caching-writes/

                                                    1. 2

                                                      Thanks a ton.

                                                  1. 3

                                                    “CIDR calculation” seems to be something that is completely obsolete with IPv6 - just assign public addresses to every system (which will never overlap) or alternatively generate random prefixes for each subnet for use in ULAs (which are long enough so that collisions are very unlikely).

                                                    1. 3

                                                      I wonder if this is something a city community center or local beer hall can solve, both often have sizable rooms for events.

                                                      1. 3

                                                        Not really, as this would not be our space. The co.up model works well, but rent needs to be paid.

                                                        Investing the money into our space has never been the problem and I’m glad it worked. Investing more is now the problem.

                                                        1. 2

                                                          Any place that serves alcohol would preclude attendance by people of less than legal drinking age, and would be inappropriate for people from cultures where alcohol is frowned upon.

                                                          1. 1

                                                            I think this is mostly am American issue? There are mostly no such restrictions in Germany, and people under 16 are not very likely to attend anyways, I guess. Not that I think that meetups that require attendees to buy (expensive) drinks would be a great idea.

                                                            1. 2

                                                              It is. Under a certain age, you need to have a guardian in such places, though that can even be an older minor.

                                                              But yeah, spaces that need consumption are not good.

                                                        1. 7

                                                          How can you claim with a straight face that Go is better at concurrency than Java and C# when Go only has green threads, and no user-level control whatsoever on the execution model? That is particularly important for server-side applications where you might need to separate IO-bound tasks from CPU-bound tasks inside the same process.

                                                          1. 22

                                                            Go is excellent at writing server-side applications where you need to separate IO-bound and CPU-bound tasks in the same process. The runtime does it all for you, without requiring you to complect your application code with irrelevant details like thread execution models.

                                                            1. 0

                                                              complect your application code with irrelevant details like thread execution models.

                                                              It’s very disingenous to dismiss threading control as “irrelevant”. If that would be the case, what’s this?

                                                              In a web server that simultaneously does some non-blocking and blocking IO (files and the like) and then also some CPU bound stuff, how can the Go scheduler guarantee the web server can function independently and not be interrupted by the scheduler trying to find a thread that isn’t blocked? This is not a terribly complex thing to solve with user-level control on threads and thread pools, but it becomes quite daunting with only green threads and pre-emption.

                                                              I’m not saying this can’t be done using green threads but it is difficult. GIven user control on threading you can implement your own runtime for fibers and continuations, but you can’t do that if you only have access to green threads!

                                                              1. 1

                                                                I don’t see how the linked issue is relevant. It’s about how Linux does not support non-blocking file I/O, so Go needs a kernel-level thread for each goroutine with a blocking file I/O operation. It’s exactly the same thing in Java and C#: If you want to run tons of file I/O in parallel, you will need tons of kernel-level threads.

                                                          1. 4

                                                            What does BC mean here?

                                                            There are two big, substantial schools of thought in the PHP world. The first likes PHP roughly the way it is - dynamic, with strong BC bias and emphasis on simplicity

                                                            1. 5

                                                              “Backwards compatibility” would be my guess.

                                                            1. 3

                                                              Ah, from the days when we believed in sufficiently smart compilers. :)

                                                              1. 4

                                                                Putting our trust in sufficiently smart processors hasn’t exactly gone well either to be fair.

                                                                1. 2

                                                                  I think the bigger issue here is that software is usually compiled once per ISA and not per processor, so the compiler never gets the chance to be very smart.

                                                                1. 1

                                                                  Within Google, we have a growing range of needs…

                                                                  Something smells fishy. And Fuchsia.

                                                                  1. 6

                                                                    There’s a reply by someone on the Fuchsia team in the email thread - doesn’t look like it’s created with Fuchsia in mind so far.

                                                                    1. 2

                                                                      No kidding, since they aren’t even planning to support aarch64 in the initial implementation.

                                                                  1. 4

                                                                    It warms the heart to know that some people push back against adding syscalls just to be convenient for one set of programs. Progress needs to have reasons and be reasoned about.

                                                                    Are minimal syscall OSes akin to RISC?

                                                                    1. 5

                                                                      Are minimal syscall OSes akin to RISC?

                                                                      Microkernels have minimal functionality and thus also very few system calls. In fact, some microkernels only have a single system call for inter-process communication.

                                                                      I’m not sure whether it’s useful to reduce the number of system calls in a big monolithic kernel. I think it might lead to a complex system call interface with system calls that perform multiple (possibly unrelated) functions. This is already reality, for example with the ioctl system call in Linux that is used for lots of very different tasks.

                                                                      1. 3

                                                                        RISC no longer has anything to do with a reduced instruction count, but instead reduced instruction complexity.

                                                                        1. 1

                                                                          Somewhat relevant to the orthogonality of instruction count/complexity: https://alastairreid.github.io/papers/sve-ieee-micro-2017.pdf

                                                                      1. 1

                                                                        The built-in PEG parser is neat, but for such simple parsing tasks, a regular expression seems to be easier to write to me.

                                                                        I’m also wondering: Why is (import sh) necessary? Isn’t it implied that you want the shell functions by running janetsh?

                                                                        1. 1

                                                                          The built-in PEG parser is neat, but for such simple parsing tasks, a regular expression seems to be easier to write to me.

                                                                          When the parsing task get’s more complicated I think the PEG module will scale better. This post is just an educational demonstration intended to be understandable.

                                                                          I’m also wondering: Why is (import sh) necessary? Isn’t it implied that you want the shell functions by running janetsh?

                                                                          I agree, I may fix this in the future. It is just an implementation detail/limitation currently.

                                                                          1. 1

                                                                            When the parsing task get’s more complicated I think the PEG module will scale better. This post is just an educational demonstration intended to be understandable.

                                                                            Yeah, alright. I was mostly writing that because Janet doesn’t appear to support regex (yet?) and I’m wondering whether that’s an intentional omission to make people use PEG.

                                                                            1. 1

                                                                              We were discussing adding regex via a non core but possibly official library.

                                                                              Actually in janetsh you could also pipe the output of ls-remote to grep too.

                                                                        1. 8

                                                                          This paper is in the recent trend of lightweight academic articles based on sneering at things that work. The dismissal of Linux clone is particularly bad:

                                                                          This syscall underlies all process and thread creation on Linux. Like Plan 9’s rfork() which preceded it, it takes separate flags controlling the child’s kernel state: address space, file descriptor table, namespaces, etc. This avoids one problem of fork: that its behaviour is implicit or undefined for many abstractions. However, for each resource there are two options: either share the resource between parent and child, or else copy it. As a result, clone suffers most of the same problems as fork (§4–5).

                                                                          No it doesn’t - precisely because you can select not to duplicate. I also like this

                                                                          The performance advantage of the shared initial state created by fork is less relevant when most concurrency is handled by threads, and modern operating systems deduplicate memory. Finally, with fork, all processes share the same address-space layout and are vulnerable to Blind ROP attacks

                                                                          For sure, let’s use threads in the exact same address space to avoid ROP attacks.

                                                                          The most interesting thing in the paper was that fork takes 1/2 millisecond on the machines they measured. Too slow by 10x.

                                                                          1. 6

                                                                            I agree that there was a very negative tone in the paper, and I found it quite off-putting and unprofessional. Among the scathingly negative phrases that caught my attention:

                                                                            • “fork is a terrible abstraction”
                                                                            • “fork is an anachronism”
                                                                            • “fork is hostile… breaking everything…”
                                                                            • “fork began as a hack”
                                                                            • “we illustrate the havoc fork wreaks”
                                                                            • “this is a deceptive myth”
                                                                            • “multi-threaded programs that fork are plagued with bugs”
                                                                            • “it is hard to imagine a new proposed syscall with these properties being accepted by any sane kernel maintainer”
                                                                            • “caused fork semantics to subvert the OS design”
                                                                            • “get the fork out of my OS”
                                                                            1. 2

                                                                              I hate to say it, but all those statements are either introductions to claims or much milder if you don’t cut off the whole sentence. I won’t go through all, but just some examples:

                                                                              • “We catalog the ways in which fork is a terrible abstraction for the modern programmer to use, describe how it compromises OS implementations, and propose alternatives.”

                                                                                • If you believe something is terrible, it’s fine to say it like that.
                                                                              • “fork began as a hack”

                                                                                • And the chapter that is called like that describes exactly that: how fork was ad-hoc implemented, using quotes by Richie himself. It’s a hack and not structured innovation implemented with farsight. That’s okay, but that’s what we call a hack.
                                                                              • “fork is hostile to user-mode implementation of OS functionality, breaking everything from buffered IO to kernel-bypass networking. Perhaps most problematically, fork doesn’t compose—every layer of a system from the kernel to the smallest user-mode library must support it.”

                                                                                • A technology or API being “hostile” to uses is standard terminology. And they go on to explain why they believe it to be hostile.
                                                                              • “At first glance, fork still seems simple. We argue that this is a deceptive myth, and that fork’s effects cause modern applications more harm than good.”

                                                                                • It is indeed the case that fork has a lot of rules to maintain around it and they are also right that fork is still tought as simple.
                                                                              • “We illustrate the havoc fork wreaks on OS implementations using our experiences with prior research systems (§5). Fork limits the ability of OS researchers and developers to innovate because any new abstraction must be special-cased for it.”

                                                                                • They rightfully claim that every abstraction needs to take into account that at any point in time, the process could be forked and the whole machinery is doubled. If something impacts almost all research in a field, strong wording is probably okay. “wreak havoc” is also not unheard of in professional settings.

                                                                              Like, I’m fine with you no agreeing with their reading, but unprofessional, I would not agree with.

                                                                              1. 2

                                                                                Like, I’m fine with you no agreeing with their reading, but unprofessional, I would not agree with.

                                                                                You omitted the most salacious phrase, “get the fork out of my OS”. This sort of wording has no place in a professional paper.

                                                                                The remaining phrases I would categorize as more scathing than salacious. This wording is appropriate for a newsgroup or mailing list, but in an academic paper, these statements are hyperbolic, and could be replaced by more objective phrasing, without sacrificing meaning or impact.

                                                                                1. 2

                                                                                  I took those phrases as tongue-in-cheek, but can agree that if you’re writing an article which may inspire strong opinions perhaps you should be mindful of how others might read it. I wouldn’t go as far as saying it’s unprofessional, however.

                                                                                  You omitted the most salacious phrase, “get the fork out of my OS”. This sort of wording has no place in a professional paper.

                                                                                  This is a good pun and I will defend to the death the authors’ right to use it in any context :)

                                                                            2. 2

                                                                              As a result, clone suffers most of the same problems as fork (§4–5).

                                                                              No it doesn’t - precisely because you can select not to duplicate.

                                                                              The only issue I see that clone clearly solves is the security problem from share-everything. How does it help with multithreading, performance (or ease-of-use in vfork mode), picoprocesses, heterogeneous hardware?

                                                                              For sure, let’s use threads in the exact same address space to avoid ROP attacks.

                                                                              I think the argument here is more like this:

                                                                              • If you need it to be fast, use threads (most people already do this)
                                                                              • If you need it to be secure, use a proper new process instead of forking from a prototype

                                                                              The authors also argue that there might be better Copy-on-Write primitives than fork.

                                                                              1. 1

                                                                                How does it help with multithreading, performance

                                                                                It’s pretty fast - especially compared to actually existing alternatives.

                                                                                picoprocesses, heterogeneous hardware

                                                                                It’s not intended to address those problems.

                                                                                1. 1

                                                                                  It’s pretty fast - especially compared to actually existing alternatives.

                                                                                  In the paper, they show that posix_spawn is consistently faster at creating processes than fork, so I’m not sure what you mean?

                                                                                  1. 1

                                                                                    Posix_spawn is vfork/exec.

                                                                            1. 12

                                                                              This looks like something that was written just so Microsoft can have something to reference when they for the umpteenth time have to explain why CreateProcessEx is slow as balls on their systems.

                                                                              1. 4

                                                                                I doubt MS commissioned a research paper to defend against one argument made about their OS.

                                                                                Besides, the very view that process spawning should be very cheap is just a trade-off. Windows prioritizes apps designed around threads rather than processes. You can argue one approach is superior than the other, but it really reflects different priorities and ideas about how apps are structured.

                                                                                1. 7

                                                                                  Besides, the very view that process spawning should be very cheap is just a trade-off. Windows prioritizes apps designed around threads rather than processes.

                                                                                  Windows was a VMS clone and iteration designed by the same guy, Dave Cutler. The VMS spawn command was a heavyweight one with a lot of control over what the resulting process would do. Parallelism was done by threading, maybe clustering. Unlike sebcats’ conspiracy, it’s most likely that the VMS lead that became Windows lead just reused his VMS approach for same reasons. That the remaining UNIX’s are doing non-forking, monolithic designs in highest-performance apps corroborates that he made the right choice in long term.

                                                                                  That is not to say we can’t improve on his choice carefully rewriting CreateProcessEx or using some new method in new apps. I’ve been off Windows for a while, though. I don’t know what they’re currently doing for highest-performance stuff. Some lean programmers are still using Win32 API based on blog articles and such.

                                                                                  1. 3

                                                                                    That the remaining UNIX’s are doing non-forking, monolithic designs in highest-performance apps corroborates that he made the right choice in long term.

                                                                                    Assuming that you believe the highest performing apps should be the benchmark for how everyone writes everything – which I don’t agree with. Simplicity and clarity are more important for all but a small number of programs.

                                                                                    1. 3

                                                                                      I think that’s important, too. Forking fails that criteria as not being most simple or clear method to do parallelism or concurrency. It would loose to things like Active Oberon, Eiffel’s SCOOP, Cilk for data-parallel, and Erlang. Especially if you add safety or error handling which makes code more complex. If heterogenous environments (multicore to clusters), Chapel is way more readable than C, forks or MPI.

                                                                                    2. 4

                                                                                      despite all the bloat in Linux, process creation in Linux is a lot faster than process creation in Windows. Rob Pike was right, using threads is generally an indication that process creation and ipc is too slow and inconvenient and the answer is to fix the OS. [ edited to maintain “no more than one ‘and yet’ per day” rule]

                                                                                      1. 3

                                                                                        Which led to faster designs than UNIX before and right after UNIX by using approaches that started with a better, concurrency-enabled language doing threads and function calls instead of processes and IPC. Funny you quote Rob Pike given his approach was more like Hansen (1960’s-1970’s) and Wirth’s (1980’s). On a language level, he also said a major inspiration for Go was his experience with Oberon-2 workstation doing rapid, safe coding. You including Rob Pike just corroborates my opinion of UNIX/C being inferior and/or outdated design from different direction.

                                                                                        Back to process creation, its speed doesn’t matter outside high availability setups. The processes usually get created once. All that matters is speed of handling the computations or I/O from that point on. Processes don’t seem to have much to do with that on modern systems. They prefer close to the metal with lots of hardware offloading. The designs try to bypass kernels whether Windows or UNIX-like. Looking back, both mainframe architectures and Amiga’s used hardware/software approach. So, their models proved better than Windows or UNIX in long term with mainframes surviving with continued updates. Last press release on System Z I saw claimed over 10 billion encrypted transactions a day or something. Cost a fortune, too.

                                                                                        1. 1

                                                                                          The threads/function call design is good for some things and bad for others. The memory protection of fork/exec is very important in preventing and detecting a wide range of bugs and forcing design simplication. Oberon was, like much of Wirth’s work, cool but too limited. I think Parnas’ “Software Jewels” was inspired by Oberon.

                                                                                          As for performance, I think you are 100% wrong and have a funny idea of “better”. System Z is clearly optimized for certain tasks, but as you say, it costs a fortune. You should have to write JCL code for a month as penance.

                                                                                          1. 2

                                                                                            “The memory protection of fork/exec is very important in preventing and detecting a wide range of bugs and forcing design simplication.”

                                                                                            Those programs had all kinds of security bugs. It also didn’t let you control privileges or resource usage. If security is goal, you’d do a design like VMS’s spawn which let you those things or maybe a capability-oriented design like AS/400. Unless you were on a PDP-11 aiming to maximize performance at cost of everything else. Then you might get C, fork, and the rest of UNIX.

                                                                                            “As for performance, I think you are 100% wrong and have a funny idea of “better”. System Z is clearly optimized for certain tasks”

                                                                                            The utilization numbers say I’m 100% right. It comes from I/O architecture, not just workloads. Mainframe designers did I/O differently than PC knowing mixing computation and I/O led to bad utilization. Even at CPU level, the two have different requirements. So, they used designs like Channel I/O that let computation run on compute-optimized CPU’s with I/O run by I/O programs on dedicated, lower-energy, cheaper CPU’s. Non-mainframes usually ditched that since cost was main driver of market. Even SMP took forever to reach commodity PC’s. The shared architecture had Windows, Mac, and UNIX systems getting piles of low-level interrupts, having one app’s I/O drag down other apps, and so on. The mainframe apps responded to higher-level events with high utilization and reliability while I/O coprocessors handled low-level details.

                                                                                            Fast forward to today. Since that model was best, we’ve seen it ported to x86 servers where more stuff bypasses kernels and/or is offloaded to dedicated hardware. Before that, it was used in HPC with the API’s splitting things between CPU’s and hardware/firmware (esp high-performance networking). We’ve seen the software side show up with stuff like Octeon processors offering a mix of RISC cores and hardware accelerators for dedicated, networking apps. Inline-Media Encryptors, RAID, and Netezza did it for storage. Ganssle also told me this design also shows up in some embedded products where the control logic runs on one core but another cheaper, lower-energy core handles I/O.

                                                                                            Knockoffs of mainframe, I/O architecture have become the dominant model for high-performance, consistent I/O. That confirms my hypothesis. What we don’t see are more use of kernel calls per operation on simple hardware like UNIX’s original design. Folks are ditching that in mass in modern deployments since it’s a bottleneck. Whereas, mainframes just keep using and improving on their winning design by adding more accelerators. They’re expensive but their architecture isn’t in servers or embedded. Adding a J2 core for I/O on ancient node (180nm) costs about 3 cents a CPU. Intel added a backdoor, err management, CPU to all their CPU’s without any change in price. lowRISC has minion cores. I/O-focused coprocessors can be as cheap as market is willing to sell it to you. That’s not a technical, design problem. :)

                                                                                            1. 2

                                                                                              Since cores are so cheap now (we’re using 44-core machines, and I expect the count to keep going up in the future), why are we still using system calls to do IO? Put the kernel on its own core(s) and do IO with fast disruptor-style queues. That’s the design we seem to be converging toward, albeit in userspace.

                                                                                              1. 1

                                                                                                Absolutely true that the hugely expensive hardware I/O architecture in IBM mainframes work well for some loads if cost is not an issue. A Komatsu D575A is not better or worse than a D3K2 - just different and designed for different jobs.

                                                                                                1. 2

                                                                                                  I just quoted you something costing 3 cents and something that’s $200 on eBay for Gbps accelerators. You only focused on the hugely, expensive mainframes. You must either agree the cheaper ones would work on desktops/servers or just have nothing else to counter with. Hell, Intel’s nodes could probably squeeze in a little core for each major subsystem: networking, disk, USB, and so on. Probably cost nothing in silicon for them.

                                                                                                  1. 1

                                                                                                    call intel.

                                                                                                    1. 2

                                                                                                      Nah. Enjoying watching them get what they deserve recently after their years of scheming bullshit. ;) Also, ISA vendors are likely to patent whatever I tell them about. Might talk to SiFive about it instead so we get open version.

                                                                                        2. 1

                                                                                          Windows NT was VMS-inspired, but I didn’t think Dave Cutler had any influence over Windows prior to that. Wasn’t CreateProcess available in the Win16 API?

                                                                                          I suspect the lack of fork has more to do with Windows’s DOS roots, but NT probably would have gained fork if Microsoft had hired Unix developers instead of Cutler’s team.

                                                                                          1. 1

                                                                                            Windows NT was VMS-inspired, but I didn’t think Dave Cutler had any influence over Windows prior to that. Wasn’t CreateProcess available in the Win16 API?

                                                                                            You could have me on its prior availability. I don’t know. The Windows API article on Wikipedia says it was introduced in Windows NT. That’s Wikipedia, though. I do know that Windows NT specifically cloned a lot from VMS via Russinovich’s article.

                                                                                            1. 2

                                                                                              I think was mistaken; looking at the Windows 95 SDK (https://winworldpc.com/product/windows-sdk-ddk/windows-95-ddk), CreateProcess was at the time in Win32 but not Win16. I guess that makes sense – what would CreateProcess do in a cooperatively multitasked environment?

                                                                                              Most of what I know about NT development comes from the book Showstoppers.

                                                                                      2. 2

                                                                                        Any source on performance issues with CreateProcessEx? A quick search didn’t yield anything interesting. Isn’t CreateProcessEx very similar to the posix_spawn API which the authors describe as the fast alternative to fork/exec in the paper?

                                                                                        1. 2

                                                                                          Alternatively they could just implement fork(). It’s nowhere near as hard as they’re making it out to be.

                                                                                          1. 3

                                                                                            fork is a bad API that significantly constrains OS implementation, so it is very understandable why Microsoft is reluctant to implement it.

                                                                                            1. 3

                                                                                              Meh, says you but if you already have a process abstraction it’s not really that much harder to clone a memory map and set the IP at the point after the syscall than it is to set up a new memory map and set the IP at the start address. I don’t buy that it “significantly” constrains the implementation.

                                                                                              1. 8

                                                                                                Effort isn’t the argument here, but the semantics of it. fork is a really blunt hammer. For example, Rust has no bindings to fork in stdlib for various reasons, one being that many types (e.g. file handles) have state attached to their memory representation not known to fork and that state becomes problematic in the face of fork. This is a problem also present in programs written in other programming languages, also C, but generally glossed over. It’s not a problem for “plain old data” memory, but once we’re talking about copying resource handles, stuff gets messy.

                                                                                                1. 1

                                                                                                  Can you elaborate? FILE* responds to being forked just fine. What Rust file metadata needs to be cleaned up after fork()?

                                                                                                  1. 2

                                                                                                    Files are the simplest case. They are just racy in their raw form and you need to make sure everyone closes them properly. You can work against that by using RAII and unique pointers (or Ownership in Rust), but all those guarantees break on fork(). But even files with fork have a whole section in POSIX and improper handling may be undefined.

                                                                                                    It gets funnier if your resource is a lock and your memory allocator has locks.

                                                                                                    Sure, all this can be mitigated again, but that adds ton of complexity. My point is: fork may seem clean and simple, but in practice is extremely messy in that it does not set up good boundaries.

                                                                                                    1. 1

                                                                                                      Files are the simplest case. They are just racy in their raw form and you need to make sure everyone closes them properly. You can work against that by using RAII and unique pointers (or Ownership in Rust), but all those guarantees break on fork().

                                                                                                      There are no close() races on a cloned file descriptor across forked processes, nor are there file descriptor leaks if one forked process does not call close() on a cloned file descriptor.

                                                                                                      It gets funnier if your resource is a lock and your memory allocator has locks.

                                                                                                      How so? malloc() and fork() work fine together.

                                                                                            2. 1

                                                                                              But one of the main issue with fork that the authors describe is that it gets really slow for processes that use a large address space because it has to copy all the page tables. So I don’t see how implementing fork would help with performance?

                                                                                              1. 0

                                                                                                Nobody calls fork() in a loop. An efficiency argument isn’t relevant.

                                                                                                1. 2

                                                                                                  In shell scripts usually most of the work is done by external programs. So shells use fork/exec a lot.

                                                                                                  1. 0

                                                                                                    But not “a lot” to the point where progress is stalled on fork/exec.

                                                                                                  2. 1

                                                                                                    I mean, the parent comment by @sebcat complains about process creation performance, and you suggest that implementing fork would help with that, so you do argue that it is efficient. Or am I reading your comment wrong?

                                                                                                    1. 1

                                                                                                      Ah okay, I see. I was under the impression that he was referring to the case where people complain about fork() performance on Windows because it is emulated using CreateProcessEx() (which he may not have been doing). My point was that If they implemented fork() in the kernel, they wouldn’t have to deal with those complaints (which are also misled since CreateProcessEx / fork() performance should never be relevant).

                                                                                                    2. 1

                                                                                                      A loop isn’t necessary for efficiency to be come relevant. Consider: most people abandoned CGI because (among other reasons) fork+exec for every HTTP request doesn’t scale well (and this was after most implementations of fork were already COW).

                                                                                                      1. 1

                                                                                                        I can’t blame you, but that’s an excessively literal interpretation of my statement. By “in a loop,” I mean that the program is fork+exec bound, which happens on loaded web servers, and by “nobody” I mean “nobody competent.” It isn’t competent to run a high trafficked web server using the CGI model and expect it to perform well since process creation per request obviously won’t scale. CGI was originally intended for small scale sites.

                                                                                                  3. 1

                                                                                                    The bureaucracy at large, slow and old corporations partly explains this. This paper maybe took 6 months - 1 year. Adding fork() (with all the formalities + technical details + teams involved) would take 5-10 years IMHO.

                                                                                                    Much easier to just include it as “yet another app”, e.g. WSL:

                                                                                                    When a fork syscall is made on WSL, lxss.sys does some of the initial work to prepare for copying the process. It then calls internal NT APIs to create the process with the correct semantics and create a thread in the process with an identical register context. Finally, it does some additional work to complete copying the process and resumes the new process so it can begin executing.

                                                                                                    https://blogs.msdn.microsoft.com/wsl/2016/06/08/wsl-system-calls/