Threads for acln

  1. 10

    Minimal Version Selection itself is fine, in my experience neither better nor worse than what I guess is the de facto standard of… Maximal {Patch, Major} Version Selection? It biases for churn reduction during upgrades, maybe, which at least for me doesn’t have much net impact on anything.

    But a lot of the ancillary behaviors and decisions that come with MVS as expressed in Go modules are enormously frustrating — I would go so far as to say fundamentally incompatible with the vast majority of dependency management workflows used by real human beings. As an example, in modules, it is effectively not possible to constrain the minor or patch versions of a dependency. You express module v1.2.3 but if another dependency in your dep graph wants module v1.4.4 you’ll get v1.4.4. There is no way to express v1.2.x or v1.2.3+. You can abuse the require directive to lock to a single specific version, but that constraint isn’t transitive. And module authors can retract specific versions, but that power is not granted to consumers.

    edit: for more of my thoughts on this exciting topic see Semantic Import Versioning is unsound

    1. 5

      I would go so far as to say fundamentally incompatible with the vast majority of dependency management workflows used by real human beings.

      I hope that this post with the experience reports of many real human beings claiming that it works well for them will help you reconsider this opinion. Perhaps it’s not as much of a vast majority as you think it is.

      1. 3

        None of the experience reports really speak to the points I raise. Maybe that’s your point.

      2. 2

        The position of the Go team is (was?) that packages with the same import path must be backwards compatible. I guess their point is that 1.4.4 should be compatible with 1.2.3, but that’s not how the rest of the software world has worked in the past decade. It’s a big if, but if all the go programmers agree, it works.

        1. 6

          That’s not (only?) the position of the Go team, it’s a requirement of semantic versioning. People fuck it up all the time but that’s the definition. One way to look at the problem with modules is that they assume nobody will fuck it up. Modules assumes all software authors treat major version API compatibility as a sacrosanct and inviolable property, and therefore provides no affordances to help software consumers deal the fuckups that inevitably occur. If one of your dependencies broke an API in a patch version bump, well, use a better different dependency, obviously!

          Ivory Tower stuff, more or less. And in many ways Go is absolutely an Ivory Tower language! It tells you the right way to do things and if you don’t agree then too bad. But the difference is that if you don’t like the “I know better than you” decisions that Go the language made, you can pick another language. But if you don’t like the “I know better than you” decisions that modules made, you can’t pick another package management tool. Modules has a monopoly on the space. That means it doesn’t have the right to make the same kind of normative assumptions and assertions that the language itself makes. It has to meet users where they are. But the authors don’t understand this.

          1. 3

            Semantic versioning is, unfortunately, incompatible with graceful deprecation. Consider the following example:

            • A is the version you’re using.
            • B introduces a new API and deprecates an old one
            • C removes the deprecated API.

            In SemVer, these thing would be numbered something like 1.0, 1.1, 2.0. The jump from 1.1 to 2.0 is a breaking change because the old API went away at that point. If you paid attention to deprecation warnings when you upgraded to 1.1 and fixed them then the 1.1 -> 2.0 transition is not a breaking change though and SemVer has no way of expressing this with a single exported version and this leads to complex constraints on the import version (in this three-version case, the requirement is >=1.1, <3.0). A lot of these things would be simpler if APIs, not packages, were versioned and the package advertised the range of the API that it advertised. Then you’d see:

            • A is the 1.0 API
            • B supports the 1.0 API and the 2.0 API
            • C supports the 2.0 API

            As a consumer, I’d just specify that I need the 1.0 API until I’ve migrated my code and then that I need the 2.0 API.

            1. 1

              One way to look at the problem with modules is that they assume nobody will fuck it up.

              So does everyone else in the world who builds with ^version and no lock file, except for them, the breakage happened when the author of their dependency published a new version rather than when they themselves performed a conscious action to alter dependencies in some way.

              1. 3

                Yeah but in most package management systems you can pin specific versions.

              2. 1

                […] and therefore provides no affordances to help software consumers […]

                If one of your dependencies broke an API in a patch version bump, well, use a better different dependency, obviously!

                You can use an exclude directive to remove the breaking version from consideration. If upstream delays fixing the breaking change, you can fork the last good version and use a replace directive to substitute your own patched module in place of the old module.

                It’s hard to imagine this hypothetical happening, however (and please correct me if I’m wrong). MVS selects the maximum of minimum required versions. If some module called “broken” patches and the new version breaks your own software, there is no way for that update to propagate to your software unless both 1) a different dependency of your module called “dep” decides to start requiring the new version of “broken” and 2) you update your dependencies to require the updated version of “dep”. (1) Implies that “dep” truly requires the patch (that broke your code), and (2) implies that you truly require the new features in module “dep”. By transitivity… there is no conceivable way to fix the problem without patching it yourself and replacing.

                There’s actually a whole paragraph on this topic of “high fidelity” in Russ Cox’s original blog post about the version selection algorithm.

                1. 2

                  You can use an exclude directive to remove the breaking version from consideration.

                  I meant “broke an API” in the “API compatibility” sense, not in the “wrote a bug” sense. That kind of broken carries forward.

            2. 1

              So your article states: “It’s a contract with its consumers, understood by default to be supported and maintained indefinitely.”

              I don’t think this follows from anything you have written or anything I have read about SIV. The way SIV works sounds to me like if you want to deprecate features from your library you should provide a reasonable deprecation policy which includes a time period for which you will provide bug-fixes for the old major version and a time period for which you will backport security-fixes for the old major version at which point you stop supporting that version since you’ve done the best you could to get old users moved to a new version. This to me seems like a lot of major software (not written in the past 5 years) basically works.

              “At Google, package consumers expect their dependencies to be automatically updated with e.g. security fixes, or updated to work with new e.g. infrastructural requirements, without their active intervention.”

              I expect this to happen on my linux desktop too. I don’t see a difference in expectations there.

              “Stability is so highly valued, in fact, that package authors are expected to proactively audit and PR their consumers’ code to prevent any observable change in behavior.”

              I think if you feel like writing a library/module/dependency then this is the kind of mindset you are obliged to take. Anything short of this kind of approach to writing a library/module/dependency is irresponsible and makes you unqualified to write libraries, modules or dependencies. This, to me, seems to have been the mindset for a long time in software until languages came along with language package managers and version pinning in the last few years. And I don’t think that this has been a positive change for anyone involved.

              “As I understand it, even issues caused by a dependency upgrade are considered the fault of the dependency, for inadequate risk analysis, rather than the fault of the consumer, for inadequate testing before upgrading to a new version.”

              And I agree with this wholeheartedly, in fact this is the mindset used by users of linux distributions and distribution maintainers.

              “Modules’ extraordinary bias toward consumer stability may be ideal for the software ecosystem within Google, but it’s inapproriate for software ecosystems in general.”

              I think it’s not inappropriate, it’s totally appropriate. I just think that modern software ecosystems have gotten lazy because it’s easier than doing it right (which is what google seems to be advocating for a return to).

              I should say, I don’t disagree with the point you make that intrinsically linking the major version to the package name is a good idea. Go should definitely NOT do that for the reasons you outlined. It would also be an easy indicator for me when picking a project to use in my codebase: Is the codebase on major version 156? Yes? Then I probably don’t want to touch it because the developers are not taking the responsibility of maintaining a dependency very seriously.

              People who want to play in the sandpit of version pinning and ridiculously high major version numbers because they think software development is an area where no thought or effort should be put into backwards compatibility should be welcome to use whatever language they want to without being artificially limited.

              Now, conversely, I would say, there seems like an obvious solution to this problem too. If you want to use semver while keeping to the golang rules, why not just encode the real semver version within the go version: “0.015600310001”. Sure, it’s not exactly so human readable, but it seems to encode the right information and you just need to pretty print it.

              “Additionally, policies that bias for consumer stability rely on a set of structural assumptions that may exist in a closed system like Google, but simply don’t exist in an open software ecosystem in general.”

              I will take things back to the world of linux distributions where these policies actually do seem to exist.

              “A bias towards consumers necessarily implies some kind of bias against authors.”

              Yes, and this is a good thing. Being an author of a dependency is a very big responsibility and a lot of modern build systems and language package managers fail to make that very clear

              “But API compatibility isn’t and can’t be precisely defined, nor can it even be discovered, in the P=NP sense.”

              This is true, but in reality there’s a pretty big gulf between best effort approaches to API compatibility (see: linux kernel) and zero effort approaches to API compatibility (see: a lot of modern projects in modern languages).

              “Consequently, SIV’s model of versioning is precisely backwards.”

              Actually it would be semver’s fault not SIV’s surely.

              “Finally, this bias simply doesn’t reflect the reality of software development in the large. Package authors increment major versions as necessary, consumers update their version pins accordingly, and everyone has an intuitive understanding of the implications, their risk, and how to manage that risk. The notion that substantial version upgrades should be trivial or even automated by tooling is unheard of.”

              Maybe today this is the case, but I am pretty sure this is only a recent development. Google isn’t asking you to do something new, google is asking you to do something old.

              “Modules and SIV represent a normative argument: that, at least to some degree, we’re all doing it wrong, and that we should change our behavior.”

              You’re all doing it wrong and you should change your behavior.

              “The only explicit benefit to users is that they can have different versions of the “same” module in their compilation unit.”

              You can achieve this without SIV, SIV to me actually seems like just a neat hack to avoid having to achieve this without SIV.

              In any case, I think I’ve made my point mostly and at this point I would be repeating myself.

              I wonder what you think.

              1. 1

                People who want to play in the sandpit of version pinning and ridiculously high major version numbers because they think software development…

                …is a means and not an end are the norm, not the exception. And the fact that people work this way is absolutely not because they’re lazy, it’s because it’s the rational choice given their conditions, the things they’re (correctly!) trying to optimize for, and the (correct!) risk analysis they’ve done on all the variables at play.

                I appreciate your stance but it reflects an Ivory Tower approach to software development workflows (forgive the term) which is generally both infeasible and actually incorrect in the world of market-driven organizations. That’s the context I speak from and the unambiguous position I’ve come to after close to 20 years’ experience in the space, working myself in a wide spectrum of companies and consulting in exactly these topics for ~100 orgs at this point.

                Google has to work this way because their codebase is so pathological they have no other choice. Many small orgs, or orgs decoupled from typical market dynamics, can work this way because they have the wiggle room, so to speak. They are the exceptions.

                1. 1

                  the things they’re (correctly!) trying to optimize for, and the (correct!) risk analysis they’ve done on all the variables at play

                  Disagree.

                  At least don’t call the majority of software developers “engineers” if you’re going to go this way.

                  The fact that this is considered an engineering discipline with such low standards is really an insult to actual engineering disciplines. I can totally see how certain things don’t need to be that rigorous, but really, seriously, what is happening is not par for the course.

                  The fact that everyone including end users has become used to the pathological complacency of modern software development is really seriously ridiculous and not an excuse to continue down this path. I would go so far to say that it’s basically unethical to keep pretending like nothing matters more than making something which only just barely works within some not even that optimal constraints for the least amount of money. It’s a race to the bottom, and it won’t end well. It’s certainly not sustainable.

                  I appreciate your stance but it reflects an Ivory Tower approach to software development workflows (forgive the term) which is generally both infeasible and actually incorrect in the world of market-driven organizations. That’s the context I speak from and the unambiguous position I’ve come to after close to 20 years’ experience in the space, working myself in a wide spectrum of companies and consulting in exactly these topics for ~100 orgs at this point.

                  It’s incorrect in the world of market-driver organizations only because there’s a massive gap between the technical ability of the consumers of these technologies and the producers, so much so that it’s infeasible to expect a consumer of these technologies to be able to see them for the trash they are. But I think that this is not “correct” it’s just “exploitative”. Exploitative of the lack of technical skill and understanding of the average consumer of these technologies.

                  I don’t think the “correct” response is “do it because everyone else is”. It certainly seems unethical to me.

                  That being said, you are talking about this from a business point of view not an open source point of view. At least until open source got hijacked by big companies, it used to be about small scale development by dedicated developers interested in making things good for the sake of being good and not for the sake of a bottom line. This is why for the most part my linux system can “just update” and continue working, because dedicated volunteers ensure it works.

                  Certainly I don’t expect companies to care about this kind of thing. But if you’re talking about solely the core open source world, “it’s infeasible in this market” isn’t really an argument.

                  Google has to work this way because their codebase is so pathological they have no other choice. Many small orgs, or orgs decoupled from typical market dynamics, can work this way because they have the wiggle room, so to speak. They are the exceptions.

                  I honestly don’t like or care about google or know how they work internally. I also don’t like go’s absolutely inane idea that it’s sensible to “depend” on a github hosted module and download it during a build. There’s lots of things wrong with google and go but I think that this versioning approach has been a breath of fresh air which suggests that maybe just maybe things may be changing for the better. I would never have imagined google (one of the worst companies for racing to the bottom) to be the company to propose this idea but it’s not a bad idea.

                  1. 2

                    The fact that everyone including end users has become used to the pathological complacency of modern software development is…

                    …reality. I appreciate your stance, but it’s unrealistic.

            1. 11

              Indeed, errors as regular values enable some elegant patterns that are cumbersome when errors (exceptions) are tied to control flow.

              However, Go’s particular implementation of errors-as-values leaves a lot to be desired. Being based on multiple return rather than sum types (enums with data), it can’t capture results as a whole as a value, so it’s missing out on even more elegance from expressing fallible function results as values. And of course it’s famously verbose. Other languages have shown that with a bit of syntax sugar it’s possible have best of both worlds: code almost as noise-free as exception-based error handling, preventing accidentally-unhandled errors, while still having flexibility of errors as values.

              1. 8

                Rust and Zig are two languages that do Go-style error handling far better than Go does. Go feels so cumbersome in comparison.

                1. 3

                  I am going to be facetious, but there is a point to this post, I promise.


                  it’s missing out on even more elegance from expressing fallible function results as values. And of course it’s famously verbose.

                  On the contrary, Go exposes the reality of what it means to simply send an error up the stack without decorating it. Go forces the programmer to think about precise error reporting!

                  it’s possible have best of both worlds: code almost as noise-free as exception-based error handling,

                  Someone’s noise is someone else’s precision! Go sits in a wonderful middle ground between error codes and exceptions, where the frequency of if err != nil blocks encourages programmers to decorate errors more precisely than they would otherwise do when (ab)using Rust’s ? (which throws away context!), or when using Java exceptions (which tend to involve noisy stack traces!).

                  preventing accidentally-unhandled errors

                  Linters exist! It’s $CURRENT_YEAR, you should have a linter in your CI script anyway, why not give it an extra task?

                  while still having flexibility of errors as values.

                  Not nearly as elegantly and as simply as Go allows it, through Go 1.13 error (un)wrapping and inspection!


                  I am exaggerating, somewhat. Perhaps my post sounds like a joke, especially when you consider how similar the statements “Go forces you to think about precise error reporting!” and “Rust forces you to think about precise ownership of memory!” sound. I suppose it is somewhat of a joke, but I am trying to point something out:

                  I am reminded of Bryan Cantrill’s talk about platforms and values. I think that if one is to enjoy Go error handling, one must value very precise error reporting to begin with (or eventually be molded into this attitude by using the language). It seems to me that this position is not unheard of in conversations on the internet about about the language, but most people are on the side of “Go is famously verbose”, from what I can tell. Perhaps this hints at a more general state of affairs in the industry with respect to how much people value precise error reporting.

                  Earnestly, Rust’s ? does indeed seem to be the best of both worlds, because it’s very easy to go from ? to error decoration, but Go doesn’t have a mechanism to compact if err != nil. Not that I’d (ever?) use such a mechanism personally, but people really seem to want it, so there is that.

                  1. 2

                    I am reminded of Bryan Cantrill’s talk about platforms and values. I think that if one is to enjoy Go error handling, one must value very precise error reporting to begin with (or eventually be molded into this attitude by using the language)

                    My only personal annoyance with Go’s syntax has been the behavior of := with respect to desugaring the tuple/product that is returned by a function that can error out, where I’ve inadvertently lost track of an error. But yes, linters have caught this for me several times. In long code blocks with lots of IO, I also find the error handling to distract from logic, but this also doesn’t occur frequently enough in my code to bother me overly.

                    Earnestly, Rust’s ? does indeed seem to be the best of both worlds, because it’s very easy to go from ? to error decoration

                    While I find this compaction very powerful, I think it’s also a bit of a footgun. I’ve frequently seen ? use to bubble up errors and ignore them. One of my favorites that plagues the Rust ecosystem is when reading from stdin. If stdin is closed early (e.g. when using head to read the top of a file), then a read will return an Err(...), and most programs will just bail because they used a ?. I appreciate the ? tool, but its ability to stuff an error out of a programmer’s view can be troubling. I like the pattern of using custom error types everywhere but the very toplevel of the program, but that level of required discipline is exactly what makes ? a footgun in my opinion.

                1. 9

                  Very, very good. Much of this resonates with my desires for modern computing, and my disappointment with “desktops” as they are today.

                  Where I do find a little solace is inside Acme, which has been my daily driver for programming work for ~8 years now. Acme deals only in text, but it has essential properties that very few other environments have. I can identify many of these properties in the principles you have enunciated.

                  • In reference to #1 - Professionals First: Acme is a programmer’s enviroment, through and through. Most UNIX tools (not the ones with the insufferable terminal spinners, though, like npm!) find themselves right at home inside Acme, and Acme integrates them in a lovely way. Acme is an Integrating Development Environment, as such.
                  • In reference to #2 - Diversify Experience: Acme has “client side links” in the sense that the user writes so-called plumbing rules, and if it matches a plumbing rule, every bit of text in a window can become a “link” when you right click swipe it. There are a few default rules, such as rules for file paths, but you can extend them to your liking.
                  • In reference to #11 - Defer Composition: “Popups” that tools create are real windows inside acme. If you execute a command from inside Acme, its output is directed to a new window, which you can manipulate like any other window. This enables the construction of context-aware “menus” that are, after all, only regular Acme windows.
                  • In reference to #12 - Simplicity is Systemic: Acme exposes all of its internal state as a file system in userspace. As such, acme can be extended using programs written in any language, not just the tired old “scripting language” of choice, such as VimScript, elisp, Lua, JavaScript, etc. I have extensions to acme written in shell script, C, Python, and Go.

                  Much more can be said, but I will stop here, since these things are better experienced than described. In case you are unaware of Plan9 and/or Acme, I hope this post contains information that might interest you. If you are already aware of Plan9 and/or Acme, then I hope this post contains encouragement and validation (if you need it at all, at this point). Arcan seems like a very ambitious project. Unlike Acme, which restricts itself only to text (and gains tremendous power in the process), it seems like it is aiming to invent a whole new multimedia userland. I hope it succeeds.

                  1. 3

                    I would be surprised if the author had no prior knowledge about plan9.

                    I find acme both liberating and infuriating: while the features you mention are indeed great, I find it difficult to live with the keyboard navigation limitations. That and the weird compose key behaviour :) Acme is pretty easy to patch and extend however, so I added what I missed the most, but still use vim as my daily driver.

                    I have had a look at different interesting projects, inspired by acme:

                    1. 1

                      I’ve looked a fair bit at (8.5)/plan9/9p/rio/acme/plumber and most of the stuff Rob Pike has published on the matter - it is something of a recurring theme in the IRC channel. Going back a few years, you have a blogpost about an experiment preceding this article 2017: One night in Rio: Vacation photos from Plan9 looking at the window management scheme specifically.

                      There is something in the pipeline nearly ready to show off that, to me at least, serves as the missing link between wm/cli/ide that has some acmeisms, though I think that much of acme can be reproduced in durden, should the editor part (neovim ui driver using our curses-like api) improve a wee bit.

                      What I am curious about, should you have the time and its not too much of an inconvenience, video documentation of acme workflows are quite few and far between (livestream, silent video, heck animated gif). I think there’s quite a few that would be interested in seeing some patterns of established / daily driver use.

                      1. 2

                        Russ Cox’ tour of Acme is pretty good video documentation on what daily use of acme looks like. I’m currently on vacation, and I may record a session or a stream of my own when I return. Unlike that video, in my case, there are three columns. The right column always contains a shell session and the guide file open for the project I’m working on. The guide file is a list of common commands for a project. The left column contains mostly read-only code and documentation, while the middle column is read-write. Russ’ video doesn’t show this, but I also make extensive use of the Dump command to save state and switch between projects.

                      2. 1

                        How are you running Acme these days?

                        1. 1

                          My work is Linux-based, so I use Acme from plan9port on Debian.

                      1. 21

                        There is just such a huge ergonomics hit. On top of that, any time I’ve compared benchmarks between async and threaded code with the cpu frequency stabilized for repeatable results, async has usually resulted in lower throughput and only in unrealistically low-cpu workloads have I sometimes measured latency improvements. Far worse ergonomics, more error prone code, worse compiler inference, tons of dependencies, etc… for approximately equal or worse throughput and latency.

                        Have other people run responsibly controlled benchmarks that show significant throughput improvements on modern server operating systems when using async? It’s kind of weird to me that people will go through all of this pain because some random person on the internet told them it was better, but it doesn’t seem like many people have seriously evaluated the costs and benefits.

                        If you want, try out this echo example after disabling turbo boost and see for yourself:

                        # build with turbo boost
                        cargo build --release --bins
                        
                        # disable turbo boost for repeatable results
                        echo 1 | sudo tee /sys/devices/system/cpu/intel_pstate/no_turbo
                        
                        # start threaded and async servers
                        cargo run --release --bin req_res_threads & # starts on port 7000
                        cargo run --release --bin req_res_async & # starts on port 7001
                        
                        # see how long it takes to receive 100k echo round trips of 4k buffers from 10 concurrent clients
                        time cargo run --release --bin req_res_sender -- 7000 10 # "bad" thread per client
                        time cargo run --release --bin req_res_sender -- 7001 10 # async
                        
                        # re-enable turbo boost for more enjoyable computing
                        echo 0 | sudo tee /sys/devices/system/cpu/intel_pstate/no_turbo
                        

                        On my linux servers, threads tend to beat async throughput by 5-20%. Threaded version is doing new thread per client. Async version is using a multi-threaded work-stealing executor.

                        It’s interesting to strace the async version to see how many more syscalls can be generated when hammering on epoll.

                        Anyway, can we please start measuring more? So much effort is being spent for negative gains :/

                        1. 16

                          It’s kind of weird to me that people will go through all of this pain because some random person on the internet told them it was better, but it doesn’t seem like many people have seriously evaluated the costs and benefits.

                          It is fascinating how much damage early 2000’s “threads = slow, async = fast” FUD can do. Almost all of it doesn’t apply anymore, anyway. It seems like this meme has somehow been imparted into the collective programmer (un)conscious, and it seems impossible to root out at this point. Remember Node.js being marketed as “everything is async, therefore very fast”?

                          Even today, somehow, being async is revered as a great feature. The very reason for “being async” seems to have long been forgotten by the general public. I have first hand experience with this phenomenon from teaching people Go. When they start learning, at some point, they ask the spooky question: “Does this call block a thread? I heard blocking is slow.”, and a wise Go sage promptly answers: “Fear not! The Go runtime actually uses async under the hood!”, and just like that, the pupil’s worries about performance are gone! Poof! The async magic sauce makes everything go fast, so if it’s under the hood, we need not worry about performance at all.

                          In fact, I believe “being async” and exposing such primitives is an enormous disadvantage, in every single department except perhaps performance, and, even in that case, there exist multiple facets, as the benchmark linked in the parent post shows. Look at the absolutely remarkable amount of code (and the complexity thereof) the author of the blog post has to write towards the end of the article, in order to do something that is, conceptually, very simple[1]. And what have we gained from this, in the real world? A few less MBs of memory sitting around unused per connection in a server?

                          I’m not buying what the async people are selling. The async programming model is strictly worse. Async code does not compose well with “regular” code, and can be difficult to reason about. Writing and debugging the reactors is difficult, especially if you want a cross-platform one, since each platform (epoll, kqueue, event ports, IOCP) has its own quirks. Having visibility into these runtimes requires a tremendous amount of extra work, because none of the existing tools understand them. Meanwhile, a thread / process, and calling read(2) is just about the same everywhere. But that would be too easy, wouldn’t it?

                          When I look at much of what modern software development is like, I can’t help but feel like

                          Anyway, can we please start measuring more? So much effort is being spent for negative gains :/

                          alludes to a much deeper problem. You see it in many places, and the async cargo culting I tried to point out with this post is just one of them.

                          [1] Here it is, in 8 lines of Go, and another ~20 for a complete working demo https://play.golang.org/p/B-ZmhxNIYPb

                          1. 4

                            In general, as someone who has worked a lot with servers in a language that doesn’t have async, Python[1], I will say that async programming is worse in many cases. I would not use it for anything outside the web domain. Unfortunately, at least in my career, web servers have eaten the world, and it turns out that async helps a lot there. With Python servers, it turns out that frequently they are only able to effectively make use of around 40% of a typical cloud VM[2]. If you start to get into the 60% CPU range performance quickly degrades. This also comports with my experience of Ruby at a previous job. Note that I’m also ignoring the absolutely astounding growth rate of memory usage resulting from a typical Python or Ruby code base that effectively limits the number of worker threads that you can actually run.

                            Now, most of this stuff where someone writes an ETL or some other trivial thing? Yeah, just use threads and traditional concurrency primitives. I’ll note though that the article here is literally a toy example to demonstrate the trait implementations and compiler errors, not an example of best practices.

                            • [1]: I do know about Python async “stuff,” but for all intents and purposes Python does not have async.
                            • [2]: Say, a c5.large on AWS.
                            1. 2

                              You do not need to expose an async interface to avoid the high memory consumption that a Python/Ruby system has though. Go and Erlang/Elixir are two fine examples; there are plenty of others.

                              A multiprocess single-threaded synchronous dynamically-typed interpreted GC’d language is worst case for memory in a high-concurrency environment; it’s just one overhead after another. Python and Ruby are both technological dead ends in web dev for the reasons you pointed out.

                              I suspect I’m drifting off from the topic at hand, however…

                          2. 8

                            I want to upvote you even more than I can. I think its sad that async took so much mindshare. there are few if any good http servers, database clients and other libs that don’t depend on tokio or some other runtime.

                          1. 5

                            Let’s build a strawman just like the one the author built, while remembering to adhere to the theme of ignoring all nuance and real-world usage of said API.

                            <strawman>

                            We endeavor to find out what happens if we call next() again on a Rust Iterator that has just returned None. Let’s read the documentation:

                            Returns None when iteration is finished. Individual iterator implementations may choose to resume iteration, and so calling next() again may or may not eventually start returning Some(Item) again at some point.

                            This seems a little difficult to program against, no? ”…at some point”? And when might that be? Should I go read the particular documentation for the iterator I’m using? And I’m writing generic code, then, certainly, the only thing I can do after I’ve seen None is stop immediately. But if that is the case, and I have the strong Rust type system on my side, why does reaching the end of iteration still leave me with an Iterator value in an unspecified state? Shouldn’t it somehow arrange to be destroyed / consumed if it cannot be used further? In the words of the author:

                            Oh no. That’s one situation where “there’s options” is definitely a bad thing.

                            What ever shall we do with this weakly specified Iterator interface? It seems so terrible to program against!

                            </strawman>

                            Thankfully, in reality, very little Rust code needs to worry about this, because most Rust code never calls next() on iterators directly, since the convenient forin construction exists. Much like most Go code never calls Read on an io.Reader directly, since the convenient io.Copy, ioutil.ReadAll, io.ReadFull, etc. constructions exist. Furthermore, I’d venture to say that for ... in on an iterator in Rust and consuming an io.Reader in Go are comparable operations in terms of their ubiquity.

                            But that’s no excuse for having such a poorly specified interface!

                            I very much agree, but such is the abstraction level of the language that it cannot reasonably specify it any more precisely than that. The Go language has many flaws (have you seen context.Context???). But it seems to me that the design of io.Reader is, in general practice, such an irrelevant case of “this interface is not specified as strongly as it could be”, that I cannot understand the author’s desire to devote thousands of words to it.

                            Making APIs resistant to misuse is certainly a good objective, but if the author has a bone to pick with Go for promoting such APIs due to its weak type system, there are many other easier and more practical targets. As “poorly” specified as it is, I doubt many Go users ever misuse io.Reader by calling its Read method directly.

                            1. 7

                              I don’t think this is a strawman and I think it is a valid criticism of Rust’s Iterator interface. Rust in fact recognizes this and there is a FusedIterator trait which guarantees None to be final.

                              You are right Rust’s for loop saves most users from this complexity, but whether to fuse or not IS a significant design concern whenever you are on the side of implementing an iterator. In my experience, Rust peope who have implemented an iterator is aware of this.

                            1. 1

                              I’ve thought a lot about what the best way to start programming would be. It’s important to note that I am talking about practical programming more so than CS education, which is why this message is slightly off-topic, but hopefully interesting nevertheless.

                              Based on my personal experience, when it comes to learning how to program, I think of languages on a sort of horseshoe spectrum. The horseshoe describes how much the abstraction provided by the language leaks through to the user.

                              On one side of the horseshoe, there is assembly language, where very few abstraction leaks can ever happen, because assembly language is just about the lowest level of abstraction any programmer will ever work at. Assembly language is also, in practical terms, the closest you are ever going to get to the truth of “how the computer actually works”. Being exposed to this from the get-go, I posit, is a Good Thing.

                              On the other side of the horseshoe, there are symbolic languages, for example LISPs. When programming in a LISP, it feels like I really only care about the abstract symbols, and the machine underneath completely disappears. This is also a Good Thing.

                              From this perspective, every other language feels like it is stuck in an uncanny valley of abstraction level. Not low level enough to show you what is really going on, and not high level enough that you can just think about it without having to bother with irrelevant minutia. That is why I think learning how to program should happen by traversing this horseshoe from the extremes of “purity” down to the muddy, uncanny valley of “practical languages”. Preferably, students should advance on both fronts (from assembly language up, and from symbolic language down) simultaneously.

                              1. 4

                                This is a false dichotomy which ignores all the advances that have happened in type theory. I’m kinda bored being sick with high fever at home so I will provide you with a new (fever-delirium-induced) model:

                                The landscape has a great big plain with all of the dynamic languages (python, lisps, perl, php, etc.), lets think of them as cities and we can order them by how much the runtime has to work to keep the leaky abstraction from collapsing (e.g. lua > ruby).

                                Then there are some lakes and oceans where the merfolk live, sadly some of the merfolk have genetic instability causing some of their offspring to not be able to breathe underwater, these civilizations are the GC-free languages and the less safe / more UB the greater the genetic instability. Some of these mutated children manage to get to land before they drown and proceed to journey the land.

                                Next we have the mountain ranges where the hermits live, these are the statically typed languages, at the roots of the mountains you will find languages like Java, then OCaml but the higher you climb the more pure the type system becomes.

                                Finally we have the sky people / aliens, they have some contact with the mountain folks but there’s so few of them that they don’t really have much of a hierarchy, these are the logic programmers, here we have languages like mercury and prolog.

                                The plains dwellers want to go to space and meet the aliens, just because why not? The plains people aren’t too fond of the hermits so they don’t know anything about the aliens, but they built this airplane called SQL which they use to catch glimpses of the sky people and travel between the cities.

                                Now how does this landscape account for the sans-GC haskell-y languages that are on the horizon, like Neut (and others..)? Mountains in the sea? Portal on the top of the mountain? idk, these kinds of stories are often more harmful than not in actually providing any sort of explanation of things because all sorts of accidental structure shows up in the story for the sake of narrative.

                                To be fair symbolic vs assembly has been a good lens for a while, but it breaks down when we have abstractions that are not leaky which describe computation adequately. This is active mathematics research and probably will be rather fringe for few more decades.

                              1. 8

                                The Go programming language is a direct descendant of Plan 9, along with Linux and BSD’s /proc filesystems, user namespaces, and more.

                                What does it mean that Go descends from Plan 9? I can see how /proc and user namespaces would be inspired by features from Plan 9, but I don’t see the connection with Go (unless the author means they were both created by the same people).

                                1. 30

                                  What does it mean that Go descends from Plan 9

                                  The first go compilers were copy-pastes of the Plan 9 C compiler, with a new frontend. You can compare, for example, https://github.com/golang/go/tree/release-branch.go1.4/src/cmd/cc and https://code.9front.org/hg/plan9front/file/965e0f59464d/sys/src/cmd/cc

                                  The concurrency model is based off Rob Pike’s experiments with Squeak, Newsqueak, Alef, and Limbo (One of which, by the way, did have generics).

                                  1. 12

                                    There are several pieces of Go that are explained by plan 9. Originally the c compiler used to compile go was the plan 9 compiler (6c iirc). Struct Embedding is borrowed from plan 9 c. http://doc.cat-v.org/plan_9/programming/c_programming_in_plan_9

                                    The library interfaces (like net.Dial) are borrowed from plan9 https://9fans.github.io/plan9port/man/man3/dial.html

                                    Channels and a M:N threading model: http://man.cat-v.org/plan_9/2/thread

                                    1. 9

                                      Plan 9 -> Inferno -> Limbo -> Go

                                      The lineage is directly tracable, and a lot of Go features (channels and goroutines in particular) are directly taken from Plan 9.

                                      1. 7

                                        There is a philosophical connection too, if I may call it that.

                                        On Plan9, most things of importance are implemented as file servers. This makes Plan9 pretty much the ultimate “existing tools work with new objects” system. (aside: I highly recommend the “Unix, Plan 9 and the Lurking Smalltalk” paper on this topic).

                                        Programming in Go also has this feeling of “existing tools work with new objects”, due to the ubiquitous use of interfaces: error, io.{Reader,Writer}, net.Conn, and so forth. Almost all the Go code you will ever write uses these. Everything works well together. This in itself is nothing revolutionary: other languages have I/O streams as well (or iostreams even, ha!). But Go is designed for these pretty much from the bottom up, and due to the fact that the runtime abstracts away asynchronous I/O where possible, performance is commendable too.

                                        1. 3

                                          The build system and compiler are built very similarly to Plan9 counterparts.

                                        1. 6

                                          You know someone is over-hyping Rust (or is just misinformed) when you see statements like

                                          Which means there’s no risk of concurrency errors no matter what data sharing mechanism you chose to use

                                          The borrow checker prevents data races which are involved in only a subset of concurrency errors. Race conditions are still very possible, and not going away any time soon. This blog post does a good job explaining the difference.

                                          Additionally, I have my worries about async/await in a language that is also intended to be used in places that need control over low level details. One library that decides to use raw I/O syscalls on some unlikely task (Like error logging) and, whoops, there goes your event loop. Bounded thread pools don’t solve this (What happens if you hit the max? It’s equivalent to a limited semaphore), virtual dispatch becomes more of a hazard (Are you sure every implementation knows about the event loop? How can you be sure as a library author?), what if you have competing runtime environments (See twisted/gevent/asyncio/etc. in the Python community. This may arguably be more of a problem in Rust given it’s focus on programmer control), and the list goes on. In Go, you literally never have to worry about this, and it’s the greatest feature of the language.

                                          1. 1

                                            You know someone is over-hyping Rust (or is just misinformed) when you see statements like

                                            It doesn’t help that they state (or did state until recently) on their website that Rust was basically immune to any kind of concurrency error.

                                            1. -1

                                              That definition of “race condition - data race” essentially refers to a operational logic error on the programmer’s side. As in, there’s no way to catch race conditions that aren’t data races via a compiler, unless you have a magical business-logic-aware compiler, at which point, you wouldn’t need a programmer.

                                              As far as the issues with async I/O go… well, yes. Asyncio wouldn’t solve everything. But asyncio also wouldn’t necessarily have to be single threaded. It could just meant that a multi-threaded networking application will now spend less resources on context-switching between threads. But the parallelism of threads > cpu_count still comes useful for various blocking operations which may appear here and there.

                                              As far as GO’s solution goes, their solution to the performance issue isn’t that good. Since goroutines have significant overhead. Much less than a native thread, but still, considerably more overhead than something like MIO.

                                              The issue you mentioned as an example, hidden sync I/O syscall by some library, can well happen in a goroutine run function just as well, the end-result of that will essentially be a OS native thread being blocked, much like in Rust. At least, as far as my understanding of goroutine goes, that seems to be the case.

                                              Granted, working with a “pool” of event loops representing multiple threads might be harder than just using goroutines, but I don’t see it as being that difficult.

                                              1. 5

                                                That definition is the accurate, correct definition. It’s important to state that Rust helps with data races, and not race conditions in general. Even the rustonomicon makes this distinction clear.

                                                The discussion around multiple threads seems like a non-sequitur to me. I’m fully aware that async/await works fine with multiple threads. I also don’t understand why the performance considerations of goroutines were brought into the picture. I’m not making any claims about performance, just ease of use and programmer model. (Though, I do think it’s important to respond that goroutines are very much low enough overhead for many common tasks. It also makes no sense to talk about performance and overhead outside of the context of a problem. Maybe a few nanoseconds per operation is important, and maybe it isn’t.)

                                                The issue I mentioned does not happen in Go: all of the syscalls/locks/potentially blocking operations go through the runtime, and so it’s able to deschedule the goroutine and let others run. This article is another great article about this topic.

                                                It’s great that you’re optimistic about the future direction Rust is taking with it’s async story. I’m optimistic too, but that’s because I have great faith in the leadership and technical design skills of the Rust community to solve these problems. I’m just pointing out that they ARE problems that need to be solved, and the solution is not going to be better than Go’s solution in every dimension.

                                                1. 0

                                                  The issue I mentioned does not happen in Go: all of the syscalls/locks/potentially blocking operations go through the runtime, and so it’s able to deschedule the goroutine and let others run.

                                                  Ok, maybe I’m mistaken here but:

                                                  “Descheduling a goroutine”, when a function call is blocking, descheduling a goroutine has the exact same cost as descheduling a thread, which is huge.

                                                  Secondly, go is only using a non-blocking syscall under the hood for networking I/O calls at the moment. So if I want to wait for an operation on any random file or wait for an asynchronous prefetch call, I will be unable to do so, I have to actually block the underlying thread that the goroutine is using.

                                                  I haven’t seen any mention of “all blocking syscalls operations” being treated in an async manner, they go through the runtime, yes, but the runtime may just decide that it can do nothing about it other than let the thread be de-scheduled as usual. And, as far as I know, the runtime is only “smart” about networking I/O syscalls atm, the rest are treated like a blocking operation/

                                                  Please correct me if this is wrong.

                                                  1. 2

                                                    descheduling a goroutine has the exact same cost as descheduling a thread, which is huge.

                                                    A goroutine being descheduled means it yields the processor and calls into the runtime scheduler, nothing more. What happens to the underlying OS threads is another matter entirely. This can happen at various points where things could block (e.g. chan send / recv, entering mutexes, network I/O, even regular function calls), but not at every such site.

                                                    the runtime is only “smart” about networking I/O syscalls atm

                                                    Yes, sockets and pipes are handled by the poller, but what else could it be smarter about? The situation may well be different on other operating systems, but at least on Linux, files on disk are always ready as far as epoll is concerned, so there is no need to go through the scheduler and poller for those. In that case, I/O blocks both the goroutine and the thread, which is fine for Go. For reference, in this situation, node.js uses a thread pool that it runs file I/O operations on, to avoid blocking the event loop. Go doesn’t really need to do this under the covers, though, because it doesn’t have the concept of a central event loop that must never be blocked waiting for I/O.

                                                    1. 2

                                                      Descheduling a goroutine is much cheaper than descheduling a thread. Goroutines are cooperative with the runtime, so they ensure that there is minimal state to save when descheduling (no registers, for example). It’s on the order of nanoseconds vs microseconds. Preemptive scheduling helps in a number of ways, but typically causes context switching to be more expensive: you have to be able to stop/start at any moment.

                                                      Go has an async I/O loop, yes, but it runs in a separate managed thread by the runtime. When a goroutine would wait for async I/O, it parks itself with the runtime, and the thread the goroutine was running on can be used for other goroutines.

                                                      While the other syscalls do in fact take up a thread, critically, the runtime is aware when a goroutine is going to enter a syscall, and so it can know that the thread will be blocked, and allow other goroutines to run. Without that information, you would block up a thread and waste that extra capacity.

                                                      The runtime manages a threadpool and ensures that GOMAXPROCS threads are always running your code, no matter what syscalls or I/O operations you’re doing. This is only possible if the runtime is aware of every syscall or I/O operation, which is not possible if your language/standard library are not desiged to provide. Which Rust’s doesn’t, for good reasons. It has tradeoffs with respect to FFI speed, control, zero overhead, etc. They are different languages with different goals, and one isn’t objectively better than the other.

                                                      1. 2

                                                        And, as far as I know, the runtime is only “smart” about networking I/O syscalls atm, the rest are treated like a blocking operation/

                                                        Pretty much everything that could block goes through sockets and pipes though. The only real exception is file I/O, and file I/O being unable to be epolled in a reasonable way is a kernel problem not a Go problem.