Threads for predrag

    1. 2

      Furiously trying to put together a great conference talk before going on a two-week trip. I’m moderately perfectionistic, so things tend to turn out well in the end but it’s often stressful along the way. Work expands to fill available time, etc.

    2. 1

      Bravo, thanks for the study and the data!

      I do want to poke on a false dichotomy in the post: That semver violations are either human error or a tooling problem.

      It’s great that the Rust community and ecosystem has aspirations here, and even greater that tooling can make assumptions on what most software in crates.io will adhere to. That said… Some projects may tactically violate semver if they know a change is valuable and also has no/low probability of breaking consumers. Some projects may choose to follow different conventions that look like semver but are not actually semver. (See: https://calver.org) Some projects may choose to just not do semver at all (See: http://sentimentalversioning.org and http://unconventions.org)

      The Rust community has had more than one “burn the heretic” moment… Please consider Semver as a worthy goal to aspire to, but not as a religious or moral duty. As tooling improves, and I believe it will, I just hope people keep in mind that a project that violates semver anyway may have good reasons for doing it, just like people who use unsafe might have a reason for it.

      1. 3

        Bravo, thanks for the study and the data!

        Thank you 😁

        Some projects may tactically violate semver if they know a change is valuable and also has no/low probability of breaking consumers.

        Agreed! This is why cargo-semver-checks aims to inform not enforce. We don’t want maintainers to violate semver by accident and without knowing it’s happening, that’s all. There are definitely “tree falls in the forest” situations where tactically breaking semver is the right thing to do, and we leave it to maintainers to decide when that is the case. (As I’m sure you already saw in the post.)

        Please consider Semver as a worthy goal to aspire to, but not as a religious or moral duty.

        Unfortunately, between the compiler and the cargo build tool, Rust already assumes that all crates follow semver. cargo update by default upgrades all dependencies to their largest non-major-bump versions, and the compiler only allows multiple major versions of the same crate to live side-by-side, not minor ones. While binaries may have more freedom, libraries that don’t follow semver can be quite difficult to use in Rust given that core assumption.

        I don’t think it’s a religious or moral duty. But I also wouldn’t use a Rust library that doesn’t at least attempt to adhere to semver, simply because it would be quite difficult to use it given the predispositions of the language tooling.

        I just hope people keep in mind that a project that violates semver anyway may have good reasons for doing it.

        100% agreed! This is precisely why we didn’t publish a list of the specific semver violations we found, nor name which crates or versions they are in. We don’t want any abuse aimed at maintainers on the basis of our data, because that would be misguided in addition to being wrong. If crate maintainers reach out directly to us, we’re of course happy to share the results with them.

        1. 3

          Unfortunately, between the compiler and the cargo build tool, Rust already assumes that all crates follow semver.

          Given the fact that A) reasonable people can and often do disagree about what it means to “follow semver”, and B) even with agreement on what it means, reasonable and well-meaning people nonetheless still often fail to “follow semver”, this feels like a choice that merits the “unfortunately” tag.

          1. 2

            A) reasonable people can and often do disagree about what it means to “follow semver”

            I’m not saying it was reasonable, but much disagreement also has been had over issues such how to indent code and where to put braces. These formatting squabbles are largely absent from modern Rust: one can run rustfmt, and whatever it produces is the “right” formatting, even if some resent it. Maybe we (in Rust) can similarly largely end disagreement over what it means to follow SemVer by encoding an acceptable-to-most-people definition into a tool (which should also help with part B, like the title says).

            1. 4

              +1 to the parent comment. Rust’s definitions about what is and is not acceptable in which kind of release are much more well-defined than the vast majority of other programming languages. The definitions aren’t necessarily obvious or identical to SemVer-as-defined-on-semver-dot-org, but they are well-thought-out and handle the vast majority of commonly-encountered cases — I’ve been using them to develop cargo-semver-checks and even contributed to them in small parts, so this is first-hand experience.

              So while not everyone in every programming language might agree on what goes in a minor version vs in a patch, in Rust this discussion is largely settled and the rules are written down. That makes it relatively easy for a tool like cargo-semver-checks to scan for violations of those rules, and then cite the rule while displaying the evidence of it being violated.

              1. 2

                See my other comment, but a good summary is that Go has already tried this, and while they’ll proclaim it a success in that technically they haven’t violated their policy, that’s no consolation to people whose code has nonetheless been broken by “well, technically that was defined as non-breaking” changes. And that’s kind of the whole problem here: no amount of explanation of the precise definition of “breaking” will un-break someone’s code or make them happy about having their code broken.

            2. 1

              I have much less faith in “just write it down and make a tool” than you do. In part because I expect people will just work around it (say, by bumping major on every release). In part because I think defining what is and isn’t “breaking” is at least as contentious as code formatting, and arguably more so.

              But mostly because people largely aren’t interested in the actual hair-splitting discussion about what does and doesn’t count as “breaking”. They’re interested in not having their code break, and their code is still going to break. I got yelled at a lot for pointing out something similar in a recent thread about Go’s backwards-compatibility approach, but: when someone’s code gets broken by a new release, and they’re told that it’s not actually a breaking change because the definition of “breaking” technically says so, it’s no consolation to the person whose code is now broken, and they’re likely to feel betrayed by the language/ecosystem.

              1. 2

                Rust has similar issues. In particular, there are breaking changes that are not considered semver-major. Two things give me hope for a better outcome:

                • Those breaking changes are possible to resolve automatically with a tool using only the information currently available. It’s just that we as a community haven’t yet invested in making that tool.
                • Rust allows using more than one major version of the same library at once in the same project. So bumping major every time isn’t as painful as it might seem at first, and unlike in say Python, major version upgrades don’t have to be ecosystem-wide all-or-nothing.
                1. 2

                  Well, Hyrum’s Law demonstrates that literally every change is potentially a breaking change, for someone. Semantic versioning is well-defined as a specification (at semver.org), and that spec doesn’t precisely define a breaking change (because it can’t). That means semver, generally speaking, isn’t an expression of an objective and verifiable property of software — it’s an expression of a subjective and best-effort intent from authors.

                  It’s cool that Rust has defined a set of things that constitute verifiable breaking changes, and built tools to detect and flag those things for authors. But I guess that makes Rust-semver a specific subclass of semver in general, and I suspect that unless that distinction is made painfully clear, many (most) people will understand “semver” as the general kind, not the Rust-specific kind.

                  1. 2

                    I think most languages have a set of things that are verifiable breaking changes. If I remove a function from the public API, that’s a breaking change in Python or Java or JS just as much as it is in Rust. These are the kinds of issues cargo-semver-checks detects and reports.

                    We still can’t catch and report general-case breaking changes, a la Hyrum’s Law. But as the blog post shows, even the small subset we can report today is useful, in that over a large real-world sample of code, it catches a lot of things that previously went unnoticed and were problems waiting to happen.

                    Neither cargo-semver-checks nor Rust claims to have “solved” semver. I’ve even refused to build a “suggest what version bump I need” feature for cargo-semver-checks because of that, even though it’s probably a top 5 most-requested feature for the tool. All we claim is that tools like this are useful in real life, even if though they won’t (and can’t possibly) catch everything.

                    1. 3

                      Absolutely! No doubt this tool provides real value.

              2. 1

                I have much less faith in “just write it down and make a tool” than you do.

                I no longer have such optimism, and I don’t know why I did. Though I still think the tool itself sounds worthwhile, for what it does, I’ve both thought about it more and realized that (being based on a JSON description of an API) this tool fundamentally can’t see changes in the implementation of functions, either of which I think would have tamped down my optimism. I agree with your second paragraph.

                1. 1

                  My 2 cents:

                  As disruptive as breaking implementation changes are, they simply aren’t that common compared to accidental breaking changes in APIs. Forget our semver survey. For every GitHub issue in a Rust project you can find where the reporter complains about a breaking implementation change, I bet I can find 5 (maybe even 10) GitHub issues opened by someone other than me where the complaint is a semver-violating API change.

                  Breaking implementation changes of any significance aren’t anywhere nearly close to happening in 3%+ of all releases, like our semver study finds for currently-detectable accidental breaking changes in APIs. If they were, I bet we’d always treat any version bump as a major version, and we’d also much smaller dependency graphs than we do now.

                  In other words, I think revealed preference shows that breaking implementation changes sound more scary than they are, while breaking API changes sound less problematic than they really are.

                  1. 1

                    What distinguishes API breaking changes from implementation breaking changes? Is it whether that breaking change is detected at compile time vs. at run time? In either case, does semver’s definition of breaking change make any such distinction? (AFAIK it doesn’t?)

                    1. 1

                      Semver itself does not make any such distinction.

                      Everyone, including library authors, is much more worried about implementation breaking changes. So your question comes up a lot. But because everyone is worried about them, folks also spend a lot more time and effort on protecting their code from such issues, for example using test suites.

                      Much less thought and effort is given to API breaking changes. Compared to how often this actually seems to happen in practice, very few people think “someone might mistakenly delete a public method and we might not catch it in code review.”

                      Given that data shows this happens a lot, and considering that API breaking changes are much easier to catch with static analysis, it’s one of those “we really should be doing this also, why weren’t we already?” kinds of situations.

                      1. 1

                        OK, but then even in your framing, what differentiates API breaking changes from implementation breaking changes? What’s the definition?

                        1. 1

                          If rustc can catch it, cargo-semver-checks should be able to as well — we’re a long way from that sort of completeness, but that’s the end goal. If a breaking change cannot cause a compile error, then our current approach has no hope of catching it either and maintainers should use other tools at their disposal to catch and prevent those changes.

                          There might be small edge cases where this definition is imperfect in one way or another, but it’s directionally correct.

                          1. 2

                            Gotcha, so API breaking changes are compile errors, and implementation breaking changes are runtime (or maybe testing) errors, I guess? 👍

                            1. 2

                              As a rule of thumb, yes 👍

      2. 2

        I do want to poke on a false dichotomy in the post: That semver violations are either human error or a tooling problem.

        Another example: sometimes the semver violation is a deliberate response to a human error: when someone makes accidentally makes a release that exposes some functions that were meant to be internal-only, and then immediately makes another release that removes them.

        1. 1

          The first release is both human error and a SemVer violation, but I don’t think the second release is either. SemVer considers the possibility of accidental violations, says “Use your best judgement”, and says that the correcting release may be either a major release or a patch release.

          1. 2

            Another example: sometimes the semver violation is a deliberate response to a human error: when someone makes accidentally makes a release that exposes some functions that were meant to be internal-only, and then immediately makes another release that removes them.

            In fact, we found at least one exactly such case — it’s mentioned in the maintainer interviews portion of the post.

            Use your best judgement

            This is why cargo-semver-checks seeks to inform not enforce. We don’t want to prevent maintainers from publishing something. We merely want them to be fully informed about the contents they are publishing, since often (but not always!) publishing a breaking change outside of a major version can be unintentional.

          2. 2

            I think the first release isn’t a semver violation because it doesn’t break anything?

      3. 1

        If I were to take a moralizing position on this, it would be “If one wants not to use SemVer, fine, but then one mustn’t use Cargo, because, if one uses Cargo, one’s users are justified in expecting one to adhere to SemVer, as Cargo stipulates.”

        1. 1

          Perhaps more mustn’t use crates.io. Even if one doesn’t use cargo for one’s own project, if the code is uploaded to crates.io, other people can use cargo to depend on it. But the point stands — semver is about communication and managing expectations. The section on #[non_exhaustive] in the blog post covers another example of this, where something that used to not be a semver-major change until 2019 has since become one, due to changing norms and expectations.

        2. 1

          I don’t know if you’re intentionally creating a very stupid strawman position to bolster my argument, but as a trivial counter point, both Rust and Cargo do not (can not) strictly follow Semver and have a different (valuable) idea of what a major version should communicate.

    3. 3

      I don’t want to sound dismissive but sounds like a luxury problem to have. I’m thinking of the time crate which seems to have been basically rewritten from 0.1 to 0.2 (or 0.3), then I think regex broke all my use cases at one point, and I want to say it was the 1.0 release, but not so sure.

      What I mean is that my default stance is more like “According to semver it shouldn’t break so I only think it will break half of the time”.

      That said, I’m mostly happy with Rust here, but I don’t trust semver at all.

      Also great article and interesting initiative, if only the projects had enough tests to simply notice before a new release, so I always liked those “we tried this on every crate” posts by the Rust team, which doesn’t seem feasible in a repeated way for simple crates.

      1. 4

        The escape hatch in Semver is that if the major version is 0, then no Semver rules apply to the minor or patch version.

        1. 6

          This is true for standard semver, but Rust / cargo’s implementation is slightly different: it ignores leading zeroes, so 0.1 -> 0.2 is considered major, and 0.1.0 -> 0.1.1 is minor, etc. This is technically not compliant with semver, but … that’s above my pay grade :)

          cargo-semver-checks implements the same thing that Rust / cargo do, because its role is to prevent you from getting in trouble in Rust, under Rust’s rules.

        2. 3

          That is what google is doing with the popular guava java library and it has been extremely painful. In my last java job I actively replaced guava with apache-commons-* or wrote things myself to get out of this madness.

          These days I am doing mostly golang and things are a lot more predictable over there.

      2. 4

        I understand the sentiment. An equally valid standpoint (common in large projects with many dependencies) is that it’s a luxury to not have this problem. If one’s work depends on 50 crates, it’s fine if running cargo update breaks half the time. If one’s work depends on 500 crates and crates break half the time, one ends up doing no work other than fixing breakage from dependencies. At some point, projects just hit a “level cap” of sorts and cannot take on any more dependencies due to this effect.

        In a real sense cargo-semver-checks is a replacement for those “enough tests” that would allow maintainers to notice accidental breakage. Even if we can’t eliminate 100% of breakage, the data in this post shows we can eliminate a large portion of it. That can make all the difference in empowering maintainers and raising that level cap.

    4. 30

      Post co-author here, AMA.

      What we did:

      1. Scan Rust’s most popular 1000 crates with cargo-semver-checks
      2. Triage & verify 3000+ semver violations
      3. Build better tooling instead of blaming human error

      Around 1 in 31 releases had at least one semver violation.

      More than 1 in 6 crates violated semver in at least one release.

      These numbers aren’t just “sum up everything cargo-semver-checks reported.” We did a ton of validation through a combination of automated and manual means, and a big chunk of the blog post is dedicated to talking about that.

      Here’s just one of those validation steps. For each breaking change, we constructed a “witness,” a program that gets broken by it. We then verified that it:

      • fails to compile on the release with the semver-violating change
      • compiles fine on the previous version

      Along the way, we discovered multiple rustc and cargo-semver-checks bugs, and found out a lot of interesting edge cases about semver. Also, now you know another reason why it was so important to us to add those huge performance optimizations from a few months ago: https://predr.ag/blog/speeding-up-rust-semver-checking-by-over-2000x/

      1. 5

        Very cool! Happy to hear this might be integrated into cargo. I really like the fact that the Elm package manager enforces semver at publish time.

        1. 8

          Thank you! I was inspired by the Elm package manager when building cargo-semver-checks. It was one of those “once you know it’s possible, it turns out the problem isn’t super hard” kinds of fortuitous situations. I happened to be in the right place at the right time with the right tools at my disposal.

      2. 4

        Right now I’m working on similar tooling for a different API ecosystem (fuchsia.dev).

        Three questions:

        1. Do you think crates.io should reject detectable server violations?
        2. Did you try building packages and then varying the micro version numbers of their dependencies to find practical, in-the-field breakages that your static analysis might not catch yet?
        3. How do you handle macros? Proc macros probably get you into the halting problem, but macro_rules might be tractable.
        1. 2

          Great questions!

          1. No, I explicitly think it shouldn’t. Semver violations are explicitly allowed when they serve the broader interest of the community, such as rolling back unintentionally-public functionality (has happened in Rust!), or fixing unsoundness errors or critical security issues (the Rust language itself does this!). I think maintainer judgment is final, and these tools should inform rather than enforce.
          2. I wasn’t sure which of two possible interpretations this was: is it “find crates that re-export 3rd party dependencies, where wiggling the 3rd party dependency can cause the re-exporting crate to violate semver” or is it “attempt to find semver violations in a different way, by looking for crates that might break when their dependency versions are wiggled in allowable ways”? The former would definitely produce more semver violations, but rustdoc JSON is currently not fully reliable for linking cross-crate data so our analysis currently is crate-local only. We’re working on it, the rustdoc and Rust folks are working on it, it’ll happen sooner or later. The latter isn’t that interesting to me personally, because I think it will be vastly more computationally expensive, and will also find lots of cases where crates declare dependencies like ^1 but actually use a feature added in 1.2. This “minimal versions” problem is well-known, and so not that interesting. And if we never try downgrading versions, only upgrading, we’re only likely to find the same issues everyone has already likely found by doing the same upgrade. In other words, I found our work interesting precisely because it had very high potential of discovering previously-unknown semver issues, and wouldn’t have been excited about it if it only told us what we mostly already knew.
          3. We currently don’t handle macros at all, beyond macro use in one’s own crate (at which point the macro is just a semver-irrelevant internal implementation detail). Rustdoc doesn’t give us a lot of information on macros (proc or otherwise), so we’ve been going after lower-hanging fruit elsewhere thus far. I also agree that macro_rules may be tractable, and I’ve been trading ideas with a few maintainers of macro-heavy crates on what would be most useful to them and most likely to find and prevent issues that might otherwise slip by. Their #1 request, btw, is proper handling of #[doc(hidden)] which macro-heavy crates use very often and now is our biggest source of false-positives since we currently (incorrectly) treat those items as being public API.
          1. 2

            For 2 I meant the latter but now I understand why it’s unappealing. For my own similar project I too am focused on static analysis of metadata rather than trying to compile things.

            For 3, does #[doc(hidden)] make macros inaccessible or just indicate to other crates that there’s no stability guarantee?

            1. 1

              For 2, makes total sense.

              For 3, the latter.

              Usually, the macros themselves are public API and not #[doc(hidden)]. The challenge is that macro expansion produces code that is in the crate where the macro is used, not the crate where it is defined. So anything the macro-expanded code uses from the macro-definition crate must be public — it’s being used cross-crate. But you often don’t want it to be public API because then you have to uphold a stability guarantee on “internal” implementation details.

              The usual answer then is that the macro is not #[doc(hidden)], but uses code that is public and #[doc(hidden)]. That way, the macro internals are public (therefore callable by the macro) but not subject to a stability guarantee because they are explicitly annotated as “not public API.”

      3. 3

        cargo-semver-checks is really nice but for more complex crates it’s difficult (at least for me) to figure out what exactly is broken.

        Is there a forum for people to discuss the output and the concrete lint warnings?

        An example:

        $ git clone https://gitlab.com/sequoia-pgp/sequoia
        $ cd sequoia/openpgp/
        $ cargo semver-checks --default-features
             Parsing sequoia-openpgp v1.16.0 (current)
             Parsing sequoia-openpgp v1.16.0 (baseline, cached)
            Checking sequoia-openpgp v1.16.0 -> v1.16.0 (no change)
           Completed [   0.378s] 48 checks; 46 passed, 2 failed, 0 unnecessary
        
        --- failure trait_method_missing: pub trait method removed or renamed ---
        
        Description:
        A trait method is no longer callable, and may have been renamed or removed entirely.
                ref: https://doc.rust-lang.org/cargo/reference/semver.html#major-any-change-to-trait-item-signatures
               impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.23.0/src/lints/trait_method_missing.ron
        
        Failed in:
          method map of trait ValidAmalgamation, previously in file /home/wiktor/.cargo/registry/src/index.crates.io-6f17d22bba15001f/sequoia-openpgp-1.16.0/src/cert/amalgamation.rs:401
        

        AFAIK this method was not touched for literally years (git blame shows 2020-04-11 22:52:18 +0200). Full output at https://paste.debian.net/1291329/

        Thanks for this incredibly useful tool!

        1. 2

          Thank you for checking it out, and for your candid feedback! It’s very much appreciated, and helps us make the tool better.

          If an output message is confusing or suspected false-positive like this, feel free to open an issue and we can triage. Just make sure to mention what commit you’re using when running cargo-semver-checks, to make reproduction easier — otherwise branches drift and that can sometimes make it hard.

          That error is complaining that it cannot find the trait method ValidAmalgamation::map anymore, but it existed in 1.16.0 published on crates.io.

          Given you’re saying that method still exists and hasn’t been modified recently, this is a possible false-positive so definitely good to have in the issue tracker.

          1. 2

            Just for continuity for people that want to follow the discussion I’ve filed an issue there: https://github.com/obi1kenobi/cargo-semver-checks/issues/536

            1. 1

              Thanks for following up and opening the issue, I appreciate it. I’m about to head to RustConf and will be travelling for the next couple of weeks, but I’ll take a look as soon as I can!

      4. 2

        The is sounds like really good work, congratulations!

        A suggestion: briefly describe rust-semver-check in the introduction, e.g. “rust-semver-check examines the JSON output of rustdoc for different versions of a crate and reports possible semver violations”.

      5. 2

        I like the approach! Especially in identifying the gulf that can be closed by tools, instead of blaming humans (regardless of fault or lack thereof).

        Thanks for taking the time to write this up and share it.

        1. 1

          Thank you! In general I feel the size of that gulf is commonly underestimated, so I’m very interested in building better tools. As one example: databases, compilers, and distributed systems have advanced massively over the last 10 years, but for the most part the tooling story is generally the same today as it was 10 years ago.

          If that sounds interesting and you’d like to keep an eye on my work, you might find subscribing to my blog worthwhile. It supports RSS and also has a email notification option: https://predr.ag/subscribe/

    5. 3

      I think hands-on learning is great for building understanding and intuition, and agree that there’s much more to compilers than just parsing.

      • I second the recommendation for Crafting Interpreters.
      • Advent of Code 2021 Day 24 asks you to implement a (simple) optimizing compiler. You’ll learn a lot by solving it!
      • I started a blog series implementing optimization passes one at a time for that Advent of Code problem. I have 3 episodes out already and a few more as drafts I really should finish 😅

      I believe it’s good to approach learning about this space from multiple directions, and see what clicks for you.

      There are also lots of compiler-like projects that aren’t compilers in the traditional sense — and there’s lots to learn from them as well: rust-analyzer, code linters and autoformatters, database query engines (“read a program and optimize it so it runs quickly” = compiler, no?), etc.

      I work on two compiler-like projects:

      • Trustfall, a compiler-based query engine that can query any data source: databases, APIs, complex file formats, or any combination. Here’s a talk I gave on it (10min).
      • A linter that catches semantic versioning violations in Rust crates called cargo-semver-checks. Here’s a recent post that describes its architecture and covers a recent optimization that made it over 2000x faster in one of our users’ workloads.

      I’m very happy to chat about either. You can find my email in commits under https://github.com/obi1kenobi or feel free to DM me on Twitter or another platform.

    6. 3

      That’s not as terrible as it sounds — it is shadowing a public type with a private one, but that’s a rather straightforward combination of glob imports. I was worried it could be something more unexpected.

      1. 3

        It’s certainly not language-breaking, I did say so in the intro:

        While I’m excited to have discovered this, I don’t think the problem is actually that severe.

        The bigger problem is that the current state of rustc lints and rustdoc make it impossible for tools to catch the problem. And humans are unlikely to catch it either, since we use glob imports precisely to avoid having to think about all the names they import.

        I think it’s a fun edge case (it’s surprising that you can break public API by only making private changes!), and it’s worth fixing especially since it’s broken real-world projects already.

    7. 1

      Hmm… The method signature sealing approach is a neat trick, but I’m rubbed the wrong way by this claim:

      Meanwhile, downstream code can both see and name the trait and its method, but cannot […] call the method

      It is possible to call the method: you just need to trick the type system into conjour instances of the private token out of thin air, which can be done with divergent functions like

      fn get<T>() -> T { get() }
      

      or even just

      panic!("This returns `!`, which coerces to any type")
      

      Although downstream users cannot call the method (in the sense that they cannot cause it to actually execute), they can still very much refer to it in a manner that is not semver-safe and so effectively nullifies one of the key reasons for sealing the trait in the first place!

      1. 1

        I’m sorry, I think I might be missing something. It’s possible to produce code where a divergent function is called and its result is used in the place of the token to “appear to call” the trait’s method — but execution will never actually reach the trait method because it will diverge first! That’s the reason ! coerces to any type: the compiler can prove the remaining code unreachable and eliminate it.

        For example

        TypeThatImplsSealed::method(self, panic!("coerce `!` into the token))
        

        never reaches the method() call because it diverges at panic!() first. So the method remains uncalled, even though the code type-checks.

        So perhaps I’ve misunderstood what you meant. If so, apologies!

        EDIT: I just noticed your edit, so let me reply to it here. I’m not a member of any official Rust team so I am not an authority on semver in Rust. But you might know that I created cargo-semver-checks, and I’ve dug quite deep into semver on my blog. Most people don’t know that Some Rust breaking changes don’t require a major version. Breaking changes in code that serves no other purpose other than to be broken are almost always non-major in Rust’s interpretation of semver, and you can find more details and citations for this statement in the link above.

        1. 1

          I guess it depends on your definition of the word ‘call’, but certainly from a semver perspective this is a call and changing the method later would still be a breaking change for downstream consumers. Divergent functions may not terminate, but their return values still have semantics.

          1. 1

            Unfortunately the “general” semver perspective and the Rust semver perspective don’t quite align. In “general semver” all breaking changes are major, in Rust that isn’t the case as my link above shows.

            In the Rust ecosystem, the only semver definition that makes sense to use is Rust’s, because that’s what cargo uses and ~everyone uses cargo. For Rust, I don’t think this counts as a call — especially because it doesn’t even pass lint, triggering a dead_code rustc warning.

            1. 1

              Whether it has a dead_code warning is irrelevant: it is a call, and this isn’t semver-compliant.

        2. 1

          I don’t think edits trigger notifications, so just using this comment to signal I’ve edited my comment in response to yours :)

    8. 11

      What I’d be more interested in seeing would be “things that people think are 10x but are actually -10x”.

      1. 25

        Tactical tornado engineering is a great example of this:

        Almost every software development organization has at least one developer who takes tactical programming to the extreme: a tactical tornado. The tactical tornado is a prolific programmer who pumps out code far faster than others but works in a totally tactical fashion. When it comes to implementing a quick feature, nobody gets it done faster than the tactical tornado. In some organizations, management treats tactical tornadoes as heroes. However, tactical tornadoes leave behind a wake of destruction. They are rarely considered heroes by the engineers who must work with their code in the future. Typically, other engineers must clean up the messes left behind by the tactical tornado, which makes it appear that those engineers (who are the real heroes) are making slower progress than the tactical tornado.

        From “A Philosophy of Software Design” by John Ousterhout. https://www.goodreads.com/author/quotes/14019088.John_Ousterhout

        1. 5

          …I totally recognize this in one of my coworkers. XD I think we have successfully channeled them into a place where their unrestrained enthusiasm gets applied towards the forces of good with a minimum of fallout: front line troubleshooting, proof-of-concept testing and customer demos. The fast pace and tactical absorption are very useful when things need to be solved Right Now, and then afterwards more cautious engineers can wade through the wreckage to cherry-pick lessons learned and scoop up the more interesting bash scripts to be turned into real tools and upstreamed.

          To their credit, they also recognize this is their MO and tend to stay away from development that would have deeper or wider impacts on core systems, and prefer to enjoy the endless novel problem-solving of operations-y stuff.

          1. 6

            This is definitely one of the roles where folks like that can shine. But it isn’t a panacea.

            A person in a similar role once told me that they wrote their own implementation of a cryptographic algorithm because “[the platform used in the product] didn’t allow importing the existing npm package for it” and that it wasn’t a problem because “they tested the implementation with all the standard test vectors and it passed all tests.”

            It took me two hours to explain the difference between “this implementation has no security vulnerabilities” and “we don’t know of any security vulnerabilities in this implementation.”

            Thankfully, I caught it early enough that it didn’t cause any serious issues. It would have been really bad if we had shipped it to a customer and it turned out being problematic in some unforeseen way.

            1. 3

              oh ho ho wow, good catch! I had a similar experience once explaining that we weren’t going to ship a telemetry-monitoring tool to our customers that consisted of a bash script calling curl on undocumented HTTP endpoints. But I think your story wins. At least they did test their impl with all the standard test vectors?

              1. 3

                They did. But standard test vectors are proof of functionality, not safety.

                For example, test vectors didn’t protect the Java 15-18 ECDSA implementation from accepting (0, 0) as a valid signature for any message: https://twitter.com/tqbf/status/1516570590211153922

                Now, Oracle can just go shrug, stuff happens and fix it, but I was at a small-ish startup at the time and wasn’t willing to bet our future on being able to do the same. Not to mention that it would have much more likely been (in part) my mess to clean up, and certainly not the tactical tornado’s.

        2. 3

          I’ve seen people promoted for writing a lot of code. In particular, the kind of developer who seems to have a mindset of ‘never do in 5 lines of code what you could do in 200’. They look amazingly productive because they’ve added a load of new features to the product and the manager doesn’t notice that the rest of the team is less productive because they’re spending all of their time fixing bugs in this person’s code.

        3. 3

          these guys have long been a frickin pain in my backside. Mr. Minimum Viable Product, Except Not Even Actually Viable. one example left four years ago and we’re still finding messes he left.

          1. 1

            If one of these people is in a code base right at the start sometimes you can never fix it. I remember starting a job and working on a particular API model. I asked who the SME was and someone told me “Well, I think it’s you now.” It was probably the most important data model in the system and one of the most complex ones. It was an absolute rat’s nest or JavaScript. That was 5 years ago and I don’t think it ever got better. I don’t think they’ll ever fix it.

        4. 2

          Too real. Whenever I hear the word “pragmatic” in a programming discussion I get tech debt PTSD.

        5. 2

          I like to think this is about time at place. There are times when going into tornado mode is super useful: prototyping, grenade-diving, etc.

          But you have to own the maintenance of your own shit.

    9. 5

      I’m putting the finishing touches on a new query optimization API for Trustfall, the query engine for all data sources I’ve been building (repo, conf talk, in-browser demo).

      The motivating use case for the new API is cargo-semver-checks, a Rust linter for semantic versioning. It uses Trustfall under the hood, and a recent performance experiment showed that for some workloads it runs a factor of ~2300x slower than necessary. This new API is the way to realize that ~2300x speedup.

      Of course, there’s a difference between a prototype implementation for a benchmark and the real production-grade API. But I’m making good progress and I’m hoping that by the end of the week, the new production-grade API will be generally available. This will benefit not just cargo-semver-checks but any other use case or data source using Trustfall as well.

      1. 3

        I had never heard of Trustfall and HYTRADBOI and both look so interesting! Trustfall fits in with my keen interest in a structured approach to editing, finding, etc—e.g. there’s Comby, Semgrep, Tree-sitter, Paredit, etc.

        1. 2

          If you decide to give Trustfall a try, I’d love to hear what you think! Please reach out with feedback or any issues!

    10. 1

      From a distance, it sounds like:

      • the interface-inclusion check was implemented in a convoluted way using a database
      • this approach works fine for medium-sized programs but it does not scale to very-large outliers
      • the author proposes to fix this by using more elaborate, not-released-yet features of the database system they are using

      I wonder if someone has tried to implement a reasonable algorithm directly, without relying on a database. This could be more upfront work but also probably quite a bit faster, and simpler to extend. In particular, I would expect semver-checking to be concerned not only with existence of functions at a given version, but also with their types: if the function has a different type now, is it more general than the previous one? That sounds very tricky to implement at the database level, so probably you have to mix it with procedural logic anyway.

      1. 4

        Unfortunately, the distance has muddied the facts a bit.

        The goal of using the “convoluted database” is not to “buy” performance. It’s to “sell” as little performance as possible while exchanging it for things we actually care about: leverage, maintainability, format-independence, newcomer-friendly lint syntax, easy code reviews, etc.

        The key difficulty in semver-checking is not in the algorithm itself, but in keeping the various lint rules working across Rust versions. Many semver linters have been attempted and built for Rust, and they largely were abandoned due to excessive maintenance burden.

        There’s no stable machine-readable format that describes APIs at a thorough-enough level in Rust. All of the following have been used as the foundation of semver-checking tools:

        • ASTs are stable, but AST-based semver-checking has both false-negatives (auto-trait info is not in the AST and can’t be checked) and false-positives (moving and re-exporting an item is a giant AST change that is semver-invisible).
        • rustdoc JSON is explicitly unstable, and has gone through 9 major versions in the 6 months cargo-semver-checks has been around. The 10th major version is just around the corner (open PR with passing tests, not merged yet).
        • The compiler’s internal APIs are also unstable, and relying on them requires that your users install a specific nightly Rust to use your tool. The semverver project was a Herculean effort to try to pull this off, and in the end it’s being deprecated due to excessive maintenance burden: https://github.com/rust-lang/rust-semverver/issues/373

        The “convoluted database” is the abstraction layer that keeps us from needing to reimplement every lint every time the rustdoc format changes. This is what has killed countless semver tools thus far.

        We have 40+ different lints with more added all the time. In the post, I described the simplest one. Most are far more complex; I invite you to check them out. They could certainly be rewritten procedurally, possibly even with some minor perf benefit. But then we’d be running a format compatibility risk + we’d have to review that procedural code incredibly closely because bugs are easy to miss in a large volume of code. In contrast, the query-based lints are simple to read and so easy that writing them is our most common onboarding task.

        The point of the post is that Trustfall gives us the resources we really care about (maintainability, leverage, easy lint writing and reviewing) without being a performance burden:

        • By the adoption of cargo-semver-checks sans the 2272x optimization, the community is clearly thrilled to give away performance to get semver-checking.
        • Adding the indexes was a few hours of work for one person, and sped up lints written by everyone without needing to change them at all.
        • By adding parallelism (as hinted in the intro) on top of the indexes, we could easily get to sub-second checking times on even gigantic crates. At that point, the check time would be 1/30th the rustdoc JSON generation time, and is no longer worth optimizing checking any further for a while.

        My words probably won’t change your mind, but consider this: of all the semver-linters out there, cargo-semver-checks is the one that is in the process of being merged into cargo. It’s also the only one to use the “convoluted database” approach instead of a more direct procedural or compiler-based approach. Funny coincidence, or something more? :)

    11. 2

      For some reason, I got to read this submission and https://lobste.rs/s/vyb9rm/stack_graphs_name_resolution_at_scale in succession. In stack graph submission, GitHub produces stack graphs for every push to the repository. To speed things up, they rely on the fact that if git push doesn’t contain some file, that file’s content haven’t changed.

      That got me to thinking: could cargo-semver-check use the same trick of leveraging git to update indices only for files that change? That should bring even more speedup as less work needs to be done.

      1. 1

        That submission was great, I originally saw the Strange Loop talk and loved it! And yes, the same trick could work here as well.

        In practice, though, we have to regenerate the rustdoc JSON data because cargo-semver-checks has 40+ checks implemented (with more added constantly), most of which need more data than stack graphs could resolve (e.g. the implemented traits for a type, which requires type-checking due to auto-traits). This involves a trip through rustc, which is then free to change all the item identifiers in the new rustdoc JSON file, which in turn invalidates the index. I’ve had some conversations with the rustdoc maintainers about identifier stability, and together we decided that’s a “not now but maybe in the future” work item.

        But recall there are two JSON files: one from the older “baseline” version and one from the newer “current / about-to-publish” version. The “current” JSON file has to be regenerated, but the “baseline” is almost always referring to a release that’s already on crates.io! That means we get to cache the baseline JSON file instead of rebuilding it, which saves a huge amount of time: avoided a build + better build caching for the other build since nothing got overwritten. This will ship in the next cargo-semver-checks version in the next few days, as soon as we’ve finished testing the alpha with our early-adopter users.

        Could we then also cache the indexes for the baseline? Probably! Right now, it just isn’t worth it:

        • Total checking-only time with the optimizations in the post is 8s.
        • Building the indexes as regular hashtables is a fraction of a second, I’d estimate probably ~0.2s tops per JSON file.
        • Reading and deserializing the 200MB JSON file is another fraction of a second, call it 1s total to read + build.
        • The “current” rustdoc JSON build from a warm build cache is ~30-35s. This is the step we can’t skip no matter what right now.
        • The “current” rustdoc JSON always needs a fresh index, so another 1s to read + build.

        So even if we switched to a fancier persisted index (or even turned cargo-semver-checks into a daemon that keeps the index always warm in RAM), we’d win maybe 1s total, but pay for it dearly in extra complexity.

        The bigger win we haven’t grabbed yet is parallelizing with rayon, as I hinted in the beginning of the post. We have 40+ queries on the same read-only dataset + completely independent from one another. That’s just a “parallel for loop,” and rayon will easily get us another O(# of cores) speedup.

        At that point, we’re talking about a checking-only time of <1s for even the biggest crates, and we start optimizing things like “rustdoc JSON itself” and “JSON deserialization” :)

        But I do need to finish the new Trustfall API first!

    12. 2

      This is straightforward: after loading the rustdoc JSON, iterate over the items and construct the hashtables we need. […] Within the context of that request, the adapter can use the new optimization functionality to ask Trustfall: “Did the query have a specific item name it’s looking for?” If yes — fast path — it applies the index […]

      I guess cargo-semver-checks would use all possible indices and might as well build all of them at once, but, more generally, would it be reasonable to construct indices lazily, after getting the answer to “Did the query have a specific item name it’s looking for?”?

      1. 1

        Good questions!

        Building indexes generally is more expensive than answering any single query: you can kind of think of the index build as a query that misses all indexes, returns a giant amount of data, and now that data needs to be persisted somewhere (in RAM or on disk, in any case it’s cheaper than not persisting it — malloc is faster than fsync but it isn’t instantaneous).

        But if there’s reason to believe that the query is going to happen repeatedly and enough to offset the initial and ongoing cost of having the index, then by all means it should be built. There are algorithms that seek to answer this question in an “online” fashion, where they learn which indexes are worth building based on the behavior of the queries being used. Trustfall’s job is to enable such algorithms to be plugged in (under its “any optimization possible outside of Trustfall is just as easy within it too” rule for API design), and it will happily do so.

        In the case of cargo-semver-checks, we have an easier case where we know all the queries ahead of time — they are hardcoded into the tool. We also know that they run to completion (“fetch all rows, no limit”) and that in expectation they find nothing because semver errors are the exception, not the norm. A quick inspection of the queries (or a half hour with a profiler) makes the index decision very easy: the two indexes mentioned in the post (with a few extra caveats, like “only include public items in the index”) are always worth it, and the rest not so much.

        Another interesting quirk you might not have noticed: queries expand items twice — once in the previous version and once in the new one. In one of these, the answer to “Did the query have a specific item name it’s looking for?” is “no” (because we’re checking all items) and in the other one it is “yes” (because we’re in the middle of checking against a specific item). This means that the adapter can’t afford to neglect the “slow path” entirely, and certainly not delete it.

    13. 6

      I’m impressed by how well the notes (I mean footnotes, except they’re not in a footer) show and hide inline without needing JavaScript, markedly better than on any blog I recall having seen before.

      The optimization is impressive too, of course. :-)

      1. 4

        Thank you! I’m using a lightly edited version of the Tufte CSS theme available here. It clashes a bit with the rest of my site, but I picked it specifically because of the nice “sidenotes” that display in the margin on wide screens and inline on narrow / mobile layouts. I also learned about it from seeing it on a friend’s blog and commenting on it :)

        1. 3

          OT: Footnotes 3 and 4 are ‘buggy’ for me (safari, iPhone). They expand over the code block, hiding the code but also displaying the footnote with dark text over a dark background.

          On topic: nice with more applications for your tool. I previously saw your shortish presentation (from that conference), which I liked very much!

          1. 3

            Oh, fascinating — thanks for reporting the problem. Are any other footnotes problematic or do you think it might just be a weird interaction with the code block at the root of it?

            I unfortunately don’t have an iPhone so I’ll have borrow a friend’s to figure out what’s going on exactly.

            Thanks for the kind words! I’m hoping I can make Trustfall useful in other domains as well, now that it hopefully has a bit more credibility :)

            1. 2

              In case it helps, for me, in Chrome on Android, those notes show as light text on the dark background. Opening them does hide the content of the code block, but I assume that’s intentional.

              1. 2

                Thanks, it does help!

    14. 1

      Note that for my system (W10,Firefox,Dark UI settings) your website is dark, but all console output has a white background with white-grey text, making it barely readable.

      1. 1

        That’s interesting, I didn’t know dark UI settings were a thing in desktop browsers; thought the issue only happened on mobile. Is it some standardized parameter the browser sends that my hugo theme must be responding to?

        1. 3

          Yes, it’s a CSS media query (the same family as the ones for screen width or pixel density) and its name is prefers-color-scheme: https://developer.mozilla.org/en-US/docs/Web/CSS/@media/prefers-color-scheme

          If you open the dev tools, in Chrome’s Elements page, in the “styles” sidebar there’s a paint roller icon. Click it and it will let you override the dark mode / light mode setting in your browser so you can see what both versions look like.

          Fun fact: you can use <picture> tags together with that CSS media query to display a different image in dark mode compared to light mode. You can see this in action on my blog, where in dark mode the graphics use off-white text over a transparent background, and in light mode they use dark grey text over a transparent background: https://predr.ag/blog/speeding-up-rust-semver-checking-by-over-2000x/

          1. 1

            Oh that’s very interesting! I will certainly need to implement the picture switching on one of my posts, which has transparent vector art I put a lot of time into that become totally invisible in dark mode: https://ahelwer.ca/post/2018-12-07-chsh/

            1. 3

              You can do it without picture switching if you embed SVGs in the HTML. For simple line drawings with black/white strokes:

              <svg><path stroke="currentColor" fill="none">...</path></svg>
              
              :root { color-scheme: light dark; }
              figure > svg { color: #000; }
              @media (prefers-color-scheme: dark) {
                  figure > svg { color: #fff; }
              }
              

              A more complicated setup for colors:

              <svg><path class="svg-stroke-red">...</path></svg>
              
              :root {
                  color-scheme: light dark;
                  --red: #c22017;
              }
              @media (prefers-color-scheme: dark) {
                  :root { --red: #fa6961; }
              }
              .svg-stroke-red { stroke: var(--red); }
              

              You don’t have to use CSS variables here, but I like doing it this way so I can have my whole palette in one place and use it outside SVGs as well.

              I use this technique for the drawing here. I’ve also partially automated it for another website.

              1. 2

                Good tips, I just managed to get it working by inlining the SVGs! Might play around with choosing non-black colors depending on the colorscheme too.

                Changes: https://gitlab.com/ahelwer/ahelwer.gitlab.io/-/commit/c7547624c6f11e69c0a914bd29b22e9632625596

                Representative blog posts:

            2. 2

              Haha I clicked on the link, and was like “oh, very interesting minimalistic graphics!”

              Then I switched to light mode and realized how much I was missing :)

              Good luck, and help spread this knowledge far and wide!

    15. 3

      Minor formatting nit, perhaps specific to my system: I’m reading in dark mode on a laptop, and the text boxes containing the error messages, terminal printouts, and TLA+ code were difficult to read since they rendered as light grey text on a white background. It didn’t look in sync with the rest of the blog’s theme and I’m guessing it might just be a CSS glitch. Wanted to let you know so you could have a look — it’s the sort of thing that’s difficult to find if nobody tells you about it :)

      1. 2

        Apologies, thanks for the report! Seems the pygments highlighter isn’t playing well with whatever dark mode setting exists for this theme I’m using. I sort-of fixed the problem by highlighting all those code blocks as sh, which isn’t correct, but at least gets them readable. Given the topic of the post I really should figure out how to highlight all the code blocks on my blog with tree-sitter.

        1. 1

          I’d love to read a blog post about syntax highlighting the code blocks with tree-sitter :)

          Bonus points if it’s tree-sitter running client side as WASM, just for the fun of it!

    16. 2

      For folks that prefer to learn by watching rather than reading, the Strange Loop talk on this was one of the finest examples of technical communication I’ve ever had the pleasure of seeing with my own eyes: https://www.thestrangeloop.com/2021/incremental-zero-config-code-nav-using-stack-graphs.html

      My hat’s off to you, @dcreager :)

      1. 2

        Wow, thank you for the kind words!

    17. 2

      Off topic, but I just want to say that it’s neat how the side notes turn into toggleable inline notes when the viewport width is small, and clever how toggling notes works without JavaScript.

      1. 4

        Thank you! This is a feature of the Tufte CSS theme, which I’m using in lightly-edited form. The ability to have side notes with a sensible behavior on mobile and narrow viewports was the top reason I picked it.

    18. 4

      For the glob-import situation, here’s a proposal which I think would be fairly reasonable to do in an edition, or at least an optional lint: modules can only be glob-imported if they contain #![prelude] (or glob_importable or whatever). Adding items to such a module is a breaking change.

      1. 2

        In the meantime, it should be possible to implement part of this in a tool like cargo-semver-checks: once we have implemented lint-level control, maintainers should be able to raise the level of “item added” lints to semver-major for their prelude modules.

        I’d also suggest a tweak for the proposal: always allow glob imports in #[cfg(test)] code, and allow glob imports of enums (grabbing all their variants) if the glob import is local to a function. I think those are two common use cases for glob imports of non-preludes these days, and I don’t think they are particularly likely to cause trouble.

        1. 4

          always allow glob imports in #[cfg(test)] code

          Allowing glob imports within a crate might be sufficient here (and feels a little less magical, especially if this is to become a language rule and not a lint rule), though I recall there’s some weirdness around integration tests getting split into a separate crate?

          1. 2

            Ah yes, good idea. Perhaps it can be like #[non_exhaustive] which doesn’t prohibit pattern-matching when inside the defining crate.

            If integration tests can’t be easily made to work with glob imports, perhaps this can be solved by tooling: if e.g. cargo fix or rust-analyzer defined an assist to replace a (technically-disallowed) glob import with explicit imports of everything actually used in the integration test file.

            1. 1

              I don’t think integration tests really need glob imports that much. Unlike unit tests, they’re usually testing a relatively small public interface.

    19. 8

      Fantastic article, with tons of high quality links for further reading. 10/10 A+.

      1. 5

        Thank you! That means a lot to me.

    20. 5

      This is a much better presentation of the argument that it makes than the spacebar heating xkcd.

      1. 3

        Thank you, much appreciated.