I’m impressed by how well the notes (I mean footnotes, except they’re not in a footer) show and hide inline without needing JavaScript, markedly better than on any blog I recall having seen before.
The optimization is impressive too, of course. :-)
Thank you! I’m using a lightly edited version of the Tufte CSS theme available here. It clashes a bit with the rest of my site, but I picked it specifically because of the nice “sidenotes” that display in the margin on wide screens and inline on narrow / mobile layouts. I also learned about it from seeing it on a friend’s blog and commenting on it :)
OT: Footnotes 3 and 4 are ‘buggy’ for me (safari, iPhone). They expand over the code block, hiding the code but also displaying the footnote with dark text over a dark background.
On topic: nice with more applications for your tool. I previously saw your shortish presentation (from that conference), which I liked very much!
Oh, fascinating — thanks for reporting the problem. Are any other footnotes problematic or do you think it might just be a weird interaction with the code block at the root of it?
I unfortunately don’t have an iPhone so I’ll have borrow a friend’s to figure out what’s going on exactly.
Thanks for the kind words! I’m hoping I can make Trustfall useful in other domains as well, now that it hopefully has a bit more credibility :)
In case it helps, for me, in Chrome on Android, those notes show as light text on the dark background. Opening them does hide the content of the code block, but I assume that’s intentional.
This is straightforward: after loading the rustdoc JSON, iterate over the items and construct the hashtables we need. […] Within the context of that request, the adapter can use the new optimization functionality to ask Trustfall: “Did the query have a specific item name it’s looking for?” If yes — fast path — it applies the index […]
I guess cargo-semver-checks would use all possible indices and might as well build all of them at once, but, more generally, would it be reasonable to construct indices lazily, after getting the answer to “Did the query have a specific item name it’s looking for?”?
Building indexes generally is more expensive than answering any single query: you can kind of think of the index build as a query that misses all indexes, returns a giant amount of data, and now that data needs to be persisted somewhere (in RAM or on disk, in any case it’s cheaper than not persisting it — malloc is faster than fsync but it isn’t instantaneous).
But if there’s reason to believe that the query is going to happen repeatedly and enough to offset the initial and ongoing cost of having the index, then by all means it should be built. There are algorithms that seek to answer this question in an “online” fashion, where they learn which indexes are worth building based on the behavior of the queries being used. Trustfall’s job is to enable such algorithms to be plugged in (under its “any optimization possible outside of Trustfall is just as easy within it too” rule for API design), and it will happily do so.
In the case of cargo-semver-checks, we have an easier case where we know all the queries ahead of time — they are hardcoded into the tool. We also know that they run to completion (“fetch all rows, no limit”) and that in expectation they find nothing because semver errors are the exception, not the norm. A quick inspection of the queries (or a half hour with a profiler) makes the index decision very easy: the two indexes mentioned in the post (with a few extra caveats, like “only include public items in the index”) are always worth it, and the rest not so much.
Another interesting quirk you might not have noticed: queries expand items twice — once in the previous version and once in the new one. In one of these, the answer to “Did the query have a specific item name it’s looking for?” is “no” (because we’re checking all items) and in the other one it is “yes” (because we’re in the middle of checking against a specific item). This means that the adapter can’t afford to neglect the “slow path” entirely, and certainly not delete it.
For some reason, I got to read this submission and https://lobste.rs/s/vyb9rm/stack_graphs_name_resolution_at_scale in succession. In stack graph submission, GitHub produces stack graphs for every push to the repository. To speed things up, they rely on the fact that if git push doesn’t contain some file, that file’s content haven’t changed.
That got me to thinking: could cargo-semver-check use the same trick of leveraging git to update indices only for files that change? That should bring even more speedup as less work needs to be done.
That submission was great, I originally saw the Strange Loop talk and loved it! And yes, the same trick could work here as well.
In practice, though, we have to regenerate the rustdoc JSON data because cargo-semver-checks has 40+ checks implemented (with more added constantly), most of which need more data than stack graphs could resolve (e.g. the implemented traits for a type, which requires type-checking due to auto-traits). This involves a trip through rustc, which is then free to change all the item identifiers in the new rustdoc JSON file, which in turn invalidates the index. I’ve had some conversations with the rustdoc maintainers about identifier stability, and together we decided that’s a “not now but maybe in the future” work item.
But recall there are two JSON files: one from the older “baseline” version and one from the newer “current / about-to-publish” version. The “current” JSON file has to be regenerated, but the “baseline” is almost always referring to a release that’s already on crates.io! That means we get to cache the baseline JSON file instead of rebuilding it, which saves a huge amount of time: avoided a build + better build caching for the other build since nothing got overwritten. This will ship in the next cargo-semver-checks version in the next few days, as soon as we’ve finished testing the alpha with our early-adopter users.
Could we then also cache the indexes for the baseline? Probably! Right now, it just isn’t worth it:
Total checking-only time with the optimizations in the post is 8s.
Building the indexes as regular hashtables is a fraction of a second, I’d estimate probably ~0.2s tops per JSON file.
Reading and deserializing the 200MB JSON file is another fraction of a second, call it 1s total to read + build.
The “current” rustdoc JSON build from a warm build cache is ~30-35s. This is the step we can’t skip no matter what right now.
The “current” rustdoc JSON always needs a fresh index, so another 1s to read + build.
So even if we switched to a fancier persisted index (or even turned cargo-semver-checks into a daemon that keeps the index always warm in RAM), we’d win maybe 1s total, but pay for it dearly in extra complexity.
The bigger win we haven’t grabbed yet is parallelizing with rayon, as I hinted in the beginning of the post. We have 40+ queries on the same read-only dataset + completely independent from one another. That’s just a “parallel for loop,” and rayon will easily get us another O(# of cores) speedup.
At that point, we’re talking about a checking-only time of <1s for even the biggest crates, and we start optimizing things like “rustdoc JSON itself” and “JSON deserialization” :)
But I do need to finish the new Trustfall API first!
the interface-inclusion check was implemented in a convoluted way using a database
this approach works fine for medium-sized programs but it does not scale to very-large outliers
the author proposes to fix this by using more elaborate, not-released-yet features of the database system they are using
I wonder if someone has tried to implement a reasonable algorithm directly, without relying on a database. This could be more upfront work but also probably quite a bit faster, and simpler to extend. In particular, I would expect semver-checking to be concerned not only with existence of functions at a given version, but also with their types: if the function has a different type now, is it more general than the previous one? That sounds very tricky to implement at the database level, so probably you have to mix it with procedural logic anyway.
Unfortunately, the distance has muddied the facts a bit.
The goal of using the “convoluted database” is not to “buy” performance. It’s to “sell” as little performance as possible while exchanging it for things we actually care about: leverage, maintainability, format-independence, newcomer-friendly lint syntax, easy code reviews, etc.
The key difficulty in semver-checking is not in the algorithm itself, but in keeping the various lint rules working across Rust versions. Many semver linters have been attempted and built for Rust, and they largely were abandoned due to excessive maintenance burden.
There’s no stable machine-readable format that describes APIs at a thorough-enough level in Rust. All of the following have been used as the foundation of semver-checking tools:
ASTs are stable, but AST-based semver-checking has both false-negatives (auto-trait info is not in the AST and can’t be checked) and false-positives (moving and re-exporting an item is a giant AST change that is semver-invisible).
rustdoc JSON is explicitly unstable, and has gone through 9 major versions in the 6 months cargo-semver-checks has been around. The 10th major version is just around the corner (open PR with passing tests, not merged yet).
The compiler’s internal APIs are also unstable, and relying on them requires that your users install a specific nightly Rust to use your tool. The semverver project was a Herculean effort to try to pull this off, and in the end it’s being deprecated due to excessive maintenance burden: https://github.com/rust-lang/rust-semverver/issues/373
The “convoluted database” is the abstraction layer that keeps us from needing to reimplement every lint every time the rustdoc format changes. This is what has killed countless semver tools thus far.
We have 40+ different lints with more added all the time. In the post, I described the simplest one. Most are far more complex; I invite you to check them out. They could certainly be rewritten procedurally, possibly even with some minor perf benefit. But then we’d be running a format compatibility risk + we’d have to review that procedural code incredibly closely because bugs are easy to miss in a large volume of code. In contrast, the query-based lints are simple to read and so easy that writing them is our most common onboarding task.
The point of the post is that Trustfall gives us the resources we really care about (maintainability, leverage, easy lint writing and reviewing) without being a performance burden:
By the adoption of cargo-semver-checks sans the 2272x optimization, the community is clearly thrilled to give away performance to get semver-checking.
Adding the indexes was a few hours of work for one person, and sped up lints written by everyone without needing to change them at all.
By adding parallelism (as hinted in the intro) on top of the indexes, we could easily get to sub-second checking times on even gigantic crates. At that point, the check time would be 1/30th the rustdoc JSON generation time, and is no longer worth optimizing checking any further for a while.
My words probably won’t change your mind, but consider this: of all the semver-linters out there, cargo-semver-checks is the one that is in the process of being merged into cargo. It’s also the only one to use the “convoluted database” approach instead of a more direct procedural or compiler-based approach. Funny coincidence, or something more? :)
I’m impressed by how well the notes (I mean footnotes, except they’re not in a footer) show and hide inline without needing JavaScript, markedly better than on any blog I recall having seen before.
The optimization is impressive too, of course. :-)
Thank you! I’m using a lightly edited version of the Tufte CSS theme available here. It clashes a bit with the rest of my site, but I picked it specifically because of the nice “sidenotes” that display in the margin on wide screens and inline on narrow / mobile layouts. I also learned about it from seeing it on a friend’s blog and commenting on it :)
OT: Footnotes 3 and 4 are ‘buggy’ for me (safari, iPhone). They expand over the code block, hiding the code but also displaying the footnote with dark text over a dark background.
On topic: nice with more applications for your tool. I previously saw your shortish presentation (from that conference), which I liked very much!
Oh, fascinating — thanks for reporting the problem. Are any other footnotes problematic or do you think it might just be a weird interaction with the code block at the root of it?
I unfortunately don’t have an iPhone so I’ll have borrow a friend’s to figure out what’s going on exactly.
Thanks for the kind words! I’m hoping I can make Trustfall useful in other domains as well, now that it hopefully has a bit more credibility :)
In case it helps, for me, in Chrome on Android, those notes show as light text on the dark background. Opening them does hide the content of the code block, but I assume that’s intentional.
Thanks, it does help!
I guess
cargo-semver-checks
would use all possible indices and might as well build all of them at once, but, more generally, would it be reasonable to construct indices lazily, after getting the answer to “Did the query have a specific item name it’s looking for?”?Good questions!
Building indexes generally is more expensive than answering any single query: you can kind of think of the index build as a query that misses all indexes, returns a giant amount of data, and now that data needs to be persisted somewhere (in RAM or on disk, in any case it’s cheaper than not persisting it —
malloc
is faster thanfsync
but it isn’t instantaneous).But if there’s reason to believe that the query is going to happen repeatedly and enough to offset the initial and ongoing cost of having the index, then by all means it should be built. There are algorithms that seek to answer this question in an “online” fashion, where they learn which indexes are worth building based on the behavior of the queries being used. Trustfall’s job is to enable such algorithms to be plugged in (under its “any optimization possible outside of Trustfall is just as easy within it too” rule for API design), and it will happily do so.
In the case of
cargo-semver-checks
, we have an easier case where we know all the queries ahead of time — they are hardcoded into the tool. We also know that they run to completion (“fetch all rows, no limit”) and that in expectation they find nothing because semver errors are the exception, not the norm. A quick inspection of the queries (or a half hour with a profiler) makes the index decision very easy: the two indexes mentioned in the post (with a few extra caveats, like “only include public items in the index”) are always worth it, and the rest not so much.Another interesting quirk you might not have noticed: queries expand items twice — once in the previous version and once in the new one. In one of these, the answer to “Did the query have a specific item name it’s looking for?” is “no” (because we’re checking all items) and in the other one it is “yes” (because we’re in the middle of checking against a specific item). This means that the adapter can’t afford to neglect the “slow path” entirely, and certainly not delete it.
For some reason, I got to read this submission and https://lobste.rs/s/vyb9rm/stack_graphs_name_resolution_at_scale in succession. In stack graph submission, GitHub produces stack graphs for every push to the repository. To speed things up, they rely on the fact that if git push doesn’t contain some file, that file’s content haven’t changed.
That got me to thinking: could
cargo-semver-check
use the same trick of leveraging git to update indices only for files that change? That should bring even more speedup as less work needs to be done.That submission was great, I originally saw the Strange Loop talk and loved it! And yes, the same trick could work here as well.
In practice, though, we have to regenerate the rustdoc JSON data because
cargo-semver-checks
has 40+ checks implemented (with more added constantly), most of which need more data than stack graphs could resolve (e.g. the implemented traits for a type, which requires type-checking due to auto-traits). This involves a trip through rustc, which is then free to change all the item identifiers in the new rustdoc JSON file, which in turn invalidates the index. I’ve had some conversations with the rustdoc maintainers about identifier stability, and together we decided that’s a “not now but maybe in the future” work item.But recall there are two JSON files: one from the older “baseline” version and one from the newer “current / about-to-publish” version. The “current” JSON file has to be regenerated, but the “baseline” is almost always referring to a release that’s already on crates.io! That means we get to cache the baseline JSON file instead of rebuilding it, which saves a huge amount of time: avoided a build + better build caching for the other build since nothing got overwritten. This will ship in the next
cargo-semver-checks
version in the next few days, as soon as we’ve finished testing the alpha with our early-adopter users.Could we then also cache the indexes for the baseline? Probably! Right now, it just isn’t worth it:
So even if we switched to a fancier persisted index (or even turned
cargo-semver-checks
into a daemon that keeps the index always warm in RAM), we’d win maybe 1s total, but pay for it dearly in extra complexity.The bigger win we haven’t grabbed yet is parallelizing with
rayon
, as I hinted in the beginning of the post. We have 40+ queries on the same read-only dataset + completely independent from one another. That’s just a “parallel for loop,” andrayon
will easily get us anotherO(# of cores)
speedup.At that point, we’re talking about a checking-only time of <1s for even the biggest crates, and we start optimizing things like “rustdoc JSON itself” and “JSON deserialization” :)
But I do need to finish the new Trustfall API first!
From a distance, it sounds like:
I wonder if someone has tried to implement a reasonable algorithm directly, without relying on a database. This could be more upfront work but also probably quite a bit faster, and simpler to extend. In particular, I would expect semver-checking to be concerned not only with existence of functions at a given version, but also with their types: if the function has a different type now, is it more general than the previous one? That sounds very tricky to implement at the database level, so probably you have to mix it with procedural logic anyway.
Unfortunately, the distance has muddied the facts a bit.
The goal of using the “convoluted database” is not to “buy” performance. It’s to “sell” as little performance as possible while exchanging it for things we actually care about: leverage, maintainability, format-independence, newcomer-friendly lint syntax, easy code reviews, etc.
The key difficulty in semver-checking is not in the algorithm itself, but in keeping the various lint rules working across Rust versions. Many semver linters have been attempted and built for Rust, and they largely were abandoned due to excessive maintenance burden.
There’s no stable machine-readable format that describes APIs at a thorough-enough level in Rust. All of the following have been used as the foundation of semver-checking tools:
cargo-semver-checks
has been around. The 10th major version is just around the corner (open PR with passing tests, not merged yet).semverver
project was a Herculean effort to try to pull this off, and in the end it’s being deprecated due to excessive maintenance burden: https://github.com/rust-lang/rust-semverver/issues/373The “convoluted database” is the abstraction layer that keeps us from needing to reimplement every lint every time the rustdoc format changes. This is what has killed countless semver tools thus far.
We have 40+ different lints with more added all the time. In the post, I described the simplest one. Most are far more complex; I invite you to check them out. They could certainly be rewritten procedurally, possibly even with some minor perf benefit. But then we’d be running a format compatibility risk + we’d have to review that procedural code incredibly closely because bugs are easy to miss in a large volume of code. In contrast, the query-based lints are simple to read and so easy that writing them is our most common onboarding task.
The point of the post is that Trustfall gives us the resources we really care about (maintainability, leverage, easy lint writing and reviewing) without being a performance burden:
cargo-semver-checks
sans the 2272x optimization, the community is clearly thrilled to give away performance to get semver-checking.My words probably won’t change your mind, but consider this: of all the semver-linters out there,
cargo-semver-checks
is the one that is in the process of being merged intocargo
. It’s also the only one to use the “convoluted database” approach instead of a more direct procedural or compiler-based approach. Funny coincidence, or something more? :)