It took us a while to arrive at the exact idea presented here.
My first attempts tried and failed to do smarter things, like track additions, deletions, renames, and moves.
FWIW, this approach is described in the linked rust-analyzer post(in the part which describes how ra does not work):
Moreover, this map is cheap to update. When a file change arrives, this file’s contribution from the index is removed, the text of the file is changed and the indexer runs on the new text and adds the new contributions.
It’s interesting that this is more efficient than what salsa (ra’s incremental engine) does and is capable of expressing. Essentially, what happens here is that we apply a diff to an input, use that to compute the diff of the derived data, and update derived data. Salsa is more dumb in that it can’t figure “diff for derived” thing, it can only recompute it. The work-around is to arrange derived data as a tree (very tree node is a query whose inputs are children), so that only path to root needs to be recomputed.
In particular, it’s not fully incremental: some kinds of edits cause Sorbet to type check the whole codebase.
Two questions: are we talking about LSP specifically here? And is “typecheck the whole codebase” refers to literal typechecking of every function’s body, or only to the fundamentally global “name binding” phase (figuring out what the stuff outside of bodies means, without peeking inside?).
My theory here is that “maintain a fully type-checked view of the codebase, all the time”, is
a) fundamentally slow operation (there are O(1) edits which lead to O(N) changes, like a rename of a popular symbol)
b) is not a requirement, in a sense that you can provide great UX without that.
In particular, I think the right architecture here is for LSP to only provide on-the-fly errors for the files you currently work with, and have a separate explicit “build” command which runs actual build system to get the list of all errors.
IntelliJ makes the distinction (in the internal UI as well as in the UX surfaced to the user) between these two kinds of sources of errors. VS Code and LSP sadly conflate the two.
Having done that, it’s hard to decide which files depend on what changed and must be type checked again.
Of those three, Sorbet managed to do (1) and (3) passably well before we started.
That’s really surprising! Like, 3 is by far the hardest problem, with the most nasty bugs! Would love to read a post about that!
The approaches for 3 I know of:
On every change, invalidate types in all bodies. To paper over that, make “typechecking function body” a lazy operation and implement LSP features in such a way that they don’t need to typecheck bodies of anything which is not on the screen
I’d first like to start by saying that as evidenced by the post, this approach starts to break down when handing changes to the class hierarchy, so feel free to take this entire post with a grain of salt. I’m not trying to recommend this strategy as much as trying to show how much value it was able to capture for how little work it was in a system that didn’t take a principled approach to being incremental in the first place.
FWIW, this approach is described in the linked rust-analyzer post
Sorbet does invalidate indexing passes, but does not maintain per-file symbol tables. As the post hints at, “removing the file’s contribution from the symbol table” is harder than “replacing the file’s index result.” It’s possible that Sorbet could have been made to work more like the Java example in that section, but it would have likely involved a redesign of Sorbet’s constant resolution pass, at which point why not just do the query-based approach (answer: time/effort)
are we talking about LSP specifically here?
Yes, the only thing that is cached for Sorbet at the command line is index results. There is no attempt to cache or reuse any other part of the pipeline at the command line. This step takes ~20 seconds on one machine vs the CPU-weeks used to run Ruby tests per build. The command line mode is basically only used in CI—the daily driver for the vast majority of developers is LSP, so it hasn’t made sense to optimize for making the command line version incremental in some way
And is “typecheck the whole codebase” refers to literal typechecking of every function’s body, or only to the fundamentally global “name binding” phase
The former, including all bodies. (In my writing about Sorbet, i exclusively use “resolving” to describe that name binding phase). This lines up with the terms as they are used in sorbet’s pipeline
My theory here is that [this] is fundamentally slow operation
Yep :)
I think the right architecture here is for LSP to only provide on-the-fly errors
We have heard many times from developers in our mono repo that hate it when their editor tells them one thing but CI tells them another thing. If they don’t see the full list of errors in their editor but they see them in CI, they complain
It’s worth noting that there is some amount of separation between resolving and running inference on method bodies: once the resolve phase has finished, Sorbet allows the typechecking operation to be pre-empted or cancelled as new edits come in. I suppose you could conceptualize it as a “build” command that Sorbet runs automatically for every edit once resolver has finished
3 is by far the hardest problem
Really? Sorbet implements what sounds like is the same approach as the Kotlin post you share (I could write something up, but it would likely live in Sorbet’s internal architecture docs, not be a blog post). That has worked with no changes since it was first implemented years ago. It typechecks too much stuff sometimes which is maybe the only complaint? The article puts a number on how often this happens—it’s the “too many extra files” bucket at the bottom of the post. 13% of slow path edits, or 1.3% of all edits would involve typechecking more than 200 downstream files (in excess of the files that needed to be re-resolved). Again, the only reason that number is 200 and not 2,000 or higher is because of the single-threaded limitation that the post mentions as being the solution to this bucket. So in practice we’ve found this name-based approach to be almost completely fine. I unfortunately can’t take credit here. It would have been either Dmitry or John who came up with that idea (i wouldn’t be surprised if Dmitry suggested it because he was familiar with the work in Kotlin).
That’s really surprising!
I’m surprised to hear this is a problem, especially if the target language has some sort of dependency graph. In particular, I would have imagined that having some sort of package graph or import graph would reduce this problem to simply taking the transitive closure of something.
In Ruby, constant resolution is a fundamentally global operation—constants can come from anywhere, and you discover them in a fixed point loop that takes turn resolving ancestor information (specified with constant literals) and resolving constant literals (whose meaning depends on ancestor information). So in essence, the dependency graph changes on every edit.
For the “names are too coarse” problem I have vaguely thought about culling the space using Stripe’s in-progress Ruby package system (which doesn’t depend on constants but instead on file structure) but I haven’t had a chance to do that yet.
IIUC, IntelliJ also doesn’t do that, rather, it’s essentially a global table of stuff, which you can query in various ways and which is transparently loaded from/saved to disk. In IJ, this infra is very dynamic (eg, language plugins at runtime register new kinds of things they index), so I imagine its fairly straightforward to write code to “delete all stuff from this file”, which would work for any bit of data stored. I think in sorbet everything is static (cause you only need to deal with ruby), so I imagine the equivalent code is “for every data structure, do this custom thing”, which I imagine is quite hard to get right indeed!
that hate it when their editor tells them one thing but CI tells them another thing
Uhu, that’s why this distinction should be visible in the editor UX. The way it works in IntelliJ, “on the fly errors” are fully automatic, and then there’s a special “build” button/shortcut which does what CI would do, so the user is aware which is which. And this leads to an interesting workflow: when I am changing something, I want to first write new code, ignoring errors elsewhere, and then have a separate single session of walking through compile errors across the project. So a) do local refactor, b) run the build c) go through all the errors one by one.
Sorbet allows the typechecking operation to be pre-empted or cancelled as new edits come in. I suppose you could conceptualize it as a “build” command that Sorbet runs automatically for every edit once resolver has finished
Yeah, that makes sense. If you can also service goto definition and completion before everything is type checked, that should give you best of both worlds, with very fast direct request, and fast, but asynchronous, global error reporting. Basically, do what’s visible immediately, do everything else eventually.
Does sorbet implement “semantic tokens” LSP API? That’s the best stress test for “how fast the thing is interactively”
That has worked with no changes since it was first implemented years ago
My surprise comes from seeing how coarse-grained incrementality was added to Kotlin and Rust, and in both cases quite a few nasty miscompilation bugs were discovered after incremental was stabilized and publicly rolled out. I have two hypothesis why it has turned out to be better for you:
it looks like this feature was designed in from the start, which should help.
sorbet is not a compiler, so you might not notice absence of errors due to some bugs in change tracking.
It does not. I agree that would be a good stress test. In practice our stress test is how long obviously wrong red squiggles persist, and how often completion requests are cancelled because the user typed faster than the request arrived
sorbet is not a compiler, so you might not notice absence of errors
ah yes. I would not want this feature powering compilation! In general I do not envy people in the position of actually using the type checker IR for compilation
The Sorbet Compiler uses Sorbet’s typechecking IR, but was never run from an editor/incremental context, so the name based approach never affected compilation (mostly, because the experiment was focusing on production latency, not yet developer velocity)
I was baited into this being a cooking recipe.
FWIW, this approach is described in the linked rust-analyzer post(in the part which describes how ra does not work):
It’s interesting that this is more efficient than what salsa (ra’s incremental engine) does and is capable of expressing. Essentially, what happens here is that we apply a diff to an input, use that to compute the diff of the derived data, and update derived data. Salsa is more dumb in that it can’t figure “diff for derived” thing, it can only recompute it. The work-around is to arrange derived data as a tree (very tree node is a query whose inputs are children), so that only path to root needs to be recomputed.
Two questions: are we talking about LSP specifically here? And is “typecheck the whole codebase” refers to literal typechecking of every function’s body, or only to the fundamentally global “name binding” phase (figuring out what the stuff outside of bodies means, without peeking inside?).
My theory here is that “maintain a fully type-checked view of the codebase, all the time”, is
a) fundamentally slow operation (there are O(1) edits which lead to O(N) changes, like a rename of a popular symbol) b) is not a requirement, in a sense that you can provide great UX without that.
In particular, I think the right architecture here is for LSP to only provide on-the-fly errors for the files you currently work with, and have a separate explicit “build” command which runs actual build system to get the list of all errors.
IntelliJ makes the distinction (in the internal UI as well as in the UX surfaced to the user) between these two kinds of sources of errors. VS Code and LSP sadly conflate the two.
That’s really surprising! Like, 3 is by far the hardest problem, with the most nasty bugs! Would love to read a post about that!
The approaches for 3 I know of:
I’d first like to start by saying that as evidenced by the post, this approach starts to break down when handing changes to the class hierarchy, so feel free to take this entire post with a grain of salt. I’m not trying to recommend this strategy as much as trying to show how much value it was able to capture for how little work it was in a system that didn’t take a principled approach to being incremental in the first place.
Sorbet does invalidate indexing passes, but does not maintain per-file symbol tables. As the post hints at, “removing the file’s contribution from the symbol table” is harder than “replacing the file’s index result.” It’s possible that Sorbet could have been made to work more like the Java example in that section, but it would have likely involved a redesign of Sorbet’s constant resolution pass, at which point why not just do the query-based approach (answer: time/effort)
Yes, the only thing that is cached for Sorbet at the command line is index results. There is no attempt to cache or reuse any other part of the pipeline at the command line. This step takes ~20 seconds on one machine vs the CPU-weeks used to run Ruby tests per build. The command line mode is basically only used in CI—the daily driver for the vast majority of developers is LSP, so it hasn’t made sense to optimize for making the command line version incremental in some way
The former, including all bodies. (In my writing about Sorbet, i exclusively use “resolving” to describe that name binding phase). This lines up with the terms as they are used in sorbet’s pipeline
https://github.com/sorbet/sorbet/blob/master/docs/pipeline.md
Yep :)
We have heard many times from developers in our mono repo that hate it when their editor tells them one thing but CI tells them another thing. If they don’t see the full list of errors in their editor but they see them in CI, they complain
It’s worth noting that there is some amount of separation between resolving and running inference on method bodies: once the resolve phase has finished, Sorbet allows the typechecking operation to be pre-empted or cancelled as new edits come in. I suppose you could conceptualize it as a “build” command that Sorbet runs automatically for every edit once resolver has finished
Really? Sorbet implements what sounds like is the same approach as the Kotlin post you share (I could write something up, but it would likely live in Sorbet’s internal architecture docs, not be a blog post). That has worked with no changes since it was first implemented years ago. It typechecks too much stuff sometimes which is maybe the only complaint? The article puts a number on how often this happens—it’s the “too many extra files” bucket at the bottom of the post. 13% of slow path edits, or 1.3% of all edits would involve typechecking more than 200 downstream files (in excess of the files that needed to be re-resolved). Again, the only reason that number is 200 and not 2,000 or higher is because of the single-threaded limitation that the post mentions as being the solution to this bucket. So in practice we’ve found this name-based approach to be almost completely fine. I unfortunately can’t take credit here. It would have been either Dmitry or John who came up with that idea (i wouldn’t be surprised if Dmitry suggested it because he was familiar with the work in Kotlin).
I’m surprised to hear this is a problem, especially if the target language has some sort of dependency graph. In particular, I would have imagined that having some sort of package graph or import graph would reduce this problem to simply taking the transitive closure of something.
In Ruby, constant resolution is a fundamentally global operation—constants can come from anywhere, and you discover them in a fixed point loop that takes turn resolving ancestor information (specified with constant literals) and resolving constant literals (whose meaning depends on ancestor information). So in essence, the dependency graph changes on every edit.
For the “names are too coarse” problem I have vaguely thought about culling the space using Stripe’s in-progress Ruby package system (which doesn’t depend on constants but instead on file structure) but I haven’t had a chance to do that yet.
IIUC, IntelliJ also doesn’t do that, rather, it’s essentially a global table of stuff, which you can query in various ways and which is transparently loaded from/saved to disk. In IJ, this infra is very dynamic (eg, language plugins at runtime register new kinds of things they index), so I imagine its fairly straightforward to write code to “delete all stuff from this file”, which would work for any bit of data stored. I think in sorbet everything is static (cause you only need to deal with ruby), so I imagine the equivalent code is “for every data structure, do this custom thing”, which I imagine is quite hard to get right indeed!
Uhu, that’s why this distinction should be visible in the editor UX. The way it works in IntelliJ, “on the fly errors” are fully automatic, and then there’s a special “build” button/shortcut which does what CI would do, so the user is aware which is which. And this leads to an interesting workflow: when I am changing something, I want to first write new code, ignoring errors elsewhere, and then have a separate single session of walking through compile errors across the project. So a) do local refactor, b) run the build c) go through all the errors one by one.
Yeah, that makes sense. If you can also service goto definition and completion before everything is type checked, that should give you best of both worlds, with very fast direct request, and fast, but asynchronous, global error reporting. Basically, do what’s visible immediately, do everything else eventually.
Does sorbet implement “semantic tokens” LSP API? That’s the best stress test for “how fast the thing is interactively”
My surprise comes from seeing how coarse-grained incrementality was added to Kotlin and Rust, and in both cases quite a few nasty miscompilation bugs were discovered after incremental was stabilized and publicly rolled out. I have two hypothesis why it has turned out to be better for you:
It does not. I agree that would be a good stress test. In practice our stress test is how long obviously wrong red squiggles persist, and how often completion requests are cancelled because the user typed faster than the request arrived
ah yes. I would not want this feature powering compilation! In general I do not envy people in the position of actually using the type checker IR for compilation
The Sorbet Compiler uses Sorbet’s typechecking IR, but was never run from an editor/incremental context, so the name based approach never affected compilation (mostly, because the experiment was focusing on production latency, not yet developer velocity)