I found this fascinating: I’d assumed that modules would speed up incremental builds because you don’t need to recompile the same headers multiple times, but because modules are coarser-grained than headers it looks as if you often results in longer incremental compile times because changing a header recompiles everything that uses that module, rather than everything that uses that header.
Sony’s ‘compilation database’ approach with fine-grained dependency tracking and caching would likely be a win here and should compose very cleanly with modules.
Slightly off topic: do you know whether precompiled headers still have any effect? Back in the 90s they were a huge speedup in C++, but lately I’ve tried enabling them in Xcode/Clang, and they seem to make no difference at all in build times … unless I touch one of the headers that gets precompiled, of course.
I suspect that they’d have the same problem. Precompiled headers are typically used in combination with prefix headers and Modules are largely implemented on the same machinery. The problem with prefix headers is that any change to a header in the prefix triggers recompilation of all of the files that use it.
I guess there’s a corollary here that modules probably are a win for platform-provided headers. Apple has used modules for cocoa.h (which, last time I looked, was around 8 MiB of text that was parsed for almost every Objective-C compilation unit) for a while. That changes only when you update the SDK, so should be a win for incremental compilation.
Objective-C (and C) are probably somewhat different because code in header files rarely generates code. There are a few things provided as static inline functions, but header files are generally just there to provide a symbol table. The graphs in the article show that most time in the compiler is spent in IR generation and optimisation, so they will probably benefit a lot less (faster preprocessing was a priority for clang from the start because ccache and distcc both run the preprocessor unconditionally and so were easily bottlenecked by gcc’s preprocessing speed). Hot header files are probably in the buffer cache, so you won’t be bottlenecked on disk read speed. In contrast, C++ headers typically contain templates that are provide code that is emitted in the final binary and so a module that already has certain common template specialisations pre-computed will save time in the AST building phase. Sony’s Compilation Database also caches the IR and even generated binary code for template instantiations and so only one compilation unit needs to instantiate std::vector<int>’s methods (for example) and the others can all use the AST, IR, and the binary code for non-inlined methods.