Abstract:
GHC is the de facto main implementation of the Haskell programming language. Over its 30 year history it has served well the needs of pure functional programmers and researchers alike. However, GHC is not exemplary of good large scale system design in a pure function language. Rather ironically, it violates the properties that draw people to functional programming in the first place: immutability, modularity, and composability. These scars have become more noticeable as modern projects currently underway, such as the Haskell Language Server and cross-compilation, aim to fulfill user needs and desires far more diverse than before.
We believe a better GHC is possible. We write this paper to properly situate both the current state of GHC’s codebase and that better future state in the design space of large scale, pure, functional systems. Firstly, we document in detail, GHC’s architectural problems, such as low coherence and high coupling of mutable state, and their genesis. Secondly, we describe what we believe to be a superior design, drawing heavily on domain-driven design principles. Lastly, we sketch a plan to get this design implemented iteratively and durably, mentioning interactions with other ongoing refactorings (structured errors, Trees That Grow, etc.).
All of this is informed not just by our own experience working on GHC and deep dives into its history, but also by the traditional software engineering literature. The paper is written from an engineering perspective, with the hope that our collection and recapitulation may provide insight into future best practices for other pure functional software engineers.
Big/old project maintenance golden rule: leave the code not worse than before. Don’t add new crap and try to cut some of the old crap with every new patch.
Pushing the ugly hacks/contracts/assumptions to periphery with creeping refactoring is the one practice that helps me with keeping my codebase sane for years.