1. 49
    1. 28

      Oh, modules in Rust, my favorite topic!

      As usual, I would say it’s important to keep two distinct contexts in mind:

      • a crates.io library with a semver-guarded API which has an open-ended set of consumers and must be evolved compatibly
      • and internal component of an application, with a closed set of usages, which can be relatively easy updated during refactors. Component still has an API, but it is not guarded by semver, and can allow itself to be much more transparent

      I think that the observations from the post make a lot of sense in the application context, but not so much in the library context. In an application, where you can be much more fluid and agile, you often want most of the modules to be relatively thin and transparent. You don’t want hard boundaries, at least for green code, as they get in the way of refactors.

      For libraries though, “size of the crate” is not the interesting metric. There, what you care about is the ratio between between the size of the API surface and the size of the implementation. Great libraries have relatively small APIs, and relatively large and complex implementations (in contrast, in applications you can expect many modules to have proportional api/impl ratios).

      Rust module system is just about perfect for libraries: you can perfectly size both API and impl the language mostly gets out of the way here (c.f. the emergent practice of header-only libraries in C++). For applications, yeah, there’s too much friction. I arrived at a specific routine here (https://matklad.github.io/2021/08/22/large-rust-workspaces.html), and splitting out subcrates is automatic for me now, but that’s still more toil than necessary.


      But, in the case of applications, my bigger criticism would be that the logical organization into crates, and the physical compilation model via monomorphidation, work against each other. Not as bad as in pre-modules C++, but this still creates a bunch of weird best-practices:

      • either you enable lto (non-default and slow) , or you need to add magic inlines.
      • generics scale quadratically (different codegen units get separate copies of identical monomorphisations), so here’s pressure to contort the code to not have generics at the boundaries between the crates.
      • default Cargo project layout encourages many binary artifacts, which adds up to quadratic linking time (https://matklad.github.io/2021/02/27/delete-cargo-integration-tests.html).
      1. 3

        Hi, thanks for chiming in!

        Yeah I definitely agree that the points in the article are based on working with large apps, not so much with libraries. And yeah I’ve already read your routine re: organizing workspaces before, very helpful.

        And thanks for outlining those points about weird best practices.

      2. 1

        Thanks so much for your ‘flat is best’ blog post. I had written on some other site that seems to be having issues today, that the rust module system doesn’t map to the directory hierarchy. Your post succinctly shows that, and I’d say it’s a feature. The problem for me is when I crack open someone else’s repo who is using hierarchies and wonder why I can’t use a module in crates/firstlevel/secondlevel as firstlevel::secondlevel::...

        And note that all of my confusion isn’t really rust’s fault, it’s just using a variety of languages that did do that can make rust’s more straightforward approach hard to grasp.

    2. 17

      Go was born out of frustration with C++ projects that compiled slowly and had dependencies that got out of hand. This is, IMO, the main reasons to use Go in the first place. No wonder Go has solved this well.

    3. 9

      You might have noticed that when I’m talking about modules being organized as a tree, I say that it seems to be neatly organized. Because the dependencies are acyclic there, modules aren’t actually organized, this tree structure merely provides an illusion, while underneath there is likely a dependency mess, and there can be hundreds of files in that mess, and Rust seemingly encourages that; otherwise why would we need modules to be organized in a tree in the first place.

      Let me comment on the second sentence. It is long, with seven clauses:

      Clauses 1 to 6

      Because the dependencies are acyclic there, …

      Yes, this is correct.

      modules aren’t actually organized,

      (Either I’m misunderstanding and/or I disagree.) Modules are organized in a hierarchical (i.e. a tree) namespace.

      this tree structure merely provides an illusion,

      The tree structure provides no ‘illusion’ of dependencies. That is a separate concern. The tree structure describes the namespacing only. It would only be an illusion if one expected the namespace to also convey dependency information.

      while underneath there is likely a dependency mess

      I would say it this way: looking only at the structure of a namespace tree*, in general, provides zero information about dependencies. (When I say “only looking at the structure of a namespace tree”, I mean treating the names as opaque. Opaqueness means that particular names have no isolated meaning; the name themselves don’t provide any conceptual or contextual linkages.)

      and there can be hundreds of files in that mess,

      (granted, there could be hundreds of files)

      and Rust seemingly encourages that;

      The claim is that Rust encourages a dependency mess by way of its design decision around modules and crates. Am I characterizing the author’s position correctly? What is the core reasoning here?

      Solution(s)?

      What specifically could Rust do (by which I mean language design decisions) to improve the dependency mess problem?

      Asking the question in the above way, to me, immediately begs the question: perhaps code analysis tools, not language design choices, would be more effective? (Language design tradeoffs and interactions are fiendishly complex, hard to predict, and hard to change!)

      What do other languages do?

      If I grant that Rust is encouraging a dependency mess, isn’t this also true of every other language that organizes namespaces in a tree? (Note: I do not intend to use the negative aspects of ‘whataboutism’ here; I am not dismissing the underlying concern. Instead, I’m asking to see if I’m properly understanding the problem. If I am, and almost every other language fails at this, perhaps we as designers are failing – or perhaps the problem involves some impossibility – or some kind of stark tradeoff.

      My three claims

      Sorry to break the flow of breaking down the sentence, but let me make these claims:

      1. namespacing is useful.
      2. namespacing by way of trees is so common to the point of ubiquity. [1]
      3. decomplecting namespacing from dependencies is a great design choice

      With these three points in mind, I’ll ask this: Could it be any other way? By this I mean: If a language has namespace-qualified modules and organizes them on a tree-based filesystem, is there any getting around the problem that the dependency relationships are not apparent from that ‘viewpoint’?

      Back to Clause 7

      otherwise why would we need modules to be organized in a tree in the first place.

      By this point, I can’t be confident that I am understanding the author’s intent.

      Neither can I seem to find an interpretation that makes sense to me.

      Generalizing the Argument?

      I know the author is making the claim that Go has advantages here. I’m open to that. My “big picture” goal here is to tease apart the concepts the author is referencing and “lift them” out of the particulars of Rust and Go. This is important, because I get the feeling that the Venn diagram of people that really like both is quite small. For those that do like both, I’m highly confident they like each language for different reasons.

      Notes

      [1]: I can think of a notable exception in the wonderful Unison language. The ‘bottom-most’ (immutable actually! very cool) identifiers are hash-qualified identifiers corresponding to hashed content. Unison also has other kinds of qualifies, such as namespace-qualified identifiers which refer to the hashes. [2]

      [2]: Hierarchical namespacing thus seems to be ‘in the water’. This doesn’t mean we shouldn’t examine the idea, though.

      P.S. I put in considerable thinking to not simply repeat the same points from a HN comment I made on the same paragraph from Dmitry’s blog post. I took a different approach and tone. And I’ve come away much more curious about language design and how it achieves organization by way of namespacing and dependencies.

      1. 4

        Thanks for the thoughtful comment.

        Re: modules not being too much organized: my point is that organizing by namespace is a lot less useful if dependencies are decoupled from namespaces, at least from my experience so far. Organization should generally make things easier to work with, and here the modules tree looks kinda neat, but it gives some false sense of organization: if one wants to reorganize things by factoring something out of this crate, that thread is likely to unravel hard, due to dependencies being all over the place. Some people also said that they don’t take advantage of the acyclic dependencies there, and keep them cyclic; but it also suggests that allowing cyclic dependencies on that level was probably a poor idea. I understand that you and some other people disagree with this, and it’s fine, I don’t expect everyone to agree.

        The claim is that Rust encourages a dependency mess by way of its design decision around modules and crates. Am I characterizing the author’s position correctly? What is the core reasoning here?

        The point was that Rust encourages large crates (with hundreds of files in it), and with cyclic dependencies between modules within a crate, despite the fact that it is likely to result in dependency mess which is hard to unravel. As opposed to Go, which encourages packages of reasonable sizes.

        What specifically could Rust do (by which I mean language design decisions) to improve the dependency mess problem? Asking the question in the above way, to me, immediately begs the question: perhaps code analysis tools, not language design choices, would be more effective?

        I think at this point it might be a bit too late to change much of the language decisions, but one thing that can be done is to at least allow a crate’s Cargo.toml to not specify dependencies at all, fully inheriting them from the workspace. Right now we already can inherit at least the versions - that’s better than nothing, but it’s still required to specify every dependency explicitly. Would be nice to have an option to not do that.

        As to adding code analysis tools - yeah again this would be better than nothing, but if the system simply encourages good practices which would eliminate the need of more tooling, isn’t it a trait of a well-designed system?

        But practically speaking, yes, at this point we can hardly do any better than adding more tools.

        If I grant that Rust is encouraging a dependency mess, isn’t this also true of every other language that organizes namespaces in a tree?

        It is indeed true that Rust is not alone in this. And it’s kinda surprising that Rust, overall being very strict as a language, just went with the herd when it comes to dependency management, and allowed it to be very lax. And also, one could say it hurt Rust probably more than some other languages using a similar dependency management design, because (1) the unit of compilation is a crate, which can be huge as discussed, and (2) the compilation in Rust is inherently slow, since Rust has to do a lot of work there.

        1. namespacing is useful.
        2. namespacing by way of trees is so common to the point of ubiquity.
        3. decomplecting namespacing from dependencies is a great design choice

        I agree with 1 and 2, but don’t agree with 3. By keeping deps and namespaces out of sync, we just introduce more surprizes for the reader of the code, from my experience.

        otherwise why would we need modules to be organized in a tree in the first place.

        By this point, I can’t be confident that I am understanding the author’s intent.

        Neither can I seem to find an interpretation that makes sense to me.

        Organizing modules as a tree within a crate encourages crates of large sizes. And that tree being decoupled from the dependencies, which can be acyclic even, encourages dependency mess.

        So as a result, instead of having numerous crates of reasonable sizes, we often have fewer crates of larger sizes. Since deps inside of a crate can be cyclic, the code is less organized as a whole, comparing to numerous crates of reasonable sizes. And, it also slows down the compilation times.

        1. 1

          Thanks for discussing.

          1. decomplecting namespacing from dependencies is a great design choice

          … but [I] don’t agree with 3. By keeping deps and namespaces out of sync, we just introduce more surprizes for the reader of the code, from my experience.

          I’m not following how any language could synchronize “dependencies and namespaces”. This is what I mean, in the general sense: How can dependencies and namespacing be synchronized, given that namespaces are trees and dependencies are graphs? (By synchronize I mean a one-to-one mapping.) I don’t think they can. Some assumption would have to be relaxed to make a 1:1 mapping. This leads me to think you are making a different point.

          Caveat: I should admit that I have gaps in how Go works, so my apologize if this ignorance is limiting my understanding. I won’t be bothered at all if you spell out what should be completely obvious.

          1. 5

            Oh sure, sorry for not being clear enough, let me be more explicit:

            In Go, if you have a directory tree like that:

            mypackage/
            ├── bar/
            │   └── subbar/
            ├── baz/
            └── foo/
            

            It means we have exactly 5 packages, and all those packages can only depend on each other in an acyclic way (like crates in Rust). For example, all four subpackages (bar, subbar, baz, and foo) can depend on mypackage, and bar can also depend on subbar, but then mypackage can not depend on any of those, because that would introduce a cycle. Or they can depend on each other in some other way, but again there should be no cycles.

            So it doesn’t have to match the exact directory tree 1:1, but still every directory in a tree is a distinct node in an acyclic dependency graph. So we have 1:1 mapping of the nodes in the directory tree and in the dependency graph. Does it make more sense now?

    4. 4

      For what it’s worth, Go also has workspaces since 1.18 as a way to organize groups of modules. This is quite similar to Rust workspaces, conceptually. It was necessary because in practice, many projects consist of multiple modules which also need to be developed concurrently, and this was inconvenient.

    5. 3

      It’s much harder to refactor Go packages though. In Rust you can copy things around and fix a few use statements. You can also forward things for compatibility reasons. Not so (nice) in Go.

      1. 1

        In Rust … You can also forward things for compatibility reasons.

        What do you mean by forward? (Was this a typo? Some way of talking about it I haven’t heard of?)

        Do you mean refer? re-export?

        1. 4

          I believe they mean re-export in this case. Specifically, the ability to extract code into a new crate and then re-export its contents from the original crate to avoid breaking downstream users. At least that’s how I read it.

      2. 1

        There are a lot of different code rewrite tools for Go. When I need to move something from one package to another internally, it’s not a big deal. For externally visible moves, it’s a bigger deal because it breaks the API, but you can usually set up forwarding to make it work.

    6. 3

      Nice article, as a casual reader of Go and Rust, it does seem to me that Rust project layouts are more baroque

      1. 3

        The layout of Rust modules is by far the most confusing thing to me in the language. It feels like it was designed by a “genius” who when asked about it says, “it is simple really, you just ….”

        Rust should encourage lots of well designed, minimal libraries that can be easily composed. I am going to have to spend more time understanding Matilda’s posts above.

        1. 3

          It feels like it was designed by a “genius” who when asked about it says, “it is simple really, you just ….”

          I think this result is a combination of two things:

          • Rust module system actually brinings two new, innovative aspects, which are not found in other languages:
            • libraries (crates) as a first-class language concept distinct from that of modules
            • actually hierarchical modules (Java reverse-dns notation looks hierarchical, but package names are mostly opaque strings, there’s no real nesting)
          • At the same time, “concrete syntax” for expressing these ideas is unnecessary complex. The mapping between file system and logical modules is quirky. The rules also were changed in 2018 edition, which is both an evidence that “surface” parts of the system were not great, and an additional complexity for people to deal with now.
        2. 2

          The thing about minimal libraries is that if they’re super minimal you tend to look at them and say “why am I bothering with this instead of just writing it myself?” The problem with designing things around composition is that making libraries work together transparently often introduces dependencies or adds overhead.

          I have a well-designed (IMNSHO), minimal, composable RNG crate, great. I want to make it possible to serialize/deserialize the RNG state with serde. Makes sense, but now I depend on serde with the versioning headaches that involves. (One of the great things about serde is that if you believe their semver they haven’t had a breaking change in about 5 years, which I mostly believe, and I can only imagine the headaches it has caused them.) Oh, well for the people who don’t need that, I can add a feature flag that lets them only include serde when they need it. …My lib is not looking so minimal anymore.

          One can argue that some of that is precisely the result of how Rust does libraries. But it’s a good example of how you can’t just magically Voltron together arbitrary libraries, afaict in any language. Making it possible is work.

          1. 4

            Modules (as in ML modules) should solve this problem. It’s funny how a combination of conditional compilation and traits is used in leu of a module system.

            1. 2

              I mean, in theory. We will see I suppose!

            2. 2

              I am really excited by the work around Compile Time features in languages like Zig, Dlang and C++. Yes Rust has some comptime, but I think Zig really showed how far this facility can go.

              I am still amazed by what the Rust community has been able to pull off.  🤞🏽 they continue to have an open mind and attract the most creative, pragmatic doers. This has been their greatest power.