What is currently done by the Debian Rust Team is to manually patch the dependency specifications, to relax them. This is very labour-intensive, and there is little automation supporting either decisionmaking or actually applying the resulting changes.
Two years later, this is hilarious to read. For two years, they shipped a version of bcachefs-tools that simply didn’t work at all. Why? Well, they were shipping bindgen 0.66, which had a severe miscompilation bug that made bcachefs-tools simply not work. bcachefs-tools specified a minimum of 0.69, the version that fixed the bug, but the Debian maintainers patched it to allow 0.66. They shipped that broken version for two years in testing, and then decided to simply drop the package, because it was “too hard” to maintain.
It doesn’t necessarily apply to this scenario, but in general terms if someone said “I’m trying to maintain a stable distribution here; your software is neat but the dependency sprawl caused by your language choice is more or less YOLO so I’m just not going to ship it” then I think that would be a defensible position.
The only thing that’s YOLO is Debian’s choice to randomly patch stuff that they don’t understand. This isn’t just an issue with Rust, they’ve shipped horribly broken C stuff too!
So-called dependency sprawl is a red herring. Two different C programs can depend on two different versions of a C library. It’s a thing that happens. And Debian either packages two versions of the library or they break things. It’s been a problem for decades at this point. Rust just took a sledgehammer to the issue and made it impossible to ignore that Debian’s practices don’t really work all that well.
I think it would be a defensible position if they found a better way to deal with breakage such as the above. A number of places use a similar strategy.
It’s been a frequent issue with Debian that they would stick with horribly broken packages in the name of policy, e.g. it has driven a lot of Rubyists away from using apt. (Won’t got into details, but Debian Ruby was broken between 1.9 and 2.x for a long stretch to the point where no one was willing to assist someone using it)
I think that for a long time, Linux distros got away with it by ignoring most languages with modern package managers. I guess they had to tackle Python since a lot of distro tooling is in that language, but Python software doesn’t tend to have hundreds of dependencies like Rust does.
A lot of serious systems software being built in Rust has finally forced a reckoning.
Python is very much not built around “modern package managers”. They exist now - as they do for C and C++ - but the language itself and its core tooling is stuck firmly in PYTHONPATH and site-packages land.
With --crate-type=dylib all dependencies have to be compiled with the same rustc version since there’s no stable Rust ABI, but I don’t think you need to change any source code.
With --crate-type=cdylib the C ABI is used so you can use any compiler and even access things from other languages, however I think you need to edit the source to add extern to all exported functions and use extern blocks in the depending code to access those functions.
(I haven’t used these flags before so the above might be inaccurate)
In addition, a problem that Rust shares with C++ is that upgrades of libraries usually require a recompilation of all dependents because of genetics/templates and inline functions.
(C also has macros and inline functions, but library authors are more often conscious of keeping ABI compatibility.)
Swift has a nice approach here, where generics within a shared library are monomorphised, but ones exposed across shared library boundaries fall back to dynamic dispatch.
Yeah, it’s a really cool solution to a very difficult problem presented to them by Apple management.
Sometimes I do wonder how things would be if Rust were very slightly less particular about being explicit than it is today. Things like this, not being able to change the representation of String to inline small strings, etc.
I still regard hard-coding a string representation as one of the worst mistakes a language can make. I’ve seen 100% end-to-end throughput increases from moving to alternative string representations that are tailored for specific workloads. And, in most cases, they’ve been different representations. In one case, strings were rarely mutated and almost all of them were very short ASCII single-word strings. Embedding them in pointers saved a ludicrous number of allocations. In other cases, in-place editing was the bottleneck and representing the string as skip lists of buffers (later changed to a different representation for the top level because modern CPUs don’t like skip lists) was a similar win.
In some cases, knowing that you’re dealing with ASCII simplifies a load of operations. In most western languages, UTF-8 is the best choice for Unicode, but if you’re a CJK language (and most of the text is in that language, not things like HTML where a large proportion is markup) then UTF-16 will have much better cache locality. I have never seen a case where UTF-32 is the best choice, but it might be for some niche use.
A standard library should provide interface types for strings but allow the representations to be pluggable. This is trivial with something like Rust, with rich generics and compile-time monomorphisation.
I’ll be honest, I think code being compile-time generic over string representations would add a lot more generics than is necessary.
I think the only reasonable approach for generic strings is dynamic implementation switching hidden behind the same static type. bytes::Bytes is a good example: it represents a contiguous series of bytes in memory that’s cheap to clone, but can switch out implementations dynamically. Bytes can represent both a ref-counted Arc-like thing, as well as simply a static slice.
I’ll be honest, I think code being compile-time generic over string representations would add a lot more generics than is necessary
For a lot of programs where string manipulation is not a performance bottleneck, there would be a single instantiation of the generics across the entire program and so, assuming a moderately well designed compiler, the overhead would be small.
I think the only reasonable approach for generic strings is dynamic implementation switching hidden behind the same static type
That’s what Objective-C does and it can be a big win, but you often statically know the representation, or want to be able to statically specialise for slightly larger small strings in a small-string optimisation, and there you end up paying dynamic dispatch overhead where you don’t need it. The fast range-based getters in Objective-C avoid this for read-only operations but mutation in loops is not fast: dynamic dispatch breaks inlining, which prevents vectorisation, and can cause a 4-16x slowdown (I’ve seen dropping down to C and manually specialising over the type give this kind of speedup).
I understand (and know this isn’t what you’re saying), but I would put this more in the category of “dynamic linking doesn’t support X” as opposed to “X doesn’t support dynamic linking”.
Most languages will have compile-time features that will be lost at the dylib boundary (e.g. #defines and type information in C, templating in C++, macros in Rust or Zig etc.).
I guess C resolves this by putting these things in headers that consumers use as a sort of side-band.
Everything that can be compatible with dylibs in Rust is compatible… I think… They would need their own version of headers for any extra information. Maybe the rlib format contains some already? Not sure.
DLLs on Windows work more or less like this. You don’t link to a DLL when you build an executable, you link to a .lib file, which is a static library that contains the code for dispatching functions to the DLL.
Again, the way legacy distros are organized is centered around how crappy, fragile and crude C and C++ and software written in them are, with an army of “package maintainers” having to baby sit everything. Modern programming languages generally just work, with distro generally being concerned with higher level e2e flows. I “maintain” some Rust packages in NixOS and it comes down to giving a :+1: to PRs to nixpkgs created by a bot updating versions when new release is detected upstream. Pretending that some downstream “package maintainer” can do a better job at testing and fixing problems in ever more complex software, than the upstream developers is just silly.
Pretending that some downstream “package maintainer” can do a better job at testing and fixing problems in ever more complex software, than the upstream developers is just silly.
Agreed. This was my strong reaction to the post too. However, pretending that upstreams can, or more accurately will, maintain long term stability and (security) support better than distribution developers is also silly. These are qualities I value very, very highly in a production system.
I think it’s debatable how much you should trust that support as it diverges more and more from fast paced upstreams, but clearly it’s at least closer than what many upstreams are willing to provide.
I don’t really know what the right answer is here.
However, pretending that upstreams can, or more accurately will, maintain long term stability and (security) support better than distribution developers is also silly.
Plenty of distros doing bare minimum of “maintenance”, shipping mostly upstream builds, combined with trunk-based release cycles and calling it a day. I can’t remember when was the last time I used a maintaince-heavy distro like Debian/Ubuntu or Fedora and I’m happier with my Linux than ever.
Sure, there are some new challenges - e.g. with security update tracking, code reviews and source vetting etc. which is very valuable and was tied to maintaince-heavy distros, but these will just get figured out.
It’s easy to criticize Debian, but I take this as evidence that their volunteer maintainers are way overworked and just trying to keep up. Maybe we with our modern languages that proliferate dependencies need to slow down and make our software more manageable.
I don’t need Debian to supply my crates. I hated it when they did it with pip packages, now this. rustup is all I need from them regarding rust development.
However I do want Debian to supply great tools written in rust (thankfully just is in). The reason to do that are: install speed, updates, curatorship/setup and documentation. I don’t care about shared libraries: ship me fat binaries built with crazy optimization options.
Two years later, this is hilarious to read. For two years, they shipped a version of bcachefs-tools that simply didn’t work at all. Why? Well, they were shipping bindgen 0.66, which had a severe miscompilation bug that made bcachefs-tools simply not work. bcachefs-tools specified a minimum of 0.69, the version that fixed the bug, but the Debian maintainers patched it to allow 0.66. They shipped that broken version for two years in testing, and then decided to simply drop the package, because it was “too hard” to maintain.
It doesn’t necessarily apply to this scenario, but in general terms if someone said “I’m trying to maintain a stable distribution here; your software is neat but the dependency sprawl caused by your language choice is more or less YOLO so I’m just not going to ship it” then I think that would be a defensible position.
The only thing that’s YOLO is Debian’s choice to randomly patch stuff that they don’t understand. This isn’t just an issue with Rust, they’ve shipped horribly broken C stuff too!
So-called dependency sprawl is a red herring. Two different C programs can depend on two different versions of a C library. It’s a thing that happens. And Debian either packages two versions of the library or they break things. It’s been a problem for decades at this point. Rust just took a sledgehammer to the issue and made it impossible to ignore that Debian’s practices don’t really work all that well.
I think it would be a defensible position if they found a better way to deal with breakage such as the above. A number of places use a similar strategy.
It’s been a frequent issue with Debian that they would stick with horribly broken packages in the name of policy, e.g. it has driven a lot of Rubyists away from using apt. (Won’t got into details, but Debian Ruby was broken between 1.9 and 2.x for a long stretch to the point where no one was willing to assist someone using it)
Context: https://jonathancarter.org/2024/08/29/orphaning-bcachefs-tools-in-debian/
How does Debian handle Go? It sounds like a similar problem in principle.
I think that for a long time, Linux distros got away with it by ignoring most languages with modern package managers. I guess they had to tackle Python since a lot of distro tooling is in that language, but Python software doesn’t tend to have hundreds of dependencies like Rust does.
A lot of serious systems software being built in Rust has finally forced a reckoning.
Python is very much not built around “modern package managers”. They exist now - as they do for C and C++ - but the language itself and its core tooling is stuck firmly in PYTHONPATH and site-packages land.
The language itself is, but the programs that distros ship (e.g. sphinx) are usually developed upstream with pip and setuptools.
Nitpick: Rust does support dynamic linking, but it doesn’t come for free as with C or C++: https://doc.rust-lang.org/reference/linkage.html#linkage.
With
--crate-type=dyliball dependencies have to be compiled with the samerustcversion since there’s no stable Rust ABI, but I don’t think you need to change any source code.With
--crate-type=cdylibthe C ABI is used so you can use any compiler and even access things from other languages, however I think you need to edit the source to addexternto all exported functions and useexternblocks in the depending code to access those functions.(I haven’t used these flags before so the above might be inaccurate)
In addition, a problem that Rust shares with C++ is that upgrades of libraries usually require a recompilation of all dependents because of genetics/templates and inline functions.
(C also has macros and inline functions, but library authors are more often conscious of keeping ABI compatibility.)
Monomorphism is just not compatible with dynamic linking. This is an inherent structural limitation.
Swift has a nice approach here, where generics within a shared library are monomorphised, but ones exposed across shared library boundaries fall back to dynamic dispatch.
And they have a book about the whole thing, which is beyond awesome!
https://download.swift.org/docs/assets/generics.pdf
Yeah, it’s a really cool solution to a very difficult problem presented to them by Apple management.
Sometimes I do wonder how things would be if Rust were very slightly less particular about being explicit than it is today. Things like this, not being able to change the representation of String to inline small strings, etc.
I still regard hard-coding a string representation as one of the worst mistakes a language can make. I’ve seen 100% end-to-end throughput increases from moving to alternative string representations that are tailored for specific workloads. And, in most cases, they’ve been different representations. In one case, strings were rarely mutated and almost all of them were very short ASCII single-word strings. Embedding them in pointers saved a ludicrous number of allocations. In other cases, in-place editing was the bottleneck and representing the string as skip lists of buffers (later changed to a different representation for the top level because modern CPUs don’t like skip lists) was a similar win.
In some cases, knowing that you’re dealing with ASCII simplifies a load of operations. In most western languages, UTF-8 is the best choice for Unicode, but if you’re a CJK language (and most of the text is in that language, not things like HTML where a large proportion is markup) then UTF-16 will have much better cache locality. I have never seen a case where UTF-32 is the best choice, but it might be for some niche use.
A standard library should provide interface types for strings but allow the representations to be pluggable. This is trivial with something like Rust, with rich generics and compile-time monomorphisation.
I’ll be honest, I think code being compile-time generic over string representations would add a lot more generics than is necessary.
I think the only reasonable approach for generic strings is dynamic implementation switching hidden behind the same static type. bytes::Bytes is a good example: it represents a contiguous series of bytes in memory that’s cheap to clone, but can switch out implementations dynamically. Bytes can represent both a ref-counted Arc-like thing, as well as simply a static slice.
For a lot of programs where string manipulation is not a performance bottleneck, there would be a single instantiation of the generics across the entire program and so, assuming a moderately well designed compiler, the overhead would be small.
That’s what Objective-C does and it can be a big win, but you often statically know the representation, or want to be able to statically specialise for slightly larger small strings in a small-string optimisation, and there you end up paying dynamic dispatch overhead where you don’t need it. The fast range-based getters in Objective-C avoid this for read-only operations but mutation in loops is not fast: dynamic dispatch breaks inlining, which prevents vectorisation, and can cause a 4-16x slowdown (I’ve seen dropping down to C and manually specialising over the type give this kind of speedup).
I understand (and know this isn’t what you’re saying), but I would put this more in the category of “dynamic linking doesn’t support X” as opposed to “X doesn’t support dynamic linking”.
Most languages will have compile-time features that will be lost at the dylib boundary (e.g.
#defines and type information in C, templating in C++, macros in Rust or Zig etc.).I guess C resolves this by putting these things in headers that consumers use as a sort of side-band.
Everything that can be compatible with dylibs in Rust is compatible… I think… They would need their own version of headers for any extra information. Maybe the
rlibformat contains some already? Not sure.DLLs on Windows work more or less like this. You don’t link to a DLL when you build an executable, you link to a .lib file, which is a static library that contains the code for dispatching functions to the DLL.
Again, the way legacy distros are organized is centered around how crappy, fragile and crude C and C++ and software written in them are, with an army of “package maintainers” having to baby sit everything. Modern programming languages generally just work, with distro generally being concerned with higher level e2e flows. I “maintain” some Rust packages in NixOS and it comes down to giving a :+1: to PRs to nixpkgs created by a bot updating versions when new release is detected upstream. Pretending that some downstream “package maintainer” can do a better job at testing and fixing problems in ever more complex software, than the upstream developers is just silly.
Agreed. This was my strong reaction to the post too. However, pretending that upstreams can, or more accurately will, maintain long term stability and (security) support better than distribution developers is also silly. These are qualities I value very, very highly in a production system.
I think it’s debatable how much you should trust that support as it diverges more and more from fast paced upstreams, but clearly it’s at least closer than what many upstreams are willing to provide.
I don’t really know what the right answer is here.
Plenty of distros doing bare minimum of “maintenance”, shipping mostly upstream builds, combined with trunk-based release cycles and calling it a day. I can’t remember when was the last time I used a maintaince-heavy distro like Debian/Ubuntu or Fedora and I’m happier with my Linux than ever.
Sure, there are some new challenges - e.g. with security update tracking, code reviews and source vetting etc. which is very valuable and was tied to maintaince-heavy distros, but these will just get figured out.
It’s easy to criticize Debian, but I take this as evidence that their volunteer maintainers are way overworked and just trying to keep up. Maybe we with our modern languages that proliferate dependencies need to slow down and make our software more manageable.
Or maybe we need to stop relying on volunteers to maintain a core safety critical piece of infrastructure…
I don’t need Debian to supply my crates. I hated it when they did it with pip packages, now this.
rustupis all I need from them regarding rust development.However I do want Debian to supply great tools written in rust (thankfully
justis in). The reason to do that are: install speed, updates, curatorship/setup and documentation. I don’t care about shared libraries: ship me fat binaries built with crazy optimization options.