1. 68

  2. 17

    I’m working on some ‘pretty big’ (several kilolines) project on rust, and two things that frustrate me to no end:

    • All the stuff around strings. Especially in non-systems programming there’s so much with string literals and the like, and Rust requires a lot of fidgeting. Let’s not even get into returning heap-allocated strings cleanly from local functions. I (think) I get why it’s all like this, but it’s still annoying, despite all the aids involved

    • Refactoring is a massive pain! It’s super hard to “test” different data structures, especially when it comes to stuff involving lifetimes. You have to basically rewrite everything. It doesn’t help that you can’t have “placeholder” lifetimes, so when you try removing a thing you gotta rewrite a bunch of code.

    The refactoring point is really important I think for people not super proficient in systems design. When you realize you gotta re-work your structure, especially when you have a bunch of pattern matching, you’re giving yourself a lot of busywork. For me this is a very similar problem that other ADT-based languages (Haskell and the like) face. Sure, you’re going to check all usages, but sometimes I just want to add a field without changing 3000 lines.

    I still am really up for using it for systems stuff but it’s super painful, and makes me miss Python a lot. When I finally get a thing working I’m really happy though :)

    1. 4

      I would definitely like to learn more about the struggles around refactoring.

      1. 4

        Your pain points sound similar to what I disliked about Rust when I was starting. In my case these were symptoms of not “getting” ownership.

        The difference between &str and String/Box<str> is easy once you know it. If it’s not obvious to you, you will be unable to use Rust productively. The borrow checker will get in your way when you “just” want to return something from a function. A lot of people intuitively equate Rust’s references with returning and storing “by reference” in other languages. That’s totally wrong! They’re almost the opposite of that. Rust references aren’t for “not copying” (there are other types that do that too). They’re for “not owning”, and that has specific uses and serious consequences you have to internalize.

        Similarly, if you add a reference (i.e. a temporary scope-limited borrow) to a struct, it blows up the whole program with lifetime annotations. It’s hell. <'a> everywhere. That’s not because Rust has such crappy syntax, but because it’s basically a mistake of using wrong semantics. It means data of the struct is stored outside of the struct, on stack in some random place. There’s a valid use-case for such stack-bound-temp-struct-wrappers, but they’re not nearly as common as when it’s done by mistake. Use Box or other owning types in structs to store by reference.

        And these aren’t actually Rust-specific problems. In C the difference between &str and Box<str> is whether you must call free() on it, or must not. The <'a> is “be careful, don’t use it after freeing that other thing”. Sometimes C allows both ways, and structs have bool should_free_that_pointer;. That’s Cow<str> in Rust.

        1. 4

          Indeed, but I think this proves the “Complexity” section of TFA. There are several ways to do things including:

          • References
          • Boxed pointers
          • RC pointers
          • ARC pointers
          • COW pointers
          • Cells
          • RefCells

          There’s a lot of expressive power there, and these certainly help in allowing memory-safe low-level programming. But it’s a lot of choice. Moreso than C++.

          1. 2

            Absolutely — with a GC all these are the same thing. C++ has all of them, just under different names, or as design patterns (e.g you’ll need to roll your own Rc, because std::shared_ptr will need to use atomics in threaded programs).

            There are choices, but none of them are Rust-specific. They’re emergent from what is necessary to handle memory management and thread safety at the low level. Even if C or C++ compiler doesn’t force you to choose, you will still need to choose yourself. If you mix up pointers that are like references, with pointers that are like boxes, then you’ll have double-free or use-after-free bugs.

            1. 2

              There are choices, but none of them are Rust-specific. They’re emergent from what is necessary to handle memory management and thread safety at the low level.

              I disagree. ATS and Ada offer a different set of primitives to work with memory safe code. Moreover, some of these pointer types (like Cow) are used a lot less frequently than others. Rust frequently has multiple ways and multiple paradigms to do the same thing. There’s nothing wrong with this approach, of course, but it needs to be acknowledged as a deliberate design decision.

              1. 1

                I’d honestly like to know what Ada brings to the table here. AFAIK Ada doesn’t protect from use-after-free in implementations without a GC, and a typical approach is to just stay away from dynamic memory allocation. I see arenas are common, but that’s not unique to Ada. I can’t find info what it does about mutable aliasing or iterator invalidation problems.

              2. 2

                The set of Boost smart pointers demonstrates some of the inherent complexity in efficient object ownership: https://www.boost.org/doc/libs/1_72_0/libs/smart_ptr/doc/html/smart_ptr.html

          2. 1

            It doesn’t help that you can’t have “placeholder” lifetimes

            I’m not sure what you mean, but maybe this can help? https://doc.rust-lang.org/std/marker/struct.PhantomData.html

          3. 8

            Thanks for posting this article! It captures most of the major issues that have influenced me not to use Rust for more than toy projects (yet). (For me this is mostly tooling, integration, and lack of other implementations.)

            I do want to note one of the issues which the author “deliberately omitted” as not a real problem:

            ”Dependencies (“stdlib is too small / everything has too many deps”) — given how good Cargo and the relevant parts of the language are, I personally don’t see this as a problem.”

            While I definitely recognize the merits of the “small-stdlib” model, it makes Rust more difficult to use in some environments. In particular, environments that lack good Internet connections are difficult to develop Rust in.

            For example: I think this is one reason I haven’t seen Rust make much headway in scientific high-performance computing. It’s difficult to develop Rust on supercomputers! In HPC, it’s common to have large data centers which are actively used for at-scale development, but bandwidth to the outside world is low because simulation workloads almost never call out of the local network. Getting an HPC center to invest in a better Internet connection can be an uphill battle.

            Additionally, many of the institutions with the biggest influence in scientific computing (e.g., US national labs) do a lot of their work in completely air-gapped environments! Moving data to these environments is often a manual process which may include human approvals. Having done some work in such an environment myself, it provides a major incentive to use languages with large stdlibs or where your Linux distro already packages your deps. ;-)

            One way these environments manage this issue is local mirrors and package registries, and I’ve definitely seen local Docker registries as an enabler for HPC systems to start using containers. I’m hoping to see this eventually for Rust, but even the Cargo Book points out that “At this time, there is no widely used software for running a custom registry.”

            1. 5

              Perhaps I’m missing something, but… why do you need to build the software on the supercomputer?

              I work in a semi-related area (electronics design tooling and bulk-simulation), and we deploy our software as a full Python virtual environment. It’s effectively a big statically-linked blob, hosted on a shared NFS mount and executed as a normal Linux process on all of the hosts that run it. It only really depends on the system libc.

              Rust binaries are the same way, except they’re actually statically linked. I’m imagining that you could compile your code on a laptop using the break-room wi-fi if that’s what you need to do (Rust packages are typically not large downloads, and cargo caches them automatically) and then transfer the resulting statically-linked binary over the air-gap. As with our software, you’d only need to depend on the system libc.

              What am I missing here? Perhaps that’s not allowed for security reasons?

              1. 6

                Mostly it’s a pain, and it slows down the development process. Think of it as increasing your effective build time!

                Additionally, a lot of problems in this kind of development only manifest when you run at scale. A developer might get an interactive allocation with some small number of nodes (usually single-digit), do some small-scale runs and fix a bunch of bugs, then submit a larger job for a scale run. Find some more bugs, try to fix them, iterate and continue to scale up. After a certain point, doing all your development on the cluster itself becomes the less painful option, and a lot of HPC development tooling has been built out to enable this pattern.

                That’s not to say the “develop locally, deploy to the cluster” pattern doesn’t happen. It definitely does, especially when you have external collaborators. And a lot of secure sites also maintain smaller “open” systems with Internet connections to enable this. But even the “open” systems tend to have slow connections, again due to the bias towards software that doesn’t require the Internet as part of its operation.

                (Security can also be a concern! But that gets even more complicated and depends on exactly what you’re working on. The data-movement and workflow problems are a bit more general.)

            2. 4

              rustc implements what is probably the most advanced incremental compilation algorithm in production compilers

              I think Scala gives Rust a run for its money in that regard.

              1. 4

                One thing spoke to me in this as someone who’s relearned Rust several times over the years as I’ve had the continual misfortune of projects using it catching the ax with a year or more between projects using it. My first Rust version was 0.11 and the last I consciously used was 1.26, I think. I’m starting on another project using it and a major dependency requires nightly, so that’s a new adventure for me who’s only ever used stable.

                In Kotlin, you write class Foo(val bar: Bar), and proceed with solving your business problem. In Rust, there are choices to be made, some important enough to have dedicated syntax.

                I feel this pain every time I delve back in. Last time, I had to dig into Rc for something for the first time and it has some unexpected gotchas. I don’t remember what they were now; this was 3 years ago.

                Perhaps I’ve been spoiled by using Ruby, Scala, and Java for most of my professional career. When I go look at Rust, I understand what’s in use but I’m not sure why it’s in use and how someone (re)learning might come across the correct tool. I love the Rust compiler’s helpful warning messages; does it suggest the more advanced memory management things like Arc and Box now?

                1. 2

                  Perhaps I’ve been spoiled by using Ruby, Scala, and Java for most of my professional career. When I go look at Rust, I understand what’s in use but I’m not sure why it’s in use and how someone (re)learning might come across the correct tool. I love the Rust compiler’s helpful warning messages; does it suggest the more advanced memory management things like Arc and Box now?

                  Each time that I fiddle in Rust, those are my biggest questions. I’d be happy to get linked to documentation/guides where I could learn how to make the best out of those.

                  Most of the time I read either “easy, just use lifetime annotations everywhere” or “you don’t need lifetime because since 2018 …”.

                  1. 2

                    In Kotlin, you write class Foo(val bar: Bar), and proceed with solving your business problem. In Rust, there are choices to be made, some important enough to have dedicated syntax.

                    This is exactly the point, though, right? It’s the same thing with C. You can just make the lazy choices if you want to just Get Shit Done when it doesn’t matter (in rust this means a lot of calling .clone()) or you can make the choices a compiler cannot make for you and get that control. Lots of GC’d languages with no knobs exist, and that’s great, but we didn’t need Yet Another of those, we have a lot. If Kotlin works for you, great! But if you want to have the control of the low-level and the performance it can sometimes give you, that’s why Rust. And if you love Rust enough to write purely high-level stuff in it where you use .clone() everywhere, that will also work.

                    1. 1

                      if you’re using rustup a nightly install / update is as easy as stable, cargo stuff is then cargo +nightly <build/foo>

                      I feel your pain for the “time to MVP” stuff and prototyping around. On the other side I’m really glad I took some more time to prototype for some of my server side projects. Where in kotlin my classes feel like a bloaty dependency mess. And sometimes it can help to just use owned Strings and clone + unwrap your way to an MVP, leaving the performance optimization for later.

                    2. 4

                      Nim is roughly in the same space as Rust, but with somewhat better ergonomics in the language.

                      1. 7

                        Yes, and zig and D-lang. All three of these deserved a mention if Ada gets a spot, in my opinion.

                        1. 2

                          Just as my own sanity check, D is not memory safe: https://run.dlang.io/is/FxoDZr. I haven’t tried this, but I believe Ada rejects such code.

                          1. 2

                            Neither is Zig. Nim is memory safe thanks to GC, and possibly thanks to a borrowing system once they get it polished.

                            Incidentally, Araq just hours ago posted that async seems to be working with it ORC: https://forum.nim-lang.org/t/6549#42774

                            1. 2

                              That won’t compile if you add @safe: to the top of the module or individually to each function: https://run.dlang.io/is/G9fIev

                          2. 4

                            Nim has a garbage collector… I wouldn’t call it the same space as Rust. More like same space as Go?

                            1. 3

                              People are using Rust for a lot of things where a GC would be just fine. So the space is kinda wide :)

                              1. 2

                                Maybe a GC would be fine for those cases, but by choosing rust one is choosing to avoid a GC. That’s the thing that makes rust special and different and worth talking about at all. Rust with a GC would just not be a big conversation buzz. It would be just another language.

                                1. 2

                                  Agreed. Having a deterministic destructor is a feature not found in many languages.

                              2. 1

                                Nim has a garbage collector

                                It actually has a ref counting and borrow system similar to rust now as a compiler flag.

                            2. 3

                              Automated complex refactors of multi-million line programs are not possible in Rust today.

                              With all due respect, a multi-million line code base is a problem in its own and should be separated into more manageable pieces. IDE support is not your main problem at this point :-)

                              1. 5

                                Trunkbased development in a monorepo is a fairly popular approach in some large organizations, where doing refactorings on a multi-million line code base would be reasonable. It all depends on context.

                                Added to that, the support for automated refactorings in modern Java IDEs like IntelliJ is outstandingly good. Part of this is that the language is simple enough to support this, part of it is that the market is huge. While I really think that Rust refactorings in the CLion/IntelliJ Rust plugin are coming along very nicely, they are nowhere near (and will probably never be) what is possible in Java.

                                1. 2

                                  Trunkbased development in a monorepo is a fairly popular approach

                                  I know, I’m working in one. This is why I’m saying it’s a problem in itself way beyond the IDE abilities :-)

                                  1. 1

                                    It all depends on how it’s done. As long as most development can be done in just a single or a few subprojects, then I think it is fine.

                                    The real benefit is to be able to do wide-ranging refactorings as a single change when needed. Its such a huge win compared to coordinating many releases of many interconnected projects. That is such a hassle that things risk stagnation.

                                    Of course, I you always have to work in the full multi-million-line code base with no subdivisions, then I agree, that is a bad idea.