1. 51

  2. 46

    To me Rust’s strings “clicked” when I realized that C also has two kinds of strings:

    1. char* from malloc that you must call free() on.
    2. char* not from malloc, or pointing to a middle of someone’s allocation, that you must never call free() on.

    Rust uses String for case 1, and &str for case 2.

    C uses char* for both, giving illusion that they’re interchangeable, but they’re not — you’re going to leak or crash if you mix them up.

    1. [Comment removed by author]

      1. 10

        Yes, of course. That’s fundamental. The point is that Rust puts “must free it” in the type, so the type system prevents you from mixing “must free” with “must not free”. In C things are “dynamically typed” in the memory-management aspect.

      2. 1

        Yeah C++ added string_view for the case where you don’t own it, and string is something you must deallocate. But it’s still hard to use and people say string_view combined with certain coercions is a recipe for leaks or UAF.

        1. 1

          Indeed, welcome after, C++. To me, string_view is the end to the age old code review fights over whether function arguments should be const std::string& or const char* (when you don’t need a mutable string). As a replacement for these, I don’t think string_view is any more of a recipe for leaks or use-after-free than these alternatives already were, so I wouldn’t say it’s hard to use. Just s/const std::string&/std::string_view/g on your code base, pretty much, for a pretty safe & easy performance win.

      3. 16

        Rust strings don’t seem hard. They are hard.

        But that’s because strings in general are either hard or (slow,unsafe)…

        1. 8

          Yeah, indeed. In my case it wasn’t the different string types, but the many different conversions between them. The not-quite-64 combinations of {source is String, &str, OsString, OsStr, CString, CStr, PathBuf, Path, Vec, &[u8]} * {which of those 8 types is the target type} * {is this conversion always succesful, or lossy, or fallible, or should I go via another type} is a lot to take in if you haven’t internalized the common patterns yet.

          Luckily, one can work in Rust just fine without learning the conversions up front; looking them up as needed works fine. The “I have an X, I and want a Y” conversion cheat sheets below have been a big help for me, both coding and learning goes faster now.

          Still a joy to work in!

          1. 3

            As someone coming from interpreted languages, and higher-level compiled languages (e.g. Nim, Crystal), this article really helped Rust’s strings click for me, as well as understanding the why behind it all.

            1. 2

              While strings certainly can be complex in a language-agnostic sense, there’s also the Rust-specific wrinkles of A) the borrow checker and B) the choice made for what type you get when you write a string literal in source code, both of which combine to create ergonomic issues and “oh, that’s just a pattern you have to memorize” things in Rust that aren’t attributable solely to the difficulty of strings in general.

              Also compounding this slightly is the proliferation of special-purpose string-like types Rust offers on top of the basic introductory-level actual string types.

            2. 1

              If you want what this article describes as “most languages” String semantics just use String and don’t mutate. Doesn’t seem hard?

              1. 8

                Not really. As soon as you put a string literal in your code you’re working with &'static str too. Try to pass a String to two functions one after the other and you’ll have a bad time. To really push this “most languages” modus operandi you could try to use Arc<str> or Arc<String> everywhere and clone a lot, but by the time you understand why this works you’re most of the way to understanding String/&str anyway.

                1. 1

                  I figured it goes without saying borrow semantics (and thus the requirement to move, borrow, or clone on use) would still be in play. That’s not string/str/String specific though.