1. 39
  1.  

  2. 18

    This is why I get a little uncomfortable when people suggest Rust fixes tons of security issues. Yes, it will fix some of them. No, just because a Rust program compiles doesn’t mean that it won’t have problems.

    Rust is memory safe. Nothing more, nothing less.

    1. 2

      Hey, while you’re here. :)

      I thought it was really odd that read_to_end() would return a vec<u8>, but there are zero write functions that accept such a type. It’s a strange asymmetry.

      1. 10

        It’s called old_io for a reason ;)

        The work on the new IO library was one of our most commented-on RFCs ever, and is not something I have particularly strong feels about, so I don’t know if that’s changed.

        That said, the corresponding function is write_all: http://doc.rust-lang.org/nightly/std/old_io/trait.Writer.html#tymethod.write_all

        And the reason the signatures are different is very instructive actually, so this is a great question! read_to_end() gives a Vec<u8> because it’s giving you ownership over what it creates. This makes sense, as it’s exactly what you want. However, there’s no reason for write_all() to take ownership: once you write some bytes, you can still do useful stuff with that buffer. There’s no reason to consume the original. Therefore, it makes sense to simply borrow a slice for a while. Does that make sense?

        1. 3

          There’s no way to return an owned array? Or have write_all borrow the vec for a bit?

          It’s not quite the same in C, but write takes a const pointer for the same reason. const doesn’t fully map onto the concept of owner, but allowing rust write_all to take a “const” vec would make things a lot more transparent.

          The types are easily enough converted, so it’s not the end of the world, but it does initially feel like the common problem where the type you want is exactly the type you don’t have. Perhaps more of a problem in short programs. I expect read and write will not be the bulk of the lines in real applications.

          To be honest, I had far more trouble with the array syntax. Just declaring it was kind of weird, and then every time I used it I’d get a compiler error saying I had one permutation of the tokens “u8, [, ], mut, &”, but the function being called only took some other permutation of the same tokens. :)

          1. 2

            There’s no way to return an owned array?

            That’s -> Vec<T>.

            Or have write_all borrow the vec for a bit?

            That’s (&[T]). I feel like we must have some kind of misunderstanding here, still. Unfortunately, I’m now boarding a plane from Europe to friggin' Australia, so hopefully someone else can chime in. :)

            Thanks again for the post.

            1. 1

              No worries, mate; my flight to Australia doesn’t leave til tomorrow.

              1. 14

                Woo Malaysia! I have a few hours' layover and some wifi. Let me re-go-over this from first principles, apologies if it starts off too simple.

                First, a re-cap of the types:

                • Vec<T>: a ‘vector.’ Heap allocated. Always mutable. Owns the underlying buffer.
                • [T; N]: an ‘array’. Fixed length of N elements of type T. Stack allocated. Owns the buffer.
                • &[T]: a ‘slice’. Doesn’t ever allocate/own its own memory, but points to some other memory somewhere. Always immutable.

                Because slices point to something else, they’re cheap (just a pointer and a length) and engage with the infamous ‘borrow checker.’

                So, let’s talk about these two methods:

                read_to_end(&mut self) -> IoResult<Vec<u8>>
                

                This function is called on a mutable Reader, and gives back the data. This is the “returns an owned array” thing you’re talking about, I’d think, even though it’s not an ‘array’ in Rust’s terms, but a vector. We need to return a Vec<T> here, because we’re gonna take ownership of the data we read. In other words, a function like this:

                fn read_to_end() -> &[u8]
                

                can’t possibly exist, unless it’s getting the data from a global or something, which isn’t very common. If you tried to implement it like this:

                 fn read_to_end() -> &[u8] {
                    let data = vec![];
                
                    // fill data with data
                
                    &data
                }
                

                You’ll get an error:

                hello.rs:1:21: 1:26 error: missing lifetime specifier [E0106]
                hello.rs:1 fn read_to_end() -> &[u8] {
                                               ^~~~~
                hello.rs:1:21: 1:26 help: this function's return type contains a borrowed value, but there is no value for it to be borrowed from
                hello.rs:1 fn read_to_end() -> &[u8] {
                                               ^~~~~
                

                Now, if we had implemented it method style:

                fn read_to_end(&mut self) -> &[u8] {
                    let data = vec![];
                
                    // fill data with data
                
                    &data
                }
                

                to match the signature of the thing we’re talking about, elision will assume that the lifetime is tied to the lifetime of self, and so it will pass that check, but give a different, more interesting error:

                hello.rs:10:6: 10:10 error: `data` does not live long enough
                hello.rs:10     &data
                                 ^~~~
                hello.rs:5:36: 11:2 note: reference must be valid for the anonymous lifetime #1 defined on the block at 5:35...
                hello.rs:5 fn read_to_end(&mut self) -> &[u8] {
                hello.rs:6     let data = vec![];
                hello.rs:7 
                hello.rs:8     // fill data with data
                hello.rs:9 
                hello.rs:10     &data
                            ...
                hello.rs:6:22: 11:2 note: ...but borrowed value is only valid for the block suffix following statement 0 at 6:21
                hello.rs:6     let data = vec![];
                hello.rs:7 
                hello.rs:8     // fill data with data
                hello.rs:9 
                hello.rs:10     &data
                hello.rs:11 }
                

                Our data is going to go out of scope at the end of the function, and would then get de-allocated, but we’re trying to return a reference too it. Oops! This would be a dangling pointer. Therefore, we want to take ownership, and just get the vector itself:

                fn read_to_end(&mut self) -> Vec<u8> {
                    let data = vec![];
                
                    // fill data with data
                
                    data
                }
                

                This works just fine.

                Whew! Now, in the Writer case, which type should we take? Well, we don’t plan on mutating the buffer, as you mention, we just want a pile of bytes to send down the Writer. Therefore, a slice makes sense:

                fn write_all(buff: &[u8]) 
                

                We just need to borrow these bytes for a bit. We don’t want to take ownership, because then we’d have to give it back:

                fn write_all(buff: Vec<u8>) -> Vec<u8>
                

                which just introduces more work for no benefit, and introduces mutability. Plus, because we can cheaply and easily make a slice that points to our vector, it works for basically all three types:

                fn write_all(buff: &[u8]) {
                    // code goes here
                }
                
                fn main() {
                    let vector = vec![1, 1, 1];
                    let array = [1; 3];
                    let slice = &[1; 3];
                
                    write_all(&vector);
                    write_all(&array);
                    write_all(slice);
                }
                

                (note that slice here is a slice to an array allocated on the stack)

                … does this help? I have some thoughts about how re-using a Vec<T> still wouldn’t exactly cause heartbleed, but this comment is long enough. Time to wander around the airport!

                1. 2

                  Very short summary: From briefly reading the documentation, I had no idea one could convert a vec<t> to a [t] with &.

                  Very helpful reply, thanks.

                  1. 1

                    Ahhh that would be an issue, yes. :) Glad it was helpful. Hope your flight was good, mine went pretty okay.

                  2. 1

                    vigorous applause

                    This was incredibly helpful, and explained why an API I’d written seemed to be an inordinate pain in the ass. Thanks for breaking this down for us.

                    1. 1

                      Excellent. :) Now I’m wondering if I shouldn’t fit something like this in the Book somewhere…

        2. 1

          memory safe is already huge

        3. 8

          A Reddit commenter suggests this is not heartbleed: http://www.reddit.com/r/rust/comments/2uii0u/heartbleed_in_rust/co8oegb

          Still, heed steves comment in this very thread.

          1. 5

            Opinions will clearly vary about what the real heartbleed is.

            In my opinion, the essence is reusing a buffer and failing to check a length, then copying out data from the buffer according to that length, leaking the previous contents of that buffer.

            This probably would have looked a lot more like heartbleed if I had kept a list of buffers, and pushed and popped, but I got lazy and didn’t want to write the C version for that, so I shortchanged the rust version.

            Defining heartbleed to be “the bug that lets you steal private keys” misses the point, I think. (fwiw, how many server keys were actually compromised besides the demo server? rolling keys was really more of a precaution. compare how many user passwords and cookies were compromised by passing through the same buffer.)

            1. 2

              I agree fully with your assessment of the vulnerability you wrote and just wanted to avoid that someone replicates the discussion here. What you describe is something to look out for.

              1. 2

                Actually, it might be pretty powerful to don the black hat and attempt to enumerate as many of these flaws as possible. A list that static analyzers (automated or human) can look for.

          2. 4

            There are languages that zero all buffers prior to use. I believe Go is one of them. While many of these have plenty of issues of their own, languages that prevent this particular vulnerability do exist. It absolutely makes sense for a to-the-metal systems language not to be one, though, and I think Ted has done a great job showing how buffer overruns can still happen there.

            1. 9

              To be fair to rust, it did zero the array when I declared it. But it obviously can’t zero it between uses, because (in this program) the object lifetime is unclear.

              One could allocate a new buffer for every packet, but to get to “zero overheard, fast as C” performance, some caching and reusing of buffers is necessary. If malloc/free is “too slow” for OpenSSL, someone will surely decide that new Vec<> is too slow for RustSSL.

              rust could certainly allocate clean buffers per packet, but then you’re only going to be running at LibreSSL levels of performance. :)

              1. 3

                Yeah, that makes sense; I misread your code, and thought what you were exploiting was the knowledge that the stack was going to be in the same position each time. I was wrong, and you’re indeed correct that this flaw could exist in any language.

                On a different note, though, this means that LibreSSL is doing fresh buffers each time? That may be slow, but it’s also a really welcome improvement from OpenSSL.

                1. 2

                  Each connection in libressl will use a freshly allocated buffer. The exact freshness depends on malloc (we don’t explicitly zero it). We could zero the buffers as well (before and/or after use), but that’s down a level from where libssl sits. Some systems (openbsd) will do that for you, others won’t. Leaving the cost/benefit tradeoff to the developer or sysadmin is the right call I think.