1. 31
  1.  

  2. 7

    So I think the technical details of the integer overflow are kinda beside the point (it’s not titled “let’s learn about implicit integer conversions today” for that reason), but for reference…

    -268435455 is less than 256, but (size_t)(-268435455 * 64) is 64.

    1. 1

      Maybe it’s not completely beside the point. Being able to recognize integer overflow situations would help protect against the ‘implicit backdoor’ situation in the article. You may not ‘fix’ the patch mentioned in the article with a situation that sets up overflow, for example.

      But, that’s not the only way to ‘implicitly backdoor’ something.

    2. 5

      I’m probably stupid. Is the problem that size_t is unsigned? How can a value of 256 possibly overflow size_t on any system with more than 8 bits?

      Edit: is the problem that the allocation can now overflow size_t?

      1. 6

        In the first case, num is implicitly converted to an unsigned value. In the second, both num and (int)limit are signed values. Negative values of num will never fail the test, which could lead to a buffer overflow.

        1. 1

          Isn’t the same thing going to happen for a negative value either way?

          1. 4

            Not all negative values are equal here. A very small (e.g. -9223372036854775806), might be treated as < 256 (it’s negative), but when converted to size_t for malloc after * 64, you might end up with a small positive number, like malloc(128), which is perfectly valid. Left as an exercise is how to get malloc(0) which is where the exploit comes in.

            (disclaimer: I’m not 100% confident in this response)

            1. 5

              There is a much simpler, more mundane vulnerability here. The goal of the check

              if (num > limit)
              

              is to check the invariant that no more memory than the limit is allocated. If num is a user-controlled quantity, then e.g. a denial of service attack could be created by allocating a large amount of (virtual) memory. The problem with

              if (num > (int)limit)
              

              is that you could pick a negative number that compares smaller than limit here, but when implicitly cast to size_t in the malloc call is some large number.

              1. 1

                denial of service attack could be created by allocating a large amount of (virtual) memory

                That doesn’t create the “backdoor” that the title implies. A “backdoor” isn’t a DoS, it’s a takeover.

                1. 1

                  I am not sure if the example was supposed to introduce a backdoor or to give an example of code that could solicit a fix that introduces a security vulnerability. Basically to show this technique of letting other introduce vulnerabilities such as backdoors.

                  It could also be that he was hinting (which you pointed out elsewhere) that it is possible through the multiplication to craft a value that results in malloc(0). At least on Linux (according to the malloc man page), this returns either NULL, or a unique pointer value that can later be successfully passed to free() (following the C standard).

                  Edit: I wonder though what most implementations do though. Couldn’t worse vulnerabilities be avoided by just returning a pointer with an address that lies outside the proces’ address space, so that dereferencing the pointer would lead to a segmentation fault? (And then faking the free.)

                  Edit 2: I briefly eyeballed the glibc implementation and it seems to allocate memory with malloc(0) and return a valid pointer. Given that this is only a small amount of memory, could indeed lead to buffer overflows.

                  Nice :).

              2. 1

                The implication is that the cast to int isn’t actually changing the memory, so it changes the value, which is a bug?

                1. 3

                  The implication is that the cast to int isn’t actually changing the memory, so it changes the value, which is a bug?

                  I think the confusion probably lies in a misunderstanding of Two’s Complement (edit: I guess this is where my misunderstanding shows. This is likely architecture dependent. Might be 1’s complement. Either way, add and substract work without modification on these representations)? I don’t believe that a cast actually changes the bit patterns at all. This is why you have things like %u vs %d, which treat the representation of the bit patterns differently in printf, for instance. Observe:

                  $ cat /tmp/foo.c 
                  #include <stdio.h>
                  int main() {
                     unsigned int x = -2;
                     printf("%u %d %u %d\n", x, x, x + x, x + x);
                     return 0;
                  }
                  $ gcc /tmp/foo.c && a.out
                  4294967294 -2 4294967292 -4
                  $
                  

                  %d treats the argument like an int. %u treats the argument like an unsigned int. If you look at the generated assembly (gcc -S /tmp/foo.c && cat foo.s) of this, you’ll see only calls to movl, addl for the relevant parts (on x86 at least). No bit fiddling going on at all.

                  Same disclaimer. I’m not sure I 100% understand this all anymore.

                  1. 1

                    Makes sense! That’s what I was referring to w/ saying that it doesn’t change memory, so the value it represents depending on type is different. The result can also be different based on which OS you’re running as well if endian-ness comes into play? :)

          2. 3

            I think the main point was that it’s easy to fix the warning, but that only masks the true problem, e.g., you can get malloc(0) with an overflow.

          3. 5

            If we care at all about security, we should not be writing any software anymore (unless utterly unavoidable) in a language that allows such seemingly innocuous changes to introduce such horrendous security holes. It’s not even about this specific example, because C is so chock full of all kinds of things like this, but this screams “irresponsible and dangerous” to me.

            I see other comments on this post discussing the minutiae of integer representation and overflow; developers should not have to worry about this kind of thing every time they manipulate an integer, and in a language like Rust (which is more than capable of covering most use-cases of C, and many others besides) they don’t have to. It’s no more complicated than that: we have solved the problem. The only action required to take advantage of that is to move to a modern language.

            You may argue that C programmers are used to having to think about this kind of thing, and that therefore they are capable of preventing security issues like this. If so,

            1. Why do we still see so many memory safety-related security issues?
            2. Are they really all capable of that? It seems probable to me that many programmers aren’t even aware of this sort of issue, and despite that, it only takes one programmer who is new to C, fresh out of university, tired and/or not paying attention, or any number of other things to allow one bug like this to slip through.

            We don’t expect programmers to write all their code in assembly anymore; at a certain point we accepted that C could do the job just as well in 98% of cases (I’m aware I’m paraphrasing the history quite a bit here), and its compilers represented tools for reducing their cognitive load and improving their productivity and the quality of their code. A large number of tasks they used to have to perform manually were now able to be done automatically, and undoubtedly that helped to prevent numerous mistakes. Surely it’s time now to accept that we have a problem with security issues due to memory unsafety, and that we have the tools available to prevent them?

            1. 1

              That does not solve the problem of maintaining legacy code. Also, you are completely right in saying that we shouldn’t be dealing with this kind of issue anymore, but compilers/linters can be smart enough nowadays to check for this kind of problem.

              I believe that some kind of sanitized version of C, free of so many cases of undefined behavior, with a catchy name such as “Modern C” would do a great favor in that regard.

              1. 2

                There was a “better C” project called Cyclone. Many of its ideas live on in Rust actually.

                1. 1

                  That sounds interesting!

                2. 0

                  That does not solve the problem of maintaining legacy code.

                  It certainly doesn’t - but most of that code could perfectly well be rewritten in Rust. Given the amount of effort being put into patching security holes and maintaining that legacy C code anyway, it doesn’t seem to me that the suggestion to rewrite it is all that crazy. People in the Rust community are already doing this kind of thing.

                  compilers/linters can be smart enough nowadays to check for this kind of problem

                  True, and some of them (Rust) can prevent it at compile time.

                  I believe that some kind of sanitized version of C, free of so many cases of undefined behavior, with a catchy name such as “Modern C” would do a great favor in that regard.

                  Why the insistence on it being a variant of C? This is Rust, but with too many other benefits and nice features to list.

                  1. 2

                    Why the insistence on it being a variant of C?

                    Just being lazy and looking for the 20% of the effort that solves 80% of the problems.

                    If I build OP’s second example with -Wconversion -Werror, I get the following output:

                    ~$ cc -c test.c -Werror -Wconversion
                    test.c: In function ‘allocatebufs’:
                    test.c:10:23: error: conversion to ‘size_t’ {aka ‘long unsigned int’} from ‘int’ may change the sign of the result [-Werror=sign-conversion]
                         return malloc(num * 64);
                                       ~~~~^~~~
                    cc1: all warnings being treated as errors
                    

                    So, just as an example: what if the compiler had something like a -hardened option which both (a) enables all relevant checks and (b) treats all of them as errors?

                    Obviously there is more to making C a safe language, and it can be painstakingly hard to get it right. But at least that would be an improvement. I do believe people should write their new code in a safer language, such as Rust, but at least we can do something about legacy code quality as well.

                    1. 1

                      So, just as an example: what if the compiler had something like a -hardened option which both (a) enables all relevant checks and (b) treats all of them as errors?

                      at least we can do something about legacy code quality as well.

                      In principle I agree that this is important and would be helpful, but I’m doubtful that the C compiler has (or even can be reasonably expected to gain) the capability to catch everything that it would need to for this to be a solution to the more general problem of memory unsafety. The advantage of a language like Rust is that it’s been designed from the ground up to deal with issues like this at compile time; I’m almost inclined to say that a partial C solution would be worse than the current situation, due to the possibility of giving programmers a false sense of security.

                    2. 1

                      People in the Rust community are already doing this kind of thing.

                      FWIW, I did not write ripgrep because I was concerned about the security of grep. I wrote it because I wanted to build a fundamentally different tool that exploited better algorithms to make it faster. (OK, technically, the initial version existed because I wanted to benchmark the underlying regex library in a real world use case.)

                      1. 1

                        Fair enough; I apologise if I sounded like I was trying to put words in your mouth about your motivations, that wasn’t my intention. It was only meant to be an example of a Rust rewrite of an existing tool.

                        Edit: the same goes for the other examples I linked to.

                3. 1

                  Shouldn’t num be size_t in the first place? A signed argument type indicates that it’s perfectly happy to accept negative numbers by design, when it clearly isn’t. This function would arouse suspicion merely by looking at its signature. If code of this quality gets through review, security issues are guaranteed by default.