1. 20
  1. 9

    This kind of code makes me very nervous. Reasoning about the flow control is a lot of effort. If you want to write code that’s maintainable and auditable, then you should use option types for anything that can store a valid value or a not-present placeholder. This is easy in C for pointers because all pointers are option types (just badly designed ones that explode if you forget the not-present check). You can do it for a lot of integer types if you’re willing to define something like 0 or INT_MAX as the not-present marker. For other structs then you can often find a bit or even a whole char spare somewhere to indicate the not-present value.

    If you do that, then you have a single cleanup function that unconditionally tries to destroy every field. The cleanup functions for the type used in every field do the check for the not-present value. Your code ends up looking something like this:

    struct driver_data
    {
    	struct A *a;
    	struct B b;
    	opaque_t c;
    };
    
    // Each of these either initialises the pointed-to value and returns true,
    // or sets rc_out to the return code and returns false.
    _Bool acquire_a(struct A ** a, int *rc_out);
    _Bool register_b(struct B* b, int *rc_out);
    _Bool make_c(opaque_t *c, int *rc_out);
    
    // Each of these destroys the pointed-to object and sets the value 
    // to the not-valid state.
    void destroy_c(opaque_t *c);
    void unregister_b(struct B* b);
    void release_a(struct A**a);
    
    int driver_init(struct parent_dev *parent)
    {
    	if (!(parent->private_data = kzalloc(sizeof struct driver_data,
    	                                   GFP_KERNEL));
    	{
    		return -ENOMEM;
    	}
    	int rc = 0;
    	if (acquire_a(&parent->private_data->a, &rc) &&
    	    register_b(&parent->private_data->b, &rc) &&
    	    make_c(&parent->private_data->c, &rc)
    	{
    		return 0;
    	}
           
    	driver_cleanup(parent);
    
    	return rc;
    }
    
    void driver_cleanup(struct parent_dev *parent)
    {
    	if (parent->data)
    	{
    		destroy_c(&parent->data->c);
    		unregister_b(&data->b);
    		release_a(&data->a);
    	}
    	kfree(parent->data);
    }
    

    Even this is a bit messy and is one of the reasons that I much prefer C++11 or newer to C for low-level development. The C++ version of this provides a constructor and a destructor for each of these and looks like this:

    struct driver_data : public generic_driver_data
    {
    	std::unique_ptr<A> *a = { acquire_a() };
    	std::optional<struct B> b = { register_b() };
    	std::optional<opaque_t> c = { make_c() };
            explicit operator bool() const
    	{
    		return a && b && c;
    	}
    };
    
    int driver_init(parent_dev *parent)
    {
    	std::unique_ptr<driver_data> data{new (GFP_KERNEL) driver_data()};
    	if (!data)
    	{
    		return -ENOMEM;
    	}
    	if (!(*data))
    	{
    		return -ENOSPC;
    	}
    	parent->private_data = std::move(data);
    	return 0;
    }
    

    This is much less code and is far harder to get wrong. This assumes the existence of an operator new that takes the flags that kzalloc takes as extra arguments beyond the required size_t and is marked noexcept. Such an implementation is allowed to fail and return nullptr. If it does, then the constructor is not called. If this succeeds, the constructor is called and is a default implementation, which will construct each field, in turn, by calling the functions from the original example.

    The user-defined conversion operator on driver_data determines if the struct is fully initialised by checking that each field is initialised. The default destructor will destroy the fields in the reverse order to their creation. Because each of these is some form of option type, the real cleanup code will be called only if the field has been initialised.

    Note the complete lack of any cleanup code in the C++ version. This is all generated automatically from the RAII pattern. If data goes out of scope without the pointee being moved to the parent device, then the pointee is destroyed (if it is not null). When the pointee is destroyed, each field is destroyed. No resource leaks and no complex flow control to follow in the source. There may be some space overhead for std::optional with some of the fields, but you can create a custom version with your own not-present placeholder for common types if you need to avoid this.

    This assumes that parent_dev has a field that looks like this:

    	std::unique_ptr<generic_driver_data> private_data;
    

    Depending on the following definition:

    struct generic_driver_data
    {
    	virtual ~generic_driver_data() = 0;
    };
    

    Now there is no need for the explicit cleanup at all because the destructor on driver_data will be called when the parent_dev is destroyed (or when it explicitly resets its private_data field). If you want to avoid a vtable on driver_data then you can hoist this one level up by making parent_dev a templated class that you specialise on the device type (you need this vtable somewhere. In Linux each device has a pointer to one already as an explicit field).

    The C++11 version is less code, is trivial to review, and has far fewer things that can go wrong. It will generate almost identical code to a correct implementation of the C version (if compiled without exceptions or RTTI, neither of which it needs). There’s really no good reason for choosing to write systems code in C these days.

    1. 2

      The article alludes to an idea similiar to the one you present with the acquire/register/make functions.

      I think having some kind of validity flag is a nice property and can really help when you are debugging. When you are in control of the code base in question, you can also design the data structures to support this kind of usage. In my case, the types are given by whatever Linux kernel my driver has to interface with. There are usually about zero guarantees that a data structure won’t change between even patches. Because of this I did not and do not feel comfortable to pick a combination of fields in any given struct as “non-valid marker”.

      I think the C code you give will only work if the zero-initialized value is also the one that gets recognized as invalid. The reason is that when e.g. acquire_a fails, make_c will not run. Therefore, c is still left in zero-initialized state by kzalloc. In order for destroy_c to be no-op, it needs to recognize zero-initialized as invalid. Is this what you intended or am I missing something? This would also imply constraints on data structures I do not control.

      I consider the given alternative a workable workalike in the case where you can control the data structures. I would feel inclined to use it if the acquire/release, register/unregister, make/release function pairs already existed with the somewhat funky proposed signature.

      I would also expect that the code generated by a C++ compiler for the same thing done with RAII to be pretty much the same as the one generated for my C code. I agree that with C++ it’s easy to get the desired behavior without having to work on the control flow myself.

      I came up with the described structure in a context, i.e. writing Linux drivers in C, where I don’t control all the code and I think that for this context the described approach holds utility.

      1. 2

        Yup, if you’re writing code for the Linux kernel then you’re basically stuck writing code that’s verbose and difficult to maintain. With FreeBSD, the kernel headers generally work in C++ code, so you can use C++ within your own module but Linux uses a lot of things that are not in the common subset of C and C++ throughout the headers and so it’s basically impossible to use any language other than C.

      2. 1

        There C version above will get flagged on code review by maintainers. And the C++ part is of course is wholly irrelevant to Linux kernel development, could as well be in Haskell.

      3. 3

        I like it. This is an example of pattern I’ve seen come up over and over again. We say we’re using “general purpose programming languages”, but really they are algorithm-definition languages. In the case of initialization and shutdown, you’ve got two algorithms, but really you want to specify one particular arrangement of interdependent resources. When you have to write two or more procedures to realize such an abstraction, you always have to worry about them staying consistent – usually in a way in which information is missing, making it difficult to analyze! Absent defining a DSL, one solution is to make a single parameter that is dynamically overloaded to perform two separate algorithms, utilizing common structure.

        Another case where we’ve seen this is the trend in IMGUI and React.js-style UI frameworks, and especially the “hooks” idea. A single render function is overloaded to provide initialization, shutdown, state management, dom updating, static-html rendering, etc. If you have an external DSL – like some html/xml-extensions (a la older Angular) – or if you have a purpose-built compiler – like Svelte – then you don’t need to play this game. When you do play this game, you need to be careful to write code that respects both execution contexts. For example, see the weird linters that enforce the “rules of hooks” in React.js, as well as all the caveats for server-side rendering frameworks.

        1. 2

          Thought of another good example: Functional Lenses.

          One way to think of a lens is as a pair of getter and a setter functions. However, you can define both with a single parameterized “update” function. The parameter is a procedure applied to the part-to-be-updated within an expression to rebuild the updated whole. You can use the update function as a getter if you call it with a procedure that unwinds the stack, returning its argument. To use it as a setter, you simply parameterize with a function that ignores its argument and returns a constant.

        2. 2

          This looks very much like code seen through lens of an assembly programmer. Cleanup after failed init is similar to shutdown cleanup, so why not just jump straight in there to reuse a few instructions?

          1. 1

            Yes, that’s very much what I saw :D Which brought me to your question, reformulated for the scope of the article: The code’s already there, but how to reach it, in C?

          2. 2

            You could manage this with a counter?

            int driver_init(struct foo *p) {
              int rc = 0;
              int rescount = 0;
            
              p->a = acquire_a();
              if (!p->a) { rc = -EBADNESS; goto error; }
              rescount++; // got one
            
              p->b = acquire_b();
              if (!p->b) { rc = -ETHING; goto error; }
              rescount++; // got two
            
              // successful return
              return rc;
            error:
              driver_destroy(p, rescount);
              return rc;
            }
            
            void driver_shutdown(p) {
              driver_destroy(p, INT_MAX);
            }
            
            void driver_destroy(struct foo *p, int rescount) {
              // Note reverse order!
              if (rescount > 0) {
                release_a(p->a);
                rescount--;
              }
              if (rescount > 0) {
                release_b(p->b);
                rescount--;
              }
            }
            

            You can get rid of the reverse ordering if you are happy to include the number of resource in driver_destroy() and do rescount = total_number_of_resources - rescount at the top (and always decrement, and test for <= 0):

            void driver_destroy(struct foo *p, int rescount) {
              int rescount = 2 - rescount; // we can see we have two resources, so can verify this number
              if (rescount <= 0) {
                release_a(p->a);
              }
              rescount--;
            
              if (rescount <= 0) {
                release_b(p->b);
              }
              rescount--;
            }
            
            1. 3

              It seems that you are releasing resources in the order you acquired them. Typically you want to release them in the reverse order, i.e. release b then a as b might depend on a. Did you intend to switch release_a and release_b in the second example?

              The destroy function can be rewritten without doing arithmetic at runtime:

              switch(rescount) {
                   case 2:
                      release_b(p->b); // note fallthrough
                  case 1:
                     release_a(p->a); // note fallthrough
                  case 0:
                       // nothing
              }
              

              The downside is that you now have absolute numbers which you have to match to the increments in the driver_init function.

              By turning the cases into goto labels you win the ability to give names to them instead of numbers.

              1. 2

                Thinking about this more, you either need to couple the number of resources to free into the driver_shutdown method (seems very fragile) or add the INT_MAX in as another case label (although since we’re no longer counting, this could be any sentinel instead of INT_MAX).

                1. 2

                  Did you intend to switch release_a and release_b in the second example?

                  No I didn’t, thanks, that was an error.

                  I think the switch/case is nice (avoids the decrement boilerplate). It has the same issue with coupling the freeing code to the number of items, but is a little more obvious (2 -> 1 -> 0) rather than (rescount = 2 - rescount).

              2. 2

                When I think about duplication these days I ask, “what can go wrong?” If things going out of sync will be caught in tests, then I prefer a little duplication to the extra coiling in the control flow. If it won’t get caught in tests, I will first try to add the tests to catch it. But these things highly depend on context.

                Do you have a redundant return in your code? Eliding details:

                int driver_init_or_shutdown(struct parent_dev *parent, bool init) {
                	int rc = 0;
                        ....
                	if (init) {
                		...
                		return rc;  // A
                	} else {
                                ...
                		goto out_full;
                	}
                	return rc;  // B
                
                out_full:
                        ...
                	return rc;
                }
                

                A and B seem duplicative here. You could drop either. Dropping B looks nicer to me.

                1. 2

                  Thanks for pointing out the duplicate return in the code sample. I have just updated the article accordingly. Removing the return at B also allows me to get rid of the goto out_full and the corresponding label.

                  As a non-Lobsters-commenting reader has pointed out, some code bases only allow goto for error handling. The original article used goto out_full for the regular shutdown flow. This is gone now, gotos are only used in error cases.

                  1. 3

                    That looks much cleaner! I’d also get rid of the else clause, and move it down to the same indentation as the rest of the shutdown code:

                      data = parent->private_data;
                      destroy_c(data->c);
                    out_no_c:
                      ...
                    
                    1. 1

                      Hey, good point! I have just updated the article. Thanks for your feedback! :)

                2. 2

                  I’m not super familiar with how you’d do this in C, but what would make the following work (I’m not 100% certain of how to do lambdas in C - I’ve used java-ish syntax ( () -> foo ) is a function point to something which returns foo.

                  
                  struct cleanup_data = init_cleanup_data();
                  
                  int driver_init(struct parent_dev *parent) {
                    int rc = 0
                    
                    rc = with_cleanup(rc,
                      () -> { data = kzalloc(sizeof struct driver_data, GFP_KERNEL); return data; }, // allocation
                      () -> { kfree(data); return -ENOMEN; }); // cleanup
                    rc = with_cleanup(rc,
                      () -> { data->a = acquire_a(); return data->a; }, // allocate
                      () -> { release_a(data->a); return -ENOSPC; }); // cleanup
                    rc = with_cleanup(rc,
                      () -> { return register_b(&data->b); },
                      () -> { unregister_b(&data->b); return -ENOSPC; });
                    rc = with_cleanup(rc,
                      () -> { data->c = make_c(); return IS_ERR(data->c); },
                      () -> { return -ENOMEM; };
                  
                    return rc;
                  }
                  
                  int with_cleanup(int rc, int (*allocFunction)(), int (*cleanupFunction)()) {
                    push_cleanup(cleanup_data, cleanupFunction);
                    if (!rc) return rc;
                    if (!allocFunction()) return 0;
                    while (cleanup = pop_cleanup(cleanup_data)) {
                      rc = cleanup();
                    }
                    return rc;
                  }
                  

                  The main benefit is that resource init and cleanup are very close to each other. This makes it more difficult to make a mistake.

                  1. 1

                    There are a lot of better ways of doing this in pretty much any language that isn’t C. C++ makes it a lot more concise and less error-prone and is ABI compatible with C. The only reason that it can’t be used here is that Linus has a religious aversion to the language and ensures that the kernel headers can’t be parsed in C++ mode.

                    1. 1

                      C has no lambdas. Each place were you have “() ->” would have to be a separate function.

                    2. 1

                      The reason why no one does this is because it introduces an extra level of indentation, and you get around 4, perhaps 5 levels total. Of course the modern response is a systemized version of the second strategy in the form of devm resources.

                      1. 1

                        devm_ resources do scratch the same itch. An issue I encounter with them is that they don’t combine well with manually managed ones. Expanding on the article, if A and C are managed, but B is not, the devm machinery will release C and A in the correct order, first C, then A. But there’s no correct place to release B, I have to do it either before releasing C,A or after. Otherwise I have to write an extra function to release B and plug it into devm.

                      2. 1

                        I would change if (init) to if (!init) goto cleanup. It saves an indent for the meat of the function.

                        1. 1

                          C really lacks defer. Not much more to be said here. The split resource allocation and release required by goto doesn’t work well for me, I find it too easy to get lost in.

                          1. 1

                            I’d also love a defer in C. It works well when you want to clean up something at the end of the current scope.

                            However, in my example, the _init and _shutdown functions are called at different times. If I defer in the init function, all the deferred cleanups will be run at the end of the init function - but I want them to run only in case of error! This could be managed by e.g. looking at rc. But then the cleanup code for _shutdown still has to be written, leaving the duplication mentioned in the problem statement.

                            Defer is much more general in Go because you can start a concurrent function with go. You can extend the lifetime/runtime of that function as desired and use defer to clean up when it has reached the end.

                            1. 1

                              Doesn’t Go’s defer run on end-of-function instead of end-of-scope? I’m thinking more of Zig, which also includes errdefer https://ziglang.org/documentation/0.9.0/#errdefer

                              1. 1

                                You are right, Go’s defer runs at the end of the function.

                                I just had a look at the Zig link. errdefer is a cool feature. This solves the first problem I state above - not running the cleanup code when returning success on init. It does not solve the problem of having to write a separate shutdown function that consists in essence of all the errdefered calls in reverse order.

                                The code presented in the article covers both needs. Regarding it being “easy get lost in”, there’s - unfortunately - a bit of discipline required to get it right. This is the bad news. The good news is that it’s not the kind of absolute discipline you need for, say, memory management. To check whether you got it right, you read the function from both ends at the same time (e.g. in two editor panes) and verify that the order of gotos matches the order of the labels and release actions.