Why not? Macros can be very useful. For example, say I have a dispatch table to call function’s with a common signature and set of local variables. If there are 30 different functions, a macro defining the function and declaring the common variables means that if something changes I only have to change it in one place. This is more than just a ease of coding thing: if I change from signed to unsigned or change the width of an integer and forget to change it in one place, there can be serious and hard-to-find consequences.
Don’t use fixed-size buffers.
Always use static, fixed-sized buffers allocated in the bss, if you can get away with it (that is, you know the maximum size at compile time). Allocation can fail at runtime, and adding checks everywhere for this is error-prone. If you’re allocating and freeing chunks of memory at runtime, you run the risk of use-after-free, reference miscounts, etc.
If the size of a block isn’t known until runtime, but is known at startup, allocate the necessary memory at startup and free it at shutdown.
Only as a last resort should you be doing allocation and freeing repeatedly during runtime, when the set of objects and their sizes depends on data only accessible while running.
Not only generic recommendations like Prefer maintainability (when should we not prefer maintainability?) or Use a disciplined workflow (yes, but what kind of workflow?), some of them are against common C best practices, like: Do not use a typedef to hide a pointer or avoid writing “struct” .
Taking into account opaque pointers are something standard in stdlib and highly recommended to hide complexity and allow code change, I don’t know from where he got these ideas.
Opaque pointers hidden behind typedefs are something I’ve never been totally comfortable with, though I guess I’ve been using them without knowing! Where in libc are they used?
typedef void* lobster_handle_t; is probably the most common way–of which I’m aware–of exposing types and structs for public consumption without giving away internal implementation details to users. This is doubly useful if you have, for example, the same interface implemented differently on different platforms: your _win32.c and _posix.c variants are chosen based on #ifdefs, but user code including your headers only ever sees the opaque pointer.
Sorry I was thinking just “opaque pointer” not one hidden behind a typedef. An example of a completely opaque type (from the perspective of the standard library) is va_list. Extending beyond the C standard library, you have things like pthread_t in POSIX (which could be “the standard library” depending on your definition), which is of unspecified type.
Keep in mind, va_list is not necessarily a pointer, and it’s only opaque in the sense that its contents are undefined and unportable. On x86-64 Linux, for example, it’s a 24 byte struct, and may be defined (depending on your compiler, headers, and phase of moon) as:
struct __va_list_struct {
unsigned int gp_offset;
unsigned int fp_offset;
union {
unsigned int overflow_offset;
char *overflow_arg_area;
};
char *reg_save_area;
};
Right, I was trying to think of an example that is an explicitly opaque type hiding behind a typedef. It’s always interesting to see how POSIX and/or C sometimes mandates somethings as completely undefined by type, but not others. jmp_bufhas to be an array type, for example, but is not specified beyond that, and va_list is explicitly of any type at all.
if I change from signed to unsigned or change the width of an integer and forget to change it in one place, there can be serious and hard-to-find consequences.
Agree, which is why using typedefs to make maximal use of C’s sad type system is a better move than a mere macro. Also, macros can do weird things when expanded in code, and it’s easy to end up with a codebase that is unreadable and ungreppable because of having to continually expand non-intuitive macros. They’re handy, in moderation, but overuse is not so great.
Only as a last resort should you be doing allocation and freeing repeatedly during runtime, when the set of objects and their sizes depends on data only accessible while running.
Spoken like a true Fortran programmer! ;)
More seriously, anything that is actually interactive and of any real practical use is easier coded with dynamic allocation. Also, the number of people that properly write fixed-size allocation code without leaving gigantic security holes and undefined behavior open is small. Better just to use malloc and free and know that you have problems than to hope somebody didn’t mismatch a buffer size with a differently-spec'ed memmove call.
That said, in a library, if you don’t allow users to specify their own allocation routines you are bad and you should feel bad.
~
Overall, I agree that this advice is not so great, probably because the author hasn’t had to deal with producing libraries for others to consume. That very much colors how these things are evaluated.
They’re handy, in moderation, but overuse is not so great.
That’s true of just about anything, but yes, macros are a sharp tool. It’s very easy to hurt yourself if not used very carefully, but like any sharp tool sometimes there’s a good use case. Never say never. :)
More seriously, anything that is actually interactive and of any real practical use is easier coded with dynamic allocation.
True, but not everything need be interactive. The most critical code I work on right now is highly dynamic at runtime, but does no memory allocation after startup. We calculate the sizes of various structures based on parameters provided by the system at startup, and allocate memory once. This is necessary for various reasons, but most importantly because of performance; we deal with tens-of-thousands of work units a second, of varying size. Repeatedly allocating and freeing blocks would rapidly result in fragmentation.
We originally thought about allocating fixed-size blocks, since most modern allocators would handle that well so long as there weren’t any other allocations happening. Things like tcmalloc would still probably be okay, but at the end of the day we decided to use a static allocation scheme with what amounts of a large array with chase pointers in each slot, making allocation an O(1) operation with zero fragmentation (basically a slab allocator). Additionally, we can use mlock to keep those pages in memory to avoid any indeterminacy with swapping.
Variable-sized data is fed into a ring buffer with chase pointers and we keep pointers to things in the ring in the slab-allocated structures; we never copy out of the ring. We track the ring pointers and invalidate any data in a block that gets overwritten while in use (which is surprisingly cheap if you do it right).
(Sorry, that was a big digression, but I really like working on that code.)
Also, the number of people that properly write fixed-size allocation code without leaving gigantic security holes and undefined behavior open is small.
I would argue that writing strncpy(foo, bar, BUFSIZE) is less error-prone than strncpy(foo, bar, dynamically_allocated_size_that_changes). (I admit that’s a contrived example.)
Again, obviously, not everything can work this way. There are times when you have to use dynamic allocation, but, at least in my experience, people have a bigger problem tracking reference counts and avoiding use-after-free than they do dealing with fixed-size buffers.
it’s easy to end up with a codebase that is unreadable and ungreppable because of having to continually expand non-intuitive macros
That’s true, although macros are also sometimes used to fix the problem that C codebases are often hard to grep in the first place. The Linux kernel uses a whole series of WARN macros partly for that reason. Lots easier to grep for WARN_ONCE in a big source tree than have to pore through every inline use of printk.
Why not? Macros can be very useful. For example, say I have a dispatch table to call function’s with a common signature and set of local variables. If there are 30 different functions, a macro defining the function and declaring the common variables means that if something changes I only have to change it in one place. This is more than just a ease of coding thing: if I change from signed to unsigned or change the width of an integer and forget to change it in one place, there can be serious and hard-to-find consequences.
Always use static, fixed-sized buffers allocated in the bss, if you can get away with it (that is, you know the maximum size at compile time). Allocation can fail at runtime, and adding checks everywhere for this is error-prone. If you’re allocating and freeing chunks of memory at runtime, you run the risk of use-after-free, reference miscounts, etc.
If the size of a block isn’t known until runtime, but is known at startup, allocate the necessary memory at startup and free it at shutdown.
Only as a last resort should you be doing allocation and freeing repeatedly during runtime, when the set of objects and their sizes depends on data only accessible while running.
I feel the writer is not so experienced with C.
Not only generic recommendations like Prefer maintainability (when should we not prefer maintainability?) or Use a disciplined workflow (yes, but what kind of workflow?), some of them are against common C best practices, like: Do not use a typedef to hide a pointer or avoid writing “struct” .
Taking into account opaque pointers are something standard in stdlib and highly recommended to hide complexity and allow code change, I don’t know from where he got these ideas.
Opaque pointers hidden behind typedefs are something I’ve never been totally comfortable with, though I guess I’ve been using them without knowing! Where in libc are they used?
typedef void* lobster_handle_t;is probably the most common way–of which I’m aware–of exposing types and structs for public consumption without giving away internal implementation details to users. This is doubly useful if you have, for example, the same interface implemented differently on different platforms: your_win32.cand_posix.cvariants are chosen based on#ifdefs, but user code including your headers only ever sees the opaque pointer.Wouldn’t a lobster handle just be a claw?
Or the tail
Forward declaration is the new hotness:
It brings no benefits to C code because all pointer types implicitly cast to each other, but in C++ they don’t and it’s definitely preferred there.
Whoa, no they don’t.
void *implicitly converts to any other type of (non-function) pointer, and vice-versa, but that’s it.(many compilers do allow for function pointer <->
void *conversions, even implicitly, but I think that’s an extension for POSIX compatibility.)MSVC/GCC/clang all allow it, but they do warn about it by default.
Tisn’t a valid type name in C. You have to usestruct Tunless you supply a typedef.FILE, for example.
Correct me if I’m wrong, but doesn’t one usually use a
FILE *rather than working with a rawFILE?Sorry I was thinking just “opaque pointer” not one hidden behind a typedef. An example of a completely opaque type (from the perspective of the standard library) is
va_list. Extending beyond the C standard library, you have things likepthread_tin POSIX (which could be “the standard library” depending on your definition), which is of unspecified type.Keep in mind, va_list is not necessarily a pointer, and it’s only opaque in the sense that its contents are undefined and unportable. On x86-64 Linux, for example, it’s a 24 byte struct, and may be defined (depending on your compiler, headers, and phase of moon) as:
Right, I was trying to think of an example that is an explicitly opaque type hiding behind a typedef. It’s always interesting to see how POSIX and/or C sometimes mandates somethings as completely undefined by type, but not others.
jmp_bufhas to be an array type, for example, but is not specified beyond that, andva_listis explicitly of any type at all.time_tStandard C does not mandate a definition at all (it could be an integer, could be a float, could be a structure). POSIX defines it though.Time is an illusion. Lunchtime doubly so.
FILE * is the more visible example.
Agree, which is why using
typedefs to make maximal use of C’s sad type system is a better move than a mere macro. Also, macros can do weird things when expanded in code, and it’s easy to end up with a codebase that is unreadable and ungreppable because of having to continually expand non-intuitive macros. They’re handy, in moderation, but overuse is not so great.Spoken like a true Fortran programmer! ;)
More seriously, anything that is actually interactive and of any real practical use is easier coded with dynamic allocation. Also, the number of people that properly write fixed-size allocation code without leaving gigantic security holes and undefined behavior open is small. Better just to use
mallocandfreeand know that you have problems than to hope somebody didn’t mismatch a buffer size with a differently-spec'edmemmovecall.That said, in a library, if you don’t allow users to specify their own allocation routines you are bad and you should feel bad.
~
Overall, I agree that this advice is not so great, probably because the author hasn’t had to deal with producing libraries for others to consume. That very much colors how these things are evaluated.
curls up in a ball, rocks back and forth, crying
That’s true of just about anything, but yes, macros are a sharp tool. It’s very easy to hurt yourself if not used very carefully, but like any sharp tool sometimes there’s a good use case. Never say never. :)
True, but not everything need be interactive. The most critical code I work on right now is highly dynamic at runtime, but does no memory allocation after startup. We calculate the sizes of various structures based on parameters provided by the system at startup, and allocate memory once. This is necessary for various reasons, but most importantly because of performance; we deal with tens-of-thousands of work units a second, of varying size. Repeatedly allocating and freeing blocks would rapidly result in fragmentation.
We originally thought about allocating fixed-size blocks, since most modern allocators would handle that well so long as there weren’t any other allocations happening. Things like tcmalloc would still probably be okay, but at the end of the day we decided to use a static allocation scheme with what amounts of a large array with chase pointers in each slot, making allocation an O(1) operation with zero fragmentation (basically a slab allocator). Additionally, we can use
mlockto keep those pages in memory to avoid any indeterminacy with swapping.Variable-sized data is fed into a ring buffer with chase pointers and we keep pointers to things in the ring in the slab-allocated structures; we never copy out of the ring. We track the ring pointers and invalidate any data in a block that gets overwritten while in use (which is surprisingly cheap if you do it right).
(Sorry, that was a big digression, but I really like working on that code.)
I would argue that writing
strncpy(foo, bar, BUFSIZE)is less error-prone thanstrncpy(foo, bar, dynamically_allocated_size_that_changes). (I admit that’s a contrived example.)Again, obviously, not everything can work this way. There are times when you have to use dynamic allocation, but, at least in my experience, people have a bigger problem tracking reference counts and avoiding use-after-free than they do dealing with fixed-size buffers.
That’s true, although macros are also sometimes used to fix the problem that C codebases are often hard to grep in the first place. The Linux kernel uses a whole series of
WARNmacros partly for that reason. Lots easier to grep forWARN_ONCEin a big source tree than have to pore through every inline use ofprintk.