This is probably the most important principle in C API design.
I code mainly in C++, but I keep using this design for the C wrappers I need to create for my C++ APIs (to use as DLL exports and as binding points to other languages.)
I was just reading K&R C the other day. It seems in the first version C, declarations were optional. Object model in C is surprisingly elegant. If data and methods are separate then they can be evolved separately - only C allows this. On a trivial note, copy paste is better than Inheritance because the copied code can evolve separately instead of changing every time the base class changes.
In terms of generality,
Pointers > Lexical Scope
Function Pointers > Closures, Virtual Methods
Gotos > Exceptions
Arrays, Structs > Objects
Co-routines > Monads
C with namespaces, pattern matching, garbage collection, generics, nested functions and defer is the C++ that I wish had happened. Go is good but I miss the syntax of C. I recently came across Pike scripting language which looks surprisingly clean.
It seems in the first version C, declarations were optional.
Yup, which sucked. It combined the lack of compiler checks of a dynamic language, with the data-corruption bugs of native code. For instance, what happens when you pass a long as the third argument, to a function whose implementation takes an int for that parameter? đ±
Object model in C is surprisingly elegant. If data and methods are separate then they can be evolved separately - only C allows this.
Maybe Iâm unsure what youâre getting at, but many languages including Objective-C, Swift and Rust allow methods to be declared separately from the data, including adding more methods afterwards, even in separate binaries.
copy paste is better than Inheritance because the copied code can evolve separately instead of changing every time the base class changes.
But itâs worse than inheritance because, when you fix a bug in the copied code, you have to remember to also fix it every place it was pasted. I had a terrible time of this in an earlier job where I maintained a codebase written by an unrepentant copy/paster. This is the kind of nightmare that led to the DRY principle.
For instance, what happens when you pass a long as the third argument, to a function whose implementation takes an int for that parameter? đ±
Usually nothing, or rather, exactly what you would want đ. Last I checked, K&R C requires function parameters to be converted to the largest matching integral type, so long and int get passed the same way. All floating point parameters get passed as double. In fact, I remember when ANSI-C came out that one of the consequences was that you could now have actual float parameters. Pointers are the same size anyway, no struct by value parameters.
It still wasnât all roses: messing up argument order or forgetting a parameter. Oops. So function prototypes: đđ
#include <stdio.h>
int a( a, b )
int a;
int b;
{
return a+b;
}
int main()
{
long c=12;
int b=3;
printf("%d\n",a(c,b));
}
[/tmp]cc -Wall hi.c
[/tmp]./a.out
15
A lot of this is assuming arguments passed in registers. Passing on the stack can result in complete nonsense as you could have misaligned the stack, or simply not made a large enough frame.
I donât mean copy paste everything, use functions for DRY ofcourse ⊠just to get the effect of inheritance copy paste is better. Inheritance, far from the notions of biology or taxonomy is similar to a lawyer contract that states all changes of A will be available to B just like land inheritance. Every time some maintainer changes a class in React, Angular, Ruby, Java, C++, Rust, Python frameworks and libraries everyone has to change their code. If for every release of a framework you have to rewrite your entire code, calling that code reuse is wrong and fraudulent. If we add any method, rename any method, change any implementation of any method that is not a trivial fix; we should create a new class instead of asking millions of developers to change their code.
when you fix a bug in the copied code, you have to remember to also fix it every place it was pasted.
If instead we used copy paste, there would be no inheritance hierarchy but just flattened code if that makes sense and you can modify it without affecting other developers. If we want to add new functionality to an existing class we should use something like plugins/delegation/mixins but never modify the base class ⊠but absolutely no one uses or understands this pattern and everyone prefers to diddle with the base class.
In C such massive rewrites wonât happen, because everything is manually wired instead of automatically being inherited. You can always define new methods without bothering if you are breaking someoneâs precious interface. You can always nest structs and cast them to reuse code written for the previous struct. Combined with judicious use of function pointers and vtables you will never need to group data and code in classes.
Every time some maintainer changes a class in React, Angular, Ruby, Java, C++, Rust, Python frameworks and libraries everyone has to change their code.
That is simply not true. There are a lot of changes you can make to a class without requiring changes in subclasses. As a large-scale example, macOS and iOS frameworks (Objective-C and Swift) change in every single OS update, and the Apple engineers are very careful not to make changes that require client code to change, since end users expect that an OS update will not break their apps. This includes changes to classes that are ubiquitously subclassed by apps, like NSView, NSDocument, UIViewController, etc. I could say exactly the same thing about .NET or other Windows system libraries that use OOP.
Iâm sure that in many open source projects the maintainers are sloppy about preserving source compatibility (let alone binary), because their âcustomersâ are themselves developers, so itâs easy to say âitâs easier to change the signature of this method and tell people to update their codeâ. But thatâs more laziness (or âmove fast and break stuffâ) than a defining feature of inheritance.
In C such massive rewrites wonât happen
Yes, because everyoneâs terrified of touching the code for fear of breaking stuff. Iâve used code like that.
In C you would just create a new function and rightfully touching working code except for bug fixes is taboo. I can probably point to kernel drivers that use C vtables that havenât been touched in 10 years. If you want to create an extensible function, use a function pointer. How many times has the sort function been reused ?
OO programmers claim that the average joe can write reusable code by simply using classes. If even the most well paid, professional programmers canât write reusable code and writing OO code requires high training then we shouldnât lie about OO being for the average programmer. Even if you hire highly trained programmers, code reuse is fragile requiring constant vigilance of the base classes and interfaces. Why bother with fragile base classes at all ?
Technically you can avoid this problem by never touching the base class and always adding new classes and interfaces. I think classes should have a version suffix but I donât think it will be a popular idea and requires too much discipline. OO programmers on average prefer adding a fly method to a fish class as a quick fix to creating a bird class and thats just a disaster waiting to happen.
I donât understand why you posted that link. Apple release notes describe new features, and sometimes deprecations of APIs that they plan to remove in a year or two. They apply only to developers, of course; compiled apps continue to work unchanged.
OO is not trivial, but itâs much better than resorting to flat procedural APIs. Zillions of developers use it on iOS, Mac, .NET, and other platforms.
Your conclusion is not supported by evidence. Look at a big, widely used, C library, such as ICU or libavcodec. You will have API deprecations and removals. Both of these projects do it nicely so you have foo2(), foo3() and so on. In OO APIs, the same thing happens, you add new methods and deprecate the old ones over time. For things like glib or gtk, the churn is even more pronounced.
OO covers a variety of different implementation strategies. C++ is a thin wrapper around C: with the exception of exceptions, everything in C++ can be translated to C (in the case of templates, a lot more C) and so C++ objects are exactly like C structs. If a C/C++ struct is exposed in a header then you canât add or remove fields without affecting consumers because in both languages a struct can be embedded in another and the size and offsets are compiled into the binary.
In C, you use the opaque pointers idiom to avoid this. In C++ you use the pImpl pattern, where you have a public class and a pointer to an implementation. Both of these require an extra indirection. You can also avoid this in C++ by making the constructor for your class private and having factory methods. If you do this, then only removing fields modifies your ABI, because nothing outside of your library can allocate it. This lets you put fast paths in the header that directly access fields, without imposing an ABI / API contract that prevents adding fields.
In C++, virtual methods are looked up by vtable offset, so you canât remove virtual functions and you canât add virtual functions if your class is subclassed. You also canât change the signature of any existing virtual methods. You can; however, add non-virtual methods because these do not take place in dynamic dispatch and so are exactly the same as C functions that take the object pointer as the first parameter.
In a more rigid discipline, such as COM, the object model doesnât allow directly exposing fields and freezes interfaces after creation. This is how most OO APIs are exposed on Windows and we (Microsoft) have been able to maintain source and binary compatibility with programs using these APIs for almost three decades.
In Objective-C, fields (instance variables) are looked up via an indirection layer. Roughly speaking, for each field thereâs a global variable that tells you its offset. If you declare a field as having package visibility then the offset variable is not exposed from your library and so canât be named. Methods are looked up via a dynamic dispatch mechanism that doesnât use fixed vtable offsets and so you are able to add both fields and methods without changing your downstream ABI. This is also true for anything that uses JIT or install-time compilation (Java, .NET).
You raise the problem of behaviour being automatically inherited, but this is an issue related to the underlying problem, not with the OO framing. If you are just consuming types from a library then this isnât an issue. If you are providing types to a library (e.g. a way of representing a string thatâs efficient for your use or a new kind of control in a GUI, for example), then the library will need to perform operations on that type. A new version of the library may need to perform more operations on that type. If your code doesnât provide them, then it needs to provide some kind of default. In C, youâd do this with a struct containing callback function pointers that carried its size (or a version cookie) in the first field, so that you could dispatch to some generic code in your functions if the library consumer didnât provide an implementation. If youâre writing in an OO language then youâll just provide a default implementation in the superclass.
Oh, and you donât say what kernel youâre referring to. I can point to code in Linux thatâs needed to be rewritten between minor revisions of the kernel because a C API changed. I can point to C++ code in the XNU kernel that hasnât changed since the first macOS release when it was rewritten from Objective-C to C++. Good software engineering is hard. OO is not a magic bullet but going back to â70s-style designs doesnât avoid the problems unless youâre also willing to avoid writing anything beyond the complexity of things that were possible in the â70s. Software is now a lot more complex than it was back then. The Version 6 UNIX release was only about 83KLoC: individual components of clang are larger than that today.
It absolutely is. Please reuse code from an earlier version of any framework released in the last 50 years. OO was sold as the magic bullet that will solve all reuse and software engineering problems.
Do you think homeopathy is medicine just because people dress up and play the role of doctors doing science ?
How many times has the sort function been reused by using function pointers ? Washing machines donât make clothes dirtier than the clothes you put in.
Both of these projects do it nicely so you have foo2(), foo3() and so on.
If they are doing it that way, then thats the way to go. Function signatures are the only stable interface you need. Donât use fragile interfaces, classes and force developers to rewrite every time a new framework is released because someone renamed a method.
For the rest of your arguments, why even bother with someone elseâs vtables when you can build your own, trivially.
My point is simply this - How is rewriting code, code reuse ?
I donât know much about COM but if it provides API/ABI stability then thatâs great and thats what I am complaining about here. It seems to be an IPC of sorts, how would it compare to REST which can be implemented on top of basic functions ?
COM is a language-agnostic ABI. for exposing object oriented interfaces. It has been used to provide stable ABIs for object oriented interfaces for around 30 years to Windows APIs. It is not an IPC mechanism, it is a binary representation. It is a strong counter example to your claim that OO APIs cannot be made stable (and one that I mentioned already in the other thread).
It absolutely is. Please reuse code from an earlier version of any framework released in the last 50 years. OO was sold as the magic bullet that will solve all reuse and software engineering problems.
Iâve reused code written in C, C++, and Objective-C over multiple decades. Of these, Objective-C is by a very large margin the one that caused the fewest problems. Your argument is âOO was oversold, so letâs use the approach that was used back when people found the problems that motivated the introduction of OOâ.
How many times has the sort function been reused by using function pointers ? Washing machines donât make clothes dirtier than the clothes you put in.
I donât know what this means. Are you trying to claim that C standard library qsort is the pinnacle of API design? It provides a compare function, but not a swap function so if your structures require any kind of copying between a byte-by-byte copy then itâs a problem. How do you reuse Câs qsort with a data type that isnât a contiguous buffer? With C++âs std::sort (which doesnât use function pointers), you can sort any data structure that supports random access iteration.
If they are doing it that way, then thats the way to go. Function signatures are the only stable interface you need.
Thatâs true, if your library is producing types but not consuming them. If code in your library needs to call into code provided by library consumers, then this is not the case. Purely procedural C interfaces are easy to keep backwards compatible if they are not doing very much. The zlib interface, for example, is pretty trivial: consume a buffer, produce a buffer. The more complex a library is, the harder it is to maintain a stable API. OO gives you some tools that help, but it doesnât solve the problem magically.
Donât use fragile interfaces, classes and force developers to rewrite every time a new framework is released because someone renamed a method.
Absolutely none of that is intrinsic to OO. If you rename a C struct field or a function, people will need to rewrite their code. The set of things that you can break without breaking compatibility is strictly larger in an OO language than in a purely procedural language.
For the rest of your arguments, why even bother with someone elseâs vtables when you can build your own, trivially.
Why use any language feature when you can just roll your own in macro assembly?
Compilers are aware of the semantics and so can perform better optimisations.
Compilers are aware of the semantics and so can give better error messages.
Compilers are aware of the semantics and so can do better type checking.
Consistency across implementations: C library X and C library Y use different idioms for vtables (e.g. compare ICU and glib: two completely different vtable models). Library users need to learn each one, increasing their cognitive burden. Any two libraries in the same OO language will use the same dispatch mechanism.
My point is simply this - How is rewriting code, code reuse ?
Far better in OO languages (and far better in hybrid languages that provide OO and generic abstractions) than in purely procedural ones. This isnât the â80s anymore. No one is claiming that OO is a magic bullet that solves all of your problems.
Are you trying to claim that C standard library qsort is the pinnacle of API design?
Personal attacks are not welcomed in this forum or any forum. If you canât use technical arguments to debate you are never going to win.
It is an example of code reuse that absolutely doesnât break.
Absolutely none of that is intrinsic to OO. If you rename a C struct field or a function, people will need to rewrite their code.
It is absolutely intrinsic to OO because interfaces, classes are multiple level deep. It is a fractal of bad design. Change one thing everything breaks.
There is a strong culture of not breaking interfaces in C and using versioning but the opposite is true for OO where changing the base class and interface happens for every release. Do you actually have fun rewriting code between every new release of an MVC framework ?
Why use any language feature when you can just roll your own in macro assembly?
Again, personal attacks are not welcomed in this forum or any forum.
Vtables are trivial. They are not a new feature. All your optimzations can equally apply to vtables.
This isnât the â80s anymore.
Lies donât become truths just because time has passed.
If code in your library needs to call into code provided by library consumers, then this is not the case.
Use function pointers to provide hooks or I am missing something.
Are you trying to claim that C standard library qsort is the pinnacle of API design?
Personal attacks are not welcomed in this forum or any forum. If you canât use technical arguments to debate you are never going to win.
That was not an ad hominem, that was an attempt to clarify your claims. It was unclear what you were claiming with references to a sort function. An ad hominem attack looks more like this:
Do you think homeopathy is medicine just because people dress up and play the role of doctors doing science ?
This is an ad hominem attack and one that I ignored when you made it, because Iâm attempting to have a discussion on technical aspects.
It is an example of code reuse that absolutely doesnât break.
Itâs also an example of an interface with trivial semantics (itâs covered in the first term of most undergraduate computer science course) and whose requirements have been stable for longer than C has been around. The C++ std::sort template is also stable and defaults to using OO interfaces for defining the comparison (overloads of the compare operators). The Objective-C -sort family of methods on the standard collection classes are also unchanged since they were standardised in 1992. The Smalltalk equivalents have remained stable since 1980.
You have successfully demonstrated that itâs possible to write stable APIs in situations where the requirements are stable. Thatâs orthogonal to OO vs procedural. If you want to produce a compelling example, please present something where a C library has changed the semantics of how it interacts with a type provided by the library consumer (for example a plug-in filter to a video processing library, a custom view in a GUI, or similar) and an OO library making the same change has required more code modification.
Absolutely none of that is intrinsic to OO. If you rename a C struct field or a function, people will need to rewrite their code.
It is absolutely intrinsic to OO because interfaces, classes are multiple level deep. It is a fractal of bad design. Change one thing everything breaks.
This is an assertion, but it is not supported by evidence. I have provided examples of the same kinds of breaking changes being required in widely used C libraries that do non-trivial things. You have made a few claims here:
Something about interfaces. Iâm not sure what this is, but COM objects are defined in terms of interfaces and Microsoft is still able to support the same interfaces in 2021 that we were shipping for Windows 3.1 (though since we no longer support 16-bit binaries these required a recompile at some point between 1995 and now).
Classes are multiple levels deep. This is something that OO enables, but not something that it requires. The original GoF design patterns book recommended favouring composition over inheritance and some OO languages donât even support inheritance. Most modern C++ style guides favour composition with templates over inheritance. Inheritance is useful when you want to define a subtype relationship with code reuse.
Something (OO in general? A specific set of OO patterns? Some OO library that you donât like?) is a fractal of bad design. This is an emotive and subjective claim, not one that you have supported. Compare your posts with the article that I believe coined that phrase: It contains dozens of examples of features in PHP that compose poorly.
There is a strong culture of not breaking interfaces in C and using versioning but the opposite is true for OO where changing the base class and interface happens for every release. Do you actually have fun rewriting code between every new release of an MVC framework ?
Youâre comparing culture, not language features. You can write code today against the OpenStep specification from 1992 that will compile and run fine on modern macOS with Cocoa (I know of some code that has been through this process). Thatâs an OO MVC API thatâs retained source compatibility for almost 30 years. The only breaking changes were the switch from int to NSInteger for better support for 64/32-bit compatibility and these changes also affected the purely procedural APIs. They were not breaking changes for code targeting 32-bit platforms. The changes over the â90s in the Classic MacOS Toolbox (C APIs) were far more invasive.
A lot of JavaScript frameworks and pretty much everything from Google make breaking API changes every few months but thatâs an issue of developer culture, not one of the language abstractions.
Why use any language feature when you can just roll your own in macro assembly?
Again, personal attacks are not welcomed in this forum or any forum.
This is not a personal attack. It is your point. You are saying that you should not use a feature of a language because you can implement it in a lower-level language. Why stop at vtables?
Vtables are trivial. They are not a new feature. All your optimzations can equally apply to vtables.
No they canât. It is undefined behaviour to write to the vtable pointer in a C++ object for the lifetime of an object. Modern C++ compilers use this optimisation for devirtualisation. If the concrete type of a C++ object is known at compile time (after inlining) then calls to virtual functions can be replaced with direct calls.
Here is a reduced example. The C version with custom vtables is called in the function can_not_inline the C++ version using C++ vtables is called in the function can_inline. In both cases, the object is passed to a function that the compiler canât see before the call. In the C case, the language semantics allow this to modify the vtable pointer, in the C++ case they do not. This means that the C++ version knows that the foo call has a specific target, the C version must be conservative. The C++ version can then inline the call, which doesnât do anything in this trivial example and so elides it completely.
This isnât the â80s anymore.
Lies donât become truths just because time has passed.
No, but claims that were believed to be true and were debunked are no longer claimed. In the â80s, OO was claimed to be a panacea that solved all problems. That turned out to be untrue. Like many other things in programming, it is a set of useful tools that can be applied to make things better or worse.
If code in your library needs to call into code provided by library consumers, then this is not the case.
Use function pointers to provide hooks or I am missing something.
You are missing a lot of detail. Yes, you can provide function pointers as hooks. Now what happens when a new version of your library needs to add a new hook? What happens when that hook interacts in subtle ways with the others? These are the kinds of problems that make OO APIs fragile, but they also make procedural APIs fragile.
OO is fragile. Procedural code is resilient.
Assertions are not evidence. Assertions that contradict the experience of folks who have been working with these APIs for decades need strong evidence.
The only breaking changes were the switch from int to NSInteger for better support for 64/32-bit compatibility and these changes also affected the purely procedural APIs.
And that doesnât count as evidence. Please read what I wrote. OO programmers constantly rename things to break backwards compatibility for no good reason at all. Code rewrite is not code reuse, by definition. Do C programmers do this ?
We are discussing how C does things and maintains backwards compatibility not COM. You say COM and I say POSIX / libc which is older. The fact that you cite COM is in-itself proof that objects are insufficient.
In Python3 ⊠print was made into a function and almost overnight 100% code was made useless. This is the daily life of OO programmers for the release of every major version of a framework.
In database how many times do you change the schema ? Well structs and classes are like schema. Inheritance changes the schema. Interface renames change the schema. Changing method names is like changing the column name. Just like in database design you should not change the schema but use foreign keys to extend the tables with additional data. Perhaps OO needs a new âViewâ layer like SQL.
No, but claims that were believed to be true and were debunked are no longer claimed âŠ. like many other things in programming, it is a set of useful tools that can be applied to make things better or worse.
The keyword is âdebunkedâ like snake oil.
I propose mandatory version suffix for all classes to avoid this. The compiler creates a new class for every change made to a class, no matter how small. If you are changing the class substantially create a completely new name, donât ship it by the same name and break all code. For ABI do something like COM if that worked.
These are the kinds of problems that make OO APIs fragile, but they also make procedural APIs fragile.
You are right. They make procedural APIs using vtables fragile, not to mention slow. So use it sparingly ? 99% of code should be procedural. I only see vtables being useful in creating bags of event handlers.
The only breaking changes were the switch from int to NSInteger for better support for 64/32-bit compatibility and these changes also affected the purely procedural APIs.
And that doesnât count as evidence. Please read what I wrote. OO programmers constantly rename things to break backwards compatibility for no good reason at all. Code rewrite is not code reuse, by definition. Do C programmers do this ?
Youâve now changed your argument. You were saying that OO is fragile, now youâre saying that OO programmers (which OO programmers?) rename things and that breaks things. Okay, but if procedural programmers rename things that also breaks things. So now youâre not talking about OO in general, youâre talking about some specific examples of OO (but youâre not naming them). Youâve been given examples of widely used rich OO APIs that have retained huge degrees of backwards compatibility, so your argument seems now to be nothing to do with OO in general but an attack on some unspecified people that you donât like who write bad code.
We are discussing how C does things and maintains backwards compatibility not COM. You say COM and I say POSIX / libc which is older. The fact that you cite COM is in-itself proof that objects are insufficient.
Huh? COM is a standard for representing objects that can be shared across different languages. I also cited OpenStep / Cocoa (the latter is an implementation of the former), which uses the Objective-C object model.
POSIX provides a much simpler set of abstractions than either of these. If you want to compare something equivalent, how about GTK? Itâs a C library thatâs a bit newer than POSIX but that lets you do roughly the same set of things as OpenStep. How many GTK applications from even 10 years ago work with a modern version of GTK without modification? GTK 1 to GTK 2 and GTK 2 to GTK 3 both introduced significant backwards compatibility breaks.
In Python3 ⊠print was made into a function and almost overnight 100% code was made useless. This is the daily life of OO programmers for the release of every major version of a framework.
Wait, so your argument is that a procedural API, in a multi-paradigm language changed, which broke everything, and thatâs a reason why OO is bad?
In database how many times do you change the schema ? Well structs and classes are like schema. Inheritance changes the schema. Interface renames change the schema. Changing method names is like changing the column name. Just like in database design you should not change the schema but use foreign keys to extend the tables with additional data. Perhaps OO needs a new âViewâ layer like SQL.
I donât even know where to go with that. OO provides away of expressing the schema. The schema doesnât change because of OO, the schema changes because the requirements change. OO provides mechanisms for constraining the impact of that change.
Again, your argument seems to be:
There exists a set of things in OO that, if modified, break backwards compatibility.
People who write OO code will change these things
OO is bad.
But itâs also possible to say the same thing with OO replaced with procedural, functional, generic, or any other style of programming. If you want to make this point convincingly then you need to demonstrate that the set of things that break backwards compatibility in OO are more likely to be changed than in another style. So far, you have made a lot of assertions, but where I have presented examples of OO APIs with a long history of backwards compatibility and procedural APIs performing equivalent things with weaker guarantees, you have failed to present any examples.
I propose mandatory version suffix for all classes to avoid this.
So, like COM?
The compiler creates a new class for every change made to a class, no matter how small. If you are changing the class substantially create a completely new name, donât ship it by the same name and break all code.
So, like COM?
For ABI do something like COM if that worked.
So, you want COM? But you want COM without OO? In spite of the fact that COM is an OO standard?
These are the kinds of problems that make OO APIs fragile, but they also make procedural APIs fragile.
You are right. They make procedural APIs using vtables fragile, not to mention slow. So use it sparingly ? 99% of code should be procedural. I only see vtables being useful in creating bags of event handlers.
Itâs not just about vtables, itâs about any kind of rich abstraction that introduces coupling between the producers and consumers of an interface.
Letâs go back to the C sort function that you liked. Thereâs a C standard qsort. Letâs say you want to sort an array of strings by their locale-aware order. It has a callback, so you can define a comparison function. Now you want to sort an array that has an external indexing structure for quickly finding the first entry with a particular prefix. Ooops, qsort doesnât have any kind of hook for defining how to do the move or for receiving a notification when things are moved, so you canât keep the data structure up to date, you need to recalculate it after the sort. After a while, you realise that resizing the array is expensive and so you replace it with a skip list. Oh dear, qsort canât sort anything other than an array, so you now have to implement your own sorting function.
Compare that to C++âs std::sort. It is given two random-access iterators. These are objects that define how to access the start and end of some collection. If I need to update some other data structure when entries in the list move, then I overload their copy or move constructors to do this. The iterators know how to move through the collection, so when I move to a skip list I donât even have to modify the call to std::stort, I just modify the begin() and end() methods on my data structure.
I am lazy. I regularly work on projects with millions of lines of code. I want to write the smallest amount of code possible to achieve my goal and I want to have to modify the smallest amount of code when the requirements change. Object orientation gives me some great tools for this. So does generic programming. Pure procedural programming would make my life much harder and I donât like inflicting pain on myself, so I avoid it where possible.
You have the patience of a saint to continue arguing with this person as they continue to disregard your experience. I certainly donât have the stamina for it, but despite the bizarreness of the slapfight, your replies are really insightful when it comes to API design.
I had a lot of the same misconceptions (and complete conviction that I was right) in my early â20s, and I am very grateful to the folks who had the patience to educate me. In hindsight, Iâm astonished that they put up with me.
I think more languages could benefit from COMâs techniques but I donât think it is a part of the C++ core. I would use a minimal and flexible version of it but it seems to be doing way too many Win32 specific things.
As @david_chisnall has pointed out many times already, this has nothing to do with OO. GTK has exhibited the exact same thing. GCC has done something similar with its internals. Renaming things such that code that relies on the API having to change has nothing at all to do with any specific programming paradigm.
Please stop your screed on this topic. Itâs pretty clear from the discussion you are not grasping what is being said. I urge you to spend some time and study the replies above.
Fine. I would compare GUI development with Tk which is more idiomatic in C.
As I have pointed out if people used versioning for interfaces things wonât break every time an architecture astronaut or an undisciplined programmer change a name, amplifying code rewrites. It is clear that the problem applies to vtables as well and naming in general and not solved within OO which exasperates the effects of simple changes.
You can conclude whatever you like, but after taking a look at your blog, Iâm going to back away slowly from this discussion and find a better use for my time. Best of luck with your jihad.
Glad you discovered my blog. Iâd recommend you start with Simula the Misunderstood. The language is a bit coarse though. The entire discussion has however inspired me to write - Interfaces a fractal of bad design. I see myself more like James Randi exposing homeopathy, superstitions, faith healers and fortune telling.
This is probably the most important principle in C API design.
I code mainly in C++, but I keep using this design for the C wrappers I need to create for my C++ APIs (to use as DLL exports and as binding points to other languages.)
I was just reading K&R C the other day. It seems in the first version C, declarations were optional. Object model in C is surprisingly elegant. If data and methods are separate then they can be evolved separately - only C allows this. On a trivial note, copy paste is better than Inheritance because the copied code can evolve separately instead of changing every time the base class changes.
In terms of generality,
Pointers > Lexical Scope
Function Pointers > Closures, Virtual Methods
Gotos > Exceptions
Arrays, Structs > Objects
Co-routines > Monads
C with namespaces, pattern matching, garbage collection, generics, nested functions and defer is the C++ that I wish had happened. Go is good but I miss the syntax of C. I recently came across Pike scripting language which looks surprisingly clean.
Yup, which sucked. It combined the lack of compiler checks of a dynamic language, with the data-corruption bugs of native code. For instance, what happens when you pass a long as the third argument, to a function whose implementation takes an int for that parameter? đ±
Maybe Iâm unsure what youâre getting at, but many languages including Objective-C, Swift and Rust allow methods to be declared separately from the data, including adding more methods afterwards, even in separate binaries.
But itâs worse than inheritance because, when you fix a bug in the copied code, you have to remember to also fix it every place it was pasted. I had a terrible time of this in an earlier job where I maintained a codebase written by an unrepentant copy/paster. This is the kind of nightmare that led to the DRY principle.
Usually nothing, or rather, exactly what you would want đ. Last I checked, K&R C requires function parameters to be converted to the largest matching integral type, so long and int get passed the same way. All floating point parameters get passed as double. In fact, I remember when ANSI-C came out that one of the consequences was that you could now have actual float parameters. Pointers are the same size anyway, no struct by value parameters.
It still wasnât all roses: messing up argument order or forgetting a parameter. Oops. So function prototypes: đđ
Except, of course, when the sizes differed.
No. The sizes do differ in the example. Once again: arguments are passed (and received) as the largest matching integral type.
I changed the
printf()
of the example to show this:Result:
A lot of this is assuming arguments passed in registers. Passing on the stack can result in complete nonsense as you could have misaligned the stack, or simply not made a large enough frame.
I donât mean copy paste everything, use functions for DRY ofcourse ⊠just to get the effect of inheritance copy paste is better. Inheritance, far from the notions of biology or taxonomy is similar to a lawyer contract that states all changes of A will be available to B just like land inheritance. Every time some maintainer changes a class in React, Angular, Ruby, Java, C++, Rust, Python frameworks and libraries everyone has to change their code. If for every release of a framework you have to rewrite your entire code, calling that code reuse is wrong and fraudulent. If we add any method, rename any method, change any implementation of any method that is not a trivial fix; we should create a new class instead of asking millions of developers to change their code.
If instead we used copy paste, there would be no inheritance hierarchy but just flattened code if that makes sense and you can modify it without affecting other developers. If we want to add new functionality to an existing class we should use something like plugins/delegation/mixins but never modify the base class ⊠but absolutely no one uses or understands this pattern and everyone prefers to diddle with the base class.
In C such massive rewrites wonât happen, because everything is manually wired instead of automatically being inherited. You can always define new methods without bothering if you are breaking someoneâs precious interface. You can always nest structs and cast them to reuse code written for the previous struct. Combined with judicious use of function pointers and vtables you will never need to group data and code in classes.
That is simply not true. There are a lot of changes you can make to a class without requiring changes in subclasses. As a large-scale example, macOS and iOS frameworks (Objective-C and Swift) change in every single OS update, and the Apple engineers are very careful not to make changes that require client code to change, since end users expect that an OS update will not break their apps. This includes changes to classes that are ubiquitously subclassed by apps, like NSView, NSDocument, UIViewController, etc. I could say exactly the same thing about .NET or other Windows system libraries that use OOP.
Iâm sure that in many open source projects the maintainers are sloppy about preserving source compatibility (let alone binary), because their âcustomersâ are themselves developers, so itâs easy to say âitâs easier to change the signature of this method and tell people to update their codeâ. But thatâs more laziness (or âmove fast and break stuffâ) than a defining feature of inheritance.
Yes, because everyoneâs terrified of touching the code for fear of breaking stuff. Iâve used code like that.
How ?
In C you would just create a new function and rightfully touching working code except for bug fixes is taboo. I can probably point to kernel drivers that use C vtables that havenât been touched in 10 years. If you want to create an extensible function, use a function pointer. How many times has the sort function been reused ?
OO programmers claim that the average joe can write reusable code by simply using classes. If even the most well paid, professional programmers canât write reusable code and writing OO code requires high training then we shouldnât lie about OO being for the average programmer. Even if you hire highly trained programmers, code reuse is fragile requiring constant vigilance of the base classes and interfaces. Why bother with fragile base classes at all ?
Technically you can avoid this problem by never touching the base class and always adding new classes and interfaces. I think classes should have a version suffix but I donât think it will be a popular idea and requires too much discipline. OO programmers on average prefer adding a fly method to a fish class as a quick fix to creating a bird class and thats just a disaster waiting to happen.
I donât understand why you posted that link. Apple release notes describe new features, and sometimes deprecations of APIs that they plan to remove in a year or two. They apply only to developers, of course; compiled apps continue to work unchanged.
OO is not trivial, but itâs much better than resorting to flat procedural APIs. Zillions of developers use it on iOS, Mac, .NET, and other platforms.
My conclusion - OO is fragile and needs constant rewrites by developers who use OO code and procedural apis are resilient.
Your conclusion is not supported by evidence. Look at a big, widely used, C library, such as ICU or libavcodec. You will have API deprecations and removals. Both of these projects do it nicely so you have
foo2()
,foo3()
and so on. In OO APIs, the same thing happens, you add new methods and deprecate the old ones over time. For things like glib or gtk, the churn is even more pronounced.OO covers a variety of different implementation strategies. C++ is a thin wrapper around C: with the exception of exceptions, everything in C++ can be translated to C (in the case of templates, a lot more C) and so C++ objects are exactly like C structs. If a C/C++
struct
is exposed in a header then you canât add or remove fields without affecting consumers because in both languages astruct
can be embedded in another and the size and offsets are compiled into the binary.In C, you use the opaque pointers idiom to avoid this. In C++ you use the pImpl pattern, where you have a public class and a pointer to an implementation. Both of these require an extra indirection. You can also avoid this in C++ by making the constructor for your class private and having factory methods. If you do this, then only removing fields modifies your ABI, because nothing outside of your library can allocate it. This lets you put fast paths in the header that directly access fields, without imposing an ABI / API contract that prevents adding fields.
In C++, virtual methods are looked up by vtable offset, so you canât remove virtual functions and you canât add virtual functions if your class is subclassed. You also canât change the signature of any existing virtual methods. You can; however, add non-virtual methods because these do not take place in dynamic dispatch and so are exactly the same as C functions that take the object pointer as the first parameter.
In a more rigid discipline, such as COM, the object model doesnât allow directly exposing fields and freezes interfaces after creation. This is how most OO APIs are exposed on Windows and we (Microsoft) have been able to maintain source and binary compatibility with programs using these APIs for almost three decades.
In Objective-C, fields (instance variables) are looked up via an indirection layer. Roughly speaking, for each field thereâs a global variable that tells you its offset. If you declare a field as having package visibility then the offset variable is not exposed from your library and so canât be named. Methods are looked up via a dynamic dispatch mechanism that doesnât use fixed vtable offsets and so you are able to add both fields and methods without changing your downstream ABI. This is also true for anything that uses JIT or install-time compilation (Java, .NET).
You raise the problem of behaviour being automatically inherited, but this is an issue related to the underlying problem, not with the OO framing. If you are just consuming types from a library then this isnât an issue. If you are providing types to a library (e.g. a way of representing a string thatâs efficient for your use or a new kind of control in a GUI, for example), then the library will need to perform operations on that type. A new version of the library may need to perform more operations on that type. If your code doesnât provide them, then it needs to provide some kind of default. In C, youâd do this with a struct containing callback function pointers that carried its size (or a version cookie) in the first field, so that you could dispatch to some generic code in your functions if the library consumer didnât provide an implementation. If youâre writing in an OO language then youâll just provide a default implementation in the superclass.
Oh, and you donât say what kernel youâre referring to. I can point to code in Linux thatâs needed to be rewritten between minor revisions of the kernel because a C API changed. I can point to C++ code in the XNU kernel that hasnât changed since the first macOS release when it was rewritten from Objective-C to C++. Good software engineering is hard. OO is not a magic bullet but going back to â70s-style designs doesnât avoid the problems unless youâre also willing to avoid writing anything beyond the complexity of things that were possible in the â70s. Software is now a lot more complex than it was back then. The Version 6 UNIX release was only about 83KLoC: individual components of clang are larger than that today.
It absolutely is. Please reuse code from an earlier version of any framework released in the last 50 years. OO was sold as the magic bullet that will solve all reuse and software engineering problems.
Do you think homeopathy is medicine just because people dress up and play the role of doctors doing science ?
How many times has the sort function been reused by using function pointers ? Washing machines donât make clothes dirtier than the clothes you put in.
If they are doing it that way, then thats the way to go. Function signatures are the only stable interface you need. Donât use fragile interfaces, classes and force developers to rewrite every time a new framework is released because someone renamed a method.
For the rest of your arguments, why even bother with someone elseâs vtables when you can build your own, trivially.
My point is simply this - How is rewriting code, code reuse ?
This is what Windows and Mac OS programmers do every day. My experience with COM is the Windows APIs built on it have great API/ABI stability.
I donât know much about COM but if it provides API/ABI stability then thatâs great and thats what I am complaining about here. It seems to be an IPC of sorts, how would it compare to REST which can be implemented on top of basic functions ?
COM is a language-agnostic ABI. for exposing object oriented interfaces. It has been used to provide stable ABIs for object oriented interfaces for around 30 years to Windows APIs. It is not an IPC mechanism, it is a binary representation. It is a strong counter example to your claim that OO APIs cannot be made stable (and one that I mentioned already in the other thread).
Iâm not sure about the IPC parts (there is a degree of âhostingâ); however, DCOM provides RPC with COM.
Iâve reused code written in C, C++, and Objective-C over multiple decades. Of these, Objective-C is by a very large margin the one that caused the fewest problems. Your argument is âOO was oversold, so letâs use the approach that was used back when people found the problems that motivated the introduction of OOâ.
I donât know what this means. Are you trying to claim that C standard library
qsort
is the pinnacle of API design? It provides a compare function, but not a swap function so if your structures require any kind of copying between a byte-by-byte copy then itâs a problem. How do you reuse Câsqsort
with a data type that isnât a contiguous buffer? With C++âsstd::sort
(which doesnât use function pointers), you can sort any data structure that supports random access iteration.Thatâs true, if your library is producing types but not consuming them. If code in your library needs to call into code provided by library consumers, then this is not the case. Purely procedural C interfaces are easy to keep backwards compatible if they are not doing very much. The zlib interface, for example, is pretty trivial: consume a buffer, produce a buffer. The more complex a library is, the harder it is to maintain a stable API. OO gives you some tools that help, but it doesnât solve the problem magically.
Absolutely none of that is intrinsic to OO. If you rename a C
struct
field or a function, people will need to rewrite their code. The set of things that you can break without breaking compatibility is strictly larger in an OO language than in a purely procedural language.Why use any language feature when you can just roll your own in macro assembly?
Far better in OO languages (and far better in hybrid languages that provide OO and generic abstractions) than in purely procedural ones. This isnât the â80s anymore. No one is claiming that OO is a magic bullet that solves all of your problems.
Personal attacks are not welcomed in this forum or any forum. If you canât use technical arguments to debate you are never going to win.
It is an example of code reuse that absolutely doesnât break.
It is absolutely intrinsic to OO because interfaces, classes are multiple level deep. It is a fractal of bad design. Change one thing everything breaks.
There is a strong culture of not breaking interfaces in C and using versioning but the opposite is true for OO where changing the base class and interface happens for every release. Do you actually have fun rewriting code between every new release of an MVC framework ?
Again, personal attacks are not welcomed in this forum or any forum.
Vtables are trivial. They are not a new feature. All your optimzations can equally apply to vtables.
Lies donât become truths just because time has passed.
Use function pointers to provide hooks or I am missing something.
OO is fragile. Procedural code is resilient.
That was not an ad hominem, that was an attempt to clarify your claims. It was unclear what you were claiming with references to a sort function. An ad hominem attack looks more like this:
This is an ad hominem attack and one that I ignored when you made it, because Iâm attempting to have a discussion on technical aspects.
Itâs also an example of an interface with trivial semantics (itâs covered in the first term of most undergraduate computer science course) and whose requirements have been stable for longer than C has been around. The C++
std::sort
template is also stable and defaults to using OO interfaces for defining the comparison (overloads of the compare operators). The Objective-C-sort
family of methods on the standard collection classes are also unchanged since they were standardised in 1992. The Smalltalk equivalents have remained stable since 1980.You have successfully demonstrated that itâs possible to write stable APIs in situations where the requirements are stable. Thatâs orthogonal to OO vs procedural. If you want to produce a compelling example, please present something where a C library has changed the semantics of how it interacts with a type provided by the library consumer (for example a plug-in filter to a video processing library, a custom view in a GUI, or similar) and an OO library making the same change has required more code modification.
This is an assertion, but it is not supported by evidence. I have provided examples of the same kinds of breaking changes being required in widely used C libraries that do non-trivial things. You have made a few claims here:
Youâre comparing culture, not language features. You can write code today against the OpenStep specification from 1992 that will compile and run fine on modern macOS with Cocoa (I know of some code that has been through this process). Thatâs an OO MVC API thatâs retained source compatibility for almost 30 years. The only breaking changes were the switch from
int
toNSInteger
for better support for 64/32-bit compatibility and these changes also affected the purely procedural APIs. They were not breaking changes for code targeting 32-bit platforms. The changes over the â90s in the Classic MacOS Toolbox (C APIs) were far more invasive.A lot of JavaScript frameworks and pretty much everything from Google make breaking API changes every few months but thatâs an issue of developer culture, not one of the language abstractions.
This is not a personal attack. It is your point. You are saying that you should not use a feature of a language because you can implement it in a lower-level language. Why stop at vtables?
No they canât. It is undefined behaviour to write to the vtable pointer in a C++ object for the lifetime of an object. Modern C++ compilers use this optimisation for devirtualisation. If the concrete type of a C++ object is known at compile time (after inlining) then calls to virtual functions can be replaced with direct calls.
Here is a reduced example. The C version with custom vtables is called in the function
can_not_inline
the C++ version using C++ vtables is called in the functioncan_inline
. In both cases, the object is passed to a function that the compiler canât see before the call. In the C case, the language semantics allow this to modify the vtable pointer, in the C++ case they do not. This means that the C++ version knows that thefoo
call has a specific target, the C version must be conservative. The C++ version can then inline the call, which doesnât do anything in this trivial example and so elides it completely.No, but claims that were believed to be true and were debunked are no longer claimed. In the â80s, OO was claimed to be a panacea that solved all problems. That turned out to be untrue. Like many other things in programming, it is a set of useful tools that can be applied to make things better or worse.
You are missing a lot of detail. Yes, you can provide function pointers as hooks. Now what happens when a new version of your library needs to add a new hook? What happens when that hook interacts in subtle ways with the others? These are the kinds of problems that make OO APIs fragile, but they also make procedural APIs fragile.
Assertions are not evidence. Assertions that contradict the experience of folks who have been working with these APIs for decades need strong evidence.
And that doesnât count as evidence. Please read what I wrote. OO programmers constantly rename things to break backwards compatibility for no good reason at all. Code rewrite is not code reuse, by definition. Do C programmers do this ?
We are discussing how C does things and maintains backwards compatibility not COM. You say COM and I say POSIX / libc which is older. The fact that you cite COM is in-itself proof that objects are insufficient.
In Python3 ⊠print was made into a function and almost overnight 100% code was made useless. This is the daily life of OO programmers for the release of every major version of a framework.
In database how many times do you change the schema ? Well structs and classes are like schema. Inheritance changes the schema. Interface renames change the schema. Changing method names is like changing the column name. Just like in database design you should not change the schema but use foreign keys to extend the tables with additional data. Perhaps OO needs a new âViewâ layer like SQL.
The keyword is âdebunkedâ like snake oil.
I propose mandatory version suffix for all classes to avoid this. The compiler creates a new class for every change made to a class, no matter how small. If you are changing the class substantially create a completely new name, donât ship it by the same name and break all code. For ABI do something like COM if that worked.
You are right. They make procedural APIs using vtables fragile, not to mention slow. So use it sparingly ? 99% of code should be procedural. I only see vtables being useful in creating bags of event handlers.
Youâve now changed your argument. You were saying that OO is fragile, now youâre saying that OO programmers (which OO programmers?) rename things and that breaks things. Okay, but if procedural programmers rename things that also breaks things. So now youâre not talking about OO in general, youâre talking about some specific examples of OO (but youâre not naming them). Youâve been given examples of widely used rich OO APIs that have retained huge degrees of backwards compatibility, so your argument seems now to be nothing to do with OO in general but an attack on some unspecified people that you donât like who write bad code.
Huh? COM is a standard for representing objects that can be shared across different languages. I also cited OpenStep / Cocoa (the latter is an implementation of the former), which uses the Objective-C object model.
POSIX provides a much simpler set of abstractions than either of these. If you want to compare something equivalent, how about GTK? Itâs a C library thatâs a bit newer than POSIX but that lets you do roughly the same set of things as OpenStep. How many GTK applications from even 10 years ago work with a modern version of GTK without modification? GTK 1 to GTK 2 and GTK 2 to GTK 3 both introduced significant backwards compatibility breaks.
Wait, so your argument is that a procedural API, in a multi-paradigm language changed, which broke everything, and thatâs a reason why OO is bad?
I donât even know where to go with that. OO provides away of expressing the schema. The schema doesnât change because of OO, the schema changes because the requirements change. OO provides mechanisms for constraining the impact of that change.
Again, your argument seems to be:
But itâs also possible to say the same thing with OO replaced with procedural, functional, generic, or any other style of programming. If you want to make this point convincingly then you need to demonstrate that the set of things that break backwards compatibility in OO are more likely to be changed than in another style. So far, you have made a lot of assertions, but where I have presented examples of OO APIs with a long history of backwards compatibility and procedural APIs performing equivalent things with weaker guarantees, you have failed to present any examples.
So, like COM?
So, like COM?
So, you want COM? But you want COM without OO? In spite of the fact that COM is an OO standard?
Itâs not just about vtables, itâs about any kind of rich abstraction that introduces coupling between the producers and consumers of an interface.
Letâs go back to the C sort function that you liked. Thereâs a C standard
qsort
. Letâs say you want to sort an array of strings by their locale-aware order. It has a callback, so you can define a comparison function. Now you want to sort an array that has an external indexing structure for quickly finding the first entry with a particular prefix. Ooops,qsort
doesnât have any kind of hook for defining how to do the move or for receiving a notification when things are moved, so you canât keep the data structure up to date, you need to recalculate it after the sort. After a while, you realise that resizing the array is expensive and so you replace it with a skip list. Oh dear,qsort
canât sort anything other than an array, so you now have to implement your own sorting function.Compare that to C++âs
std::sort
. It is given two random-access iterators. These are objects that define how to access the start and end of some collection. If I need to update some other data structure when entries in the list move, then I overload their copy or move constructors to do this. The iterators know how to move through the collection, so when I move to a skip list I donât even have to modify the call tostd::stort
, I just modify thebegin()
andend()
methods on my data structure.I am lazy. I regularly work on projects with millions of lines of code. I want to write the smallest amount of code possible to achieve my goal and I want to have to modify the smallest amount of code when the requirements change. Object orientation gives me some great tools for this. So does generic programming. Pure procedural programming would make my life much harder and I donât like inflicting pain on myself, so I avoid it where possible.
You have the patience of a saint to continue arguing with this person as they continue to disregard your experience. I certainly donât have the stamina for it, but despite the bizarreness of the slapfight, your replies are really insightful when it comes to API design.
I had a lot of the same misconceptions (and complete conviction that I was right) in my early â20s, and I am very grateful to the folks who had the patience to educate me. In hindsight, Iâm astonished that they put up with me.
This page lists all the changes in Objective C since the last 10 years. Plenty of renames.
I think more languages could benefit from COMâs techniques but I donât think it is a part of the C++ core. I would use a minimal and flexible version of it but it seems to be doing way too many Win32 specific things.
As @david_chisnall has pointed out many times already, this has nothing to do with OO. GTK has exhibited the exact same thing. GCC has done something similar with its internals. Renaming things such that code that relies on the API having to change has nothing at all to do with any specific programming paradigm.
Please stop your screed on this topic. Itâs pretty clear from the discussion you are not grasping what is being said. I urge you to spend some time and study the replies above.
Fine. I would compare GUI development with Tk which is more idiomatic in C.
As I have pointed out if people used versioning for interfaces things wonât break every time an architecture astronaut or an undisciplined programmer change a name, amplifying code rewrites. It is clear that the problem applies to vtables as well and naming in general and not solved within OO which exasperates the effects of simple changes.
You can conclude whatever you like, but after taking a look at your blog, Iâm going to back away slowly from this discussion and find a better use for my time. Best of luck with your jihad.
Glad you discovered my blog. Iâd recommend you start with Simula the Misunderstood. The language is a bit coarse though. The entire discussion has however inspired me to write - Interfaces a fractal of bad design. I see myself more like James Randi exposing homeopathy, superstitions, faith healers and fortune telling.