Object Oriented Programming with C
There are few Pros and Cons associated to utilizing Object Oriented Programming in C Language.
Pro | Con |
---|---|
Enable Extensibility | Potential performance degradation from Virtual Dispatch |
Increased Maintainability | Increased Verbosity |
Partial Visibility Feature | Steep Increase in Complexity |
Single Inheritance Only | Single Inheritance Only |
Enabling Extensibility
This paradigm allows for programming languages other than C programming language to extend upon C written library while enabling C written library to manage objects created from other programming languages and vice versa.
Assume we have the following object in C defined as followed:
struct BaseClass
{
int Value;
};
And it's associated Virtual Dispatch Table or otherwise known as the “VTable” as followed:
struct BaseClassVTable
{
int Sum(MyClass* this, int value);
};
With our “BaseClass” object defined in C, we can proceed to inherit from this BaseClass going forward. The code above are similar to C# counter part as followed:
public abstract class BaseClass
{
public int Value;
}
You can implement a “Virtual” method as well in C by writing a BaseClass function which can be swapped out by inheriting classes if desired.
int BaseClass_Sum(struct BaseClass* this, int val)
{
if (this == NULL)
return val;
return this->Value + val;
}
However, we cannot/shouldn't call BaseClass_Sum going forward as this function should remain a “Protected” visibility modifier for inheriting class concern, not for the end-user who wish to instantiate the class and to use Sum method provided. We'll need to create a “Dispatch” function which have a role of looking into inheriting class' vtable and then find the correct function to call for that class.
bool BaseClass_Sum_dispatch(struct BaseClass* this, int val)
{
return ((struct BaseClassVTable*) (((void**)this)[-1]))->Sum(this, val);
}
And this would be similar the following snippet in C#:
public abstract class BaseClass
{
public int Value;
public int Sum(int val)
{
return this.Value + val;
}
}
And so for inheriting the BaseClass in C, we can simply do this as followed:
// This should be in a header file, but for demonstrative purpose, we'll declare this function for use in following snippet
bool SumClass_ValidateThis(struct SumClass* this);
struct SumClass {
struct BaseClass baseObj;
};
int SumClass_Sum(struct SumClass* this, int val)
{
if (SumClass_ValidateThis(this))
return -1; // Error occurs, returns -1
return this->Value + val + 1;
}
static struct BaseClassVTable SumClassVTable = (struct BaseClassVTable) {
.Sum = SumClass_Sum
};
bool SumClass_ValidateThis(struct SumClass* this)
{
if (this == NULL)
{
return true;
}
if (((void **)this)[-1] != (void *)&SumClassVTable)
{
return true;
}
return false;
}
struct SumClass* SumClass_New(int val)
{
void **newObj = (void **)calloc(sizeof(struct SumClass) + sizeof(void *), 1);
*newObj = &SumClassVTable;
struct SumClass *ptr = (struct SumClass *)&(newObj[1]);
ptr->baseObj.Value = val;
return ptr;
}
There are a number of things happening above, so starting with declaration of SumClass, it must ALWAYS have the base object struct defined as the first member for both conventional and practical reasons:
- The inheriting class may still use Base class functions and within that function may calls upon Base Class members.
- It's a good idea to form a good habit on defining base object members to avoid mistakes.
The inheriting class define a new function that overrides the Base Class method with SumClass_Sum function.
New VTable have to be defined for SumClass to specify which method to call anytime a Dispatch function is called to lookup which function to calls, in this case, it points to SumClass_Sum function.
The SumClass_ValidateThis is a useful error checking function to ensure that the object being passed into “This” parameter is actually the inherited class, not any other object and it validates by checking for VTable address.
With all of above defined, SumClass_New can be defined by allocating the size of SumClass and also size of Pointer reserved for VTable address, this function will assign the pointer address and then assign any value necessary afterward and return pointer to the actual value members rather than starting at VTable address.
So with those implemented, the end-user can instantiate the SumClass and then invoke it's Sum method as followed:
#include <stdio.h>
int main()
{
struct SumClass* sumClass = SumClass_New(2);
int result = BaseClass_Sum_dispatch(sumClass, 3);
printf("%i\n", result);
return 0;
}
And then the result would be: 6
indicating that it have run SumClass_Sum method and it would compute SumClass's Value integer and provided Parameter Integer and finally the constant of 1 resulting: 2 + 3 + 1 = 6
Potential performance degradation from Virtual Dispatch
As you may have observed in prior example of BaseClass_Sum_dispatch
function, it looks up for VTable address and then figure out the function address by having to copy the stack including the this
parameter over to the function
listed in the VTable, it incurs a performance cost associated to virtual dispatch. There are some alleviation to this and that is being the static dispatch optimization that some compilers offered such as LLVM/Clang.
Essentially the process is as followed:
BaseClass_Sum_dispatch -> Get VTable address from this parameter -> Calls SumClass_Sum
Whereas sometime compile can optimize it out into a static dispatch by eliminating the first two steps into just calling SumClass_Sum
directly, but the performance degradation can remain when BaseClass_Sum_dispatch
is called from another programming language that does not get optimized along with C library so you don't get the benefit of static dispatch optimization. There are some alleviation to this such as compiling the library into an Intermediate Representation such as LLVM IR and allows the external code from other programming language to emit new compiled code with static dispatch optimization at runtime to alleviate potential performance degradation.
In most circumstances, such severe degradation in performance aren't likely unless the virtual dispatch code are repeatedly being called at every iteration/loop step to attenuate the cost.
Increased Maintainability
This assertion is a bit controversial depending on the critic, but generally while there are other paradigms such as functional programming and so forth that serves as an alternative to object oriented programming in general. The problems that arise from those paradigms are:
- Lack of user familiarity coming from C++, Java, C#, and other imperative programming languages
- Difficulty to extend implementation written in C when FFI is the goal
- Increased complexity in writing compiler-related code to alleviate the problem such as declarative programming which leaves the implementation details to the compiler to figure out and potentially limit flexibility to end-user
- Lack of structure in data types
It is important to note that the variant of Object Oriented Programming in C is single inheritance, and this is mainly due to both maintainability and practicality issue.
The downside of this maintainability is also the increased verbosity involved anytime a new member/function is being added to base class that have to be replicated/implemented for all other inheriting classes. This is precisely why designing class early on is crucial to overall project design to reduce the cost of maintaining the library.
Partial Visibility Feature
There is some level of visibility control that can be done by moving code that should remain under private or protected visibility into a separate header, it can increase verbosity however.
Generally in Object Oriented Programming implementation in C, we would have the following structure:
include/
BaseClass.h
private/
BaseClass_priv.h
BaseClass_vtable.h
src/
BaseClass_impl.c
It shares some of the characteristics seen in C# so that can help the case of maintainability.
Steep Increase in Complexity
The main issue with Object Oriented Programming in C is that it does not scale well especially on large project due to the sheer amount of verbose code and required testings. Generally, you have to write twice for each function since it may have base implementation and a dispatch function, and twice-more for each property/data type member for setter and getter, and multiple functions for event handler related members.
Creating a base class in C are broken down in multiple steps:
- Create the public header file and the private header file
- Create the VTable header file
- Create the implementation file, write everything in implementation first
- Define the Base Class object, members, functions, events, and so forth
- Define the dispatch functions for each function, member, events, and so forth
- Move VTable definition over to VTable header file and include VTable header file directly in implementation file
- Declare all functions from implementation for public header file and private header file depending on visibility
- Ensure all dispatch functions are declared within public header file
Now you have the base class object done, you now need to inherit such base class object which can be done as followed:
- Create the public header file
- Create the implementation file
- Define the inheriting class structure and ensure that the base class object member is included within the inherited class as the first member
- Define the VTable static variable (should be immutable) and define each function pointer member for either base class function or inherited class function as needed
- Implement the validation stub for validating
this
parameter to ensure it is not NULL and that it have it's VTable address pointing to VTable static variable above - Define the inheriting class implementation as needed
- Implement the New/Destory function for inheriting class
- Declare inheriting class specific functions into public header
As you can observe, this prompts the need for stringent planning for base class implementation to minimize amount of changes required that cascade to the rest of the inheriting objects.
There are few tricks that can alleviate the pain for above:
- Write generator program for generating C-style Object Oriented Code
- Utilize Preprocessors to alleviate writing verbose code
- Copy and paste from existing code as needed
Single Inheritance Only
This is both considered a pro and a con in scope of C programming, because it benefit in a way that it reduce the scope of complexity involving Object Oriented Programming that in most programming languages allows for multiple inheritance. Multiple inheritance often resort to compiler managing the implementation details involving VTable and dispatch routines.
The project was done in C for one important reason, to enable Foreign Function Interface which is a mechanism that allows external programming languages to call upon C functions and to share data in a coherent way.
The practicality problem with multiple inheritance in C is that you would end up having multiple data structure sharing with multiple base class members, so let's suppose we have the following functions:
struct BaseClassA {
int A;
};
int BaseClassA_GetValue(struct BaseClassA* this)
{
return this->A;
}
struct BaseClassB {
int B;
};
int BaseClassB_GetOtherValue(struct BaseClassB* this)
{
return this->B;
}
struct InheritedClass {
struct BaseClassA objA;
struct BaseClassB objB;
};
struct InheritedClass* ptr = InheritedClass_New();
BaseClassA_GetValue(ptr);
BaseClassB_GetOtherValue(ptr);
See the problem?
Implementation for BaseClassB would have to be changed to use the correct address to retrieve data type or to obtain offset information. Supporting the ability to enable multiple inheritance in C would drive up the complexity enormously that we're approaching to a point that creating a new programming language may be necessary to alleviate the verbosity of such code which is what C++, C#, Java, and so forth offers at the cost of Foreign Function Interface compatibility and ABI compatibility.