Moving to C++ From C

Started by
11 comments, last by Ectara 11 years, 4 months ago
The main reason that I've been using C89 for many years is that it is one of the least common denominators in any platforms that I've seen; my code runs with little complaint on an ANSI C compiler, and that is the main reason that I have always used C.

However, over the past year, the features of C++ have me yearning. Doing all of the things that I've always done to program in an object-oriented fashion in C seem caveman-like and old fashioned; doing object-oriented programming in an object-oriented language would be a godsend compared to how much work it is to do anything of the sort in a language without features that expedite object-oriented design.

All of my codebase is in C. Every personal project that I've ever done up to this very second has been in C. It's almost become a matter of loyalty and discipline to maintain the utmost compatibility by using the earliest standard of C. But at the same time, I know that doing this in a more advanced (and complex) language would make my progress soar, due to the lowered time to develop my projects, and implement them.

I have more or less mastered C; I've read many books on it, spent years learning it, wrote a couple compilers in it. I have a decent amount of experience in C++; the commonalities with C, a couple c++ classes that I had to take, and a couple books that I've read.

One of the few major reasons that I've committed to C is the lack of name mangling, which means I can link code compiled with different compilers. I know you can extern "C" in C++, and achieve the same effect, but if I implement all of the logic in C++, there's a limit to what I can expose with a C interface. Another is the lack of any realloc() like function. Perhaps at some point, I can override new with my custom allocator, which could be used to implement realloc(). However, I know that overriding new is bad news. I could find another solution, but it is much harder than just allocating blocks, with new[], delete[], constructors, and destructors. The third is ubiquity of C compilers; for many platforms, there is a C compiler in some form. I've always kept a minimalistic build requirement, and requiring a C++ runtime feels as if it would add bloat to the basic requirements. However, I know that for most platforms, there are C++ compilers now. Additionally, they make C++ compilers that can be compiled on very bare-bones C compilers for platforms that were never supported before.

My basic question is at what point do I give up portability, distributivity, and compatibility for ease of use, power, and flexibility?
Advertisement
One of the few major reasons that I've committed to C is the lack of name mangling, which means I can link code compiled with different compilers. I know you can extern "C" in C++, and achieve the same effect, but if I implement all of the logic in C++, there's a limit to what I can expose with a C interface.
It's surprisingly common for C++ code bases to provide a C interface or "wrapper". Even as a C++ developer, it's sometimes really nice to find that some library has a well thought out C API, so that knowledge is still handy wink.png
Another is the lack of any realloc() like function. Perhaps at some point, I can override new with my custom allocator, which could be used to implement realloc().[/quote]If you want to manage a resizable block of memory in C++, the standard solution is to use std::vector, which does the right thing™ by the C++ class model, calling constructors/destructors on each of the objects within the block as required.
However, even though C++ gives you the option of having "higher level" classes, with operators and constructors, etc... plain old data (POD) types are still common/useful in C++ code, just like in C code. So, if you have a situation where the simplest solution is using a POD struct and [font=courier new,courier,monospace]realloc[/font], it's fine to do so.
requiring a C++ runtime feels as if it would add bloat to the basic requirements[/quote]Some compilers will (by default) require your users to install a C++ runtime, but you should also have the option of statically linking the parts of the runtime that you actually use into your binaries too.
My basic question is at what point do I give up portability, distributivity, and compatibility for ease of use, power, and flexibility?[/quote]As above:
* If you're writing a library and want to distribute a binary version that's as widely usable as possible, then a C interface over your C++ code might still be useful.
* You can usually statically link to the runtime to simplify distribution of applications.
* C++98/C++03 is very widely supported now (except for a few dark corners of the spec which nearly every compiler failed to implement and nobody uses as a result), nearly as much as C89! C++11 compilers are still being developed, so C++11 code is definitely less portable than C.
A question for Hodgman, is there a downside to statically linking in the c and C++ runtimes other than the few kb of increased code size it brings? It eliminates the possibility of moving the .dll files or failing to redistribute said .dll files. In games this is a big concern as users don't often know the details of what the files do and might accidentally delete them.

I wonder as I wander...

http://www.davesgameoflife.com


It's surprisingly common for C++ code bases to provide a C interface or "wrapper". Even as a C++ developer, it's sometimes really nice to find that some library has a well thought out C API, so that knowledge is still handy

The main thing I worry about is that if I compile one library with an excellent class interface, and I want to use it in another library, my options are to either hide the interface behind a C interface, thus defeating the purpose of the easy to use class, or I must compile both the library and the code using it together into one executable.
I guess I can't really picture how I could use already compiled classes in another project without losing the C++ class interfaces.

If you want to manage a resizable block of memory in C++, the standard solution is to use std::vector, which does the right thing™ by the C++ class model, calling constructors/destructors on each of the objects within the block as required.
However, even though C++ gives you the option of having "higher level" classes, with operators and constructors, etc... plain old data (POD) types are still common/useful in C++ code, just like in C code. So, if you have a situation where the simplest solution is using a POD struct and realloc, it's fine to do so.

Using vectors (theirs or mine) would be an amazing thing, compared to having to manually manage it all the time.
However, in the implementation of a class, I would likely be using realloc() for primitive data types; imagine an implementation of std::string, that manages a char array. Calling new might exhaust memory when requesting a larger size, when realloc() could easily satisfy the requirement by extending the block of memory, and avoiding a copy. I've always had a fear of prematurely failing to allocate memory. However, having half of your code use one allocator, and having the rest use another sounds like bad news, as well.


Some compilers will (by default) require your users to install a C++ runtime, but you should also have the option of statically linking the parts of the runtime that you actually use into your binaries too.

I often forget about statically linking, because I try to avoid it, but my biggest question is how I would statically link code compiled by another compiler; for one, would it not restrict me to a C interface anyway, and secondly, if the executables were stored in the exact same format, with debugging symbols, it seems possible that another compiler might produce a manged name that collides with a name in this module for a different function definition.


If you're writing a library and want to distribute a binary that's as widely usable as possible, then a C interface over your C++ code still be useful.

I could likely pull off a C interface for others to use; I've been doing it all along. However, I would prefer to use a C++ interface when I use the library in another project...


You can statically link to the runtime to simplify distribution of applications.

This seems like a decent possibility, though it results in having multiple copies of the same code in different applications, unless I use a C interface. However, if I want to be link compatible, it'd probably require that I use the same compiler, thus requiring me to compile both codebases together.


C++98/C++03 is very widely supported now (except for a few dark corners of the spec which nearly every compiler failed to implement and nobody uses as a result), nearly as much as C89! C++11 compilers are still being developed, so C++11 code is definitely less portable than C.

Knowing me, I'd only ever use C++98. :) However, I do recognize that all of my platforms now have C++ compilers, so my prudence in that area might be for naught in this day and age...
For the record, C++98 and C++03 are pretty much the same thing. AFAIK, C++03 was basically just a "bug fix" revision of the spec, not a collection of new features like C++11 is.
is there a downside to statically linking in the c and C++ runtimes other than the few kb of increased code size it brings?
One down-side is that there may be bugs (e.g. security holes that allow hackers to exploit your program) in the C++ runtime -- if you dynamically link to them, then Microsoft (or whoever) can release a patch/update for the runtime, which will fix these bugs across many different programs.
I guess I can't really picture how I could use already compiled classes in another project without losing the C++ class interfaces.
Well the simple answer is: don't rely on using long-compiled code. Keep the source for your libraries around so you can recompile them on your current compiler.
For perspective, if I don't have the source for a library, I just don't use it, and this works fine for me. And besides the mentioned reasons (name mangling, etc), you usually want to make sure that all of your code uses similar compiler/linker settings anyway -- e.g. all code for my engine is compiled using specific settings (depending on build type) to disable exceptions & RTTI, to determine how floating-point code is treated, which instruction sets are allowed (e.g. SSE), the level of security/debugging checks, whether intrinsics are used, the optimisation level, whether link-time code-gen will be used, what kind of debug/symbol databases are generated... It can be a real pain if different libraries have been compiled in different ways, even if on the same compiler.
I must compile both the library and the code using it together into one executable.[/quote]There's no reason the library can't still be a (static or dynamic) library, separate from the code that uses it (which may be an executable). You just need to ensure they're both compiled by the same compiler.
Calling new might exhaust memory when requesting a larger size, when realloc() could easily satisfy the requirement by extending the block of memory, and avoiding a copy. I've always had a fear of prematurely failing to allocate memory. However, having half of your code use one allocator, and having the rest use another sounds like bad news, as well.[/quote]Are you writing code for embedded systems with limited RAM and no virtual memory?
If not, then [font=courier new,courier,monospace]new[/font]/[font=courier new,courier,monospace]malloc[/font] are very unlikely to fail... and if so, then you should probably be avoiding all kinds of dynamic memory allocation as much as possible anyway!
Cases where a [font=courier new,courier,monospace]malloc[/font] will fail but a [font=courier new,courier,monospace]realloc[/font] to a slightly larger size will work are pretty rare -- it's only when the allocator has created fragmentation that coincidentally exists immediately after your to-be-resized allocation, and the fact that there is fragmentation is a problem anyway...

Well the simple answer is: don't rely on using long-compiled code. Keep the source for your libraries around so you can recompile them on your current compiler.
For perspective, if I don't have the source for a library, I just don't use it, and this works fine for me. And besides the mentioned reasons (name mangling, etc), you usually want to make sure that all of your code uses similar compiler/linker settings anyway -- e.g. all code for my engine is compiled using specific settings (depending on build type) to disable exceptions & RTTI, to determine how floating-point code is treated, which instruction sets are allowed (e.g. SSE), the level of security/debugging checks, whether intrinsics are used, the optimisation level, whether link-time code-gen will be used, what kind of debug/symbol databases are generated... It can be a real pain if different libraries have been compiled in different ways, even if on the same compiler.

Gah... Well, I guess that's the way it has to go. Far gone are the days of making sure that you had everything down to the same calling convention in sync. I guess having to recompile ALL of the code you're using is the price to pay. I'm assuming that if you have any sort of modular interface between libraries, using a C interface is the way to go.


There's no reason the library can't still be a (static or dynamic) library, separate from the code that uses it (which may be an executable). You just need to ensure they're both compiled by the same compiler.

Well, for it to be dynamic, it'd have to have a C interface, making the point moot. Unless you're really good at guessing how the compiler mangles the name. And this removes the possibility of compiling a library for distribution, and having multiple applications use it, because they must all have the same compiler (maybe even the same version), or it no longer works and I need to ship a new version of everything that works together.

Basically, I am used to the ideology that I distribute a library that an application depends on, and I can swap out that library with a newer version if I choose, while having multiple applications still use it. I suppose that's no longer the way to go, and compiling and statically linking everything together is the current way of doing things. I know I can still do it if I use the C interface, but I must go from awesome class to plain old data structure back to another awesome class, unless I decide to serialize the data to smuggle it through customs or something. It's just one of those things that keeps me from C++, that in order to maintain status quo, I need to find ways around the features I'd stand to gain.


Are you writing code for embedded systems with limited RAM and no virtual memory?
If not, then new/malloc are very unlikely to fail... and if so, then you should probably be avoiding all kinds of dynamic memory allocation as much as possible anyway!
Cases where a malloc will fail but a realloc to a slightly larger size will work are pretty rare -- it's only when the allocator has created fragmentation that coincidentally exists immediately after your to-be-resized allocation, and the fact that there is fragmentation is a problem anyway...

Well... Yes, actually. My phone has 23mb of usable RAM, with much of it taken up by the OS, so even running one other application (like SMS messaging) has a chokehold on the amount of memory free. Additionally, my memory allocator allows you to provide your own pool and set how much memory you want to use, so if I decide that I want it to only have a certain amount of memory to keep more important tasks from being killed to reclaim memory, that further restricts things. In some cases, it could be a valid concern.

Ultimately, here, I suppose I'm looking to seek information to see if what I could stand to gain in ease of use is worth the trouble.
I'm assuming that if you have any sort of modular interface between libraries, using a C interface is the way to go.
Yep, for plugin systems, etc, then C might be a good choice.
Well, for it to be dynamic, it'd have to have a C interface, making the point moot[/quote]It can be dynamic, as in, a .dll or .so file... but yes, it's not replaceable by the user, unless they create a build from the same compiler.
I am used to the ideology that I distribute a library that an application depends on, and I can swap out that library with a newer version if I choose, while having multiple applications still use it[/quote]From a professional QA perspective, that kind of environment isn't feasible. When shipping a game, QA needs to be able to reliably reproduce the same binary distribution on every machine (e.g. if a bug comes in from a user, you need to be able to replicate the user's installation on one of your PCs).
In a world where one of your dependencies is installed into some shared system directory, and is outside of your control, you basically just can't do QA on the game.
So in order to be able to be in control of the quality of the product you're selling, you need to bundle all of your dependencies into the game's installation, whether that be putting the [font=courier new,courier,monospace].dll[/font]/[font=courier new,courier,monospace].so[/font] files in the same place as the executable, or statically linking those libraries into the executable file itself. In this case, there is going to be a shared compiler between all components, so you (the game author) can update DLLs with new versions.

I understand the system-wide library is a popular methodology, especially on Linux systems... but a big studio would never be able to release a game using that model.
Well... Yes, actually. My phone has 23mb of usable RAM[/quote]In that case... If you're writing "good" C++ code, then you should be using [font=courier new,courier,monospace]std::vector[/font], and [font=courier new,courier,monospace]shared_ptr[/font] and [font=courier new,courier,monospace]unique_ptr[/font] to do your memory management... but in the embedded case, then the oft-criticized "C with classes" style of C++ can actually be useful.
The old PS2-era engines that I worked with were all written in C++, but in a very C-style of it (no usage of [font=courier new,courier,monospace]std::*[/font], no constructors/destructors, etc).

Personally, I still think that RAII and proper use of the rule-of-three, and constructors/destructors is key to good C++ code in any style (whereas the typical "C with classes" style usually shuns these C++ concepts, I'd still recommend using them), but for embedded systems, I do shun pretty much any part of [font=courier new,courier,monospace]std::*[/font] that deals with memory.
However, the point I was making before is, I also shun C's [font=courier new,courier,monospace]malloc[/font]/[font=courier new,courier,monospace]free[/font] in these situations.

In my engine I use a custom new keyword, which uses a "scope" allocator, which internally uses a stack allocator. The vast majority of the engine is scope/stack allocated in this way (I can count the [font=courier new,courier,monospace]malloc[/font] calls on one hand), and although this type of allocator doesn't support random [font=courier new,courier,monospace]free[/font]/[font=courier new,courier,monospace]realloc[/font] semantics, it instead makes allocations almost free, almost eliminates fragmentation, removes leaks like RAII smart pointers (but without the burden of ref-counting or GC), still respects C++ destructors, and makes your memory allocation patterns extremely predictable, which is great for RAM-constrained situations.
This may be a tad off-topic but I think that C works quite well for the Component Entity system as found in things like Unity.
This is because rather than a game object inheriting from all the classes providing it with the functionality, the components are added instead.
This means that some of the useful functionality found in C++ might be wasted.

I have noticed that there are only a couple of pure C games engines so perhaps it isn't so popular because most game libraries are implemented in C++ without any C interfaces?

Perhaps C will make a comeback in games dev some day :)
http://tinyurl.com/shewonyay - Thanks so much for those who voted on my GF's Competition Cosplay Entry for Cosplayzine. She won! I owe you all beers :)

Mutiny - Open-source C++ Unity re-implementation.
Defile of Eden 2 - FreeBSD and OpenBSD binaries of our latest game.

From a professional QA perspective, that kind of environment isn't feasible. When shipping a game, QA needs to be able to reliably reproduce the same binary distribution on every machine (e.g. if a bug comes in from a user, you need to be able to replicate the user's installation on one of your PCs).
In a world where one of your dependencies is installed into some shared system directory, and is outside of your control, you basically just can't do QA on the game.
So in order to be able to be in control of the quality of the product you're selling, you need to bundle all of your dependencies into the game's installation, whether that be putting the .dll/.so files in the same place as the executable, or statically linking those libraries into the executable file itself. In this case, there is going to be a shared compiler between all components, so you (the game author) can update DLLs with new versions.

I understand the system-wide library is a popular methodology, especially on Linux systems... but a big studio would never be able to release a game using that model.

I suppose you're right. For testing, you need determinism. I've long considered the implication that a different version might break a game, so you must include the version with which the game is know to work. It's also been evident that most games distributed on read-only media have all code that they use stored in one place, so two games that use the same engine will have a copy of the engine with each.

I presume that this is the better way to do it. Also, dynamically linking is a pain, and it is yet another opportunity to worry about the program failing by worrying about whether or not a symbol was resolved.


In that case... If you're writing "good" C++ code, then you should be using std::vector, and shared_ptr and unique_ptr to do your memory management... but in the embedded case, then the oft-criticized "C with classes" style of C++ can actually be useful.
The old PS2-era engines that I worked with were all written in C++, but in a very C-style of it (no usage of std::*, no constructors/destructors, etc).

Some of the main reasons that I consider C++ are no longer having to manually call constructors and destructors, templates, operator overloading, default parameters, having member functions, and being able to worry less about validating parameters.

By validating parameters, I mean constantly checking if the pointer to the object upon which I'm operating is NULL, or checking if objects that I pass by reference are NULL. The this pointer should be assumed to be valid, from what I've seen, and passing using the reference operator in C++ is a lot more reassuring that the other object isn't going to be a NULL pointer.


Personally, I still think that RAII and proper use of the rule-of-three, and constructors/destructors is key to good C++ code in any style (whereas the typical "C with classes" style usually shuns these C++ concepts, I'd still recommend using them), but for embedded systems, I do shun pretty much any part of std::* that deals with memory.
However, the point I was making before is, I also shun C's malloc/free in these situations.

I shy away from malloc()/free() as well; much of my object's constructors accept a pointer to an optional block of preallocated memory, so most objects that I use are allocated on the stack to be reclaimed when I go out of scope. I try my hardest to allow it to be possibly stack allocated, because if I don't heap allocate anything, I don't need to manually call a destructor, as the object will destruct itself. However, having a destructor would be a great relief, so I can have cleanup code there, and not have to watch every place that I might exit the function and put a destruct call there for all constructed objects.


In my engine I use a custom new keyword, which uses a "scope" allocator, which internally uses a stack allocator. The vast majority of the engine is scope/stack allocated in this way (I can count the malloc calls on one hand), and although this type of allocator doesn't support random free/realloc semantics, it instead makes allocations almost free, almost eliminates fragmentation, removes leaks like RAII smart pointers (but without the burden of ref-counting or GC), still respects C++ destructors, and makes your memory allocation patterns extremely predictable, which is great for RAM-constrained situations.

How do you pull off having a custom new? I'm concerned that others' modules that I might use would rely on new having default behavior; I realize that a module should free its own memory, all the same.


I have noticed that there are only a couple of pure C games engines so perhaps it isn't so popular because most game libraries are implemented in C++ without any C interfaces?

For me, the amount of manual constructor and destructor calls is enough for me to consider porting my years in the making engine and libraries to C++. All other features are immensely useful, and would help me write programs faster, but object construction, destruction, and validation are tedious, and wear away at my morale.
I shy away from malloc()/free() as well; much of my object's constructors accept a pointer to an optional block of preallocated memory, so most objects that I use are allocated on the stack to be reclaimed when I go out of scope.
In C++, the equivalent is "placement new", which is how you manually call a constructor on a block of memory (whereas regular [font=courier new,courier,monospace]new[/font] allocates memory and calls the constructor). However, you've got to be very careful doing this kind of stuff, if you want to keep regular C++ behaviour -- destructors won't be called when the memory goes out of scope, so when using placement-new, you must also manually call the destructors of your objects at the appropriate times.
How do you pull off having a custom new? I'm concerned that others' modules that I might use would rely on new having default behavior[/quote]I linked to the code above, that bit is open source wink.png
Instead of overriding the new operator with my own version, I've chosen to invent my own keyword via a macro - [font=courier new,courier,monospace]eiNew[/font] ([font=courier new,courier,monospace]ei[/font] is my engine's "macro prefix", short for "eight", the name of the engine). If I use any 3rd party code that relies on [font=courier new,courier,monospace]new[/font], then it behaves as usual.
My [font=courier new,courier,monospace]eiNew[/font] macro uses a stack allocator to grab enough memory for the new object, then uses placement new to construct the object in that area. It also adds the address of the object to a linked-list belonging to a "scope" object (which is a RAII-type object), which is used to call the destructor when the scope is destructed.

I also have an [font=courier new,courier,monospace]eiAlloc[/font] macro, which does the same thing, but without calling constructors/destructors.

e.g. Using the built-in call-stack, we can write nice C++ code like:class Foo { ... };
{
Foo obj1( 42 );//increase "the call stack" by sizeof(Foo), call constructor with arg "42"
Foo obj2( 1337 );//again
}//obj2 and obj1 are out of scope, and are destructed

But for cases where I want to use memory other than the built-in call stack, my eiNew macro mimics this regular behaviour for my own buffers:char buffer[1024];
StackAlloc stack( buffer, 1024 );
{//*marker
Scope a( stack ); // N.B. Scope objects could also be created with eiNew, instead of being created in the call-stack
Foo* obj1 = eiNew( a, Foo )( 42 );//increases stack's pointer by sizeof(Foo), constructs a Foo with param "42", adds obj into the scope's destruction list
Foo* obj2 = eiNew( a, Foo )( 1337 );//again...
}//"a" has gone out of scope, obj2 and obj1 have their destructors called, the stack pointer in "stack" is reset back to where it was at "*marker"

Most of the time when I use malloc, it's to grab big buffers, like "[font=courier new,courier,monospace]buffer[/font]" above, and then I use [font=courier new,courier,monospace]eiNew[/font] for everything.


These ideas for making C++ (constructors/destructors) interact nicely with the simplicity of "stack allocators" comes from the "scope/stack" link in my last post.

This topic is closed to new replies.

Advertisement