Jump to content

Using Modern C++ to Eliminate Memory Problems

char std::string name myname c++ errors getname space
Some tips for avoiding common errors in C++ programs.

4: Adsense

0. Introduction
Programming is hard. Programming in C++ is even harder. Unfortunately, it is often made unnecessarily hard by programmers' resistance to adopt modern, safer methods and idioms. Bring up the topic of C++ at the lunch table and -- especially if there are Java programmers present -- you will be greeted with the customary horror stories of buffer overruns, memory leaks, and wild pointer errors that led to caffeine binges and marathon programming sessions.

Sadly, these kinds of errors occur far too often in C++ programs. Not because the language is inherently unsafe, but because many C++ programmers don't know how to use it safely. If you are tired of these kind of errors in your C++ programs, you've come to the right place. Relax, put down that Java compiler before it goes off, and follow the simple rules outlined in this article.

1. Use std::string instead of char * or char []
Character arrays are the only way to encapsulate string data in C. They're quick and easy to use, but unfortunately their use can be fraught with peril. Let's look at some of the more common errors that occur with character pointers and arrays. Keep in mind that most if not all of these problems will go undetected by the compiler.

Ex. 1 - Forgetting to allocate enough space for string terminator

char myName[4] = "Dave"; // Oops! No room for the '\0' terminator! strcpy(anotherName, myName); // Might copy four characters or 4000 Ex. 2 - Forgetting to allocate memory for a char *

char * errorString; ... strcpy(errorString, "SomeValueDeterminedAtRuntime"); Usually this error is caught rather quickly with a segmentation violation.

Ex. 3 - Returning a pointer to space allocated on the stack

char * getName() { char name[256]; strcpy(name, "SomeStaticValue"); ... strcat(name, "SomeValueDeterminedAtRuntime"); return name; } char * myName = getName(); Once the function returns, the space allocated to name is returned to the program. This means myName might point to something unexpected later.

Ex. 4 - The dread function sprintf()

char buf[128]; sprintf(buf, "%s%d", someValueGottenAtRuntime, someInteger); Unless you are absolutely sure of how much space you need, it's all too easy to overrun a buffer with sprintf().

Now, let's revisit each example and show how a std::string eliminates the aforementioned problems:

Ex. 1a

std::string myName = "Dave"; std::string anotherName = myName; Ex. 2

std::string errorString; ... errorString = "SomeValueDeterminedAtRuntime"; Ex. 3

std::string getName() { std::string name; name = "SomeStaticValue"; ... name += "SomeValueDeterminedAtRuntime"; return name; } std::string myName = getName(); Ex. 4

std::string buf; std::ostringstream ss; ss << someValueGottenAtRuntime << someInteger; buf = ss.str(); This one's a no-brainer, folks. Avoid the headaches associated with character arrays and pointers and use std::string. For legacy functions that expect a character pointer, you can use std::string's c_str() member function.

2. Use standard containers instead of homegrown containers
Besides std::string, the standard library provides the following container classes that you should prefer over your homegrown alternatives: vector, deque, list, set, multiset, map, multimap, stack, queue, and priority_queue. It is beyond the scope of this article to describe these in detail, however you can probably ascertain what most of them are by their names. For a proper treatment of the subject, I highly recommend the book by Josuttis listed in my references.

An important feature of the standard containers is that they are all template classes. This is a powerful concept. Templates let you define lists (or stacks or vectors) of *any* data type. The compiler generates type-safe code for each type of list you create. With C, you either needed a list for each type of data it would hold (e.g. IntList, MonsterList, StringList) or the list would hold a void * that pointed to data in each node; somewhat the antithesis of type-safety.

Let's look at a simple example with the commonly used std::vector. You'll want to use std::vector (or std::deque) instead of variable length arrays.

#include #include using namespace std; int main() { vector v; // Add elements to the end of the vector. // The vector class handles resizing v.push_back(1); v.push_back(2); v.push_back(3); v.push_back(4); // Careful - bounds-checking not performed cout << v[2] << endl; // iterate like you would with arrays for (int i = 0; i < v.size(); i++) { cout << v[i] << endl; } // iterate with an iterator vector::iterator p; for (p = v.begin(); p != v.end(); p++) { cout << *p << endl; } } In addition to providing generic, type-safe containers for any data type, these classes also provide multiple ways to search and iterate, and like std::string, they manage their own memory - a huge win over rolling these things yourself. I can't stress enough how important it is to familiarize yourself with the standard containers. Josuttis' book is an invaluable reference that is always within arm's reach of my keyboard.

3. Manage your class data to avoid resource leaks and crashes
There are other resources besides pointers that you might have to manage in your program: file handles, sockets, semaphores, etc. And sometimes you will manage raw pointers. The way to effectively manage resources in C++ is to wrap them in a class that performs their allocation and initialization in the constructor and their deallocation in the destructor:

class ResourceHog { SomeObject * object_; FILE * file_; public: ResourceHog() { object_ = new SomeObject(); file_ = fopen("myfile", "r"); } ~ResourceHog() { delete object_; fclose(file_); } }; The constructor allocates and opens resources, the destructor closes and deallocates them. But you're not done yet. When your objects are copied or assigned, their copy constructors and assignment operators are called.

"But I didn't write a copy constructor or assignment operator!", you protest.

Perhaps you didn't, but your compiler did for you, and their default behavior is to do a bitwise copy of your member data -- probably not what you want if your object manages pointers or other resources. Consider the following:

class String { char * data_; public: // Member functions }; String s1("hello, memory problems"); String s2 = s1; // String copy constructor called String s3; s3 = s1; // String assignment operator called If you didn't write a copy constructor or assignment operator for String to allocate space for the data_ member and copy its contents, then your compiler generated one that merely copies the data_ *pointer*. Now you have three String objects, each with a copy of the same pointer but not with a copy of the data which is pointed to by the pointer.

When the destructor is called for the first of these String objects that goes out of scope, the pointer is deleted and the memory pointed to is reclaimed by the system. Recall that the other two copies of the pointer still point to that reclaimed data -- i.e., they are "dangling pointers" -- so when one of their destructors is called, an access violation occurs and your program crashes. The access violation occurs since an attempt to access reclaimed memory is made.

One way to fix this problem is to not let client code copy or assign objects of your class. The trick here is to *declare* the copy constructor and assignment operator, but not implement them:

class String { char * data_; // Copy c'tor with no implementation String(const String &); // Assignment op with no implementation String & operator=(const String &); public: // Member functions }; Now code that tries to copy or assign objects of your class will not compile. That's because it can't access your private functions. Friend and member functions that try the same will compile, but the linker will stop them since these functions are not defined.

Not allowing copying and assignment is, in fact, my default MO when designing a new class. Rarely do you need multiple copies of objects floating around your program. If you do need a copy constructor and assignment operator, they are pretty easy to code up. You're just creating a new object from an existing one:

Matrix::Matrix(const Matrix & m) { data_ = new float[16]; memcpy(data_, m.data_, 16 * sizeof(float)); } Matrix & Matrix::operator=(const Matrix & m) { if (this != &m) { // Check for self-assignment float * newData = new float[16]; delete [] data_; data_ = newData; memcpy(data_, m.data_, 16 * sizeof(float)); } return *this; } A good rule of thumb is that if you need a destructor, then you also need a copy constructor and assignment operator (or you need to prevent other code from calling them as shown above).

4. Use virtual destructors in class hierarchies
Compile and run the following program and observe what happens when the pointer to Base is deleted:

#include using namespace std; class Base { // Base private data // Base private member functions public: ~Base() { cout << "~Base()" << endl; } // Base public member functions }; class Derived : public Base { // Derived private data // Derived private member functions public: ~Derived() { cout << "~Derived()" << endl; } // Derived public member functions }; int main() { Base * bp = new Derived(); delete bp; } On my system, the output looks like this:

$ myprogram ~Base() $ The Derived destructor wasn't called! This means that the Derived portion of the object is still floating around and taking up memory, and you have no way of getting it back. That's called a memory leak. It's also a problem in the C++ world known as the "slicing" of an object. Fortunately, it is easily prevented by declaring your base class destructors as virtual:

virtual ~Base() { cout << "~Base()" << endl; } Now the destructor is called polymorphically through the pointer to Base. What this means is that the object is destructed in reverse order in which it was constructed: the Derived destructor is called, then the Base destructor, which is what you want.

Making ~Base() virtual in the above example, you'll see that the object is now properly destructed from the top down:

$ myprogram ~Derived() ~Base() $
5. Use smart pointers instead of raw pointers
In addition to wrapping pointers in classes as shown above, "smart" or "managed" pointers can be a great help in managing memory. The standard library offers std::auto_ptr, which is designed to prevent memory leaks in the face of thrown exceptions. As shown here, raw pointers are prone to leaks when exceptions are thrown before they are cleaned up:

Foo * fp = new Foo(); someFunctionThatMightThrow(); delete fp; If the function throws an exception before you delete fp, you've leaked memory. You can fix this with std::auto_ptr, whose destructor will clean up the pointer it owns:

std::auto_ptr fp(new Foo()); // if it throws, ~auto_ptr() cleans up the Foo * someFunctionThatMightThrow(); auto_ptr has limitations, however. Its copying semantics transfer ownership of the raw pointer. Therefore, if you pass an auto_ptr *by value* to a function, the temporary copy in the function gains ownership of the raw pointer. When the function returns, the temporary copy is destroyed along with the raw pointer -- almost certainly what you don't want! If you want to avoid this kind of behavior, use a smart pointer that implements "reference counting", like Boost's shared_ptr:

boost::shared_ptr fp(new Foo()); You can now pass fp around and not worry about ownership issues. shared_ptr will managed the references to it and delete the raw pointer when the last reference goes out of scope. Smart pointers are often used in containers:



Note: GameDev.net moderates article comments.