# Container class questions

This topic is 4158 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

Hello. I started writing a dynamic array class, mostly as a learning experience (but also for possible future use). As it's how Flavian Brebion is doing it for Infinity, I'm using the C memory management functions. I thank him for his assistance. So far it looks like this:
template <typename _type> class DynamicArray
{
public:
DynamicArray<_type>(unsigned int reservedMemory = 0);
~DynamicArray();
void append(const _type& element);
_type operator[](int params){return *(arrayLocation+params);};

protected:
size_t objectSize;
_type* arrayLocation;
unsigned int maxSize;
unsigned int numElements;
};

template <typename _type> DynamicArray<_type>::DynamicArray(unsigned int reservedMemory) :
objectSize(sizeof(_type)),
arrayLocation((_type*) malloc(reservedMemory * objectSize)),
maxSize(reservedMemory),
numElements(0)
{}

template <typename _type> void DynamicArray< _type >::append(const _type& element)
{
if(numElements >= maxSize)
{
arrayLocation = (_type*) realloc((void*)arrayLocation, (numElements+1)*objectSize);
maxSize+=1;
}

new(arrayLocation + numElements) _type(element);
numElements++;
}

template <typename _type> DynamicArray<_type>::~DynamicArray()
{
_type* arrayLocationCopy = arrayLocation;

for(unsigned int x = 0; x < numElements; x++)
{
arrayLocationCopy->~_type();
arrayLocationCopy++;
};

free(arrayLocation);
}

It's incomplete. Anyway, at first I couldn't figure out how to destruct elements; integral types don't require destruction (I think?) while objects do. I tried this on a whim:
~_type();

and it surprisingly compiled without error. Is that syntax correct? Also, if you have any suggestions based on what I've shown, please share them. There's relatively little information out there on writing container classes.

##### Share on other sites
Yes, that's right - assuming you're using placement new, which you are.

For some reason, there have been a whole bunch of threads lately on people writing their own dynamic arrays. Scan down the thread list and you'll see them. At least one of them goes into this topic.

##### Share on other sites
Quote:
 arrayLocation = (_type*) realloc((void*)arrayLocation, (numElements+1)*objectSize);

Do remember that realloc basically finds a new buffer in memory, forcefully copies the bytes from the old buffer to the new one without any regard for what the memory actually represents, then brutally frees the old buffer without any form of contents destruction. So, if _type is not a POD type, your container will simply stop working for mysterious reasons at the worst possible moment.

Rule: non-POD objects cannot be copied byte-by-byte, they must be initialized in the new buffer as copies of the objects in the old buffer, and the originals must then be destructed. There is simply no way to achieve this using realloc: you will have to malloc a new buffer, do the copy, do the destruction, and free the old one.

##### Share on other sites
Quote:
 Original post by phil_tYes, that's right - assuming you're using placement new, which you are.For some reason, there have been a whole bunch of threads lately on people writing their own dynamic arrays. Scan down the thread list and you'll see them. At least one of them goes into this topic.

Ah, thanks for clearing that up, and informing me of the other topics. I'll have a look.
Quote:
 Original post by ToohrVykRule: non-POD objects cannot be copied byte-by-byte, they must be initialized in the new buffer as copies of the objects in the old buffer, and the originals must then be destructed. There is simply no way to achieve this using realloc: you will have to malloc a new buffer, do the copy, do the destruction, and free the old one.

Thanks. That's why I started this topic: something was bound to be wrong with my code.

However, that's not exactly what realloc() does, is it? If it can resize the memory block while retaining contiguity, it does (without affecting anything already in the block). So I guess I need to write a function that does either what you've described or realloc() as appropriate. Which necessitates a noob question. How do I check if a memory location is already allocated?

[Edited by - Baiame on July 25, 2007 3:07:02 PM]

##### Share on other sites
Quote:
 However, that's not exactly what realloc() does, is it? If it can resize the memory block while retaining contiguity, it does (without affecting anything already in the block). So I guess I need to write a function that does either what you've described or realloc() as appropriate. Which necessitates a noob question. How do I check if a memory location is already allocated?

Let's not go there.

Where and how memory is allocated is unknown - it's not specified or standardized - the compiler writers looked into it, and made it work on given platform. It has also received a lot of review (every application ever written) and works most of the time. But, what may work on XP Pro SP2 on AMD can break Vista, on Intel processor, or under gcc. There are zero guarantees, except what documentation for new states - it allocates a chunk of memory, or throws bad_alloc.

You have exactly two options for managing memory:
1) Pre-allocate buffer, then override new to provide pointers from there
2) Let new allocate it

That's it. malloc/realloc are bad. Unless you're overriding global new handler, you really really don't want to mess with them.

PS. 1) would be the option that allows you to check if area of memory is already allocated, since you'd be allocating chunks by yourself.

Your allocation code must also respect data alignment, so you'll need to possible over-allocate by 16 bytes to be able to align the memory on a boundary (and even that is considered a hack, since it can theoretically fail)

##### Share on other sites
Quote:
 Original post by BaiameHowever, that's not exactly what realloc() does, is it? If it can resize the memory block while retaining contiguity, it does (without affecting anything already in the block). So I guess I need to write a function that does either what you've described or realloc() as appropriate. Which necessitates a noob question. How do I check if a memory location is already allocated?

You can't know whether realloc can resize the buffer without moving its contents or not before you actually call the function. So, if your container contains non-POD objects, you have to move the contents of the buffer to a temporary buffer before calling realloc (just in case it moves the buffer), which kind of defeats the purpose. C++ provides no standard equivalent of realloc.

In short: have your generic class allocate-copy-destruct-deallocate, and then specialize your class for known POD types (int et al) to use realloc.

##### Share on other sites
Quote:
 Original post by AntheusWhere and how memory is allocated is unknown...

I thought that allocation checks would be more standardized. Can you please point to a couple ways of doing it? Overriding the new operator seems a bit excessive at this stage.
Quote:
 Original post by AntheusYour allocation code must also respect data alignment, so you'll need to possible over-allocate by 16 bytes to be able to align the memory on a boundary (and even that is considered a hack, since it can theoretically fail)

Quote:
 Original post by ToohrVykYou can't know whether realloc can resize the buffer without moving its contents or not before you actually call the function.

I know. But realloc() obviously does some kind of checks for memory occupation- that's what I want to do. Of course the program would end up basically doing the same thing twice, but it'd still be far faster than copying each element to a new buffer every time the array resizes. Flavian pointed to faster resizing as one of the advantages of his dynamic array over the STL vector (he definitely achieved it using realloc() in some way, but I'm not sure whether he overrided new).

##### Share on other sites
Quote:
 I know. But realloc() obviously does some kind of checks for memory occupation- that's what I want to do.

All the checks are internal to the library. realloc from a given standard C library implementation knows whether memory is occupied or not because that memory was allocated and handled by that very same library implementation. If you were to use realloc from a different implementation, it just wouldn't work, because it relies on implementation-specific helper functions. Since your program is not a part of the standard library, it can only use the public parts of it.

Even worse, the C++ memory models involves contiguous buffers. It has no concept of "memory being available past the end of a buffer". So, not only do the libraries not provide this checking functionality, but it is explicitly described as non-portable by the standard.

Quote:
 Of course the program would end up basically doing the same thing twice, but it'd still be far faster than copying each element to a new buffer every time the array resizes. Flavian pointed to faster resizing as one of the advantages of his dynamic array over the STL vector (he definitely achieved it using realloc() in some way, but I'm not sure whether he overrided new).

Sadly, you have no choice here: if you wish to use a realloc-like feature with non-POD objects (POD objects, again, are a breeze), you will have to keep track of memory yourself, by allocating a very large buffer from the OS or library, and then using this buffer to allocate sub-buffers to the rest of your program.

##### Share on other sites
By over-allocating and using placement new, you are already implementing what realloc tries to do, the only difference is you reserve the memory forever so you are guaranteed that after a reallocation you will have to add lots more elements before another reallocation occurs.

##### Share on other sites
Quote:
 I know. But realloc() obviously does some kind of checks for memory occupation- that's what I want to do. Of course the program would end up basically doing the same thing twice, but it'd still be far faster than copying each element to a new buffer every time the array resizes. Flavian pointed to faster resizing as one of the advantages of his dynamic array over the STL vector (he definitely achieved it using realloc() in some way, but I'm not sure whether he overrided new).

(POD types only):
Quote:
 Now it sounds obvious, but when I allocated memory in my CArray class, I used malloc/realloc/free, and there's no object constructor/destruction with those. I didn't have much trouble so far because as soon as my classes start to become complex, I prefer to store pointers to them. But every once in a while, I store an object itself, and that's where leaks and bugs appear.

(use standard library):
Quote:
 Second lesson learnt: the std::vector class is not slow. It is bug-free ( unlike mine ), and performs pretty much as fast.

(Copying is where you lose time, not re-allocating - also, constructors/destructors must be called)
Quote:
 The first one being: using malloc/realloc avoids the copy/fill mechanism used in std::vector, meaning no construction/destructions needed when the array grows. A first good point for me. Of course, this time I had to take care of correctly constructing/destructing objects, meaning playing with "placement new" and other goodies.. but no big deal.

Final lesson:
Quote:
 After this, I benchmarked my classes: the old ( buggy ) one, std::vector, and my new CArray class. Verdict: std::vector and my old class are on par as far as performance goes, but my new CArray class, in the context of appending millions of elements, is in average 2 to 3.5 times faster ! So I haven't wasted my time, finally.

Quote:
 Another good point: because I knew what is the most frequent usage I'm doing in my engine of my CArray class ( namely, appending data ), I could optimize the case and even create a special interface to append many elements to the array. I called this the "beginAppend/doAppend/endAppend" mechanism.

USE STANDARD LIBRARY, unless you're solving a particular, non-standard behaviour.

malloc/realloc are secondary in this story - it's different allocation strategy for batch additions that makes things fast - it also uses special interface - std::vector's performance is irrelevant, and couldn't be improved, only understanding of the problem solved helped.

You could achieve that with an accessor to std::vector, that would batch additions, and submit them to std::vector as a batch, after calling resize (or reserve, I forget).

1. 1
2. 2
Rutin
21
3. 3
4. 4
A4L
15
5. 5
khawk
14

• 13
• 26
• 10
• 11
• 9
• ### Forum Statistics

• Total Topics
633737
• Total Posts
3013612
×