Jump to content
  • Advertisement
Sign in to follow this  
Baiame

Container class questions

This topic is 4158 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hello. I started writing a dynamic array class, mostly as a learning experience (but also for possible future use). As it's how Flavian Brebion is doing it for Infinity, I'm using the C memory management functions. I thank him for his assistance. So far it looks like this:
template <typename _type> class DynamicArray
{
public:
	DynamicArray<_type>(unsigned int reservedMemory = 0);
	~DynamicArray();
	void append(const _type& element);
	_type operator[](int params){return *(arrayLocation+params);};

protected:
	size_t objectSize;
	_type* arrayLocation;
	unsigned int maxSize;
	unsigned int numElements;
};
template <typename _type> DynamicArray<_type>::DynamicArray(unsigned int reservedMemory) : 
	objectSize(sizeof(_type)),
	arrayLocation((_type*) malloc(reservedMemory * objectSize)),
	maxSize(reservedMemory),
	numElements(0)
{}

template <typename _type> void DynamicArray< _type >::append(const _type& element)
{
	if(numElements >= maxSize)
	{
		arrayLocation = (_type*) realloc((void*)arrayLocation, (numElements+1)*objectSize); 
		maxSize+=1;
	}

	new(arrayLocation + numElements) _type(element);
	numElements++;
}

template <typename _type> DynamicArray<_type>::~DynamicArray()
{
	_type* arrayLocationCopy = arrayLocation;

	for(unsigned int x = 0; x < numElements; x++)
	{
		arrayLocationCopy->~_type();
		arrayLocationCopy++;
	};

	free(arrayLocation);
}
It's incomplete. Anyway, at first I couldn't figure out how to destruct elements; integral types don't require destruction (I think?) while objects do. I tried this on a whim:
~_type();
and it surprisingly compiled without error. Is that syntax correct? Also, if you have any suggestions based on what I've shown, please share them. There's relatively little information out there on writing container classes.

Share this post


Link to post
Share on other sites
Advertisement
Yes, that's right - assuming you're using placement new, which you are.

For some reason, there have been a whole bunch of threads lately on people writing their own dynamic arrays. Scan down the thread list and you'll see them. At least one of them goes into this topic.

Share this post


Link to post
Share on other sites
Quote:
arrayLocation = (_type*) realloc((void*)arrayLocation, (numElements+1)*objectSize);


Do remember that realloc basically finds a new buffer in memory, forcefully copies the bytes from the old buffer to the new one without any regard for what the memory actually represents, then brutally frees the old buffer without any form of contents destruction. So, if _type is not a POD type, your container will simply stop working for mysterious reasons at the worst possible moment.

Rule: non-POD objects cannot be copied byte-by-byte, they must be initialized in the new buffer as copies of the objects in the old buffer, and the originals must then be destructed. There is simply no way to achieve this using realloc: you will have to malloc a new buffer, do the copy, do the destruction, and free the old one.

Share this post


Link to post
Share on other sites
Quote:
Original post by phil_t
Yes, that's right - assuming you're using placement new, which you are.

For some reason, there have been a whole bunch of threads lately on people writing their own dynamic arrays. Scan down the thread list and you'll see them. At least one of them goes into this topic.

Ah, thanks for clearing that up, and informing me of the other topics. I'll have a look.
Quote:
Original post by ToohrVyk
Rule: non-POD objects cannot be copied byte-by-byte, they must be initialized in the new buffer as copies of the objects in the old buffer, and the originals must then be destructed. There is simply no way to achieve this using realloc: you will have to malloc a new buffer, do the copy, do the destruction, and free the old one.

Thanks. That's why I started this topic: something was bound to be wrong with my code.

However, that's not exactly what realloc() does, is it? If it can resize the memory block while retaining contiguity, it does (without affecting anything already in the block). So I guess I need to write a function that does either what you've described or realloc() as appropriate. Which necessitates a noob question. How do I check if a memory location is already allocated?

[Edited by - Baiame on July 25, 2007 3:07:02 PM]

Share this post


Link to post
Share on other sites
Quote:
However, that's not exactly what realloc() does, is it? If it can resize the memory block while retaining contiguity, it does (without affecting anything already in the block). So I guess I need to write a function that does either what you've described or realloc() as appropriate. Which necessitates a noob question. How do I check if a memory location is already allocated?


Let's not go there.

Where and how memory is allocated is unknown - it's not specified or standardized - the compiler writers looked into it, and made it work on given platform. It has also received a lot of review (every application ever written) and works most of the time. But, what may work on XP Pro SP2 on AMD can break Vista, on Intel processor, or under gcc. There are zero guarantees, except what documentation for new states - it allocates a chunk of memory, or throws bad_alloc.

You have exactly two options for managing memory:
1) Pre-allocate buffer, then override new to provide pointers from there
2) Let new allocate it

That's it. malloc/realloc are bad. Unless you're overriding global new handler, you really really don't want to mess with them.

PS. 1) would be the option that allows you to check if area of memory is already allocated, since you'd be allocating chunks by yourself.

Your allocation code must also respect data alignment, so you'll need to possible over-allocate by 16 bytes to be able to align the memory on a boundary (and even that is considered a hack, since it can theoretically fail)

Share this post


Link to post
Share on other sites
Quote:
Original post by Baiame
However, that's not exactly what realloc() does, is it? If it can resize the memory block while retaining contiguity, it does (without affecting anything already in the block). So I guess I need to write a function that does either what you've described or realloc() as appropriate. Which necessitates a noob question. How do I check if a memory location is already allocated?


You can't know whether realloc can resize the buffer without moving its contents or not before you actually call the function. So, if your container contains non-POD objects, you have to move the contents of the buffer to a temporary buffer before calling realloc (just in case it moves the buffer), which kind of defeats the purpose. C++ provides no standard equivalent of realloc.

In short: have your generic class allocate-copy-destruct-deallocate, and then specialize your class for known POD types (int et al) to use realloc.

Share this post


Link to post
Share on other sites
Quote:
Original post by Antheus
Where and how memory is allocated is unknown...

I thought that allocation checks would be more standardized. Can you please point to a couple ways of doing it? Overriding the new operator seems a bit excessive at this stage.
Quote:
Original post by Antheus
Your allocation code must also respect data alignment, so you'll need to possible over-allocate by 16 bytes to be able to align the memory on a boundary (and even that is considered a hack, since it can theoretically fail)

I had read about memory alignment, and I'll take it into account.
Quote:
Original post by ToohrVyk
You can't know whether realloc can resize the buffer without moving its contents or not before you actually call the function.

I know. But realloc() obviously does some kind of checks for memory occupation- that's what I want to do. Of course the program would end up basically doing the same thing twice, but it'd still be far faster than copying each element to a new buffer every time the array resizes. Flavian pointed to faster resizing as one of the advantages of his dynamic array over the STL vector (he definitely achieved it using realloc() in some way, but I'm not sure whether he overrided new).

Share this post


Link to post
Share on other sites
Quote:
I know. But realloc() obviously does some kind of checks for memory occupation- that's what I want to do.


All the checks are internal to the library. realloc from a given standard C library implementation knows whether memory is occupied or not because that memory was allocated and handled by that very same library implementation. If you were to use realloc from a different implementation, it just wouldn't work, because it relies on implementation-specific helper functions. Since your program is not a part of the standard library, it can only use the public parts of it.

Even worse, the C++ memory models involves contiguous buffers. It has no concept of "memory being available past the end of a buffer". So, not only do the libraries not provide this checking functionality, but it is explicitly described as non-portable by the standard.

Quote:
Of course the program would end up basically doing the same thing twice, but it'd still be far faster than copying each element to a new buffer every time the array resizes. Flavian pointed to faster resizing as one of the advantages of his dynamic array over the STL vector (he definitely achieved it using realloc() in some way, but I'm not sure whether he overrided new).


Sadly, you have no choice here: if you wish to use a realloc-like feature with non-POD objects (POD objects, again, are a breeze), you will have to keep track of memory yourself, by allocating a very large buffer from the OS or library, and then using this buffer to allocate sub-buffers to the rest of your program.

Share this post


Link to post
Share on other sites
By over-allocating and using placement new, you are already implementing what realloc tries to do, the only difference is you reserve the memory forever so you are guaranteed that after a reallocation you will have to add lots more elements before another reallocation occurs.

Share this post


Link to post
Share on other sites
Quote:
I know. But realloc() obviously does some kind of checks for memory occupation- that's what I want to do. Of course the program would end up basically doing the same thing twice, but it'd still be far faster than copying each element to a new buffer every time the array resizes. Flavian pointed to faster resizing as one of the advantages of his dynamic array over the STL vector (he definitely achieved it using realloc() in some way, but I'm not sure whether he overrided new).


Always read the whole story:

Quoted from googled reference:

(POD types only):
Quote:
Now it sounds obvious, but when I allocated memory in my CArray class, I used malloc/realloc/free, and there's no object constructor/destruction with those. I didn't have much trouble so far because as soon as my classes start to become complex, I prefer to store pointers to them. But every once in a while, I store an object itself, and that's where leaks and bugs appear.


(use standard library):
Quote:
Second lesson learnt: the std::vector class is not slow. It is bug-free ( unlike mine ), and performs pretty much as fast.


(Copying is where you lose time, not re-allocating - also, constructors/destructors must be called)
Quote:
The first one being: using malloc/realloc avoids the copy/fill mechanism used in std::vector, meaning no construction/destructions needed when the array grows. A first good point for me. Of course, this time I had to take care of correctly constructing/destructing objects, meaning playing with "placement new" and other goodies.. but no big deal.


Final lesson:
Quote:
After this, I benchmarked my classes: the old ( buggy ) one, std::vector, and my new CArray class. Verdict: std::vector and my old class are on par as far as performance goes, but my new CArray class, in the context of appending millions of elements, is in average 2 to 3.5 times faster ! So I haven't wasted my time, finally.

Quote:
Another good point: because I knew what is the most frequent usage I'm doing in my engine of my CArray class ( namely, appending data ), I could optimize the case and even create a special interface to append many elements to the array. I called this the "beginAppend/doAppend/endAppend" mechanism.


USE STANDARD LIBRARY, unless you're solving a particular, non-standard behaviour.


malloc/realloc are secondary in this story - it's different allocation strategy for batch additions that makes things fast - it also uses special interface - std::vector's performance is irrelevant, and couldn't be improved, only understanding of the problem solved helped.

You could achieve that with an accessor to std::vector, that would batch additions, and submit them to std::vector as a batch, after calling resize (or reserve, I forget).

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!