Sign in to follow this  

[C++] How do I allocate a multidimensional array with new?

This topic is 2660 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Let's say I want to allocate:


hash_set<Object *> Collisionmatrix [Worldwidth][Worldheight] ;






If I do something like
hash_set<Object *> * Collisionmatrix = new hash_set<Object *> [Worldwidth * Worldheight] ;

then I cannot use
Collisionmatrix [i][j]
Because the compiler does not know the size of each line.

Share this post


Link to post
Share on other sites
If the width and height of the world are not constant, have you considered using boost::multi_array (or similar) to hold the hash sets?

Quote:

then I cannot use
Collisionmatrix [i][j]
Because the compiler does not know the size of each line.

You can still index it in a random manner, using (i * width + j) to compute the linear co-ordinates from the multi-dimensional ones. This is what boost::mutli_array does internally.

Share this post


Link to post
Share on other sites
Quote:
Original post by Dizzy_exe
hash_set<Object *> (*Collisionmatrix)[Worldheight] = (hash_set<Object *> (*)[Worldheight])new hash_set<Object *> [Worldwidth * Worldheight] ;


Thanks. I think that will do the trick.

Share this post


Link to post
Share on other sites
It won't work if Worldheight is not a compile-time constant (some compilers might have an extension that allow this, but it won't be portable). If the width and height is constant, you should just use a fixed array like in your original post.

Whatever you choose to do, you should certainly be using an RAII wrapper to alleviate the headache of worrying about memory management.

Share this post


Link to post
Share on other sites
Quote:
Original post by rip-off
It won't work if Worldheight is not a compile-time constant (some compilers might have an extension that allow this, but it won't be portable). If the width and height is constant, you should just use a fixed array like in your original post.

Whatever you choose to do, you should certainly be using an RAII wrapper to alleviate the headache of worrying about memory management.


You are right, Worldheight is variable.
I have created my own array class (I don't have boost installed):

template <class T>
class Dynamic2darray
{
const int Numofcolumns ;
T * Tp ;

public:

Dynamic2darray ( int Localnumofrows, int Localnumofcolumns ) :
Numofcolumns ( Localnumofcolumns ),
Tp ( ( Localnumofrows > 0 && Numofcolumns > 0) ?
new T [Localnumofrows * Numofcolumns] : NULL )
{
}

~Dynamic2darray()
{
delete [] Tp ;
}

inline T * operator [] ( int i )
{
return Tp + ( Numofcolumns * i ) ;
}

} ;


Share this post


Link to post
Share on other sites
Quote:
Original post by Dizzy_exe
hash_set<Object *> (*Collisionmatrix)[Worldheight] = (hash_set<Object *> (*)[Worldheight])new hash_set<Object *> [Worldwidth * Worldheight] ;


Dear god...

Share this post


Link to post
Share on other sites
Quote:
Original post by hiigara
Quote:
Original post by Dizzy_exe
hash_set<Object *> (*Collisionmatrix)[Worldheight] = (hash_set<Object *> (*)[Worldheight])new hash_set<Object *> [Worldwidth * Worldheight] ;


Thanks. I think that will do the trick.
The aforementioned compilation issues aside, please don't do it that way :| Your 'Dynamic2darray' class is a much better solution.
Quote:
(I don't have boost installed):
Well, you could install it - it's allowed :)

I know Boost is a big download, but multi_array is header-only, so once you've got it downloaded, all you have to do is include the multi_array header and you're good to go.
Quote:
Original post by Zipster
You could also do:

vector<vector<hash_set<Object*> > > Collisionmatrix;
IMO, that seems like a step backward, even from the OP's ad hoc multidimensional array class. (I don't see any particular reason to use a jagged array for this if it can be avoided.)

Share this post


Link to post
Share on other sites
Quote:
Original post by hiigara
I have created my own array class (I don't have boost installed):
*** Source Snippet Removed ***

Careful, your class isn't Rule of Three compliant.

Share this post


Link to post
Share on other sites
Quote:
Original post by jyk
IMO, that seems like a step backward, even from the OP's ad hoc multidimensional array class. (I don't see any particular reason to use a jagged array for this if it can be avoided.)

I couldn't disagree more. Why write a custom container class if it can be avoided, especially when the only downside is the possibility a multidimensional array could be jagged? The solution is... don't create a jagged array. It's easy enough to avoid, and doesn't lead you down the path of reinventing the wheel and potentially dealing with extra bugs as a consequence. Already SiCrane pointed out a common mistake made by the OP when writing containers.

The best solution would probably still be to use a 1D vector and simply calculate the linear coordinates from multi-dimensional coordinates (as per rip-off). This gives you the best of both worlds -- the benefit of a one contiguous block of memory without the pitfalls of writing containers.

Share this post


Link to post
Share on other sites
I'll both agree and disagree, to an extent.

Yes, rewriting containers that already exist is generally unwise, and the OP would be best served using an existing library class such as boost's in this case. Yes, it is thoroughly all too easy to make mistakes and screw up containers, as has been demonstrated already.

However, containers are not mystical magical classes which are somehow more dangerous or difficult than any other non-POD class. If one doesn't know to follow the rule of three with a container, who's to say one will do it correctly with anything else? If one doesn't understand the benefits and power of RAII for containers, who's to say one will use it correctly in any other context?

In fact, I would argue that (comparatively) simple data containers are a great way to learn about both concepts. You get to see first-hand the benefits of writing accessible and intuitive interfaces for nontrivial logic, you get to experience the benefits of RAII, you get to learn why the rule of three is a rule, and you get to spend a bit of time reasoning about the kinds of issues that will affect every single nontrivial, non-POD class you ever write in the C++ language. Maybe there's a better place to learn all that stuff in one shot, but I certainly can't think of it offhand.


Lastly: nested vectors are a performance and memory fragmentation nightmare. Have N rows of cells and need to erase the entire row N/2? Enjoy that nasty deep copy.

Share this post


Link to post
Share on other sites
Quote:
Original post by Zipster
I couldn't disagree more.
Hehe...didn't realize I'd elicit that strong a reaction :)

But, I'll have to disagree. At the very, very least, if one doesn't want to use multi_array, one can write a class somewhat similar to the OP's, but using std::vector for storage. You're only talking about (literally) a few lines of code, and what you get is greater code clarity and (possibly) better cache coherency.
Quote:
The best solution would probably still be to use a 1D vector and simply calculate the linear coordinates from multi-dimensional coordinates (as per rip-off). This gives you the best of both worlds -- the benefit of a one contiguous block of memory without the pitfalls of writing containers.
From a usability standpoint, I don't think you want to be writing out that (admittedly trivial) computation by hand every time you want to index the array. The wrapper we're talking about here doesn't have to do anything more than provide an accessor of some sort (e.g. operator()()) that takes care of the indexing details for you. Sure, whenever you write code there's a chance you'll mess something up, but as far as container wrappers go, it doesn't get much simpler than the case under discussion here.

Anyway, the easiest solution would still be to use multi_array, which is what I'd recommend to the OP.

Share this post


Link to post
Share on other sites
It is probably simpler to make one big array and calculate the position in a function, but this can also be done using an array of array, so if you don't want to use boost and stick with plain old C++, this should do the trick:

#define X            100
#define Y 100

char ** pMultiArray;
int i;

pMultiArray = new char * [ X ];
for( i = 0; i < X; i++ )
pMultiArray[ i ] = new char[ Y ];

//
// Do your stuff...
//

for( i = 0; i < X; i++ )
delete [] pMultiArray[ i ];
delete [] pMultiArray;


You can access variables like this:
pMultiArray[ i ][ j ] = 1234;

Share this post


Link to post
Share on other sites
Just because you are writing a container, doesn't mean you cannot re-use the standard library. I'd use std::vector<> as the underlying storage for the 2D class. Best of both worlds, doesn't depend on Boost but still avoids most of the pitfalls of trying to manage the memory yourself.

template<class T>
class vector2d
{
public:
vector2d(int width, int height)
:
width_(width),
storage(width * height)
{
}

T &operator()(int x, int y) {
return storage[index(x,y)];
}

const T &operator()(int x, int y) const {
return storage[index(x,y)];
}

int width() const { return width_; }

int height() const {
return storage.size() / width_;
}

private:
typename std::vector<T>::size_type index(int x, int y) {
// range check here if you want
// the vector usually does this for us in debug anyway
return y * width_ + x;
}

int width_;
std::vector<T> storage;
};


Something like that *. Look ma, I obey the rule of three without even trying!

* Warning: code not tested or even compiled!

Share this post


Link to post
Share on other sites
Quote:
Original post by jyk
Hehe...didn't realize I'd elicit that strong a reaction :)

BECAUSE I HATE YOU SO MUCH! But really, in retrospect it did come across stronger than I intended because...

Quote:
But, I'll have to disagree. At the very, very least, if one doesn't want to use multi_array, one can write a class somewhat similar to the OP's, but using std::vector for storage. You're only talking about (literally) a few lines of code, and what you get is greater code clarity and (possibly) better cache coherency.

...after I posted I realized that just putting a vector in there would solve all your problems. Perhaps I got a little caught up in the fact that I personally wouldn't create a wrapper, and then ended up defending my position versus the one that made the most sense for the OP :)

Share this post


Link to post
Share on other sites
Quote:
Original post by Bearhugger
It is probably simpler to make one big array and calculate the position in a function, but this can also be done using an array of array, so if you don't want to use boost and stick with plain old C++, this should do the trick:

*** Source Snippet Removed ***

You can access variables like this:
pMultiArray[ i ][ j ] = 1234;
There's really no reason to put yourself through that in C++ (under normal circumstances at least). At the very least, use a vector or vector's; the end result will be basically equivalent, and the resulting code will be much cleaner (and most likely much safer as well).

Really, allocating multi-dimensional arrays manually like that in C++ is just onerous; it's really not something you want to be doing if you can possibly avoid it (and usually you can).

Share this post


Link to post
Share on other sites
Quote:
Original post by Bearhugger
It is probably simpler to make one big array and calculate the position in a function, but this can also be done using an array of array, so if you don't want to use boost and stick with plain old C++, this should do the trick:

*** Source Snippet Removed ***

You can access variables like this:
pMultiArray[ i ][ j ] = 1234;


Nitpick: Though that's not a straight multi-dimensional-array or array[n]-of-array[m] anymore but rather a jagged array, which would also allow for non-rectangular "formations", e.g. triangles.

Share this post


Link to post
Share on other sites
Quote:
Original post by rip-off
Something like that *. Look ma, I obey the rule of three without even trying


Actually, you need to either reorder the member variables or add a custom assignment operator as copy-assigning the vector can throw after the width_ has been copy-assigned, leaving the vector2d in an inconsistent state.

</ducks>

EDIT: fully elucidated

Share this post


Link to post
Share on other sites
Quote:
There's really no reason to put yourself through that in C++ (under normal circumstances at least). At the very least, use a vector or vector's; the end result will be basically equivalent, and the resulting code will be much cleaner (and most likely much safer as well).

Really, allocating multi-dimensional arrays manually like that in C++ is just onerous; it's really not something you want to be doing if you can possibly avoid it (and usually you can).


Well, like I said, using one unidimensional array is the most effective way of dealing with it in my opinion, but I'm not gonna lose sleep over an array of pointers. I can trust myself to delete the pointers because I always write the delete right after I'm done writing the block of code, so I never miss a delete..

Also, using vectors for fixed-size arrays is overkill in my opinion, not to mention it's extra overhead you might not need, especially for huge vectors, because the multiple resize calls you'll be doing to fill each vector with elements will pass through all the new items and initialize them to 0. (Or whatever you tell it to initialize to.) The performance loss is usually not worth worrying for, but in rare cases, it does, like when you use vectors as variable-size buffers for loading huge chunks of data. I had *significant* slowdowns to my game when I tried using that method to load the assets from my archive file into my game.

Share this post


Link to post
Share on other sites
Quote:
Original post by Bearhugger
I can trust myself to delete the pointers because I always write the delete right after I'm done writing the block of code, so I never miss a delete..


That's wrong.

Q: In your example, what do you think will happen if the marked line ("THIS ONE") ...
[...]
pMultiArray = new char * [ X ];
for( i = 0; i < X; i++ )
pMultiArray[ i ] = new char[ Y ]; // <-- THIS ONE

//
// Do your stuff...
//

for( i = 0; i < X; i++ )
delete [] pMultiArray[ i ];
delete [] pMultiArray;




... will throw?


Quote:
Also, using vectors for fixed-size arrays is overkill in my opinion,


0) Misconception of "bottleneck"
1) Misconception of modern compiler technology:

#include <vector>
#include <cstdlib>
#include <iostream>

int main() {
const int s = 128;

std::vector<float> foo(s), bar(s);

// init
for (int i=0; i<s; ++i) {
foo[i] = rand();
bar[i] = rand();
}

// add
std::vector<float> frob(s);
for (int i=0; i<s; ++i)
frob[i] = foo[i] + bar[i]; // dump below

for (int i=0; i<s; ++i)
std::cout << frob[i];
}


g++ -S -mfpmath=sse -march=native -O3 foo.cc


cat foo.s
<snip>
.L31:
movups (%r12,%rax), %xmm0
movups (%r13,%rax), %xmm1
addps %xmm1, %xmm0
movaps %xmm0, (%rbp,%rax)
addq $16, %rax
cmpq $512, %rax
jne .L31
<snip>


--> Utilizes SSE code and properly unrolled the loop. Where is your overhead now?

[Edited by - phresnel on August 30, 2010 1:41:54 PM]

Share this post


Link to post
Share on other sites
Quote:
I can trust myself to delete the pointers because I always write the delete right after I'm done writing the block of code, so I never miss a delete..
Don't forget about exception safety.

In any case, why saddle yourself with even that minor task if you don't have to? If you don't like std::vector for some reason, you can write a simple RAII wrapper of your own that handles deallocation for you.
Quote:
Also, using vectors for fixed-size arrays is overkill in my opinion
Who said anything about fixed-size arrays? The OP stated earlier that:
Quote:
...Worldheight is variable.
So I think it's safe to assume we're dealing with variable-size arrays here.
Quote:
not to mention it's extra overhead you might not need, especially for huge vectors, because the multiple resize calls you'll be doing to fill each vector with elements will pass through all the new items and initialize them to 0. (Or whatever you tell it to initialize to.) The performance loss is usually not worth worrying for, but in rare cases, it does, like when you use vectors as variable-size buffers for loading huge chunks of data. I had *significant* slowdowns to my game when I tried using that method to load the assets from my archive file into my game.
I'm not sure what method you were using, but not many people would recommend loading large amounts of data into a vector using push_back(), etc., at least not without using reserve(). Typically, by using reserve() or resize(), you can avoid repeated allocations and copies.

Also, with Visual Studio at least, an easy trap to fall into is measuring performance with iterator debugging turned on. In VS 2008 at least, iterator debugging is on even in the default release configuration, which I imagine has led quite a few people astray with respect to the standard library containers and algorithms. (Not saying that's the case here, but it does seem to happen fairly often.)

Anyway, I'm not saying that vector is the right solution in every case, but it does seem that often when people avoid it (and other parts of the standard library), it's for reasons that aren't particularly sound.

Share this post


Link to post
Share on other sites
Please don't debug my code, that was a quick snippet to illustrate the idea of the multiple pointer method. In a real program I would initialize pointers to NULL, check for new returning NULL, and yada-yada. Of course, you can set new to throw bad_alloc on allocation failure if you prefer that behavior, and, of course, in that case, you have to catch it and do cleanup and stuff. Personally though, for reasons that are beyond the scope of this topic, I absolutely hate the exception programming system so I prefer doing it old-school style and check for return values instead.

Also, multiple push_back *is* slow when you have large amount of bytes to store. My RPG loads its assets and data from a custom archive format I designed, and, because the buffers need to have variable sizes, I decided to use a std::vector and call push_back to store the decompressed data bytes into the vector buffer. It was quite slow. Because I called reserve, I expected it to be super fast, and to be honest I have no idea why it was slow, but for whatever reason, it didn't work well and there was considerable slowdowns. I guess I could have used resize and write directly in the buffer using the [] operator (it would probably have fixed the problem) but I went straight to using new and delete because I knew this would work.

The overhead caused by vectors is that when you resize a vector (declaring a vector by passing it an initial size is the same thing), the STL does not simply allocate sufficient memory: in the case of a growing vector, it passes through all new items to initialize them. This is unlike the new operator, which just allocates the data while leaving the garbage inside. (Unless you are in debug mode.) In a nut shell: vector = allocation + initialization, New = allocation only. So, in the example posted above, the overhead is not at the allocation where you displayed the assembler, it's at the declaration.

Although it will not affect the TC if it's done correctly (more on that farther), another thing I recently discovered concerning vectors is that they can potentially waste memory. As you probably know, vectors are optimized to be resizable arrays whose length changes dynamically, and in order to prevent too much reallocation, the object is programmed to increase its size by 50% everytime it needs to grow. And that's great because it's a nice compromise between the speed of an array and the power of a linked list when you need an array of unknown size. However, as far as I know, the buffer never shrinks. What this means is that, if you have some huge chunk of data that you store to load data, let's say a 1MB buffer, then the vector will always have 1MB allocated even if your next loads are 300 bytes. Only the destructor seems to free that memory.

Don't believe me? See for yourself. The second cout is to illustrate the initialization process.
#include <vector>
#include <iostream>

using namespace std;

int main()
{
int i;
vector< int > v;

v.resize( 1000000 );
v.resize( 50 );
cout << "Capacity: " << v.capacity() << endl;
cout << "Index 23: " << v[ 23 ] << endl;

cin >> i;
return 0;
}


STL vectors seem to be internally wired to be resizable. In the case of the TC question, if I understand correctly, the multi-dimensional array is sized once, when the map is loaded, then it remains constant until another map is loaded. So that is not something I'd use a vector for: I just need an array, and the only time it is resized is while loading while the screen is black. Still, if the TC wants to use the vector method, it is imperative when repopulating the vectors to call clear on the vector of vector instead of resize or it would lead in waste of memory after unloading a really large map.

Of course, I know that memory is cheap for PCs and that the performance losses are unnoticable 99% of the time. But I believe it's a bad habit to use vectors everywhere. Not saying I hate vectors, I use them a lot and I suggest people to use them as well. It's just that I suggest reserving those for arrays that are constantly changing in size.

In any case, this is getting off-topic. I just wanted to present a template without boost or STL.

Share this post


Link to post
Share on other sites
Quote:
Original post by Bearhugger
[...] I would initialize pointers to NULL, check for new returning NULL, and yada-yada. Of course, you can set new to throw bad_alloc on allocation failure if you prefer that behavior, and, of course, in that case, you have to catch it and do cleanup and stuff. Personally though, for reasons that are beyond the scope of this topic, I absolutely hate the exception programming system so I prefer doing it old-school style and check for return values instead.

Than C++ is not for you, because by the standard, new throws an exception upon allocation failure (<tt<new returning NULL is behaviour from before any standard existed (read MSVC 1998). In which case you will have quite a number of memory leaks in your posted snippet.

Quote:
Also, multiple push_back *is* slow

That is the reason why I did <b<not[/b] push_back() in my code. That is also the reason why push_back() is not the only granted way to resize arrays. Please, look back at my code.


Quote:
This is unlike the new operator, which just allocates the data while leaving the
garbage inside.

Please, oh please, take some more lessons on C++. new[] will call the default constructor of passed type, new will let you call any constructor. If you want garbage, use placement-new. C++0x will change that a bit, then you can call any constructor.

Quote:
New = allocation only

Nope.

Quote:
So, in the example posted above, the overhead is not at the allocation where you displayed the assembler, it's at the declaration.

No, I did only display assembly from the addition-code, not from allocation.


Quote:
As you probably know, vectors are optimized to be resizable arrays whose length changes dynamically, and in order to prevent too much reallocation, the object is programmed to increase its size by 50% everytime it needs to grow

But where is there overhead because of that?

Quote:
And that's great because it's a nice compromise between the speed of an array and the power of a linked list when you need an array of unknown size. However, as far as I know, the buffer never shrinks. What this means is that, if you have some huge chunk of data that you store to load data, let's say a 1MB buffer, then the vector will always have 1MB allocated even if your next loads are 300 bytes. Only the destructor seems to free that memory.

You can swap an empty vector with a used one. A common C++ trick.


edit: oops, striked through part was from an earlier revision of this reply

[Edited by - phresnel on August 31, 2010 1:51:05 PM]

Share this post


Link to post
Share on other sites

This topic is 2660 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this