Multidimensional Arrays in C/C++

Started by
13 comments, last by Russell 20 years, 1 month ago
I have heard that doing a 2D array lookup is slower than doing a 1D array lookup, and 3D array lookup is slower than 2D, and so on. It seems to me that a 2D array lookup should be the same as two 1D array lookups, and a 3D array lookup should be the same as three 1D array lookups, but people seem to think 3D array lookups are REALLY slow. So what is the truth here? Here is why I think a 2D array lookup should be the same as two 1D array lookups. Let''s say I have two arrays: int b1[10], b2[10];. I make another array: int * a[2]; a[0] = b1; a[1] = b2; Now I should be able to do a 2D array lookup like: int i = a[1][5]; However, shouldn''t this be the same as doing two 1D array lookups? I''m just doing two 1D array lookups.
Advertisement
No. These two things behave differently :

int a[10][20];
int * a[10]; //then fill a with int[20]'s

the first object is actually an array of the form a[200], and when you access a[j][k] it actually accesses a[j+10*k]. The "10*k" part is the slow part about it : to access an element you do 1 mul and 2 add, then a dereferencing :

a[j][k] == *(a+j+10*k)

the second object is actually an int** object, so you are basically adding, dereferencing, adding again, and dereferencing again :

a[j][k] == *(*(a+j)+k)

the second method is of course faster, but it's slower to set up (because you need to allocate additional memory and set it up. Still, a good way would be to allocate on the stack something like this :

int a[10][20];
int * b[10];
for( j = 0; j < 10; ++j ) { b[j] = &(a[j][0]); }

And then use b[j][k] to access the elements.

Victor Nicollet, INT13 game programmer



[edited by - ToohrVyk on March 20, 2004 11:07:48 AM]
I thought that even a 1D array lookup required a multiply (unless it is an array of bytes). Since an int is 4 bytes (usually), to lookup in int a[10], it has to do *(a + (i * 4)).

Another issue is, how smart is the compiler? I hear a lot of people recommend to just do it the most straightforward way, avoid all of the pointer stuff, and let the compiler convert it to that method if that will be faster.
With all of the "real" stuff that your game is probably doing--graphics, sound, physics, etc, etc--you should NOT be worrying about whether a[10][10] or a[100] is faster.

Worrying about optimizing the least important 1% of your code is simply wasted effort. You''d be better off trying to get more speed out of your rendering loop. (If you even need any, most computers are fast enough that you don''t care.)
AP : why would optimizing the rendering system only mean he doesn''t have to optimize bidimensional arrays? Don''t forget bitmaps! In such cases even an array read can become a huge waste of time, since you usually do a lot of them each frame.

One of my collaborators recently worked on an application that did something like 15,360,000 pixel acecesses per second. In such cases optimization IS required (especially when you only have 200 MHz to work with in the first place, AND many other things to do, AND a lousy chipset overall).

OP : of course, any lookup requires a multiply (which is why []-accessing is slower than getting a copy of the array and incrementing it), but n-dimensional arrays require n*(n-1)/2 multiplies (in addition to the "normal" multiply of a 1D array).

Victor Nicollet, INT13 game programmer

And the multiply-by-four for integers is built in to the processor. AFAIK it takes no more time than a ''normal'' lookup
If your data is two dimensional by nature then whether you store it all in one long 1D array, or in a 2D array you are going to have about the same asm instructions either way. So for readability''s sake, just use a 2D array.
If you want to make sure it is fast the best thing you can do is use array dimensions that are a power of two in size. If the compiler sees that it has to multiply by a power of two then it will replace that with a shift.
"In order to understand recursion, you must first understand recursion."
My website dedicated to sorting algorithms
Multi dimentional array access DOES require multiple multiplies - compared to SINGLE dimention lookups - but this is completely unavoidable ... cause you have to do the multiply sometime ... the only thing you can optimize is when ... the the user wants to get item [3] [2] on a chess board ... someone has to do the mutliple to figure out that this really is item index [26] (3 * 8 + 2).

also, when the array is declared all at once it is faster, not slower, than when the arrays are built using pointers to arrays ... in which case they use multiple multiplies AND a pointer dereference per dimension.

int a1[20];
int a2[20][20];
int a2b[20*20];
int a3[20][20][20];
int a3b[20*20*20];

when you use a2b[index];

the compiler does something like: *(a2b + (index * 4))

when you use a2[x][y];

the compiler does something like:

*(a2b + (x * 80) + (y * 4))

so the [ + x * 80 ] is the extra work ... BUT this work is actually DOING SOME FUNCTION that you need, it''s comverting a number from a logical domain (rows / column) to a physical domain (single dimentional address space) ... if you try to use a2b you will find YOU writing the equivelent code to solve the problem ... so it''s NOT an optimization, it''s a functional change.

also, smart compilers do optimize out that multiply in loops - it''s just standard variable computation hoisting ... if the compiler can see that for a given inner loop a certain computation is contanst, it compute it once, prior to the loop ...

so this:

// assume this might be a nested loop - variable x is already set ... outside this code ...
for(int y=0; y<20; ++y)
cout << a2[x][y];

if the compiler cannot or does not optimze this (such as in debug mode) - then EVERY time inside the loop, the [+ x * 80] code will be done ...

but most (all I know of) modern compilers can optimize this so that the [x * 80] part is done before the loop, and is simple added in each time ... - and most compilers can go even farther and precompute the [a2 + (x * 80)] part, so inside the loop the code is JUST this:

*(precomputedSum + (y * 4))

EXACTLY the same as a one dimentional array.
quote:Original post by Anonymous Poster
With all of the "real" stuff that your game is probably doing...
Ahem... My program is a chess program, so...
quote:graphics
No graphics.
quote:sound
No sound.
quote:physics
No physics.
quote:etc, etc
Aha! You hit on one. Enhanced Transposition Cutoffs (ETC)! A decent tree searching pruning technique.
quote:you should NOT be worrying about whether a[10][10] or a[100] is faster.

Worrying about optimizing the least important 1% of your code is simply wasted effort. You''d be better off trying to get more speed out of your rendering loop. (If you even need any, most computers are fast enough that you don''t care.)
The chess board is an array. The piece lists are 2D arrays or 3D arrays, depending upon which I choose. A chess program will spend almost all of its time scanning one of these arrays (board or piece lists).

If there is a detectable difference between using a 2D array or a 3D array for piece lists, I''d like to know that. I''ll be accessing and scanning these quite a bit.
I was just thinking about one time when I turned a 2D array lookup into a single bitwise OR. Instead of doing this: a[x][y];, I did this: a[x|y];. I just modifed the values that x and y contained so it would work.

If the size of each dimension of the array was a power of two, I imagine if I wanted to take the time, I could do this for a 3D array also, right? a[x|y|z];?

This topic is closed to new replies.

Advertisement