Back to General and Gameplay Programming

which way is the most cache friendly?

Raloth · 2004-02-11T16:09:08

This has always bothered me. Which of the following is the most efficient? [edit] Assume that the array was created like so: someobject** somearray = new someobject*[width]; for(int x=0;x < width;x++) { somearray = new someobject[height]; } A) for(int x=0;x < width;x++) { for(int y=0;y < height;y++) { ...somearray[x][y]... } } B) for(int x=0;x < width;x++) { for(int y=0;y < height;y++) { ...somearray[y][x]... } } C) for(int y=0;y < height;y++) { for(int x=0;x < width;x++) { ...somearray[x][y]... } } D) for(int y=0;y < height;y++) { for(int x=0;x < width;x++) { ...somearray[y][x]... } } [edited by - Raloth on February 8, 2004 6:25:01 PM]

General and Gameplay Programming Programming

Started by Raloth February 08, 2004 05:17 PM

21 comments, last by Raloth 20 years, 3 months ago

Chuck3d

122

February 10, 2004 01:59 PM

Is there an advantage to do array[y][x] instead of array[x+y*sizex] ?
If yes, you can allocate a "continuous" array (cache friendly) by doing that:

objet **array;array = new objet*[sizey];array[0] = new object[sizex*sizey]; //note that you have only two 'new'.for(f=1,k=sizex;f < sizey ;f++,k+=sizex){array[f] = array[0][k];}

and you can access linearly like that array[0][indice].

Chuck.

[edited by - Chuck3d on February 10, 2004 3:06:41 PM]

!o)

Anonymous

February 10, 2004 04:38 PM

I''ll take one more shot since I have it very clear now what made our views different.

quote:Original post by Jan Wassenberg
In theory, you are correct about (n -> inf) => cache is a constant factor, but that''s entirely worthless in practice (lacking infinite mem)

Not at all worthless! The cache is still a lot smaller than actual memory, and in applications where the matrices don''t fit in the cache, the speed of a cache-lookup is essentially O(1), and indeed *in practice*. No need for infinite memory to notice this. Processing an image or a heightmap with the size of a few megabytes, for example, would be enough to hit the cache''s upper lookup time and so making it a constant factor. This holds to what you said earlier as well. We can''t hit infinity with any real memory, but we''ll reach the asymptotic behaviour soon enough, as caches are so small when compared to conventional memory. This is what matters IMO. (and this is not nitpicking!)

But yeah, I see how it could be useful to have say the algorithm may behave like O(n^3) if the n is small enough. The notation is still helpful since it says that e.g. when n<10, the ''real'' complexity function may be more than than C*n^3 but as n rises to 10<n<1000, we get the asymptotic behaviour of O(n^3). And once we break the n=1000 barrier and upwards, cache gets small enough and a better approximation can be got by using O(n^2). Maybe I was too eager to think that the n should be thought to be very big (but within realistic bounds of conventional memory), when O(n^2) in the original case would hold with cache or not. But if we want better approximations for *small* n, then we may indeed get O(n^3) when n<1000.

This is precisely where our differences arose. Maybe you only deal with small matrices and arrays, so in your field (or what you say "in practice"

the estimate O(n^3) was correct (best approximation as long as n<1000). But for what I had assumed n to be, e.g. a bitmap image, O(n^2) was the correct limit (best approximation as long as 1000<n<100000).

Jan Wassenberg

1,000

February 11, 2004 04:09 PM

quote:Maybe you only deal with small matrices and arrays, so in your field (or what you say "in practice" the estimate O(n^3) was correct (best approximation as long as n<1000). But for what I had assumed n to be, e.g. a bitmap image, O(n^2) was the correct limit (best approximation as long as 1000

Alright, good summary, we''re on the same page here.
Yeah, it''s always funny to see the assumptions we make about certain things we ''always''/''never'' use

BTW, who''s the man behind the AP mask?

E8 17 00 42 CE DC D2 DC E4 EA C4 40 CA DA C2 D8 CC 40 CA D0 E8 40E0 CA CA 96 5B B0 16 50 D7 D4 02 B2 02 86 E2 CD 21 58 48 79 F2 C3

which way is the most cache friendly?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

which way is the most cache friendly?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines