Should I give the length of pointer array or just look until it = null?

Started by
15 comments, last by intransigent-seal 18 years, 7 months ago
Hi, in functions that use pointer arrays, eg void func(float *pArray, in size); Speed wise, must I enter the size integer or could I just use a while loop to keep going until it finds a null, eg

void func(float *pArray) {
    int i=0;

    while(pArray) {
       i++;
    }
}


Advertisement
In terms of program correctness, it's less error prone to pass the size. It's even less error prone to use a std::vector which stores it's size and you can use iterators with.
Unless the last element of the array is guaranteed to be NULL (which is almost never the case with arrays), then you should pass the size. Else, you could end up reading past the end of the array into memory held by other programs, and that'll eventually give you an error.
[size="2"][size=2]Mort, Duke of Sto Helit: NON TIMETIS MESSOR -- Don't Fear The Reaper
Pass the size!!!

Reasoning: Not passing the size means that you have to count every element to find out this number. This can be a very expensive operation if you have a large array - and sometimes you need this number without needing to sift through all the data.

This is one of the reasons why std::string and std::vector can outpreform their raw C counterparts in a more mantainable fashion.

Plus, you're not checking for null in your example, you're checking for equivilance to zero. You may as well have written:

while( pArray != 0.0f )


Note that this means you're screwed if pArray should be allowed to contain a NaN or 0.0, as it'll stop parsing there instead of the end of the array. You also MUST explicitly set the end of the array to zero, which is nasty, especially the error that will occur when you froget this step (and trust me, you will eventually).

Also, as SiCrane mentions, using std::vector would be even less error prone, because it associates the size directly with the array, which prevents you from mismatching them. It'd be even better to use.
It's also more performant to use an explicit length in tight loops. Probably because it helps the compiler unroll the loop.

For example,

int spin (int * array, int count){  int j;  for (int i = 0; array; ++i)    j += array;  return j;}


is five times slower than

int spin (int * array, int count){  int j;  for (int i = 0; i < count; ++i)    j += array;  return j;}


when compiling with "gcc -O3".
Quote:Original post by johnnyBravo
Hi, in functions that use pointer arrays, eg void func(float *pArray, in size);

Speed wise, must I enter the size integer or could I just use a while loop to keep going until it finds a null, eg
*** Source Snippet Removed ***


ITYM
void func(float **pArray) { // pointer to pointer to float                            // i.e. array of pointers to float    int i=0;    while(pArray) {      i++;    }}
Quote:Original post by Anonymous Poster
Quote:Original post by johnnyBravo
Hi, in functions that use pointer arrays, eg void func(float *pArray, in size);

Speed wise, must I enter the size integer or could I just use a while loop to keep going until it finds a null, eg
*** Source Snippet Removed ***


ITYM
*** Source Snippet Removed ***


Switching to having an array of pointers to floats is a pointless waste of resources. The size of a float is going to be the same as that of a pointer - except on a 64 bit platform where it'll be half the size.

Not to mention needless dereferencing of pointers will kill your cache and preformance. I think he just saw comparison to null with such a construct (e.g. argv) and mistakenly abstracted that to simpler arrays in general. You _can_ do it, it just works very badly when the array is not of pointers.
Quote:Original post by Nathan Baum
It's also more performant to use an explicit length in tight loops. Probably because it helps the compiler unroll the loop.

For example,


int spin (int * array, int count)
{
int j;
for (int i = 0; array; ++i)
j += array;
return j;
}


is five times slower than


int spin (int * array, int count)
{
int j;
for (int i = 0; i < count; ++i)
j += array;
return j;
}


when compiling with "gcc -O3".


I thought it might be worth pointing out that count is a value that cannot be determined until runtime so the compiler cannot unroll the loop at all (it can't assume that it's going to be a multiple of anything other than 1).

I would guess that the actual speed up comes from the fact that you're simply incrementing a value in a register and comparing it to another register value (or fixed value) rather than constantly offsetting and dereferencing a pointer to load the value into a register and compare it (even if that data is cached). But I'm just making this up so take it with a grain of salt.
Quote:Original post by Anonymous Poster
I thought it might be worth pointing out that count is a value that cannot be determined until runtime so the compiler cannot unroll the loop at all (it can't assume that it's going to be a multiple of anything other than 1).

I would guess that the actual speed up comes from the fact that you're simply incrementing a value in a register and comparing it to another register value (or fixed value) rather than constantly offsetting and dereferencing a pointer to load the value into a register and compare it (even if that data is cached). But I'm just making this up so take it with a grain of salt.


The compiler is free to inline the function as a whole where possible, so unrolling could account for some difference. Also, since the end result of pArray is allways used, the compiler is free to reuse the value for both the conditional check in the loop as well as the incrementation of J.

I'd blame it on the processor myself, I have the feeling that it's a lot easier to do good branch prediction with a simple loop rather than depending on the results of the data. But I could be totally wrong. Assembly optimization is not my specialty by any stretch of the imagination.
Quote:Original post by Anonymous Poster
Quote:Original post by Nathan Baum
It's also more performant to use an explicit length in tight loops. Probably because it helps the compiler unroll the loop.

For example,


int spin (int * array, int count)
{
int j;
for (int i = 0; array; ++i)
j += array;
return j;
}


is five times slower than


int spin (int * array, int count)
{
int j;
for (int i = 0; i < count; ++i)
j += array;
return j;
}


when compiling with "gcc -O3".

I thought it might be worth pointing out that count is a value that cannot be determined until runtime so the compiler cannot unroll the loop at all (it can't assume that it's going to be a multiple of anything other than 1).

Haven't you heard of Duff's device? One can (partially) unroll loops even if one doesn't know the size of the loop at compile-time: so long as one knows that the size of the loop will be known at run-time.

When applying Duff's device manually, the performance is about half-way between the two alternatives above. No doubt this is partly because Duff's device makes it harder for gcc to do its own loop unrolling optimisation, and partly because it makes some other optimisations harder to do.
Quote:
I would guess that the actual speed up comes from the fact that you're simply incrementing a value in a register and comparing it to another register value (or fixed value) rather than constantly offsetting and dereferencing a pointer to load the value into a register and compare it (even if that data is cached). But I'm just making this up so take it with a grain of salt.

Since the loop body accesses the same element that the loop test uses, I would expect that any vaguely competent compiler would produce code which dereferenced the pointer only once per iteration. I'm sure there are other optimisations which gcc can only apply to the faster one, though.

This topic is closed to new replies.

Advertisement