Jump to content
  • Advertisement
Sign in to follow this  
johnnyBravo

Should I give the length of pointer array or just look until it = null?

This topic is 4681 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi, in functions that use pointer arrays, eg void func(float *pArray, in size); Speed wise, must I enter the size integer or could I just use a while loop to keep going until it finds a null, eg
void func(float *pArray) {
    int i=0;

    while(pArray) {
       i++;
    }
}


Share this post


Link to post
Share on other sites
Advertisement
In terms of program correctness, it's less error prone to pass the size. It's even less error prone to use a std::vector which stores it's size and you can use iterators with.

Share this post


Link to post
Share on other sites
Unless the last element of the array is guaranteed to be NULL (which is almost never the case with arrays), then you should pass the size. Else, you could end up reading past the end of the array into memory held by other programs, and that'll eventually give you an error.

Share this post


Link to post
Share on other sites
Pass the size!!!

Reasoning: Not passing the size means that you have to count every element to find out this number. This can be a very expensive operation if you have a large array - and sometimes you need this number without needing to sift through all the data.

This is one of the reasons why std::string and std::vector can outpreform their raw C counterparts in a more mantainable fashion.

Plus, you're not checking for null in your example, you're checking for equivilance to zero. You may as well have written:

while( pArray != 0.0f )


Note that this means you're screwed if pArray should be allowed to contain a NaN or 0.0, as it'll stop parsing there instead of the end of the array. You also MUST explicitly set the end of the array to zero, which is nasty, especially the error that will occur when you froget this step (and trust me, you will eventually).

Also, as SiCrane mentions, using std::vector would be even less error prone, because it associates the size directly with the array, which prevents you from mismatching them. It'd be even better to use.

Share this post


Link to post
Share on other sites
It's also more performant to use an explicit length in tight loops. Probably because it helps the compiler unroll the loop.

For example,


int spin (int * array, int count)
{
int j;
for (int i = 0; array; ++i)
j += array;
return j;
}


is five times slower than


int spin (int * array, int count)
{
int j;
for (int i = 0; i < count; ++i)
j += array;
return j;
}


when compiling with "gcc -O3".

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Quote:
Original post by johnnyBravo
Hi, in functions that use pointer arrays, eg void func(float *pArray, in size);

Speed wise, must I enter the size integer or could I just use a while loop to keep going until it finds a null, eg
*** Source Snippet Removed ***


ITYM

void func(float **pArray) { // pointer to pointer to float
// i.e. array of pointers to float
int i=0;

while(pArray) {
i++;
}
}

Share this post


Link to post
Share on other sites
Quote:
Original post by Anonymous Poster
Quote:
Original post by johnnyBravo
Hi, in functions that use pointer arrays, eg void func(float *pArray, in size);

Speed wise, must I enter the size integer or could I just use a while loop to keep going until it finds a null, eg
*** Source Snippet Removed ***


ITYM
*** Source Snippet Removed ***


Switching to having an array of pointers to floats is a pointless waste of resources. The size of a float is going to be the same as that of a pointer - except on a 64 bit platform where it'll be half the size.

Not to mention needless dereferencing of pointers will kill your cache and preformance. I think he just saw comparison to null with such a construct (e.g. argv) and mistakenly abstracted that to simpler arrays in general. You _can_ do it, it just works very badly when the array is not of pointers.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Quote:
Original post by Nathan Baum
It's also more performant to use an explicit length in tight loops. Probably because it helps the compiler unroll the loop.

For example,


int spin (int * array, int count)
{
int j;
for (int i = 0; array; ++i)
j += array;
return j;
}


is five times slower than


int spin (int * array, int count)
{
int j;
for (int i = 0; i < count; ++i)
j += array;
return j;
}


when compiling with "gcc -O3".


I thought it might be worth pointing out that count is a value that cannot be determined until runtime so the compiler cannot unroll the loop at all (it can't assume that it's going to be a multiple of anything other than 1).

I would guess that the actual speed up comes from the fact that you're simply incrementing a value in a register and comparing it to another register value (or fixed value) rather than constantly offsetting and dereferencing a pointer to load the value into a register and compare it (even if that data is cached). But I'm just making this up so take it with a grain of salt.

Share this post


Link to post
Share on other sites
Quote:
Original post by Anonymous Poster
I thought it might be worth pointing out that count is a value that cannot be determined until runtime so the compiler cannot unroll the loop at all (it can't assume that it's going to be a multiple of anything other than 1).

I would guess that the actual speed up comes from the fact that you're simply incrementing a value in a register and comparing it to another register value (or fixed value) rather than constantly offsetting and dereferencing a pointer to load the value into a register and compare it (even if that data is cached). But I'm just making this up so take it with a grain of salt.


The compiler is free to inline the function as a whole where possible, so unrolling could account for some difference. Also, since the end result of pArray is allways used, the compiler is free to reuse the value for both the conditional check in the loop as well as the incrementation of J.

I'd blame it on the processor myself, I have the feeling that it's a lot easier to do good branch prediction with a simple loop rather than depending on the results of the data. But I could be totally wrong. Assembly optimization is not my specialty by any stretch of the imagination.

Share this post


Link to post
Share on other sites
Quote:
Original post by Anonymous Poster
Quote:
Original post by Nathan Baum
It's also more performant to use an explicit length in tight loops. Probably because it helps the compiler unroll the loop.

For example,


int spin (int * array, int count)
{
int j;
for (int i = 0; array; ++i)
j += array;
return j;
}


is five times slower than


int spin (int * array, int count)
{
int j;
for (int i = 0; i < count; ++i)
j += array;
return j;
}


when compiling with "gcc -O3".

I thought it might be worth pointing out that count is a value that cannot be determined until runtime so the compiler cannot unroll the loop at all (it can't assume that it's going to be a multiple of anything other than 1).

Haven't you heard of Duff's device? One can (partially) unroll loops even if one doesn't know the size of the loop at compile-time: so long as one knows that the size of the loop will be known at run-time.

When applying Duff's device manually, the performance is about half-way between the two alternatives above. No doubt this is partly because Duff's device makes it harder for gcc to do its own loop unrolling optimisation, and partly because it makes some other optimisations harder to do.
Quote:

I would guess that the actual speed up comes from the fact that you're simply incrementing a value in a register and comparing it to another register value (or fixed value) rather than constantly offsetting and dereferencing a pointer to load the value into a register and compare it (even if that data is cached). But I'm just making this up so take it with a grain of salt.

Since the loop body accesses the same element that the loop test uses, I would expect that any vaguely competent compiler would produce code which dereferenced the pointer only once per iteration. I'm sure there are other optimisations which gcc can only apply to the faster one, though.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

Participate in the game development conversation and more when you create an account on GameDev.net!

Sign me up!