#### Archived

This topic is now archived and is closed to further replies.

# Nearest power of 2

This topic is 5600 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

The next power of 2 is for the numbers from 0 to 15:

p=5;if (i>=p) {    p=8;    labelcritticall15:;    if (i>p) {        p=p+p;    }} else {    p=2;    if (i>=p) {        goto labelcritticall15;    }    p=1;}

i stands for input and p stands for power.

Also developed by Critticall tool.

##### Share on other sites
p=8;if (i>p) {    p=p+p;} else {    p=5;    if (i>=p) {        p=8;    } else {        p=2;        if (i>=p) {            if (i!=p) {                p=4;            }        } else {            p=1;        }    }}

Another one. Faster for 5%.

##### Share on other sites
And another 1% better.

p=5;if (i>=p) {    p=8;    if (i>p) {        p=p+p;    }} else {    p=2;    if (i>=p) {        if (i>p) {            p=4;        }    } else {        p=1;    }}

To generalize it now, to any bit number is quite easy. Isn't it?

##### Share on other sites
How we can generalize it to 5 bits numbers?

This is an obvius way:
if (i>16) {p=32;goto over;} if (i==16) {p=16;goto over;} p=5;if (i>=p) {    p=8;    if (i>p) {        p=p+p;    }} else {    p=2;    if (i>=p) {        if (i>p) {            p=4;        }    } else {        p=1;    }}over:;

Critticall will smooth it to:

p=16;if (i>p) {    p=p+p;} else {    p=5;    if (i>=p) {        p=8;        if (i>p) {            p=p+p;        }    } else {        p=2;        if (i>=p) {            if (i>p) {                p=4;            }        } else {            p=1;        }    }}

##### Share on other sites
Seems like it's generating the b-search approach and we've already beaten that one. Also would you mind using Charles exsiting framework so we can get comparable numbers?

##### Share on other sites
Well ... it's the same thing. _WE_ are faster, not this BSR version.

This b tree is better. I don't care what you say, until you run and see!

##### Share on other sites
Quote:
 Original post by Thomas_CRWell ... it's the same thing. _WE_ are faster, not this BSR version.This b tree is better. I don't care what you say, until you run and see!

I said b-search as in "binarysearch" and honestly we've beaten hardcoded binarytrees just have a look at the 1kb LUT version.

And really you have no reason to be rude just because I asked you to use the excellent framework everyone else is using so we can compare speeds.

EDIT: added the Criticall version to the testing framework here's the numbers

nextpow2_Naive : 42.906250 nextpow2_Logical : 10.898438 nextpow2_Recursive : 67.859375 nextpow2_IEEE : 8.250000 nextpow2_BSR : 14.710938 nextpow2_3DNow : 9.523438 nextpow2_DD : 4.898438 nextpow2_DD2 :8.250000 nextpow2_DD3 : 6.281250 NextPow2_Critticall : 37.257813

note that the framework makes heavy used of inlineing to get
good performace so testing through function pointers can yield quite diffrent numbers.

I still consider my DD3 version using 1kb LUT to be the best practial solution, for lower numbers where memory is at a premium the IEEE version is a really good choice and for the general case the logical version is probably the best choice.

##### Share on other sites
What is the problem here?

You _have to_ measure for _all_ the numbers from 0 to 2^(N+1)-1. Where N denotes the bitness.

For example, if your function is meant to work for 10 bits numbers, you must measure it for 0 to 1023. There is no point to pick one number and measure for this one only.

Now, suppose that we are actually measuring for 21 bits number. Half of them are greater than 1048576. In all those cases we compare the number with 1048576 and set the result to 2097152. Few clock cycles and it is done. For the next half of the half, we have two comparisons. Several comparisons and one set for the waste majority of numbers. That also means several CPU cycles.

And yes, for some small numbers this can actually be a slower way. But they are just a tiny minority.

All in all, you can't beat the binary approach, toward which Critticall is also converting.

Here, you do a FLAW measurement. Do you understand that?

##### Share on other sites
Either you don't understand what Im saying or you're just ignorant but the fact remains that a hardcoded binarysearch isn't the fastest way to do this. All benchmarks will have flaws and that's the universal truth about benchmarking still you can see the numbers the Criticall goop just stinks and doesn't let the compiler carry out its job in a good fashion. Ill run some additional tests and report again.

##### Share on other sites
I understand you perfectly. I insist however, that you make test on ALL the numbers of a certain bit witdth.

Otherwise, your test is pointless. Worse, it's wrong.

##### Share on other sites

This topic is 5600 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

This topic is now closed to further replies.

• ### Game Developer Survey

We are looking for qualified game developers to participate in a 10-minute online survey. Qualified participants will be offered a \$15 incentive for your time and insights. Click here to start!

• 16
• 13
• 9
• 11
• 15