# A Fast pow(a,b) and SSE

This topic is 2576 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

Hi,

So, my ray-tracer takes a very large amount of time (28%!) doing pow calculations. Despite this, I have been unable to find anything faster. I have tested:
http://martin.ankerl.com/2007/10/04/optimized-pow-approximation-for-java-and-c-c/
http://www.dctsystems.co.uk/Software/power.html

And none of them give any speedup, even for the quality hits they incur. I'm using Visual Studio with full optimizations. I am now attempting to use the SSE pow functions here. However, I am running into a problem:Error: variable "__m128" is not a type name.. . . and, when attempting to compile:error C2146: syntax error : missing ';' before identifier '__m128'I've looked all over for how to fix this, but I can't solve it. I have tried:[source lang="cpp"]#include <xmmintrin.h>[/source]. . . xor:[source lang="cpp"]#include <emmintrin.h>[/source]. . . but to no avail.

CPU is Intel Core 2 Duo T8300 (2.4GHz, dual core)

Thanks,
G

##### Share on other sites
What part of the computation makes use of pow? Is it the evaluation of the Phong reflection model? Perhaps there are ways to avoid calling pow so many times to begin with.

##### Share on other sites
Which compiler are you using? What version?

##### Share on other sites
The pow function is being used for calculating specular and for calculating hemispherical random samples weighted by cons^n(theta).

The compiler is Visual Studio 2010 Ultimate's compiler.

##### Share on other sites
Post all your source for the file which you can't compile...

##### Share on other sites
Post all your source for the file which you can't compile...
It's that code from the article I linked. I added the #includes which I thought would make it work, but it still fails early:[source lang="cpp"]#pragma once
#include <xmmintrin.h> //I've alternately tried: "#include <emmintrin.h>"

#define EXP_POLY_DEGREE 3

#define POLY0(x, c0) _mm_set1_ps(c0)
#define POLY1(x, c0, c1) _mm_add_ps(_mm_mul_ps(POLY0(x, c1), x), _mm_set1_ps(c0))
#define POLY2(x, c0, c1, c2) _mm_add_ps(_mm_mul_ps(POLY1(x, c1, c2), x), _mm_set1_ps(c0))
#define POLY3(x, c0, c1, c2, c3) _mm_add_ps(_mm_mul_ps(POLY2(x, c1, c2, c3), x), _mm_set1_ps(c0))
#define POLY4(x, c0, c1, c2, c3, c4) _mm_add_ps(_mm_mul_ps(POLY3(x, c1, c2, c3, c4), x), _mm_set1_ps(c0))
#define POLY5(x, c0, c1, c2, c3, c4, c5) _mm_add_ps(_mm_mul_ps(POLY4(x, c1, c2, c3, c4, c5), x), _mm_set1_ps(c0))

__m128 exp2f4(__m128 x) // <= FAILS HERE!!!
{
__m128i ipart;
__m128 fpart, expipart, expfpart;

//...[/source]-G

##### Share on other sites
Did you try #include <Windows.h> ?

Often when something fails for that reason, and in such a simple single header case, it's because the header depends on a define/typedef from the windows.h header...

It could also be a case of fail for a header included before the one you pasted... (I'm assuming it's a header due to the #pragma once in it)

##### Share on other sites
Did you try #include <Windows.h> ?

Often when something fails for that reason, and in such a simple single header case, it's because the header depends on a define/typedef from the windows.h header...[/quote]I attempted:[source lang="cpp"]#include <Windows.h>
#include <xmmintrin.h>
#include <emmintrin.h>[/source]. . . but it didn't work.
It could also be a case of fail for a header included before the one you pasted... (I'm assuming it's a header due to the #pragma once in it)[/quote]Yes, it is a header, and nope, everything else works fine.
Thanks,
-G

##### Share on other sites
Not sure how this works in VS but for GCC you usually also have to turn on SSE instructions in the compiler to be able to use those headers.

##### Share on other sites
I just tried pasting that code into a new console app using VS2010 Express, and using both includes it compiled just fine.

Commenting out all the includes gives "error C2146: syntax error : missing ';' before identifier 'exp2f4'". Note the differing location of the error message.

I suspect your problem is a missing ; at the end of a class definition in another file.

1. 1
2. 2
Rutin
18
3. 3
4. 4
5. 5

• 10
• 14
• 9
• 9
• 9
• ### Forum Statistics

• Total Topics
632922
• Total Posts
3009223
• ### Who's Online (See full list)

There are no registered users currently online

×