Jump to content
  • Advertisement
Sign in to follow this  
Geometrian

A Fast pow(a,b) and SSE

This topic is 2638 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi,

So, my ray-tracer takes a very large amount of time (28%!) doing pow calculations. Despite this, I have been unable to find anything faster. I have tested:
http://bytes.com/topic/c/answers/761659-fast-power-function
http://martin.ankerl.com/2007/10/04/optimized-pow-approximation-for-java-and-c-c/
http://www.dctsystems.co.uk/Software/power.html

And none of them give any speedup, even for the quality hits they incur. I'm using Visual Studio with full optimizations. I am now attempting to use the SSE pow functions here. However, I am running into a problem:Error: variable "__m128" is not a type name.. . . and, when attempting to compile:error C2146: syntax error : missing ';' before identifier '__m128'I've looked all over for how to fix this, but I can't solve it. I have tried:[source lang="cpp"]#include <xmmintrin.h>[/source]. . . xor:[source lang="cpp"]#include <emmintrin.h>[/source]. . . but to no avail.


CPU is Intel Core 2 Duo T8300 (2.4GHz, dual core)


Thanks,
G

Share this post


Link to post
Share on other sites
Advertisement
What part of the computation makes use of pow? Is it the evaluation of the Phong reflection model? Perhaps there are ways to avoid calling pow so many times to begin with.

Share this post


Link to post
Share on other sites
The pow function is being used for calculating specular and for calculating hemispherical random samples weighted by cons^n(theta).

The compiler is Visual Studio 2010 Ultimate's compiler.

Share this post


Link to post
Share on other sites
Post all your source for the file which you can't compile...
It's that code from the article I linked. I added the #includes which I thought would make it work, but it still fails early:[source lang="cpp"]#pragma once
#include <xmmintrin.h> //I've alternately tried: "#include <emmintrin.h>"

#define EXP_POLY_DEGREE 3

#define POLY0(x, c0) _mm_set1_ps(c0)
#define POLY1(x, c0, c1) _mm_add_ps(_mm_mul_ps(POLY0(x, c1), x), _mm_set1_ps(c0))
#define POLY2(x, c0, c1, c2) _mm_add_ps(_mm_mul_ps(POLY1(x, c1, c2), x), _mm_set1_ps(c0))
#define POLY3(x, c0, c1, c2, c3) _mm_add_ps(_mm_mul_ps(POLY2(x, c1, c2, c3), x), _mm_set1_ps(c0))
#define POLY4(x, c0, c1, c2, c3, c4) _mm_add_ps(_mm_mul_ps(POLY3(x, c1, c2, c3, c4), x), _mm_set1_ps(c0))
#define POLY5(x, c0, c1, c2, c3, c4, c5) _mm_add_ps(_mm_mul_ps(POLY4(x, c1, c2, c3, c4, c5), x), _mm_set1_ps(c0))

__m128 exp2f4(__m128 x) // <= FAILS HERE!!!
{
__m128i ipart;
__m128 fpart, expipart, expfpart;

//...[/source]-G

Share this post


Link to post
Share on other sites
Did you try #include <Windows.h> ?

Often when something fails for that reason, and in such a simple single header case, it's because the header depends on a define/typedef from the windows.h header...

It could also be a case of fail for a header included before the one you pasted... (I'm assuming it's a header due to the #pragma once in it)

Share this post


Link to post
Share on other sites
Did you try #include <Windows.h> ?

Often when something fails for that reason, and in such a simple single header case, it's because the header depends on a define/typedef from the windows.h header...[/quote]I attempted:[source lang="cpp"]#include <Windows.h>
#include <xmmintrin.h>
#include <emmintrin.h>[/source]. . . but it didn't work.
It could also be a case of fail for a header included before the one you pasted... (I'm assuming it's a header due to the #pragma once in it)[/quote]Yes, it is a header, and nope, everything else works fine.
Thanks,
-G

Share this post


Link to post
Share on other sites
Not sure how this works in VS but for GCC you usually also have to turn on SSE instructions in the compiler to be able to use those headers.

Share this post


Link to post
Share on other sites
I just tried pasting that code into a new console app using VS2010 Express, and using both includes it compiled just fine.

Commenting out all the includes gives "error C2146: syntax error : missing ';' before identifier 'exp2f4'". Note the differing location of the error message.

I suspect your problem is a missing ; at the end of a class definition in another file.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!