Jump to content

  • Log In with Google      Sign In   
  • Create Account

Banner advertising on our site currently available from just $5!


1. Learn about the promo. 2. Sign up for GDNet+. 3. Set up your advert!


NDIR

Member Since 22 Sep 2007
Offline Last Active May 20 2015 04:17 PM

Posts I've Made

In Topic: Triangulation

20 May 2015 - 04:39 AM

For this exact problem (triangulating 3D polygons (with holes) on the random plane) i'm using the: gluTessBeginPolygon (and other gluTess* functions). It does the job beautifully.


In Topic: Triangle is not visible - PathTracing

04 May 2015 - 04:46 AM

where do you normalize the normal?


In Topic: Speeding up cpu side of rendering engine

08 January 2015 - 03:21 PM

...

    componentMap currentComponentMap = gameObjects->at(i)->GetComponentMap();

...

 

 

You should use references (&) and also using .find everywhere is probably bad for performance....


In Topic: Poor STL threads performance

19 September 2014 - 01:59 PM

 

Do you actually split the work between the threads or do you simply multiply the work with every thread?

 

Not sure what you mean, but i tried multiple approaches.

The one i wrote above, and this one too :

 

for(int m_prime = 0; i < N; i++)

{

     threads.push_back(std::thread(&Ocean::DoFFT, this, 1, m_prime * N));

     if(threads.size() == 4)

     {

           //.. join all 4 threads before moving to the next batch.

     }

}

 

But like Hogman mentioned, i am spawning 64 threads even with this approach, i'll just create some kind of threadpool of 4 permanent threads instead.

 

Side question though, assuming i make this thing work properly with expected fps boost, would i get better performance with OpenMP (so code still executed on CPU), or should i jump directly to a OpenCL implementation ?

 

Thanks for your help !

 

 

Take 4 threads (from a pool, don't create new ones every frame / update) and split the work between those 4 threads.

 

So thread 1 executes (in the Ocean::DoFFT):

 

for (int m_prime = 0; m_prime < N/4; m_prime++) {
fft->fft(h_tilde, h_tilde, 1, m_prime * N);
fft->fft(h_tilde_slopex, h_tilde_slopex, 1, m_prime * N);
fft->fft(h_tilde_slopez, h_tilde_slopez, 1, m_prime * N);
fft->fft(h_tilde_dx, h_tilde_dx, 1, m_prime * N);
fft->fft(h_tilde_dz, h_tilde_dz, 1, m_prime * N);
}

 

Thread 2 executes:

 

for (int m_prime = N/4; m_prime < N/4+N/4; m_prime++) {
fft->fft(h_tilde, h_tilde, 1, m_prime * N);
fft->fft(h_tilde_slopex, h_tilde_slopex, 1, m_prime * N);
fft->fft(h_tilde_slopez, h_tilde_slopez, 1, m_prime * N);
fft->fft(h_tilde_dx, h_tilde_dx, 1, m_prime * N);
fft->fft(h_tilde_dz, h_tilde_dz, 1, m_prime * N);
}

 

and so on.


In Topic: Poor STL threads performance

19 September 2014 - 08:39 AM

Do you actually split the work between the threads or do you simply multiply the work with every thread?


PARTNERS