UPDATE : I've edited the source code since the original post as it made no sense at all.
Hi there,
So, i ported this ocean rendering algorithm to DirectX : http://www.keithlantz.net/2011/11/ocean-simulation-part-two-using-the-fast-fourier-transform/
It works great, but slow as the FFT is computed on the CPU.
I've noticed that this part of the code is very costly :
for (int m_prime = 0; m_prime < N; m_prime++) {
fft->fft(h_tilde, h_tilde, 1, m_prime * N);
fft->fft(h_tilde_slopex, h_tilde_slopex, 1, m_prime * N);
fft->fft(h_tilde_slopez, h_tilde_slopez, 1, m_prime * N);
fft->fft(h_tilde_dx, h_tilde_dx, 1, m_prime * N);
fft->fft(h_tilde_dz, h_tilde_dz, 1, m_prime * N);
}
so i tried to use c++11 threads to make this faster, and i ended up with worse performances (went from 16fps to 5fps in Debug)
I don't have my code in front of me, but it basically looked like this :
Ocean::Update(float tick)
{
...
std::vector<std::thread> threads;
for (int m_prime = 0; m_prime < N; m_prime++)
{
threads.push_back(std::thread(&Ocean::DoFFT, this, h_tilde, h_tilde, 1, m_prime * N));
threads.push_back(std::thread(&Ocean::DoFFT, this, h_tilde_slopex, h_tilde_slopex, 1, m_prime * N));
threads.push_back(std::thread(&Ocean::DoFFT, this, h_tilde_slopez, h_tilde_slopez, 1, m_prime * N));
threads.push_back(std::thread(&Ocean::DoFFT, this, h_tilde_dx, h_tilde_dx, 1, m_prime * N));
threads.push_back(std::thread(&Ocean::DoFFT, this, h_tilde_dz, h_tilde_dz, 1, m_prime * N));
for(int i = 0 ; i < threads.size();i++)
{
threads[i].join();
}
threads.clear();
}
...
}
Ocean::DoFFT(Complex* in, Complex* out, int stride, int offset)
{
fft->fft(in, out, stride, offset);
}
With N = 64, so 64 threads. There are probably a couple of syntax errors in there as i am not fluent in C++, but you get the idea.
I also tried to create a maximum of 4 threads at the time, but it didn't help much (barelly reached 12fps)
Any idea what could be wrong here ?
Ultimately i'd like to move this code to a compute shader (OpenCL) but i first wanted to test this thing on CPU first
Thanks,
Yann