I am writing a library that will operate on huge arrays.
For now I am creating it's basics like copy functions etc.
And I thought that it is a good way to split 1D array into few smaller, and use std::copy in each omp thread, however this approach fails. Errors that occur are each time different, but it seems like a reason for this is memory access violation.
The code is as follows:
INT_TYPE TN=omp_get_num_threads( );
INT_TYPE dN=N/TN; //array size divided by threads number
TN--; //decrement because threads are numbered from 0 to TN-1
#pragma omp parallel default (none) shared(First,Last,Result,N,dN,TN)
{
INT_TYPE TID=omp_get_thread_num( );
if (TID==TN) std::copy(First+TID*dN,Last,Result);
else std::copy(First+TID*dN,First+(TID+1)*dN,Result); //this line causes an error
}
I checked the ranges many times, there is no way in which the range on which copy is operating is covered by any other.
Why doesn't it work? Is using std lib in sych manner a poor idea or just my concept is bad?
It is not correct for all threads to write to the same place. And is it `result' or `Result'? Or are you not copying and pasting actual code?
You can do it with memcpy or with copy, but you need to add the offset to the destination as well.
I am not an expert in multithreaded code, but I can't imagine you'll get much gain in the performance of copying memory around by using several threads: The bandwidth of your RAM is probably the limiting factor, and throwing more threads at it might make the situation worse, because now the hardware needs to handle simultaneous accesses.
Well, I used also proper offset and result or |Result it's just my mistake, because i am writing not copying. Anyway, std::copy is very fast anyway so I'll give up threading it :] Thanks for answers