OpenMP and std::copy

Started by
4 comments, last by Antheus 12 years ago
Hello,

I am writing a library that will operate on huge arrays.
For now I am creating it's basics like copy functions etc.
And I thought that it is a good way to split 1D array into few smaller, and use std::copy in each omp thread, however this approach fails. Errors that occur are each time different, but it seems like a reason for this is memory access violation.
The code is as follows:


INT_TYPE TN=omp_get_num_threads( );
INT_TYPE dN=N/TN; //array size divided by threads number
TN--; //decrement because threads are numbered from 0 to TN-1
#pragma omp parallel default (none) shared(First,Last,Result,N,dN,TN)
{
INT_TYPE TID=omp_get_thread_num( );
if (TID==TN) std::copy(First+TID*dN,Last,Result);
else std::copy(First+TID*dN,First+(TID+1)*dN,Result); //this line causes an error
}


I checked the ranges many times, there is no way in which the range on which copy is operating is covered by any other.
Why doesn't it work? Is using std lib in sych manner a poor idea or just my concept is bad?

Thanks in advance,
Regards
Advertisement
I don't know what `Result' is, but chances are it's not something where a bunch of threads can generate output at the same time.
It's just a

double* result

Which was allocated earlier. Also using private pointers to address corresponding to Firs+TID*dN etc. didn't work.
memcpy also results in error.
It is not correct for all threads to write to the same place. And is it `result' or `Result'? Or are you not copying and pasting actual code?

You can do it with memcpy or with copy, but you need to add the offset to the destination as well.

I am not an expert in multithreaded code, but I can't imagine you'll get much gain in the performance of copying memory around by using several threads: The bandwidth of your RAM is probably the limiting factor, and throwing more threads at it might make the situation worse, because now the hardware needs to handle simultaneous accesses.
Well, I used also proper offset and result or |Result it's just my mistake, because i am writing not copying. Anyway, std::copy is very fast anyway so I'll give up threading it :] Thanks for answers
The problem here is you're not thinking with portals. Um... OpenMP.


#pragma omp parallel for
for (i = 0; i < N; i++)
result = source;

OMP now chooses how to split this into blocks, how many threads to use and so on.

It also exposes a potential flaw, namely 'result' being shared, which may cause certain side-effects and unexpected stalls.

This topic is closed to new replies.

Advertisement