Jump to content
  • Advertisement
Sign in to follow this  
Misery

OpenMP and std::copy

This topic is 2436 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hello,

I am writing a library that will operate on huge arrays.
For now I am creating it's basics like copy functions etc.
And I thought that it is a good way to split 1D array into few smaller, and use std::copy in each omp thread, however this approach fails. Errors that occur are each time different, but it seems like a reason for this is memory access violation.
The code is as follows:


INT_TYPE TN=omp_get_num_threads( );
INT_TYPE dN=N/TN; //array size divided by threads number
TN--; //decrement because threads are numbered from 0 to TN-1
#pragma omp parallel default (none) shared(First,Last,Result,N,dN,TN)
{
INT_TYPE TID=omp_get_thread_num( );
if (TID==TN) std::copy(First+TID*dN,Last,Result);
else std::copy(First+TID*dN,First+(TID+1)*dN,Result); //this line causes an error
}


I checked the ranges many times, there is no way in which the range on which copy is operating is covered by any other.
Why doesn't it work? Is using std lib in sych manner a poor idea or just my concept is bad?

Thanks in advance,
Regards

Share this post


Link to post
Share on other sites
Advertisement
I don't know what `Result' is, but chances are it's not something where a bunch of threads can generate output at the same time.

Share this post


Link to post
Share on other sites
It's just a

double* result

Which was allocated earlier. Also using private pointers to address corresponding to Firs+TID*dN etc. didn't work.
memcpy also results in error.

Share this post


Link to post
Share on other sites
It is not correct for all threads to write to the same place. And is it `result' or `Result'? Or are you not copying and pasting actual code?

You can do it with memcpy or with copy, but you need to add the offset to the destination as well.

I am not an expert in multithreaded code, but I can't imagine you'll get much gain in the performance of copying memory around by using several threads: The bandwidth of your RAM is probably the limiting factor, and throwing more threads at it might make the situation worse, because now the hardware needs to handle simultaneous accesses.

Share this post


Link to post
Share on other sites
Well, I used also proper offset and result or |Result it's just my mistake, because i am writing not copying. Anyway, std::copy is very fast anyway so I'll give up threading it :] Thanks for answers

Share this post


Link to post
Share on other sites
The problem here is you're not thinking with portals. Um... OpenMP.


#pragma omp parallel for
for (i = 0; i < N; i++)
result = source;

OMP now chooses how to split this into blocks, how many threads to use and so on.

It also exposes a potential flaw, namely 'result' being shared, which may cause certain side-effects and unexpected stalls.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!