Sign in to follow this  
Misery

short OpenMP performance question

Recommended Posts

Misery    354
Hello,

I have a question about OpenMP directoves for small cases. Lest say I have a class managing big arrays with OpenMP. If an array is big there's no question that omp should get better performance. But what about array containing lets say 10 elements? Will compiler be smart enough to avoid creating for example 8 threads to process such a small array? Or rather I should use if-else statement?
[code]
if (N>100)
{
// openmp code
}
else
{
//serial code
}
[/code]

If so, how big should be array to have some advantage in using OpenMP?

Thanks in advance,
Regards,
Misery

Share this post


Link to post
Share on other sites
Washu    7829
You can dictate the chunk size, which would allow you to specify a psuedo minimum work unit for a thread.

Share this post


Link to post
Share on other sites
Misery    354
[quote name='Washu' timestamp='1333441781' post='4927818']
You can dictate the chunk size, which would allow you to specify a psuedo minimum work unit for a thread.
[/quote]

Ok, but still - how much work would be worth creating the threads?

Share this post


Link to post
Share on other sites
Antheus    2409
[quote]Will compiler be smart enough to avoid creating for example 8 threads to process such a small array?[/quote]

Hopefully yes.

If it doesn't, it just shared a cache line across 8 cores, meaning each write will stall entire CPU. Considering false sharing is one of biggest complaints about OMP, I wouldn't trust it to be smart about stuff like that.

But a much more important question here is about actual work being done. OMP isn't magic, so it should be used for problems that can make use of it. If multiplying 4x4 matrix once, it's a poor fit, constant factors will win. For large problems, it's not an issue.


Core issue here is the unsolved problem of parallel and concurrent programming. How to distribute the loads?

Using if statement however also indicates too wildly varying workloads for OMP to be a good fit. A task queue would be considerably better. Parallel indicates even and consistent work. Large variations are counter indication for such processing.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this