[C++] - Multithreaded Bubble Sort

Started by
3 comments, last by Shannon Barber 11 years, 4 months ago
Hello,
I'm working on multithreaded bubble sort example in C / C++.
I have an array of N numbers,
then I create M threads,
and divide my array of numbers into M parts - one part per thread.
Then I sort (N / M) numbers in each thread...
but what should I do next?
I dont know how to "merge" the results of multithreaded sorting.
Right now I end up with array that has M sorted sections, but I need it to be sorted entirely.
Any ideas?
PS: I need to use THREADS and Bubble sort together. The question is only how to merge the results....
Thanks for any advices and comments
Advertisement
Look up merge sort - it inherently merges two sorted lists together, you'll find how to do it there. The result is basically a simplified selection sort, taking advantage of the fact that the two lists are already sorted, which runs in O(n) time and uses a temporary array. You could extend this to an arbitrary number of lists, or you could do the merging in parallel too (which would be more useful, because doing the merge step on a single thread defeats your use of multithreading in the sorting step).

Is this for a school assignment, by the way? If not, you might be better served by trying to implement a multithreaded merge sort, which is more interesting and is actually useful.

“If I understand the standard right it is legal and safe to do this but the resulting value could be anything.”

Have a look at parallel odd-even sorting if you really have to go with a bubble sort approach.
If this is about trying to implement an actual efficient parallel sorting algorithm you should look at other approaches, possibly an odd-even mergesort which can be implemented using a sorting network using simple CAS elements.

I gets all your texture budgets!


Look up merge sort - it inherently merges two sorted lists together, you'll find how to do it there. The result is basically a simplified selection sort, taking advantage of the fact that the two lists are already sorted, which runs in O(n) time and uses a temporary array. You could extend this to an arbitrary number of lists, or you could do the merging in parallel too (which would be more useful, because doing the merge step on a single thread defeats your use of multithreading in the sorting step).

Is this for a school assignment, by the way? If not, you might be better served by trying to implement a multithreaded merge sort, which is more interesting and is actually useful.

I second this. Split array into two, spawn two threads and each handle its own segment. Recursively do this until you reach the smallest subset.

Merging the array shouldn't be too compliated either. Since each thread spawns two subthreads, you just wait until both threads finished executing, then merge, then flag its parent thread.
This is actually kind hard to do... everyone is saying merge sort because if you sit down to make bubble-sort a parallel algorithm you quickly realize you have to execute it 'backwards' to make it an easily-parallelizable problem. Then it turns into a merge sort.

Quick-sort iterates the entire array over and over until nothing changes position.
Merge sort divides and conquers, sort top-half, sort bottom-half, recursive, unwind and shuffle into place (linear algorithm now since both sub-arrays are known-sorted).

To parallel quick-sort you have to lock each element of the array.
You need a parallel array of spin-locks (that's POSIX, for Win32 they are called 'critical-sections') and you need to lock the two elements you are about to compare. Compare, swap-if-needed, then unlock. You have to swap the spin-locks as well to keep the two arrays parallel!

Now you break the array into n pieces, one piece for each thread.
You need to check the element before and after your chunk (don't blow the bounds of the array!) to see if you need to swap them.
Each thread bubble-sorts it's chunk of the array starting from the top down.
The thread pauses (use a semaphore) if it makes a pass and nothing is swapped.
If another thread tosses a new element into a chunk it has to kick that chunks' semaphore to tell that thread it has to start sorting again.
Keep going until all threads are paused and it should be done.
Then you set an exit flag and kick the semaphores to shake-them-loose and terminate.
- The trade-off between price and quality does not exist in Japan. Rather, the idea that high quality brings on cost reduction is widely accepted.-- Tajima & Matsubara
I'm guessing this is homework or an uni assignment or such, because no sane person would request you use bubble sort with threads for something real. Bubble sort is the probably most embarrassingly poor sort algorithm in existence, and throwing "multithreading" at a poor algorithm to make it faster (when better single-threaded algorithms exist) is a poor approach.

However, after thinking about it for a minute just now, I figured that it is actually a quite interesting exercise. And, in fact, not so silly at all.

Bubblesort is, surprisingly, actually a quite good fit for multithreading (not perfect, but quite good!). Bubblesort runs several passes over the complete set of data, only ever examining two adjacent values and swapping them if they're not in order. It's a O(N[sup]2[/sup]) average algorithm. Obviously, smaller pieces of data will therefore be considerably faster (using 4 threads on partitions 1/4 the size reduces the number of operations to 1/16). That means you're doing better than Amdahl's law!

You can trivially partition the set into N pieces and run the N pieces in N threads, in parallel. You can then, after syncing at a barrier, merge the partitions by taking the i-th element of every partition, which gives an "almost sorted" dataset. On a perfectly evenly distributed dataset, it would be sorted, not "almost sorted", but of course you want it to work for any data. A final pass of bubblesort over the whole set makes the "almost sorted" set sorted. Sorting "almost sorted" data with bubblesort is very efficient, usually a single pass.

Now, if you want to do it more elegantly than the trivial approach of N threads sorting N pieces, you can for example have N threads sort 4*N pieces, using a worker queue. This has a little added complexity, but considers that not all sub-partitions will take the same number of iterations. Thus, you avoid CPU cores going idle while they wait for the slowest one to finish.

This topic is closed to new replies.

Advertisement