Archived

This topic is now archived and is closed to further replies.

CProgrammer

sorting question

Recommended Posts

CProgrammer    303
I need a sorting algo that is very fast for sorting arrays that are already almost sorted. QuickSort is very bad at this, so is there an algorithm that is very good? -CProgrammer

Share this post


Link to post
Share on other sites
Kylotan    10008
Insertion sort is what you''re after, I expect. It approaches O(N) for almost-ordered data.

Take a look here also, perhaps: http://www.whitsoftdev.com/qsort/

[ MSVC Fixes | STL Docs | SDL | Game AI | Sockets | C++ Faq Lite | Boost
Asking Questions | Organising code files | My stuff | Tiny XML | STLPort]

Share this post


Link to post
Share on other sites
Sneftel    1788
In what way is the array almost sorted? sorted except for a few randomly inserted elements, or what?


"Sneftel is correct, if rather vulgar." --Flarelocke

Share this post


Link to post
Share on other sites
bastardos    100

You might just want to go with the old-fashioned bubble sort now.

while (not sorted) {
if (next value < last value) swap the values
}

.bas

[sPiKie] mmorpg isnt hard in vb

Share this post


Link to post
Share on other sites
petewood    819
quote:
Original post by CProgrammer
I need a sorting algo that is very fast for sorting arrays that are already almost sorted. QuickSort is very bad at this, so is there an algorithm that is very good?
-CProgrammer


This seems a very specific request. Can you describe the context for us? It may be that there are other containers that are more appropriate which will keep themselves sorted or it may be better to choose where to insert data so that it remains sorted.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster   
Guest Anonymous Poster
Believe it or not, in some cases it is faster to randomize the data before using quicksort. STL provides the random_shuffle algorithm for this. I dunno about insertion sort approaching O(n) but randomizing will give you a single O(n) followed by a very fast quicksort (since the data is randomized). I''d do some playing around with both insertion sort and randomizing before quicksort.

Share this post


Link to post
Share on other sites
CProgrammer    303
AP: Wow, so you think shuffling and then sorting with quickSort is faster than just using bubblesort? What do the others think, cause I dunno.

Kylotan: Thanks I'll look into that as well.
-CProgrammer

[edited by - CProgrammer on January 24, 2004 12:48:14 AM]

Share this post


Link to post
Share on other sites
Guest Anonymous Poster   
Guest Anonymous Poster
Same AP here:

Randomizing the data before a Quicksort will probably always be faster than a bubble sort. Bubble sort is O(n^2) whereas randomizing the data is a single theta(n) cost, plus the randomized data will give the Quicksort nearly best-case performance.

Insertion sort is O(n^2) but it''s expected performance will be a lot better than bubble sort. Insertion sort is probably good on sorted data as the guy above said, but I think his terminology is wrong.

Try both insertion sort and std::random_shuffle+std::sort. I''d guess the guy above is right that insertion sort will win, but you don''t know unless you try.

Share this post


Link to post
Share on other sites
Zahlman    1682
Mergesort doesn''t have bad cases like Quicksort AFAIK. And in my tests it''s nearly as fast as far as the constant goes, and of course it''s also normally O(n lg n).

For short lists, bubble sort *will* be faster than Quicksort or Mergesort written recursively, whatever the data. The value of "short" depends on how well your compiler optimizes tail recursion, and details of implementation. I found the crossover point to be around 40 elements when I did tests for my undergrad thesis (on a Sparc architecture compiling with an outdated version of gcc and no optimization flags).

If you *know* your data is "nearly sorted"... is it possible that you could maintain it in a completely sorted state? Encapsulate your list, with sortedness being an invariant in your class design. If you take that approach, insertion sort is almost certainly your only sensible option - just find the right place to insert via binary search every time you add something.

Share this post


Link to post
Share on other sites
Beer Hunter    712
quote:
Original post by Anonymous Poster
Believe it or not, in some cases it is faster to randomize the data before using quicksort.
Only if your implementation of quicksort always chooses the first or last element as a pivot. There are far better ways to choose a pivot.

Share this post


Link to post
Share on other sites
Megahertz    286
I''ve seen this page around for a while, and from the looks of it the O/E Transpose and the Shear sort blow the pants off the quick sort.

http://www.cs.rit.edu/~atk/Java/Sorting/sorting.html

Is the performance this much greater and if so, why is the shear sort not more recommended over quicksort?

-=[ Megahertz ]=-

Share this post


Link to post
Share on other sites
SiCrane    11839
Did you notice that Shear sort and O/E Transpose are both parallel algorithms? (i.e. use more than one processor. A lot more than one processor to be effective.)

Share this post


Link to post
Share on other sites
The worst case for Quick Sort being sorted order is only true when using a naïve way to pick the pivot. Most STL implementations use far better approaches.

I would really advice you to benchmark it, for linear arrays (i.e. vectors) beating most std::sort implementatins for small datasets (sub million size) can be done using a very naïve merge sort or even just a simple variant of quick sort if you can live with worst case but that doesn''t seems to be possible in this case.

Share this post


Link to post
Share on other sites
Megahertz    286
I did notice that. Perhaps I''m confused as to how these applets run. Are they running on the client side or is there some sort of server side, trickery involved that im not aware of. My assumption obviously is that is client side. (I''m not real knowledgeable in Java so I honestly couldn''t tell you)

If it is client side, I''m then confused as to how it still beats the other sorts. I AM on a single processor machine. I think theres some code there for it, I''ll take a look as see if I can see what im mising about this.


-=[ Megahertz ]=-

Share this post


Link to post
Share on other sites
osmanb    2082
Also, people should remember that std::sort in pretty much EVERY C++ compiler now is not using quicksort, it''s using introsort, which avoids the n^2 behavior of quicksort, even on pathologically bad inputs. And further, once your list size is small enough, that std::sort will switch over to using insertion sort automatically. Unless you have some staggering evidence that using std::sort is too slow for whatever you''re doing, don''t reinvent the wheel.

Share this post


Link to post
Share on other sites