which sorting algorithm

Started by
19 comments, last by Rockoon1 16 years, 1 month ago
If you want to find lots of different sorting algorithms, then you probably wont find more than I've implemented
here.

Possible candidates for common stable array sorting algorithms that are about O(n) on nearly sorted data include:

Improved BubbleSort,
ShakerSort,
InsertionSort,
StrandSort,
SplayTreeSort,
RadixSort.

You can find C++ implementations of all of these by following the above link.

dmatter: Merge Sort can be done in-place without O(n) extra space. The stable_sort in VS does this.
"In order to understand recursion, you must first understand recursion."
My website dedicated to sorting algorithms
Advertisement
Hm, "the majority of the elements already sorted" wasn't precise enough :)

The first 256 elements are unsorted, and the following 65280 elements are sorted in descending order. I ended up sorting only the first 256 elements with my "stable quicksort" in ascending order and then merging the two sorted lists into a new list.

Since the first part is sorted in ascending order and the second part is sorted in descending order, the merging can be done quite nicely (no need to check if one of the lists is already completely processed).
According to this website:
http://warp.povusers.org/SortComparison/
The fastest sorting algorithm for almost sorted arrays (beginning or end of the array is unsorted) is Shell Sort! :)
Quote:Original post by DevFred
Hm, "the majority of the elements already sorted" wasn't precise enough :)

The first 256 elements are unsorted, and the following 65280 elements are sorted in descending order. I ended up sorting only the first 256 elements with my "stable quicksort" in ascending order and then merging the two sorted lists into a new list.

Since the first part is sorted in ascending order and the second part is sorted in descending order, the merging can be done quite nicely (no need to check if one of the lists is already completely processed).
That's known as a bitonic merge, which as you've noticed can be done easily enough without using extra space.

Note, I did not suggest shellsort because it is not stable. It could have been used, but be weary of what guardian24 just mentioned. ShellSort is reasonably good for reasonably sorted data, but it wont beat plain InsertionSort for very close to already sorted data. The reasons is that ShellSort does far more comparisons in that case.
"In order to understand recursion, you must first understand recursion."
My website dedicated to sorting algorithms
Quote:Original post by DevFred
The first 256 elements are unsorted, and the following 65280 elements are sorted in descending order. I ended up sorting only the first 256 elements with my "stable quicksort" in ascending order and then merging the two sorted lists into a new list.

If they're already in two lists, and you want to keep the first list sorted, why are you not using a sorted list in the first place?

Insertions involve an O(log n) search repeated for m elements to be inserted. (Any of binary_search, lower_bound, or upper_bound used with insert.) That's much faster than adding your 256 elements and then sorting. It is roughly the same computation work to sort your 256 elements and then doing an insertion sort, but much less programmer work.
Quote:Original post by frob
Quote:Original post by DevFred
The first 256 elements are unsorted, and the following 65280 elements are sorted in descending order. I ended up sorting only the first 256 elements with my "stable quicksort" in ascending order and then merging the two sorted lists into a new list.

If they're already in two lists, and you want to keep the first list sorted, why are you not using a sorted list in the first place?

Insertions involve an O(log n) search repeated for m elements to be inserted. (Any of binary_search, lower_bound, or upper_bound used with insert.) That's much faster than adding your 256 elements and then sorting. It is roughly the same computation work to sort your 256 elements and then doing an insertion sort, but much less programmer work.


You forgot to count the cost of an insertion.

I believe the only data structure that offers both an O(log n) search and an O(1) insertion is a tree.

Certainly there is no "list" data structure that offers both. A linear sorted array offers O(log n) search but O(n) insertion, while a linked list offers an O(n) search and an O(1) inserion.

Neither of these (obviously) resolve down to something as good as an O(n log n) sort. The former resolves to O(n^2 log n) and the later to O(n^2) .. neither is as good as you thought, right?

I realize that some linked list methods offer better than an O(n) average search while still offering an O(~1) insertion, but implimenting one of those seems like a lot of work when the theoretical best it can do is infact O(n log n) just like every decent comparison based sort.

Additionally, linked lists and trees are generally not cache friendly. Cache-oblivious implimentations are often quite complex and are relatively "new" so you cannot expect any help from a compilers standard library.

Seriously.. his best bet is to load those 256 items into an array and then sort it using a solid O(n log n), followed by a bitonic merge of his two lists.
Original post by Rockoon1
Quote:Original post by frob
You forgot to count the cost of an insertion.

I believe the only data structure that offers both an O(log n) search and an O(1) insertion is a tree.

Certainly there is no "list" data structure that offers both. A linear sorted array offers O(log n) search but O(n) insertion, while a linked list offers an O(n) search and an O(1) inserion.

There are several binary search algorithms for previously sorted containers.


std::binary_search()
Complexity: At most log(last - first) + 2 comparisons.

If you are trying to get a stable sort, you can use
std::upper_bound(), or std::lower_bound()
Both are At most log(last - first) + 1 comparisons.

All three return an iterator suitable for insert().

Coupled with a std::list<>, amortized performance is as I described above.
Quote:Original post by Rockoon1
You forgot to count the cost of an insertion.

I believe the only data structure that offers both an O(log n) search and an O(1) insertion is a tree.

Certainly there is no "list" data structure that offers both. A linear sorted array offers O(log n) search but O(n) insertion, while a linked list offers an O(n) search and an O(1) inserion.

Hmm, IIRC a std::deque typically offers O(~log n) search and O(~1) insertion - don't quote me on that though, I don't know what the actual complexity guarantees of the standard say about this.
Quote:Original post by frob
There are several binary search algorithms for previously sorted containers.


std::binary_search()
Complexity: At most log(last - first) + 2 comparisons.

If you are trying to get a stable sort, you can use
std::upper_bound(), or std::lower_bound()
Both are At most log(last - first) + 1 comparisons.

All three return an iterator suitable for insert().

Coupled with a std::list<>, amortized performance is as I described above.


You can't do a binary search on a linked list, and any implementation of lower_bound or upper_bound for a linked list will be O(N). The only way to efficiently insert the elements one at a time is if you're using a tree or similiar structure (like a skiplist). Linked lists are inefficient for searching and arrays are inefficient for inserting, so in either case merging is more efficient.

Quote:Original post by dmatter
Hmm, IIRC a std::deque typically offers O(~log n) search and O(~1) insertion - don't quote me on that though, I don't know what the actual complexity guarantees of the standard say about this.


A deque offers O(1) insertion only at the beginning and end. Insertions at random locations are O(N).
Quote:Original post by Vorpy
Quote:Original post by dmatter
Hmm, IIRC a std::deque typically offers O(~log n) search and O(~1) insertion - don't quote me on that though, I don't know what the actual complexity guarantees of the standard say about this.


A deque offers O(1) insertion only at the beginning and end. Insertions at random locations are O(N).

Yeah that's what I meant.
Having now applied a few more brain cells to this, it occurs to me that Rockoon1 probably meant insertions at arbitrary locations. [rolleyes]

This topic is closed to new replies.

Advertisement