Sorting algorithm

Started by
29 comments, last by snk_kid 19 years, 5 months ago
Hi Ive some code that look something like this: struct SORT { int index; int value; }; SORT sort[10]; for (int n = 0; n < 10; n++) { sort[n].index = n; sort[n].value = rand()%32; } now I want to order the indices so the sort with greatest value has its index member set to 0, etc.. But I cant find out how to do this. plz help
Problems every where
Advertisement
Quick Sort will do this for you. The link is to Java source but its easy to understand.
[size="1"] [size="4"]:: SHMUP-DEV ::
for large datasets, radixsort is unbeatable.

afaik the only sorting algorithm that runs in linear time, and not hard to implement either.
Actually radix sort is O(n) only if you allow it to have an O(n²) memory complexity, which can sometimes be impossible (I mean, just imagine what you would need to sort 100.000 integers), and is usually a big drawback on huge datasets.

Otherwise, radix sort usually runs in O( n log N ) where N is the size of the biggest integer, because it reduces the amount of bins by increasing the amount of passes. Sadly, if the biggest integer is n (if, for instance, you're sorting a set with mostly different objects), it will be in O(n log n).

- Quicksort is fast, O(n log n) and has a very tight inner loop (that allows it to be faster than other O(n log n) sorts), but has a O(n²) worst-case complexity. If you know your data are not already sorted or almost sorted, use it.

- Merge sort is theoretically fast, O(n log n), but does not work well with arrays (it cannot be done in place, like the others). On the other hand, it's a good alternative on lists.

- Heap sort is fast as well, O(n log n), can be done in place just like quicksort, and does not suffer from the same worst-case complexity as quicksort. If your data might be almost totally sorted, use it instead of quicksort.

- Bubble sort is slow, O(n²), but is useful if your data is almost sorted (in which case it runs close to O(n), making it the fastest sort algorithm to run on sorted data).

A final word: if using C++, you have sort algorithms available as a standard, default part of the language, such as std::sort in , which sorts an array.
The above was me... looks like it didn't accept my password.
I'm fairly certain that radix sort can be done in O(n + k) memory and O(k * n) complexity where n is the number of elements and k is the size of each element (usually in bytes, since that makes the lookup table small). Since k would be a constant per application, it simplifies to O(n) on both accounts.
"Walk not the trodden path, for it has borne it's burden." -John, Flying Monk
Extrarius: yes, k is the number of digits, in base B (in your case, B=256), of N, the largest number. So k = logB N, hence my O(n log N) statement.

However, in the computer world, n is smaller than N, so quicksort, merge sort, and heap sort, all are in O(n log N) (because n log n < n log N), which can all be simplified as O(n) by your reasoning.

By saying that "k is a constant", you're reducing the problem from a general environment to the 64-bit or 32-bit integer environment we work with. And in this environment, log n is also smaller than a constant = O(max pointer size), so all O(n log n) complexities will end up being O(n).
Quote:Original post by ToohrVyk
Extrarius: yes, k is the number of digits, in base B (in your case, B=256), of N, the largest number. So k = logB N, hence my O(n log N) statement.

However, in the computer world, n is smaller than N, so quicksort, merge sort, and heap sort, all are in O(n log N) (because n log n < n log N), which can all be simplified as O(n) by your reasoning.

By saying that "k is a constant", you're reducing the problem from a general environment to the 64-bit or 32-bit integer environment we work with. And in this environment, log n is also smaller than a constant = O(max pointer size), so all O(n log n) complexities will end up being O(n).

im sorry but i dont think i get you.

maybe what youre trying to say is that radixsort depends on the actual number of bits to sort, instead of the amount of entries? that would be totally true, yet still linear in terms of number of entries nonetheless. sure, sorting a dataset of 64nit numbers will be twice as slow as a equal sized dataset of 32bit numbers, but as i said: for large datasets linear timecomplexity just cant be beaten.

as far as memory consumption is concerned: its also linear. true its not inplace but you only need one backbuffer plus a couple of bytes for an offset&counttable. not a big deal either, really.

it seems your opinion on radixsorting is kind of wrong...
Assuming you know how to sort then this problem is only slightly differant. Normally you would do your comparision as sort.value>sort[j].value. Here you use sort[sort.index]].value>sort[sort[j].index].value. Assuming i<j then if that condition is true you swap the indexes, i.e. temp=sort.index, sort.index=sort[j].index, sort[j].index=temp. The values stay in the position they started in. The indexes move from row to row. At the end the index stored in the first row is the index of the row with the smallest value.
Keys to success: Ability, ambition and opportunity.
Eelco: radix sort is linear in the number of entries. What I want to point out is that O-notation will hide constants away from you, and that in the case of radix sort, that constant is somewhere around log N. This is why in almost all cases, where n < N, I might prefer using a heap sort with O( n log n ) complexity instead of a sort in O( n ) that is actually a hidden O( n log N ).

And how about pushing that reasoning to the extreme? Here:

I coded recently an algorithm that sorts, in place, using a heapsort-like algorithm, a table of 2^32 numbers *at most* (you won't be needing more for most applications). Of course, you can submit smaller datasets.

The sort occurs in O(N log N), where N = 2^32, because that's how heap sort performs on my table. Since the algorithm doesn't depend on the number of entries, it works in O(1).

Yes, I see your point: my algorithm only allows a number of entries smaller than 2^32. Indeed, I did make that approximation here: a pointer is only 32 bits, hence the limited amount of memory. But radix sort does the same approximation: an integer is only 32 bits, hence the limited amount of passes.

Do you understand my point now? (And do you agree?)

[BTW, don't say someone's opinion is wrong if you admitted, in the same post, that you didn't understand it]

This topic is closed to new replies.

Advertisement