# QuickSort algorithm

This topic is 1727 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

Hi,

QuickSort is a sort algorithm who is used a lot.

I use a callback to compare both data :

Int32 (*Compare) ( const T&, const T& )


The complete function is :

QuickSort( const std::size_t First, const std::size_t Last, Int32 (*Compare) ( const T&, const T& ) )
{
// Check for valid parameters.
if( ( First >= Last ) || ( m_nData <= 1 ) )
return;

// Initialize position.
std::size_t t = First;
std::size_t p = First;
std::size_t q = Last;

// Partition.
while( true )
{
while( Compare( m_Data[ p ], m_Data[ t ] ) > 0 )
++p;

while( Compare( m_Data[ q ], m_Data[ t ] ) < 0 )
--q;

if( p > q )
break;

if( p < q )
{
const T Temp = m_Data[ p ];
m_Data[ p ] = m_Data[ q ];
m_Data[ q ] = Temp;

if( p == t )
t = q;

if( q == t )
t = p;
}

++p;
--q;
}

// Recursion.
QuickSort( First, q, Compare );
QuickSort( p, Last, Compare );
}

An exemple of compare callback is :

Int32 ArrayCompare( const UInt32& a, const UInt32& b )
{
return a < b ? 1 : a > b ? -1 : 0;
}


Using this callback the array is ascending.

I have two questions :

- Is it possible to have better performances ? How ?

- Is it possible to have a compare callback who works by boolean ? How ?

the compare callback by boolean should be :

bool (*Compare) ( const T&, const T& )


Thanks

Edited by Alundra

##### Share on other sites

- Is it possible to have better performances ? How ?

Here are some of the ways I can think of:
2. pass in better data
3. turn on compiler optimizations

- Is it possible to have a compare callback who works by boolean ? How ?

uhhhh...
return a > b;


##### Share on other sites

If you use a templated functor instead of a function-pointer, then the compiler will be able to optimize the code much better, and you can still use the function the same way (and also in other new ways too, such as using comparison lambdas or function-objects):

template<class Fn>
void QuickSort( const std::size_t First, const std::size_t Last, const Fn& Compare )
...
int compare( const int& a, const int& b ) {...}
...
myIntContainer.QuickSort(0,10, &compare);

    while( Compare( m_Data[ p ], m_Data[ t ] ) >= 0 ) // note that (x>=y) == !(x<y)
...
while( Compare( m_Data[ q ], m_Data[ t ] ) < 0 )

Move the <0 inside the compare function (so it returns a bool) and then because (x>=y) is the same as !(x<y), your code becomes:

    while( !Compare( m_Data[ p ], m_Data[ t ] ) )
...
while( Compare( m_Data[ q ], m_Data[ t ] ) )

If you want to look at an optimized implementation, the code for std::sort is a good place to look.

Edited by Hodgman

##### Share on other sites

- Is it possible to have better performances ? How ?

In addition to Hodgman's implemtation changes:
1) Choose a random pivot element, rather than the first. This greatly reduces the probability of performance leaning in the O(n^2) direction for poorly distributed (semi-sorted to fully sorted) input data. You may want to also compute the median of the first, second and middle elements, or the median of a random subset (trade off between better medians, and more computation to get them).
2) Drop to a simpler sort when the data gets small enough. E.g., insertion sort (at approx. 8-16 elements) in order to reduce overhead. A less conservative switch (which IIRC std::sort does), is a switch to a heap sort after either a certain memory footprint size is reached or stack depth is reached. This is because this approach has a bounded stack depth, and HS becomes more efficient (due to it not being inhibited by it's cache unfriendly memory access patterns on larger sets) for smaller data sets.
3) If you are using primitive types, use a SIMD sorting network when the data set for a particular recursion is small enough.
4) Separate your sorting keys from your data for cache and swap efficiency.
6) Sort distributed.

##### Share on other sites

- Is it possible to have better performances ? How ?

Many sorting libraries will use different sorting algorithms depending on the data they encounter.

Quicksort has a high overhead compared to a few other sort routines.  Also any time you get a bad pivot value you will move toward the worst case scenario of quicksort.  You chose the first value, so sorting an already-sorted list means you will see the worst case O(n^2) performance.  For any pivot selection heuristic someone can construct a worse case data set.

So the answer is to change your algorithm.   For a small number of items, insertion sort is generally fastest.  If the number is small enough, sort with that algorithm instead.  Quicksort has a really bad worst case scenario.  If multiple bad pivots are detected, switch over to the heapsort algorithm instead.

##### Share on other sites

Look into introsort. It has a worst-case complexity of O(nlogn) while quicksort is O(n2). It's what frob is describing. I just wanted to give the name of the algorithm to aid in googling.

##### Share on other sites
One concern I have here is that either:
a) You have a buffer overrun on this line
while( Compare( m_Data[ q ], m_Data[ t ] ) < 0 )
because q starts off equal to Last, or
b) You are using a different range scheme from the standard library. The end() method of the standard C++ library returns an iterator that is one-past the end of the container.

You might want to take a look at how your Quicksort varies from some of the dozens of sorting algorithm on my site. I've got introsort on there too. Link below.
Median-of-three is one improvement often used too. Edited by iMalc

##### Share on other sites

Quicksort has a high overhead compared to a few other sort routines.  Also any time you get a bad pivot value you will move toward the worst case scenario of quicksort.  You chose the first value, so sorting an already-sorted list means you will see the worst case O(n^2) performance.  For any pivot selection heuristic someone can construct a worse case data set.

It would be less confusing to say "higher" overhead. In the grand scheme of things it's low to average overhead. Certainly a well-tuned implementation should still switch to a simpler algo for very small ranges.

With a reasonable pivot selection, quicksort is astronomically unlikely to behave significantly worse on real data than the expected O(n lg n) average. If the heuristic uses randomness, you can't even construct a data set to defeat it.

##### Share on other sites

I've been told timsort is really smart. I don't know how it compares however, everything std::sort implements is fine for me.

##### Share on other sites

I've been told timsort is really smart. I don't know how it compares however, everything std::sort implements is fine for me.

Yeah it probably is. The code for timsort is also huge and complicated, and it requires O(n) extra space. For his learning I wouldn't recommend going near it.

Introsort (a variation on quicksort) can be modified to produce the same best, worst, and average cases as timsort, but without the high memory usage, and probably a lower constant overhead too.