• Announcements

    • khawk

      Download the Game Design and Indie Game Marketing Freebook   07/19/17

      GameDev.net and CRC Press have teamed up to bring a free ebook of content curated from top titles published by CRC Press. The freebook, Practices of Game Design & Indie Game Marketing, includes chapters from The Art of Game Design: A Book of Lenses, A Practical Guide to Indie Game Marketing, and An Architectural Approach to Level Design. The GameDev.net FreeBook is relevant to game designers, developers, and those interested in learning more about the challenges in game development. We know game development can be a tough discipline and business, so we picked several chapters from CRC Press titles that we thought would be of interest to you, the GameDev.net audience, in your journey to design, develop, and market your next game. The free ebook is available through CRC Press by clicking here. The Curated Books The Art of Game Design: A Book of Lenses, Second Edition, by Jesse Schell Presents 100+ sets of questions, or different lenses, for viewing a game’s design, encompassing diverse fields such as psychology, architecture, music, film, software engineering, theme park design, mathematics, anthropology, and more. Written by one of the world's top game designers, this book describes the deepest and most fundamental principles of game design, demonstrating how tactics used in board, card, and athletic games also work in video games. It provides practical instruction on creating world-class games that will be played again and again. View it here. A Practical Guide to Indie Game Marketing, by Joel Dreskin Marketing is an essential but too frequently overlooked or minimized component of the release plan for indie games. A Practical Guide to Indie Game Marketing provides you with the tools needed to build visibility and sell your indie games. With special focus on those developers with small budgets and limited staff and resources, this book is packed with tangible recommendations and techniques that you can put to use immediately. As a seasoned professional of the indie game arena, author Joel Dreskin gives you insight into practical, real-world experiences of marketing numerous successful games and also provides stories of the failures. View it here. An Architectural Approach to Level Design This is one of the first books to integrate architectural and spatial design theory with the field of level design. The book presents architectural techniques and theories for level designers to use in their own work. It connects architecture and level design in different ways that address the practical elements of how designers construct space and the experiential elements of how and why humans interact with this space. Throughout the text, readers learn skills for spatial layout, evoking emotion through gamespaces, and creating better levels through architectural theory. View it here. Learn more and download the ebook by clicking here. Did you know? GameDev.net and CRC Press also recently teamed up to bring GDNet+ Members up to a 20% discount on all CRC Press books. Learn more about this and other benefits here.
Sign in to follow this  
Followers 0
Darkbouncer4689

C++ STL Algorithms

22 posts in this topic

Hey all,

I've been brushing up on my algorithms and data structures lately and today I was messing around with everything in the STL. I had always known about qsort, but it seems like there are some easier functions available if you happen to already be using an STL container such as vector.

Namely something like:
vector<int> myVector;

sort(myVector.begin(), myVector.end());

I'm guessing that sort is implemented as a quicksort, but there is one part that I'm not sure about. The template for sort is

void sort(RandomAccessIterator first, RandomAccessIterator last);

I know that for the basic quicksort (taking first element as pivot) it's worst case is a sorted or reverse sorted list in which it runs at O(n^2). This can be avoided by choosing a random pivot which makes it's expected run time O(n log n). I'm guessing this is what RandomAccessIterator is about, but I'm not familiar with them. If I simply use vector.begin() and vector.end() is it converted to a RandomAccessIterator or do I need to do something special to insure a O(n log n) run time?

Thanks in advance!
0

Share this post


Link to post
Share on other sites
[quote]I'm guessing this is what RandomAccessIterator is about[/quote]

No, it's not. RandomAccessIterator is simply a template parameter. Sort is expecting you to pass it iterators that support random access. This means, for instance, that std::sort cannot be used on a std::list.

[url="http://www.cplusplus.com/reference/algorithm/sort/"]http://www.cplusplus...algorithm/sort/[/url]
[quote name="www.cplusplus.com"]
Approximately N*logN comparisons on average (where N is last-first).
In the worst case, up to N[sup]2[/sup], depending on specific sorting algorithm used by library implementation.
[/quote]

As you can see, the details are left to implementation, so there is no right answer. If you are really concerned about the details, you can check your STL headers out for yourself.
1

Share this post


Link to post
Share on other sites
Some modern implementations use [url="http://en.wikipedia.org/wiki/Introsort"]introsort[/url], which is as fast as quicksort in practice and is O(n*log(n)) even in the worst case.
1

Share this post


Link to post
Share on other sites
[quote name='Chris_F' timestamp='1323665526' post='4893003']
No, it's not. RandomAccessIterator is simply a template parameter. Sort is expecting you to pass it iterators that support random access. This means, for instance, that std::sort cannot be used on a std::list.
[/quote]

I'm a bit confused here, what exactly is an iterator that supports random access?

Thanks!
0

Share this post


Link to post
Share on other sites
Well, a forward iterator is one that can be incremented, meaning you can always iterate forward and get the next object, but you can't iterate backwards and get the previous. A bidirectional iterator is one that can iterate in both directions, meaning you can get the next object or the previous. A random access iterator means you can get [b]any[/b] object; the the next, the previous, one 5 places ahead or one 5 places behind...
1

Share this post


Link to post
Share on other sites
An iterator is just a "concept", it's kind of like an "interface" or an "abstract base class", but it's invisible. It's a contract that you make, saying "this class satisfies these requirements".

e.g. to satisfy the RandomAccessIterator contract, needs to be able to (among other things)
* be incremented by an integer: e.g. i = i + 7 -- should increment the iterator forwards 7 places
* be incremented/decremented with ++/-- operators
* support the square bracket opeator, e.g. foo = i[7]
* support dereferencing, e.g. foo = *i;

N.B. standard raw pointers satisfy all of these requirements, which means that a raw pointer is a RandomAccessIterator, and thus can be passed to any template-algorithm that requires one:
[code]const char* text = "Hello!";
const char* begin = text;
const char* end = text+6;
std::sort( begin, end );
printf(text);[/code]

Basically, if an algorithm says that it operates on RandomAccessIterators, it actually means that it will operate on any type that you pass to it, but it expects this type to satisfy the above list of requirements.
2

Share this post


Link to post
Share on other sites
Okay, this is making sense now. I appreciate the help. I'm having trouble finding any examples that use RandomAccessIterator, for regular iterators I could do vector<int>::iterator itr. How can this be done with a random iterator?

Edit: Okay, so the random access is handled for me then? If make a sorted vector and call sort on it, passing in vector.begin and vector.end iterators, the expected run time will still be O(n log n)?
0

Share this post


Link to post
Share on other sites
The iterators belong to a class heirarchy. Because a Random Access Iterator satisfies the requirements of a Bidirectional Iterator in that it can definitely go forwards and backwards, RandomAccessIterator is derived from BidirectionalIterator. Also, because a Bidirectional Iterator satisfies the requirements of a plain old Iterator in that it can definitely go forwards, BidirectionalIteratoris derived from an ordinary Iterator.

e.g. If a function takes a BidirectionalIterator then it can be passed a RandomAccessIterator, or a BidirectionalIterator, but not an ordinary forwards only iterator.

Since a lot of algorithms out there don't actually require random access, they don't force you to have a RandomAccessIterator. That doesn't mean that you can't give them one, as it still fits the requirements even if it goes above and beyond them.
1

Share this post


Link to post
Share on other sites
Hodgman's example doesn't work because the string is const. This works better:
[code]char text[] = "Hello!";
char* begin = text;
char* end = text+6;
std::sort( begin, end );
printf(text);[/code]
1

Share this post


Link to post
Share on other sites
An iterator is considered "random access" if you can move an arbitrary number of elements forward or back in constant time. For example, an array pointer is random access because you can look up any element in the array through simple pointer arithmetic. In some containers, such as list, you need to visit each element in order to find the element adjacent to it, so the container only supports bidirectional iteration. Certain structures such as singly-linked lists or connections to external resources are unidirectional, so you can only move forward.

So "random access iterators" have nothing to do with avoiding quick-sort's worst case whatsoever.

The actual implementation of quick-sort can vary considerably, but the objective is to choose pivot points as close to the middle of the range as often as possible. A particular implementation MAY experience it's worst case when the elements are sorted. Or it may check for that, possibly resulting in a sorted range being the best case instead. A naive implementation might overflow the stack in it's worst case, but this is easily countered by only recursively calling the function on the smaller of the two partitions. Finally, pure quick-sort results in too many recursive function calls, so smarter implementations will opt to perform some other sort (e.g. insertion) when the partitions are split small enough.
The main purpose of random pivot selection is actually about preventing deliberate exploitation of quick-sort's weakness. Simple issues such as preventing poor performance when sorting an already sorted range can be solved with other means.


0

Share this post


Link to post
Share on other sites
Maybe a dumb question, but why do you need to ensure a n log n run time?

Unless you are sorting VERY large lists in a real-time environment, and you have profiled that this is a verified choke-point in your program (and it matters), you are simply doing "optimization for optimization's sake." And that is nearly always a bad thing.
0

Share this post


Link to post
Share on other sites
Actually lists don't need to be that big for an N log N algorithm to be much faster than an N^2 one. For example if you have 10,000 objects the N log N algorithm should be about 750 times quicker (using a base 2 logarithm, and ignoring the constant factors).
0

Share this post


Link to post
Share on other sites
Regarding the iterators -- just read this:
[url="http://www.sgi.com/tech/stl/Iterators.html"]http://www.sgi.com/t.../Iterators.html[/url]
and then this:
[url="http://www.cplusplus.com/reference/std/iterator/"]http://www.cplusplus...e/std/iterator/[/url]
This should be all that you need to get started! :-)

Interestingly, you might notice that (other than asymptotic algorithmic complexity) C++ std::sort is faster than C qsort since it (or, strictly speaking, the comparison function) can be inlined and thus more effectively optimized by the compiler:
[url="http://stackoverflow.com/questions/4708105/performance-of-qsort-vs-stdsort"]http://stackoverflow...sort-vs-stdsort[/url]
0

Share this post


Link to post
Share on other sites
[quote name='tbrick' timestamp='1323704288' post='4893131']
Maybe a dumb question, but why do you need to ensure a n log n run time?

Unless you are sorting VERY large lists in a real-time environment, and you have profiled that this is a verified choke-point in your program (and it matters), you are simply doing "optimization for optimization's sake." And that is nearly always a bad thing.
[/quote]

Because having your application randomly hang for no apparent reason and in an impossible to debug way is a bad practice. Having average time of 0.1 second but spikes of 2.5 minutes is not acceptable.

Ruby on Rails exhibited some such fundamental flaws in parsing which caused no end to problems in production due to naive or incorrect algorithms being used for some fundamental problems. Biggest pain was unexpected delays, since average times were as expected.

For something as fundamental as sorting, the excuse of premature optimization simply doesn't hold.

Another example: I once used a third-party multi-threading task library. I crashed hard and fast. Turns out, it used recursive algorithm for partitioning, resulting in n^2 stack growth. If developer of a library makes such a fundamental mistake, it loses any and all credibility on quality of rest of code. In my case it led to automatic rejection of any third-party threading library not published by one of proven players in the field.
0

Share this post


Link to post
Share on other sites
[quote name='Adam_42' timestamp='1323707411' post='4893163']
Actually lists don't need to be that big for an N log N algorithm to be much faster than an N^2 one. For example if you have 10,000 objects the N log N algorithm should be about 750 times quicker (using a base 2 logarithm, and ignoring the constant factors).[/quote]In that case, why not go the whole hog and use an O(N) sorting algorithm instead, which would be about 13 times faster than the O(N log N) one, and 10,000 times faster than the O(N[sup]^2[/sup]) one [img]http://public.gamedev.net/public/style_emoticons/default/tongue.gif[/img]
0

Share this post


Link to post
Share on other sites
[quote name='Hodgman' timestamp='1323733221' post='4893305']
[quote name='Adam_42' timestamp='1323707411' post='4893163']
Actually lists don't need to be that big for an N log N algorithm to be much faster than an N^2 one. For example if you have 10,000 objects the N log N algorithm should be about 750 times quicker (using a base 2 logarithm, and ignoring the constant factors).[/quote]In that case, why not go the whole hog and use an O(N) sorting algorithm instead, which would be about 13 times faster than the O(N log N) one, and 10,000 times faster than the O(N[sup]^2[/sup]) one [img]http://public.gamedev.net/public/style_emoticons/default/tongue.gif[/img]
[/quote]

AFAIK, there's no such thing as a general sorting algorithm with worst-case time complexity below O(n log n) (linearithmic) that remains practical:
[url="http://en.wikipedia.org/wiki/Sorting_algorithm#Comparison_of_algorithms"]http://en.wikipedia....n_of_algorithms[/url]

You might fare better in special (not general) cases where you know something extra about the data to be sorted, such as integer sorting.

In theory, there are some "general" algorithms w/ O(n) complexity, but they "are impractical for real-life use due to extremely poor performance or a requirement for specialized hardware."

EDIT: [url="http://arxiv.org/abs/0811.3448"]this[/url] abstract explains the situation pretty well:
"For all the sorting algorithms, it is an accepted performance limit that sorting algorithms are linearithmic or O(N lg N). The linearithmic lower bound in performance stems from the fact that the sorting algorithms use the ordering property of the data. The sorting algorithm uses comparison by the ordering property to arrange the data elements from an initial permutation into a sorted permutation.
Linear O(N) sorting algorithms exist, but use a priori knowledge of the data to use a specific property of the data and thus have greater performance. In contrast, the linearithmic sorting algorithms are generalized by using a universal property of data-comparison, but have a linearithmic performance lower bound. The trade-off in sorting algorithms is generality for performance by the chosen property used to sort the data elements. "

From the above-cited paper:
"Linearithmic Sorting

A sort algorithm performs sorting, and most sorting algorithms are comparison based sorting al-
gorithms. The comparison sorting algorithms include such algorithms as the merge sort, quick
sort, and heap sort. These sorting algorithms use comparison to arrange the elements in the
sorted permutation, and are general-purpose in nature.

The comparison sorting algorithms have a well-known theoretical [Johnsonbaugh and Schaefer
2004] performance limit that is the least upper bound for sorting that is linearithmic or O(N lg N)
in complexity. This theoretical lower bound is from the basis of the comparison sorting algo-
rithm-using sorting to arrange the data elements. A decision tree for N elements is of logarithmic
height lg N. Thus the time complexity involves the cost that is the cost of using a decision tree to
compare elements lg N and the number of elements N. Hence the theoretical least upper bound is
the product of the two costs involved, O(N lg N), which is linearithmic complexity."
0

Share this post


Link to post
Share on other sites
[quote name='Matt-D' timestamp='1323744816' post='4893366']
AFAIK, there's no such thing as a [b]comparison-based[/b] sorting algorithm with worst-case time complexity below O(n log n) (linearithmic)
...
[/quote]^^ fixed that for you (bold bit).

Any data where you can express ordering as a bitwise greater/less-than can be sorted with an integer sort. This includes integers, floating-point numbers, text, etc... pretty much all the fundamental data-types. Composite sorting rules simply place higher-priority rules in more significant bits, however, the complexity of most of these algorithms is related to the size of your integer 'key', so once it increases past a certain point, you'd be better off with the 'inferior' O(N log N) algorithms.

It's very common to see these kinds of O(K.N) sorting algorithms used inside game engines on decently sized data-sets. On smaller data-sets, introsort is good enough (and has lower 'constant' overhead).
0

Share this post


Link to post
Share on other sites
[quote name='Hodgman' timestamp='1323747575' post='4893389']
[quote name='Matt-D' timestamp='1323744816' post='4893366']
AFAIK, there's no such thing as a [b]comparison-based[/b] sorting algorithm with worst-case time complexity below O(n log n) (linearithmic)
...
[/quote]^^ fixed that for you (bold bit).

Any data where you can express ordering as a bitwise greater/less-than can be sorted with an integer sort. This includes integers, floating-point numbers, text, etc... pretty much all the fundamental data-types. Composite sorting rules simply place higher-priority rules in more significant bits, however, the complexity of most of these algorithms is related to the size of your integer 'key', so once it increases past a certain point, you'd be better off with the 'inferior' O(N log N) algorithms.

It's very common to see these kinds of O(K.N) sorting algorithms used inside game engines on decently sized data-sets. On smaller data-sets, introsort is good enough (and has lower 'constant' overhead).
[/quote]

Well, if you're concerned about using precise terminology (a worthy concern, so do not take this as a criticism), I believe you wanted to say that you "specified" or "explained", not "fixed" that. ;-)
Correct me if I'm wrong, but you're not disagreeing with me that the general-purpose in this context (sorting algos) is interchangeable with comparison-based, while integer sorting is still special-purpose (even "pretty much all the fundamental data-types" is not "all the data-types", which is important for anyone concerned with precision), right?
The complexity and dependence on key w/ trade-offs this introduces is what I meant by "general" in conjunction w/ the "remains practical" part, not disagreeing with your explanation here.
0

Share this post


Link to post
Share on other sites
Should've put a cheeky [img]http://public.gamedev.net/public/style_emoticons/default/tongue.gif[/img] onto FTFY.

I wouldn't necessarily agree that comparison-based is general-purpose whereas bit-based in special-purpose -- I've personally not run into a case where I couldn't convert between one method or the other.
0

Share this post


Link to post
Share on other sites
My sources tell me that C++0x requires a guaranteed worse case complexity of O(n * log(n)) for std::sort.

Though it was only specified as average case O(n * log(n)) in C++03, most compilers tended to use IntroSort anyway which meets the C++0x requirement.

Bottom line is that you should probably not be concerned about worst case if you use std::sort on any compiler made or updated this millenium.
1

Share this post


Link to post
Share on other sites
>>[b]AFAIK, there's no such thing as a general sorting algorithm with worst-case time complexity below O(n log n) (linearithmic) that remains practical:[/b]

Also, just to add, it has been proven that there can be no comparison-based sorting algorithm that has a worst case below O(n*log(n))
0

Share this post


Link to post
Share on other sites
It just occurred to me that even if the key size of unreasonably large, you might still devise a O(n) sorting algorithm with hashing. Just hash each value and record the number of occurrences as per counting sort. But then you need a way to visit the hash table in a sorted fashion. You'd need a hash algorithm that inserted the values sorted, which is problematic as that's the very problem we're attempting to solve!However if your hash algorithm was just key MOD tableSize then you could lower the space required to a fixed multiple of the number of values, instead of a multiple of the key size, which may be prohibitive. This would multiply the time complexity by a factor of keySize / tableSize however, resulting in a very clear trade-off. You can sacrifice time complexity for storage space, or vice versa.

But suppose you run the values through a standard hash function, counted each occurrence, and then ran an introsort on the hash table itself. You'd be doing a O(nlogn) sort with respect to the number of *distinct* values. Dpending on the data set, this might yield amazing performance.

Just something to consider

0

Share this post


Link to post
Share on other sites
[quote name='taz0010' timestamp='1323785669' post='4893503']
But suppose you run the values through a standard hash function, counted each occurrence, and then ran an introsort on the hash table itself. You'd be doing a O(nlogn) sort with respect to the number of *distinct* values. Dpending on the data set, this might yield amazing performance.[/quote]

Mergesort.
0

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0