Confused on Big Oh

Started by
35 comments, last by jwezorek 11 years, 1 month ago

Based on everyone feedback, I still have questions:

Since n^3 grows faster than n^2, then does that means n^3 is a slower run time algorithm than n^2? Please correct me if I am wrong.

Advertisement

No.

The notation only shows how it grows in the long case, for a large data set. There is an invisible constant multiplier which is left off, and there are potentially smaller polynomial terms that are also left off.

A very slow algorithm with n^2 growth can be slower then an extremely fast algorithm with n^3 growth.

For a small data set the more complex algorithm may have a faster clock time. Sometimes the specific hardware will have an optimization that will handle the more complex algorithm as a natural fit to the computation model where the less complex algorithm is a bad fit requiring more processor work.

And sometimes one algorithm can be run in main memory or in cache memory where the other one must be swapped out to disk, causing actual performance to plummet.

Big-O notation is useful in CS theory classes, but in the practical world it only serves as a general guide to see if an algorithm is sane or not for the problem you are trying to solve.

Expanding on frob's answer above. Refer to my previous post.

Big O notation is defining the upper bounds of the growth of a function

Its important to note the word "Growth" in there. So n^3 will grow to requiring more operations faster than n^2 - but without considering the C and k constants - it makes no guarantees about performance.

Essentially the way you should view the formulas is that when you substitute a value for n the result is the number of operations required.

So the more number of operations required = more time required but Big O doesn't provide us with details on how many operations are required since if you look back to how the proof for Big O is done - that information is lost in the process.

Lets say I have the following task: I have a binder of unsorted papers, and I want to find a particular paper. If I have n number of papers in the binder - in the worst case scenario I would perform n checks to see if it's the paper I want.

That would give me a function we can define as f(x) = n since I have to perform n checks, and it would also be O(n).

Now if for whatever reason - in addition to looking through the papers in the binder, I also had to perform a calculation or some other operation for every page (maybe now I need to perform two comparisons since there are two possible ways to match the page I want). My function would be f(x) = 2n, which intuitively makes sense as taking more time since we have to do twice as many comparisons.

Its important to note that this second function would also be O(n) since Big O notation isn't so much interested in comparing the exact number of operations between functions but rather their growth - which in the case of both functions is linear (or n). And this also illustrates the reason why the C and k constants are important since for the first example - C = 1, k = 0, for the second example C = 2, and k = 0.

I'd also like to mention that there are a whole set of notations like Big O which all have to be considered to do a proper comparison of algorithms. Big O gives us upper bound, but there's still the average, and lower bounds to consider... But most people generally only look at worst case/upper bound for a quick sanity check. See http://en.wikipedia.org/wiki/Big_O_notation#Use_in_computer_science

In the real world - much of this doesn't really matter as frob mentioned. But as an example of a real world example where it does matter is if you read through the implementation details of some generic data structures in the .Net Framework - they're actually implemented in such a way that depending on the number of items, they'll use different algorithms. This is because for small data sets (smaller values of n) an algorithm with a greater order of complexity actually performed better, but as the data set increases (for larger values of n) - a different algorithm is preferred because it has a lower order of complexity.

Big-O notation is useful in CS theory classes, but in the practical world it only serves as a general guide to see if an algorithm is sane or not for the problem you are trying to solve.

Good to know. sleep.png

Expanding on frob's answer above. Refer to my previous post.

This is because for small data sets (smaller values of n) an algorithm with a greater order of complexity actually performed better, but as the data set increases (for larger values of n) - a different algorithm is preferred because it has a lower order of complexity.

Is this always the case? Why do complex algorithms perform better on a small values and less complex algorithms perform better on larger values? Can I get an example of a complex and less complex algorithms?

Expanding on frob's answer above. Refer to my previous post.

This is because for small data sets (smaller values of n) an algorithm with a greater order of complexity actually performed better, but as the data set increases (for larger values of n) - a different algorithm is preferred because it has a lower order of complexity.

Is this always the case? Why do complex algorithms perform better on a small values and less complex algorithms perform better on larger values? Can I get an example of a complex and less complex algorithms?

Not always the case. It just turns out that sometimes you'll have algorithm that performs k*n^3 operations on average (where k is, say, 10) and another algorithm that performs c*n^2 operations on average (where c is, say, 100). For values of n from 1 to 10 the first algorithm is as fast or faster. After that, the second algorithm is faster.

I had to go digging around the .net framework on MSDN but I found this page talking about Array.Sort. It talks a little about what algorithms it uses in the Remarks section:

his method uses the introspective sort (introsort) algorithm as follows:

  • If the partition size is fewer than 16 elements, it uses an insertion sort algorithm.

  • If the number of partitions exceeds 2 * LogN, where N is the range of the input array, it uses a Heapsort algorithm.

  • Otherwise, it uses a Quicksort algorithm.

C++: A Dialog | C++0x Features: Part1 (lambdas, auto, static_assert) , Part 2 (rvalue references) , Part 3 (decltype) | Write Games | Fix Your Timestep!

This is completely made-up and a very forced example, but imagine two different algorithms with loops like below, where they both call some inner function (DoAlgorithmInner1/DoAlgorithmInner2) that we assume are both of equal cost.
for( i=0; i < n; ++i )
  for( j=0; j < 100; ++j )
    DoAlgorithmInner1( i, j )

for( i=0; i < n; ++i )
  for( j=0; j < n; ++j )
    DoAlgorithmInner2( i, j );
The first is O(n*100), which is O(n*constant), which is just O(n).
The second is O(n2).

Say that n is only 10 -- the first one calls the inner function 10*100 times, the second calls it 10*10 times.
Say that n is 10000 -- the first one calls the inner function 10000*100 times, the second calls it 10000*10000 times.

So for any case where n is less than 100, the n^2 algorithm actually does less work.


For a real world example -- try implementing a comparison sort, like quicksort, which is O(n log n) and also a radix sort, which is O(n).
The constants, which are removed from the big-O analysis are actually very high with the radix sort, which makes it behave like the above example.
e.g. n*1000 might be more expensive than n*n*10.

Expanding on frob's answer above. Refer to my previous post.

This is because for small data sets (smaller values of n) an algorithm with a greater order of complexity actually performed better, but as the data set increases (for larger values of n) - a different algorithm is preferred because it has a lower order of complexity.

Is this always the case? Why do complex algorithms perform better on a small values and less complex algorithms perform better on larger values? Can I get an example of a complex and less complex algorithms?

It's often the case.

Complex algorithms perform worse on small data sets because it often doesn't require a lot of cleverness to perform an operation on a small data set quickly. Depending on what you need to do, small data sets can often be dealt with efficiently "by brute force", that is by doing the what you need to do in the most straight-forward manner without worrying about algorithmic complexity. Whereas, performing the same operation on the data set in the algorithmically superior manner often requires additional set-up costs, additional function call overhead, additional dynamic memory usage, and so forth. These sorts of costs come out in the wash when the data set is huge.

A common example of this situation is standard sort implementations. Often if the array being sorted is greater than some small number (7 to 10, I think, are common) QuickSort is used; if it is smaller, selection sort is used.

I found the data structure I was actually thinking of, even though other people have offered lots of great examples too.

http://msdn.microsoft.com/en-ca/library/system.collections.specialized.hybriddictionary.aspx

See the remarks section of that class and the classes it links to. Offers lots of great insight.

Overall - every algorithm has its advantages and disadvantages - there are no super algorithms that are magically better, simply some that are better in solving certain problems.

Expanding on frob's answer above. Refer to my previous post.

This is because for small data sets (smaller values of n) an algorithm with a greater order of complexity actually performed better, but as the data set increases (for larger values of n) - a different algorithm is preferred because it has a lower order of complexity.

Is this always the case? Why do complex algorithms perform better on a small values and less complex algorithms perform better on larger values? Can I get an example of a complex and less complex algorithms?

It's often the case.

Complex algorithms perform worse on small data sets because it often doesn't require a lot of cleverness to perform an operation on a small data set quickly. Depending on what you need to do, small data sets can often be dealt with efficiently "by brute force", that is by doing the what you need to do in the most straight-forward manner without worrying about algorithmic complexity. Whereas, performing the same operation on the data set in the algorithmically superior manner often requires additional set-up costs, additional function call overhead, additional dynamic memory usage, and so forth. These sorts of costs come out in the wash when the data set is huge.

A common example of this situation is standard sort implementations. Often if the array being sorted is greater than some small number (7 to 10, I think, are common) QuickSort is used; if it is smaller, selection sort is used.

Interesting. I will bear in mind about this

This topic is closed to new replies.

Advertisement