Confused on Big Oh

Started by
35 comments, last by jwezorek 11 years, 1 month ago

I'm been reading the definition on Big-Oh but I want to test if I interpret it right using my own words. The book I'm using does not provide examples and also uses a rather complicated and dry definition. So I will make examples to test if my knowledge on Big Oh holds true.

In my own words, Big Oh is used when you have the growth of a function on the left less than or equal to the growth of a function on the right.

My examples should be correct based on what I stated

n^2 = O n^2

n^2 = O n^3

Advertisement

In lay terms, it is approximately how many times it needs to go over the data.

Some examples:

A binary search is O(log n). It only touches a tiny bit of the data, and only does it once.

A linear search is O(n). It touches data exactly once.

A naive matrix multiply is O(n^2). A naive gravity simulation can be O(n^2). Every object in a loop must touch every other object in a loop. It is a single nested loop, such as

for(every item)

for(every item)

do something;

When you start getting to O(n^3) you are leaving what is practical in games. Every object touches every object, which in turn touches every object.

for( every item)

for( every item)

for( every item)

do something;

O(n!) is the traveling salesman problem's most direct solution. It is a nested pairwise NP-hard problem.

In lay terms, it is approximately how many times it needs to go over the data.

Some examples:

A binary search is O(log n). It only touches a tiny bit of the data, and only does it once.

A linear search is O(n). It touches data exactly once.

A naive matrix multiply is O(n^2). A naive gravity simulation can be O(n^2). Every object in a loop must touch every other object in a loop. It is a single nested loop, such as

for(every item)

for(every item)

do something;

When you start getting to O(n^3) you are leaving what is practical in games. Every object touches every object, which in turn touches every object.

for( every item)

for( every item)

for( every item)

do something;

O(n!) is the traveling salesman problem's most direct solution. It is a nested pairwise NP-hard problem.

It can be used to describe how many times it needs to go over the data, but it is also used to describe how a change in input data will affect the algorithm's processing time.

If I'm understanding correctly I think the OP is asking about the formal definition of Big-O, meaning the arithmetic of it, which is something like

Let f(x) and g(x) be two functions. f(x) = Big-O( g(x) ) if and only if there exists a constant K such that f(x) is always less than K * g(x) for big values of x.

I think I have that right.

Then the idea would be that in theory you could come up with an f(x) that is exactly the running time of some algorithm in some unit on some input of size x and then say, "Well, see we don't care about all this complexity in this function f(x) that we came up with. We just care that it is Big-O( n log(n) ) which is good enough time efficiency for our purposes" or whatever.

A lot of times when Big-O is taught the instructor throws a definition like the above on the blackboard before students have sufficient context to understand why we are all of a sudden talking "constants K" and "sufficiently large values of x" in our algorithms class.

If I'm understanding correctly I think the OP is asking about the formal definition of Big-O, meaning the arithmetic of it, which is something like

Let f(x) and g(x) be two functions. f(x) = Big-O( g(x) ) if and only if there exists a constant K such that f(x) is always less than K * g(x) for big values of x.

I think I have that right.

Then the idea would be that in theory you could come up with an f(x) that is exactly the running time of some algorithm in some unit on some input of size x and then say, "Well, see we don't care about all this complexity in this function f(x) that we came up with. We just care that it is Big-O( n log(n) ) which is good enough time efficiency for our purposes" or whatever.

A lot of times when Big-O is taught the instructor throws a definition like the above on the blackboard before students have sufficient context to understand why we are all of a sudden talking "constants K" and "sufficiently large values of x" in their algorithms class.

Happened to me last semester actually. Professor put on the slide the formal definition of Big-O with all those funky symbols (instead of words, the math symbols for them were used, a lot of people have never seen some of them) and it confused a lot of people. Needless to say I don't think we ever did learn how to prove Big-O and in this semesters class, the continuation of last semesters, we still have a hard time calculating Big-O.

I understand Big-O and basically know what it is saying. Though ask me to prove something formally for Big-O and you will get an empty sheet of paper right now. The professor did a pretty terrible job explaining that to us.

The definition is basically the formal mathematical way of saying that you don't have to worry about lower-order or slower-growing terms (for "big values" of 'x', the highest-order term dwarfs all others), and not to worry about constant factors on terms ('K' can be made arbitrary large so that any constant factors in f(x) are irrelevant). So if f(x) is something like 5n2 + 10n + 7, the complexity is O(n2).

Concerning your examples:

My examples should be correct based on what I stated

n^2 O n^2
n^2 O n^3

n² = O(n²)

n² = O(n³)

Are correct in terms of the formal definition (jwezorek got it right except the "big values of n" part):

"Every appearance of O(f(n)) means precisly this: There are positive constants M and n0 such that the number xn represented by O(f(n)) satisfies the condition |xn| ? M |f(n)|, for all integers n ? n0."

(The Art of Computer Programming Volume I p. 107)

Your intuition "Big Oh is used when you have the growth of a function on the left less than or equal to the growth of a function on the right." is actually quite correct.

Lets say you have the functions f(n) = n² and g(n) = n³. The only thing you have to do to prove that f(n) = O(g(n)) is to find a factor M and some value n0 for which g(n) is always bigger than or equal f(n). In this example you could say M = 1 and n0= 1. Then you get 1² ? 1 * 1³ which is obviously true. Usually you are not required to argue that this holds for any n greater than n0 as this is often quite obvious. If you have to, that part can get quite tricky

This is how your example would look graphically:

graphbb.png

As you can see the red function g(x) overtakes the blue f(x) at n = 1 and it stays always bigger than it. This works only if you know that the functions you are working with are strictly monotonic increasing, i.e. they keep on growing forever. Luckely that is true for every function you will encounter in the analysis of algorithms as you don't have the case that the input gets bigger, but the time it takes to process the input gets smaller.

A good rule of thumb for the growth of functions and how "fast" you can expect corresponding algorithms to run is:

Very fast:

  • Constant O(1): Access of a variable or an item in an array
  • Logarithmic O(log n): Find an item using binary search

Fast:

  • Linear O(n): Find an item using linear search, Depth-/Breadth-first search in graphs (single for-loop)
  • Linear * Logarithmic O(n log n): Fastest a sorting algorithm based on comparisons can get (Quicksort, Mergesort, etc)

Okay (anything above O(n³) is kinda slow):

  • Polynomial O(nx): Naive sorting algorithms like Bubblesort and Insertionsort (double for-loop).

Really slow, only applicable for small problem sizes:

  • Exponential O(xn), O(n!), ... : Building all permutations of a set, Travelling Salesman Problem

graph2g.png

Edit: Ahh. Posts happened while I was typing that also explain it quite well... But I hope my post can offer some additional clarification if needed

Easiest way to understand Big O notation in my opinion is as follows:

Big O notation is defining the upper bounds of the growth of a function.(Note: this isn't the actual definition but it suffices to explain the first few steps. I'll expand onto the actual definition below).

So to find the Big O of a function f(x) = 2x^3 - x + 5

We begin by first considering the "upper bounds" part of the definition. Essentially if we plotted this function on a graph - we want a similar function which will act as the upper bounds so we're just looking at functions which are greater than or equal to our original function. Or in other words we're looking for f(x) <= g(x) where g(x) is the function that acts as our upper bound. The rest is essentially simplification.

I generally begin by turning all the negative/subtraction signs into positive/addition as follows:

2x^3 - x + 5 <= 2x^3 + x + 5

The next step is to consider how this function will grow for some large value of x. Immediately we can see that x^3 will dominate the growth of our function, the x + 5 portion will become insignificant over time so we'll progress further with our simplification:

2x^3 - x + 5 <= 2x^3 + x + 5 <= 2x^3 + x^3 + 5x^3

All we did was just add a x^3 to every statement. Its seems pretty arbitrary but it follows the requirements of the definition above. If we had an x^5, then we'd have used x^5 instead or really whatever part of the function would dominate its growth.

Now finally - we can simplify using basic addition

2x^3 - x + 5 <= 2x^3 + x + 5 <= 2x^3 + x^3 + 5x^3 <= 8x^3

Finally, we can drop even the 8 and simply right that our function is O(x^3). And this is generally sufficient for a quick analysis - your math teachers generally want the rest as well.

As you can probably tell - x^3 does not actually define the upper bounds to our function (because we dropped the 8) but it does provide enough information for a quick comparison of the algorithm so it worked for our partial solution. This is where we consider the actual definition - I could use all the fancy math symbols but they're a pain to write and not that easy to understand anyways so I'll explain in english.

Essentially we have two functions, f(x), and g(x) both from positive numbers to positive numbers. We can get the big O of f(x) which is written as: f(x) is O(g(x)) if there are two constants C and k such that:

f(x) <= C*g(x) whenever x > k

We actually already derived part of this with our previous work since we had f(x) <= 8x^3. C = 8 in this case.

The second part is essentially referring to the statement I made earlier "some large value of x" just before I did the second simplification. We need to define what this "some large value of x" is and we can often just get away with a bit of trial and error.

If k = 1, then x can be any value greater than 1. So we can begin by considering the case of x = 2.

2x^3 - x + 5 <= 8x^3

2(2)^3 - 2 + 5 <= 8(2)^3

16 - 2 + 5 <= 64

37 <= 64

In this case it worked on the first try. As an example of alternate values of C and k, try C = 2, and k = 5. Often times you have to play around with the value of k to make it work. In fact - often times you have to play around with both C and k to make f(x) <= O(g(x)) work. I know my teacher often gave us problems where he provided g(x), but asked us to find values for C and k.

The following is generally what my professors accepted as sufficient for a proof:

2x^3 - x + 5 <= 2x^3 + x + 5 <= 2x^3 + x^3 + 5x^3 <= 8x^3

O(x^3) where C = 8, and k = 1.

Your profs may want more or less and they might try to trip you up with things like fractions.

PS - in case you're wondering about the value of the C and k values and why we don't just immediately look at the function and say "Great. It's O(x^3), now lets get on with our lives". Its because k is essentially your input size. It allows you to make comparisons of two functions f(x) and z(x) which detail the search performance of a theoretical data structure: f(x) is O(n), but k is 1 million, versus z(x) is O(n^2) but k is 1. We will have less than 100 values stored in this data structure. If we considered only the big O part - f(x) is the clear winner - but in practice z(x) would be the better choice for our application.

You put it in words way better than I did orizvi :) I missed the "upper bound" part and that's definitly a term anyone studying algorithms should know!

Thanks... I just wish I'd figured it out a few semesters go when I was failing exams over it wacko.png

I had the exact same problem as the people above. Teachers for some reason like to explain it in the most confusing way possible.

This topic is closed to new replies.

Advertisement