How to analyze run time complexity in code in a trivial way

Started by
9 comments, last by Nicholas Kong 11 years, 1 month ago

How would I know if a run-time of an algorithm is one of the four types of run-time: quadratic, linear, constant, and logarithmic in a trivial way based on code itself and definition of the four terms?

I did some research on the definition on the four types of run-time but I am putting it in my own words so hopefully my phrasing is correct:

1) an algorithm run-time is constant if the run-time is same no matter the size of the input of the algorithm

I have a question about this one: if the run-time is constant why is it O(1). Why choose 1 of all numbers?

2) an algorithm run-time is linear if the run-time increases as the input size increases

3) an algorithm run-time is logarithmic if the run time increases to the log of the input size. I have a question about this too: are we assuming log of some input to the base 10 or is the base of the log arbitrary?

4) an algorithm run-time is quadratic if the run time increases to the input size squared.

On a side note what is the difference between lg and log?

Here's an example:

loglog n = _ (loglog(n^3)) <-I think the example is taking a log of log n

What should n be assumed to be to evaluate the statement in a trivial way.

On a side note, I'm not familiar with this notation lg < what does lg mean?

Example:

n*lg n = _ ( n*lglgn)

Advertisement

a trivial way

The only way I see is, to change the input set and check the run-time performance. The standard way would be to analyse the algorithm on paper, this is not without reason part of the theoretical branch of computer science smile.png

no loops with no recursive calls = constant

1 loop without recursive calls = linear

Other usually need deeper analysis, but most of the time it's not hard.

  1. O is some constant, so O(1) means constant*1;
  2. Yes, but cache & other things might make it slower than linear when you compare different amounts of data;
  3. It doesn't matter, point is it's between constant and linear, also it means the more data you add the smaller difference in time will be (assuming just logn);

lg is logarithm with base 10, ln is with base e.

O is some constant, so O(1) means constant*1;

O doesn't mean some constant. It means "any term of a lower order" (there is a formal definition involving limits at infinity). So if you have an algorithm which runs in complexity (2n)^2 + n log(n), the asymptotic complexity is O(n^2).

We use O(1) by convention, since 1 is independent of the problem size n, you could choose any nonzero finite number and get the same asymptotic complexity as per the definition. 1 just seemed the most meaningful choice. If another number had been chosen people would be asking "why O(3)?", so..

“If I understand the standard right it is legal and safe to do this but the resulting value could be anything.”

There is really only one way to do it:

First, analyze your code piece by piece and write down the complexity using Big-O notation.

Most loops are easy to analyze:


for(i = 0; i < n; i++) {
   var++; // O(1)
} // O(n) * O(1)

for(i = 0; i < 2 * n; i++) {
   f(); // f \in O(log(x))
} // O(2*n) * O(log(x))

for(i = 0; i < n; i++) {
   for(j = 0; j < n; j++) {
      for(k = n; k > 1; k /= 2) {
         var++; // O(1)
      } // O(log(n))
   } // O(n) * O(log(n))
} // O(n) * O(n) * O(log(n))

But there are also trickier cases:


for(i = 1; i < n; i *= 2) {
   g(i); // g in O(i)
}

Here you can't simply multiply the complexities because g depends on i.

The complexity of the loop's body doubles with every iteration: 1, 2, 4, 8, 16, ... and there are log(n) elements in the series. This is the log(n)-th partial sum of a geometric series with a common ratio of 2.

When using recursion, things get even weirder. For most divide and conquer algorithms, you can use the master theorem.

Then, simplify using the rules of the Big-O notation:

c*O(f) = O(c*f) = O(f)

O(f)*O(g) = O(f*g)

O(O(f)) = O(f)

O(f) + c = O(f+c) = O(f)

O(f) + O(f) = O(f)

O(f+g) = O(max(f,g))

So, yeah. I don't think there is a trivial way.

I have a question about this too: are we assuming log of some input to the base 10 or is the base of the log arbitrary?

The base of the logarithm doesn't matter when analyzing the asymptotic complexity, because it's just a constant factor that can be ignored.

O(log_2(n)) = O(log(n) / log(2)) = O(log(n) * 1.4426...) = O(log(n))

current project: Roa

On a side note what is the difference between lg and log?

Different subjects deal with different bases regularly and so tend to be lazy and create small notation.

'log' written on its own should be assumed as base 10 if a base isn't specified, 'ln' is to the base 'e', and 'lg' I think is commonly base 2 but that notation is fairly informal (is my understanding).

Where I live we learned ln is base e, lg base 10 and ld base 2. lg being (mis-)used as base 2 is new to me, inside O() I just assumed people used a random logarithm to have to write one character less. This I just found while searching for it:

http://mathworld.wolfram.com/BinaryLogarithm.html

http://en.wikipedia.org/wiki/Binary_logarithm

There is not just the O notation, there are other related things like o,Theta, big Omega, small omega which are not that popular although they can give other clues over complexity. http://en.wikipedia.org/wiki/O_notation

It can also be frustrating to find 2 algorithms advertised at same complexity and one being secretly much slower, all while the people who calculated it must have had more exact data to prove it just to throw half of it away and then later cause other people speculating and trying to find which is the better. I'd very much like to know if one is n^2-100*n*log n and the other 10*(n^2+1000*n*log n) or if both are n log n on average and one is best case n, worst case n log n and other best case n log n and worst case n^2.

I would think the 1 comes from n^0 = 1. Any other number would be substitutable, but arbitrary. As far as evaluating you could also walk through a small data set and see how many times it loops. Basically, how many times does the most inner loop run in the worst case scenario. Then evaluate that number versus your n. Though I am not very versed on Big O so there may be a reason why this would not work.



For a very large number of algorithms you'll end up writing in every day work, it's pretty easy. They tend to be simply counting up the number of times a loop iterates and possibly accounting for nested loops.

O(n):


for (int i = 0; i < n; ++i)
{
    ...
}


O(n2):


for (int i = 0; i < n; ++i)
{
    for (int j = 0; j < n; ++j)
    {
        ...
    }
}


The same method can be analogized to include any function calls made within the loop bodies. Say, for instance:


for (int i = 0; i < n; ++i)
{
    for (int j = 0; j < n; ++j)
    {
        f(j, n); // f() is O(n).
    }
}

The function call is basically another nested loop, hence O(n3).

But this kind of analysis really only works for very simple algorithms that do simple iteration. Once you get into recursion, you need to start doing some more math to be able to come to asymptotically tight analyses. For some illustrative examples, take a look at some of the material here: http://www.cs.uiuc.edu/~jeffe/teaching/algorithms/

In particular:

http://www.cs.uiuc.edu/~jeffe/teaching/algorithms/notes/99-recurrences.pdf

http://www.cs.uiuc.edu/~jeffe/teaching/algorithms/notes/98-induction.pdf

http://www.cs.uiuc.edu/~jeffe/teaching/algorithms/notes/00-intro.pdf

http://www.cs.uiuc.edu/~jeffe/teaching/algorithms/notes/05-dynprog.pdf

http://www.cs.uiuc.edu/~jeffe/teaching/algorithms/notes/06-sparsedynprog.pdf

http://www.cs.uiuc.edu/~jeffe/teaching/algorithms/notes/14-amortize.pdf

http://www.cs.uiuc.edu/~jeffe/teaching/algorithms/notes/16-unionfind.pdf

http://www.cs.uiuc.edu/~jeffe/teaching/algorithms/notes/27-lowerbounds.pdf

The first two links are particularly important/relevant. The rest sort of just show you the analyses in action. I would highly recommend knowing how to solve for recurrences. You cannot hope to analyze anything but the simplest algorithms without knowing how to solve for recurrences.

On a side note what is the difference between lg and log?

Asymptotically, none. Logs to any base are big theta of each other.

It can also be frustrating to find 2 algorithms advertised at same complexity and one being secretly much slower, all while the people who calculated it must have had more exact data to prove it just to throw half of it away and then later cause other people speculating and trying to find which is the better. I'd very much like to know if one is n^2-100*n*log n and the other 10*(n^2+1000*n*log n) or if both are n log n on average and one is best case n, worst case n log n and other best case n log n and worst case n^2.

But, this.

[size="1"]And a Unix user said rm -rf *.* and all was null and void...|There's no place like 127.0.0.1|The Application "Programmer" has unexpectedly quit. An error of type A.M. has occurred.
[size="2"]

This topic is closed to new replies.

Advertisement