How to code and use commonly used algorithms

Started by
65 comments, last by Nice Coder 19 years, 2 months ago
Greetings In this thread place little micro-tutorials on how to code and use commonly used algorithms. For eg. A simple sorting algorithm, a bubble sort. What to use it for What not to use it for Why you should use it Problems with using it ect. I will edit my post to add the algorithms we have already used. Please note that double posts are allowed. Current index: Absolute function Binary search x2 Hash table Simple case boyer-moore string search Memory manager Bitreversing algorithm Recursive Floodfill Algoritithms awaiting a tutorial Complex Boyer-Moore string search algorithm Binary heap Quicksort From, Nice coder [Edited by - Nice Coder on January 9, 2005 4:58:43 AM]
Click here to patch the mozilla IDN exploit, or click Here then type in Network.enableidn and set its value to false. Restart the browser for the patches to work.
Advertisement
Binary search

It searches a sorted array of values, for a particular value in o(log2 n) time (worst case).

This is a very fast search algorithm.
For eg, if you have an array that is 18,446,744,073,709,551,616 elements big, it will find any given element in at most 64 iterations.

Why is it that fast?

It is that fast, because at every comparison, it divides the search space in half. This means that after 2 iterations, the search space is 1/4 of the origional. this allows it to not search most of the array.

You should use it whenever you have an array, and you need to find a given element quickly. for eg, in a search engine, or pretty much anything.

psudocode:
Function binarysearch(array(), Element)max = upperbound(array)min = lowerbound(array)doGuess = makeinteger((max - min) / 2)if guess is negitive then return falureif array[guess] = element then	return guesselseif array[guess] > Element then	max = guess - 1elseif array[guess] < element then	min = guess + 1else	return falureend ifloop forever	


From,
Nice coder

[Edited by - Nice Coder on January 2, 2005 3:53:12 AM]
Click here to patch the mozilla IDN exploit, or click Here then type in Network.enableidn and set its value to false. Restart the browser for the patches to work.
Well, I'd just point you to your CRLS or your Sedgewick algorithms books.
"Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." — Brian W. Kernighan
There is also the sometimes-helpful National Institute of Standards and Technology's Dictionary of Algorithms and Data Structures
"Walk not the trodden path, for it has borne it's burden." -John, Flying Monk
Quote:Original post by Nice Coder
Greetings

In this thread place little micro-tutorials on how to code and use commonly used algorithms.

For eg. A simple sorting algorithm, a bubble sort.

What to use it for

What not to use it for

Why you should use it

Problems with using it

ect.



In C++ no doubt: use STL and their algorithm.
For example if you use std::set data structure (in general it is implemented as a binary tree) the method set::insert can be used to insert and/or find an element in O(log2 N) without need to reinvent the wheel.

If you need to sort a sequence use the algorithm sort: no need to specify the type of the sequence (vector, list,...). However if the sequence is a vector the algorithm is O(NlogN ).

Use your own algorithm leads to bad programming...

However if you like to do this I report the classic merge-sort algo for vectors ( O(NlogN) )

// pseudo C code void Sort(TYPE* vector, INT lo0, INT hi0){  if (lo0 >= hi0){    	return;  }	  INT  lo = lo0;  INT  hi = hi0;  INT mid = (lo + hi)/2;		  while (lo < hi)  {    for(;lo<hi && vector[lo]<vector[mid];lo++);     for(;lo<hi && vector[mid]<vector[hi];hi--);    if (lo < hi)    {        TYPE etmp  = vector[lo]:        vector[lo] = vector[hi];        vector[hi] = etmp; 	lo++;        hi--;    }  }	  if (hi < lo)  {   INT tmp = lo; lo = hi; hi = tmp;  }	  Sort(vector, lo0, lo);  Sort(vector, lo == lo0 ? lo+1 : lo, hi0);	}
Thats why this is here, so people can have a set of good algorithms, programmed in a language they can use, which contains more algorithms then then the stl.

From,
Nice coder
Click here to patch the mozilla IDN exploit, or click Here then type in Network.enableidn and set its value to false. Restart the browser for the patches to work.
A fast string search.

Time complexity o(n / m) where n is the length of the string that your searching, and m is the length of the string your searching for.

Simple case, where each letter is used only once.

Psudocode:
function stringsearch(text, find)lof = lengthof(find)index = lofdoif the index'th carachter of text is equal to the last character of find then	check backwards, and see if the index - 2'th character of text is the same as the second last character of find.	Continue looping until you have checked the entire string. (Find), against the substring(string, index - lof,lof)	If you find an inequality then go back to the index += statement	otherwise, you have found your index = index + lofloop until index is larger then the length of the text


More complex case to follow

From,
Nice coder
Click here to patch the mozilla IDN exploit, or click Here then type in Network.enableidn and set its value to false. Restart the browser for the patches to work.
Your algo is O(n) where n is the sequence lenght because in the >>WORST<< case you do n/m * m = n char tests.
Otherwise if m = n (it is reduced to a string compare) you would have O(1) that is not the case!!!

For those who use std::string there is

starting_index = string.find(substring);


I'm afraid that your algorithm does not work in this case

search for: aaa (len = 3)
in : bbaaa (len = 5)

you start from index=3 and go back -> a[OK], a[OK], b[BAD]
increase index = 3 + 3 = 6 -> 6>5 -> exit -> NOT FOUND -> WRONG!!!

The correct algo is

int search(const text, str){  for(int i=0; i<len(text); i++){     if(string_compare_maxnchar(text+i, str, len(str))==equals){         return(i);     }  }  return(-1);}


where string_compare_maxnchar(const s1,const s2, int n) compares max n char (strncmp in C) -> O(n*m) = O(n) in practical WORST case
Quote:Original post by Nice Coder
Function binarysearch(array(), Element)max = upperbound(array)min = lowerbound(array)doGuess = makeinteger((max - min) / 2)if guess is negitive then return falureif array[guess] = element then	return guesselseif array[guess] > Element then	max = guess - 1elseif array[guess] < element then	min = guess + 1else	return falureend ifloop forever	
A newbies guide to the binary search algorithm: All you thought you knew, and more...

You've made a very common major performance mistake (or two) here.
You've put the most unlikely if case first. The equal case should be the last case. Also since you have then already determined if it was < or > and acted appropriately then it must be equal so there is no test required. This cuts down the number of compares to about half what it was.

In fact, it's actually better if you leave the = part out of the loop entirely, since it's so rare, and only stop when min = max (at which point you will be at the item to compare for equality).
This all cuts it down to 1 compare per iteration, rather than the three you have above (try the biggest item). Performance also levels with exactly logn comparisons each time, rather than between 1 and 3logn (about 2.5logn on average)

Taking the = comparison out of the loop also has the unexpected advantage of making sure that you find the first occurance of an item which has duplicates in the array, which makes it easier to go through all duplicates (should that be of any usefullness in a given situation)
Or, if the item is known to be always present and you just what to find it's index, then the equals test can be removed entirely!

Now the C++ code...
template <class T, class Size, class TKey>Size BinarySearch(const T a[], Size n, const TKey &find) {	Size l = 0, r = n; //r starts at the item past the last one	while (l < r) {		Size m = (l + r) / 2; //midpoint		if (a[m] < find)			l = m + 1;		else			r = m;	}	if (a[l] == find)		return l;	else		return NOT_FOUND;}

Sure both are O(logn) so both are relatively fast, but a little extra thought effortlesssly gives you another 2x speedup + added niceities. Feel free to use the code above in your own programs. You may need to overload the < and == operators.
[smile]

No offence intended Nice Coder, just enlightening everybody furthur.

[Edited by - iMalc on January 2, 2005 2:35:17 PM]
"In order to understand recursion, you must first understand recursion."
My website dedicated to sorting algorithms
What to use it for: flipping 2^n bits with n = {0,1,2,3,4,5...}

What not to use it for: everything else

Why you should use it: because it's quick and compact

Problems with using it: none

How to use:
//example: flip last 16 bits of int x (starting at LSB)
int x = 0xfb10a386;int i;x = ((x & 0x00FF) << 8) | ((x & 0xFF00) >> 8);x = ((x & 0x0F0F) << 4) | ((x & 0xF0F0) >> 4);x = ((x & 0x3333) << 2) | ((x & 0xCCCC) >> 2);x = ((x & 0x5555) << 1) | ((x & 0xAAAA) >> 1);

For a how-it-works, see:
http://www.gamedev.net/community/forums/topic.asp?topic_id=291109

Bas

This topic is closed to new replies.

Advertisement