what's the fastest search algorithm

Started by
15 comments, last by Brother Bob 12 years, 1 month ago
How about doing a sort then a search?
Wouldn't that on average be faster than going through it linearlly?
Advertisement
Sorting is O(n log n), searching is O(n) or O(log n), you only get a better complexity if you do more than O(log n) searches.

The AP wanted to plug interpolation search.
You're talking about two seperate operations, which would be less efficient.
It would have to run through the complete list first to sort & then again to search.
If it was O(n logn) + O(n) or even O(logn) + O(1) it'd still be less effiecient, but the only way you'll get that kind of performance (O(1)) is from something like hashing anyway.
Plus, if it's a tree, if you sort it first it'll simply become a linked list and that's adverse to the goal here.

Most of the searching algorithms I've seen around the net cater to text searches: SFS, Boyer-Moore, Knuth-Morris-Pratt, etc . . .

It'd be best to get some cheap books on Amazon.com that have multiple examples of algorithms. Your best friend in this field (programming, development) is physical, paper text and, luckily, this subject is popular and has aged quite well so you can find a lot of good books cheap.
That was the best lesson learned from a degree in computer science.

EDIT:
Trap beat me to it.
Got my reply . . .
Daniel Millerhttp://formulaic.net
Sorting Algorithms [grin]
Optimal searching is a worst case O(log n) time complexity when we assume a sequential model of computation. A parallel model allows worst case O(1) time complexity. The key thing to keep in mind here is that when people make statements about optimality, they are inherently assuming a model of computation ... typically a sequential, random access machine. So when somebody says that optimal comparison sorts have O(n log n) complexity, this is only true on a sequential machine. Comparison networks can achieve better performance.

On the other hand, for your purposes assuming a sequential RAM computer works just fine.
If you are going to search through the vector many times, it would be better to use Merge Sort and then use Binary Search to find your terms. The thing is that you are only going to need to use Merge Sort once (unless you are going to put more terms into the list). This means that you are going to have a time of O(n lg n) + kO(lg n) where k is the amount of times you search through the list. The time mentioned would be the time for that subdivision of the program to run.
Don't bring six-year-old threads back from the dead.

This topic is closed to new replies.

Advertisement