Jump to content
  • Advertisement
Sign in to follow this  
Kenneth Godwin

String Searching Algorithms and practical performance...

This topic is 3674 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

While other algorithms that I know of (e.g. Apostolico Giancarlo, Boyer Moore) they don't beat Horspool for purposes of searching large blocks of text for small (five or less character) patterns. Does anyone know a algorithm that is not theoretically faster, but actually is when dealing with small patterns? The following is a couple 'test results' for the same set of data. 48297 words, 433017 characters, two sets of tags per word. The IndexOf Method took 451.3765785 seconds to process. The BoyerMoore Method took 252.5482641 seconds to process. The Horspool Method took 115.2101059 seconds to process. The TurboBoyerMoore Method took 241.7624532 seconds to process. The ApostolicoGiancarloMatch Method took 241.9990094 seconds to process. 6000 words, 53792 characters, two sets of tags per word. The IndexOf Method took 5.5662522 seconds to process. The BoyerMoore Method took 2.1252076 seconds to process. The Horspool Method took 1.004881 seconds to process. The TurboBoyerMoore Method took 2.1488536 seconds to process. The ApostolicoGiancarloMatch Method took 2.1401635 seconds to process.

Share this post


Link to post
Share on other sites
Advertisement
What is the usage pattern/characteristics of your search needs? Will the large text be changing frequently or relatively static: Would preprocessing the large text be benificial or would it have to be done every search anyways?

Share this post


Link to post
Share on other sites
Quote:
Original post by Extrarius
What is the usage pattern/characteristics of your search needs? Will the large text be changing frequently or relatively static: Would preprocessing the large text be benificial or would it have to be done every search anyways?


I'm just looking for the fastest algorithm available for dealing with small patterns (5 characters or less).

Preprocessing won't help since the end result of the search is saved atm. (e.g. If an identical search occurs and the source text is the same, it just re-displays the same result rather than attempting another search). The source text is also modified frequently so unless the preprocessing would save the time over the course of 2-3 searches, it wouldn't help.

At this point, I'm pretty sure Horspool is the only one that is practical for what I'm doing. I just figured I'd ask GameDev in case someone would go 'Oh this is a better idea'.





Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!