Jump to content
  • Advertisement
Sign in to follow this  
guangwang1983

Fast string look up

This topic is 3882 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi, I have a collection of around 3000-4000 items with string keys for my application. I am trying to come up with a method that given a string key, it will return me the item within no more than 5 millisec. What is the best way I can do it? I was thinking about STL map, but I was told that it doesnt do any hashing so it could be very slow. Any help will be much appreciated.

Share this post


Link to post
Share on other sites
Advertisement
Try a hashtable or hashmap. It's not part of the current standard, but some STL vendors have included an implementation as an extension. I would guess boost might have something along those lines as well.

EDIT: 5 milliseconds is a *long* time for a simple lookup. Have you actually tried the STL map? You are assuming it's slow, but have you timed it?

Share this post


Link to post
Share on other sites
Quote:
Original post by guangwang1983 it will return me the item within no more than 5 millisec. What is the best way I can do it?


What kind of hardware are you running?

If it's anything close to modern PCs, then you should be able to do hundreds of lookups *per* millisecond using std::map or std::hash_map classes. YMMV depending on string size.

Of course, then there's different hash_map implementations, there was a very long thread around here regarding tries, and there's other lookups, including custom containers, or even direct mappings.

Quote:
I was thinking about STL map, but I was told that it doesnt do any hashing so it could be very slow.


How about testing it yourself? Surely the time needed to post this question, wait for responses, and then evaluate different perspectives (several hours) will take more effort and time than simply writing a 5 line test to see the order of magnitude?

Share this post


Link to post
Share on other sites
I'm with mossmoss... you could potentially just iterate through 3000-4000 strings and find the one you want in 5ms. That's a long time, and std::map should suffice.

Share this post


Link to post
Share on other sites
Quote:
Original post by Zahlman
For strings, you could also try a trie...


Sure, but let's try something first...

#include <map>
#include <string>
#include <iostream>
#include <fstream>

#include <boost/timer.hpp>

typedef std::map< std::string, int > StringMap;


int main(int argc, char* argv[])
{
StringMap keys;
std::ifstream f("d:\\words.txt");

int c(0);
std::string line;
while (std::getline(f, line)) keys.insert(std::make_pair(line, c++));


boost::timer t;
c = 0;
int n = 1000000;
for (int i = 0; i < n; i++) {
if (keys.find("hello") != keys.end()) c++;
}
double t1 = t.elapsed();
std::cout << "Keys found: " << c << std::endl;
std::cout << "Elapsed time: " << (t1 * 1000) << " ms" << std::endl;
std::cout << "Lookups per ms: " << (n / t1 / 1000) << std::endl;


return 0;
}


Using words.txt, which contains ~21000 entires.

The results:
Quote:

Keys found: 1000000
Elapsed time: 875 ms
Lookups per ms: 1142.86


Or, under 1 microsecond per lookup.... 6 minutes coding time, including google lookup.

Share this post


Link to post
Share on other sites
Also, if you're doing lots of string comparisons in your program, it can be useful to create a 'fixed-string' class, or some other kind of 'global string table' so that you can replace all of your string comparisons with simple pointer comparisons.

e.g. Here's a quick idea:
class FixedString {
public:
//copy constructor, operator=, operator==, operator bool etc...
private:
friend class StringTable
FixedString();
FixedString(const std::string& s) : ptr(&s) {}
const std::string* ptr;
};
class StringTable
{
public:
typedef std::set<std::string> Table;
static FixedString get( const std::string& s )
{
static Table table;
pair<Table::iterator, bool> r = table.insert( s );
return FixedString(*r.first);
}
};


StringTable will only keep a single copy of any string registered with it. New FixedString's are immutable and can only be created by StringTable, and FixedString is just a wrapper for a raw pointer so it can compare itself to other FixedStrings without having to actually do a string compare.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!