Sign in to follow this  

std::map with many elements

This topic is 3665 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi, i am using std::map for many elements almost 20.000. But the insertion of those elements seems to be a litlle bit slow... Is there anything like std::string::reserve method on map? I already looked the references but haven´t found any hint. Ideas?

Share this post


Link to post
Share on other sites
std::map is a red-black tree so there is no way to "reserve space" ahead of time like std::string or std::vector do. If you want faster lookup you want to use a hash table. This is available as an extension to the standard and is named hash_map. Whether or not this is available to you depends on whether the vendor of your compiler/tools has included extensions to the standard library.

Share this post


Link to post
Share on other sites
Quote:
Original post by fpsgamer
std::map is a red-black tree so there is no way to "reserve space" ahead of time like std::string or std::vector do. If you want faster lookup you want to use a hash table. This is available as an extension to the standard and is named hash_map. Whether or not this is available to you depends on whether the vendor of your compiler/tools has included extensions to the standard library.


While the standard library may not have a method to reserve memory for a tree it is certainly possible for the underlying implementation of a tree be a contiguous chunk of memory so it is possible for tree's to have reserve methods.

As for the original poster, what are your keys? Possibly changing to a 32 bit valued key could speed up the process or even implementing a faster allocator for the tree. It also may be possible you aren't using the best possible data structure. With a little bit more context we can help you figure out what your best data structure could be.

-= Dave

Share this post


Link to post
Share on other sites
While you can't tell std::map to preallocate, you CAN give it an allocator which is more efficient than the default. Check out boost::pool for some good pre-written allocators to give it.

Share this post


Link to post
Share on other sites
Don't go blindy switching to a hash table - they can be significantly slower in the average case if you choose a bad hash functions and they can also cause security risks in certain circumstances.

Share this post


Link to post
Share on other sites
Ok, i will decribe what my program is doing: It takes a list of words from a text file 20.000~500.000 words, and maps them to a map that the key is a string and the value is a int, the int holds how much times the word appears on the text.

Share this post


Link to post
Share on other sites
We've gotten some code out of this poster where this same problem was posted on another message board. It seems only fair that it gets posted here too.
std::map<std::string, int> dictionary;
boost::regex expression("([a-z]+)", regex::perl|regex::icase);
...


	ifstream file(filename);
cmatch result;
char data[10000];

while (!file.eof())
{
file.getline(data, 10000);

string line(data);

sregex_token_iterator begin(line.begin(), line.end(), expression, 1);
sregex_token_iterator end;

while (begin != end)
{
dictionary[(*begin++)]++;
}
}

Share this post


Link to post
Share on other sites
Strings and maps really don't get along as well as one might hope in a situation like this. A string comparison operation needs to look at the characters of the strings in order until it finds a difference, so strings with long common prefixes take more time to compare. With a map, these strings end up being close to each other in the tree so that comparisons with several close matches might be needed to locate the key.

Since the alphabetical order isn't important in this situation, a faster comparison than lexicographic order can be used. A key that has a hash code precomputed in addition to the actual string could have a very fast comparison oprator implemented by sorting first based on hash value and then only comparing strings when the hashes are equal. Or just use a hash table at this point.

Share this post


Link to post
Share on other sites
From my engine code:

u32 hash(const char* _sz)
{
u32 nRet = 5381;
while(*_sz)
nRet = (nRet << 5) + nRet + ((*_sz++)&0xDF); // Mask out bit 5 for case-insensitivity
return nRet;
}


You might want to remove the mask if you want the hash to be case sensitive, or you want to correctly handle extended ASCII.

Share this post


Link to post
Share on other sites

This topic is 3665 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this