std::map with many elements

Started by
9 comments, last by Evil Steve 16 years, 4 months ago
Hi, i am using std::map for many elements almost 20.000. But the insertion of those elements seems to be a litlle bit slow... Is there anything like std::string::reserve method on map? I already looked the references but haven´t found any hint. Ideas?
Advertisement
std::map is a red-black tree so there is no way to "reserve space" ahead of time like std::string or std::vector do. If you want faster lookup you want to use a hash table. This is available as an extension to the standard and is named hash_map. Whether or not this is available to you depends on whether the vendor of your compiler/tools has included extensions to the standard library.
Quote:Original post by fpsgamer
std::map is a red-black tree so there is no way to "reserve space" ahead of time like std::string or std::vector do. If you want faster lookup you want to use a hash table. This is available as an extension to the standard and is named hash_map. Whether or not this is available to you depends on whether the vendor of your compiler/tools has included extensions to the standard library.


While the standard library may not have a method to reserve memory for a tree it is certainly possible for the underlying implementation of a tree be a contiguous chunk of memory so it is possible for tree's to have reserve methods.

As for the original poster, what are your keys? Possibly changing to a 32 bit valued key could speed up the process or even implementing a faster allocator for the tree. It also may be possible you aren't using the best possible data structure. With a little bit more context we can help you figure out what your best data structure could be.

-= Dave
Graphics Programmer - Ready At Dawn Studios
While you can't tell std::map to preallocate, you CAN give it an allocator which is more efficient than the default. Check out boost::pool for some good pre-written allocators to give it.
What type is the key type and the mapped type?
If they are user-defined classes, what do their copy-constructors look like?
"In order to understand recursion, you must first understand recursion."
My website dedicated to sorting algorithms
Don't go blindy switching to a hash table - they can be significantly slower in the average case if you choose a bad hash functions and they can also cause security risks in certain circumstances.
And make sure you're not building in debug mode, if using MSVC.
Ok, i will decribe what my program is doing: It takes a list of words from a text file 20.000~500.000 words, and maps them to a map that the key is a string and the value is a int, the int holds how much times the word appears on the text.
We've gotten some code out of this poster where this same problem was posted on another message board. It seems only fair that it gets posted here too.
std::map<std::string, int> dictionary;boost::regex expression("([a-z]+)", regex::perl|regex::icase);...


	ifstream file(filename);	cmatch result;	char data[10000];	while (!file.eof())	{		file.getline(data, 10000);		string line(data);		sregex_token_iterator begin(line.begin(), line.end(), expression, 1);		sregex_token_iterator end;		while (begin != end)		{			dictionary[(*begin++)]++;		}	}
"In order to understand recursion, you must first understand recursion."
My website dedicated to sorting algorithms
Strings and maps really don't get along as well as one might hope in a situation like this. A string comparison operation needs to look at the characters of the strings in order until it finds a difference, so strings with long common prefixes take more time to compare. With a map, these strings end up being close to each other in the tree so that comparisons with several close matches might be needed to locate the key.

Since the alphabetical order isn't important in this situation, a faster comparison than lexicographic order can be used. A key that has a hash code precomputed in addition to the actual string could have a very fast comparison oprator implemented by sorting first based on hash value and then only comparing strings when the hashes are equal. Or just use a hash table at this point.

This topic is closed to new replies.

Advertisement