[C++] Use map::operator[] for inserting

Started by
16 comments, last by T1Oracle 18 years, 6 months ago
I want some-how to subclass STL map<SomeType, std::string> for inserting char* and std::string. This is the declaration of map::operator[]

	mapped_type& operator[](const key_type& _Keyval)
		{	// find element matching _Keyval or insert with default mapped
		iterator _Where =
			insert(value_type(_Keyval, mapped_type())).first;
		return ((*_Where).second);
		}

I mean, I want to convert char* to std::string when using both of

MyMap[AAA] = "asdfasdg";
MyMap[AAA] = some_scl_string;

ai-blog.org: AI is discussed here.
Advertisement
Doesn't the "constant in quotes" get automatically converted to std::string anyway?
i'm pretty sure it does
(at least when i use std::map)
Quote:Original post by Nitage
Doesn't the "constant in quotes" get automatically converted to std::string anyway?


Well, yes. Sorry.
But when I'm writing the following: (and using it in previous circumstances)
  map<SomeType, string>::mapped_type& operator[](const SomeType SomeType1)  {    return MyMap[SomeType1];  }

me receiving the error C2679.
ai-blog.org: AI is discussed here.
Sorry to everybody!!
Just have found a mistake in my main code.
I was accessing to MyMap[AAA], but MyMap was a pointer to map<...>
ai-blog.org: AI is discussed here.
I solved that by making my own string class for use in STL maps. It's very simple and can convert from "str *" to "std:string" in it's assingment operator. It also can output those forms using cast operators. Plus it has a "printf" function for generating string using the printf format. I created this class for STL maps since I read somewhere that the "std:string" class was slow in STL maps and since this thing is so simple I figured it would be faster. I also created for all the string conversions that it handles so easily. Here's my code:
#pragma once#include<string>#include <stdarg.h>namespace strLib{	class qckstr	{	public:		// void constructor		qckstr()		{			m_length=0;			m_pStr=NULL;		}		// copy constructor		qckstr(const qckstr& qStr)		{			m_length=0;			m_pStr=NULL;			Set(qStr.m_pStr);		}		// construct qckstr from a C style string		qckstr(const char *pStr)		{			m_length=0;			m_pStr=NULL;			Set(pStr);		}		// destructor		~qckstr()		{			Destroy();		}		// returns the length of the string		inline unsigned Length() const		{			return m_length;		}		// dealocates memory and resets string		void Destroy()		{			delete [] m_pStr;			m_length=0;			m_pStr=NULL;		}		// resets string to the value of C style string		bool Set(const char *pStr)		{			// check user pointer			if (!pStr||!*pStr)				return false;			// check local pointer			if (m_pStr)				Destroy();			// allocate space and copy string			m_length=strlen(pStr);			m_pStr=new char[m_length+1];			if (!m_pStr)			{				m_length=0;				return false;			}			strcpy(m_pStr,pStr);			return true;		}		// uses the printf syntax to format a string, returns a reference to the result		qckstr &Sprintf(const char *format,...)		{			Destroy();			va_list arglist;			va_start(arglist,format);			m_length=_vsnprintf(NULL,0,format,arglist);			if (!m_length)				return *this;			m_pStr=new char[m_length+1];			if (!m_pStr)			{				m_length=0;				return *this;			}			va_start(arglist,format);			vsprintf(m_pStr,format,arglist);			va_end(arglist);			return *this;		}		// assigns a C style string to qckstr		inline qckstr &operator=(const char *pStr)		{			if (m_pStr!=pStr)				Set(pStr);			return *this;		}		// assigns a stl string to qckstr		inline qckstr &operator=(const std::string &stlStr)		{			if (m_pStr!=stlStr)				Set(stlStr.c_str());			return *this;		}		// assigns a qckstr string to qckstr		inline qckstr &operator=(const qckstr &qStr)		{			if (this!=&qStr)				Set(qStr.m_pStr);			return *this;		}		inline bool operator<(const qckstr &qStr) const		{			return (strcmp(m_pStr,qStr.m_pStr)<NULL);		}		inline bool operator>(const qckstr &qStr) const		{			return (strcmp(m_pStr,qStr.m_pStr)>NULL);		}		inline bool operator==(const qckstr &qStr) const		{			return (strcmp(m_pStr,qStr.m_pStr)==NULL);		}		inline bool operator!=(const qckstr &qStr) const		{			return (strcmp(m_pStr,qStr.m_pStr)!=NULL);		}		inline bool operator<=(const qckstr &qStr) const		{			return (strcmp(m_pStr,qStr.m_pStr)<=NULL);		}		inline bool operator>=(const qckstr &qStr) const		{			return (strcmp(m_pStr,qStr.m_pStr)>=NULL);		}		// casts as a stl string, creates a temporary stl string		inline operator std::string()		{			return std::string(m_pStr);		}		// casts as a C style string (should only be used for const strings)		inline operator const char *()		{			return (const char *)m_pStr;		}		// casts as a qckstr		inline operator const qckstr ()		{			return (const qckstr)*this;		}	protected:		unsigned m_length; // length of string		char	*m_pStr; // pointer to string data	};};

Anyway when I use that class I can do stuff like:
std::map<strLib::qckstr,int> myMap;myMap["string constant"];char *str="another constant";myMap[str];
Programming since 1995.
"since I read somewhere that the "std:string" class was slow in STL maps"

Don't listen to rumors. The STL was designed for speed and efficiency. Always use it unless you can PROVE that it is too slow for your purposes.
Quote:Original post by T1Oracle
I solved that by making my own string class for use in STL maps.


There is nothing to solve as the OP made a mistake anyways [grin].

Quote:Original post by T1Oracle
... I created this class for STL maps since I read somewhere that the "std:string" class was slow in STL maps and since this thing is so simple I figured it would be faster...


Define slow, what aspect is slow? did you profile to know for sure? i don't mean to rude or put you down but your implementation is naive, it doesn't really do any better than a modern imp of std::basic_string. Most modern implementations make std::basic_string nothing more than a handle to a private reference counted representation with COW semantics (copy on write), granted COW semnatics does not have much thread safety but it's better than the greedy navie method the one which your code does among other things.

To get a better understanding you might want to read these articles:

Reference Counting - Part I
Reference Counting - Part II
Reference Counting - Part III.

Those articles use/write custom string type as the example.
What if the "lazily" copied string goes out of scope?

link <-Slow STL string performance. I'm sure I found it somewhere else though, but this is all I have right now.

*edit* Also, I only use my class for STL maps and the printf feature, after that I usually convert it to a STL string. I'm most concerned about performance in the STL map since I use those heavily in my code.
Programming since 1995.
Try posting a benchmark from a compiler which isn't horrendously obsolete.
#include <algorithm>#include <cstdarg>#include <ctime>#include <deque>#include <fstream>#include <iostream>#include <iterator>#include <map>#include <string>#include <vector>#include <boost/lambda/algorithm.hpp>#include <boost/lambda/bind.hpp>#include <boost/lambda/lambda.hpp>namespace strLib{	class qckstr	{	public:		// void constructor		qckstr()		{			m_length=0;			m_pStr=NULL;		}		// copy constructor		qckstr(const qckstr& qStr)		{			m_length=0;			m_pStr=NULL;			Set(qStr.m_pStr);		}		// construct qckstr from a C style string		qckstr(const char *pStr)		{			m_length=0;			m_pStr=NULL;			Set(pStr);		}		qckstr(std::string const & str)		{			m_length=0;			m_pStr=NULL;			Set(str.c_str());		}		// destructor		~qckstr()		{			Destroy();		}		// returns the length of the string		inline unsigned Length() const		{			return m_length;		}		// dealocates memory and resets string		void Destroy()		{			delete [] m_pStr;			m_length=0;			m_pStr=NULL;		}		// resets string to the value of C style string		bool Set(const char *pStr)		{			// check user pointer			if (!pStr||!*pStr)				return false;			// check local pointer			if (m_pStr)				Destroy();			// allocate space and copy string			m_length=strlen(pStr);			m_pStr=new char[m_length+1];			if (!m_pStr)			{				m_length=0;				return false;			}			strcpy(m_pStr,pStr);			return true;		}		// uses the printf syntax to format a string, returns a reference to the result		qckstr &Sprintf(const char *format,...)		{			Destroy();			va_list arglist;			va_start(arglist,format);			m_length=_vsnprintf(NULL,0,format,arglist);			if (!m_length)				return *this;			m_pStr=new char[m_length+1];			if (!m_pStr)			{				m_length=0;				return *this;			}			va_start(arglist,format);			vsprintf(m_pStr,format,arglist);			va_end(arglist);			return *this;		}		// assigns a C style string to qckstr		inline qckstr &operator=(const char *pStr)		{			if (m_pStr!=pStr)				Set(pStr);			return *this;		}		// assigns a stl string to qckstr		inline qckstr &operator=(const std::string &stlStr)		{			if (m_pStr!=stlStr)				Set(stlStr.c_str());			return *this;		}		// assigns a qckstr string to qckstr		inline qckstr &operator=(const qckstr &qStr)		{			if (this!=&qStr)				Set(qStr.m_pStr);			return *this;		}		inline bool operator<(const qckstr &qStr) const		{			return (strcmp(m_pStr,qStr.m_pStr)<0);		}		inline bool operator>(const qckstr &qStr) const		{			return (strcmp(m_pStr,qStr.m_pStr)>0);		}		inline bool operator==(const qckstr &qStr) const		{			return (strcmp(m_pStr,qStr.m_pStr)==0);		}		inline bool operator!=(const qckstr &qStr) const		{			return (strcmp(m_pStr,qStr.m_pStr)!=0);		}		inline bool operator<=(const qckstr &qStr) const		{			return (strcmp(m_pStr,qStr.m_pStr)<=0);		}		inline bool operator>=(const qckstr &qStr) const		{			return (strcmp(m_pStr,qStr.m_pStr)>=0);		}		// casts as a stl string, creates a temporary stl string		inline operator std::string()		{			return std::string(m_pStr);		}		// casts as a C style string (should only be used for const strings)		inline operator const char *() const		{			return (const char *)m_pStr;		}		// casts as a qckstr		inline operator const qckstr () const		{			return (const qckstr)*this;		}	protected:		unsigned m_length; // length of string		char	*m_pStr; // pointer to string data	};};struct ReverseString	:	std::unary_function< std::string, std::string >{	public:		std::string operator()(std::string string)		{			std::reverse(string.begin(), string.end());			return string;		}};struct ReverseQckstr	:	std::unary_function< strLib::qckstr, strLib::qckstr >{	public:		strLib::qckstr operator()(strLib::qckstr string)		{			std::string str(string.operator const char *());			std::reverse(str.begin() , str.end());			return strLib::qckstr(str);		}};int main(){	using namespace boost::lambda;	std::ifstream reader("words.lst");	std::deque< std::string > wordsAsStrings;	std::copy(std::istream_iterator< std::string >(reader), std::istream_iterator< std::string >(), std::back_inserter(wordsAsStrings));	std::random_shuffle(wordsAsStrings.begin(), wordsAsStrings.end());	std::deque< strLib::qckstr > wordsAsQckstrs(wordsAsStrings.begin(), wordsAsStrings.end());	std::deque< std::string > reverseWordsAsStrings;	std::transform(wordsAsStrings.rbegin(), wordsAsStrings.rend(), std::back_inserter(reverseWordsAsStrings), ReverseString());	std::deque< strLib::qckstr > reverseWordsAsQckstrs;	std::transform(wordsAsQckstrs.rbegin(), wordsAsQckstrs.rend(), std::back_inserter(reverseWordsAsQckstrs), ReverseQckstr());	std::clock_t times[5];	std::map< std::string, std::string > stringMap;	std::map< strLib::qckstr, strLib::qckstr > qckstrMap;	std::vector< unsigned int > accumulatorString(wordsAsStrings.size(), 0);	std::vector< unsigned int > accumulatorQckstr(wordsAsStrings.size(), 0);	std::vector< unsigned int > randomIndices;	std::generate_n(std::back_inserter(randomIndices), 1000000, bind(std::rand) * wordsAsStrings.size() / RAND_MAX);	times[0] = std::clock();	for (std::deque< std::string >::const_iterator word = wordsAsStrings.begin(), wordsEnd = wordsAsStrings.end(), reverseWord = reverseWordsAsStrings.begin(); word != wordsEnd; ++word, ++reverseWord)	{		stringMap.insert(std::make_pair(*word, *reverseWord));	}	times[1] = std::clock();	for (std::deque< strLib::qckstr >::const_iterator word = wordsAsQckstrs.begin(), wordsEnd = wordsAsQckstrs.end(), reverseWord = reverseWordsAsQckstrs.begin(); word != wordsEnd; ++word, ++reverseWord)	{		qckstrMap.insert(std::make_pair(*word, *reverseWord));	}	times[2] = std::clock();	for (std::vector< unsigned int >::const_iterator wordIndex = randomIndices.begin(), wordIndicesEnd = randomIndices.end(); wordIndex != wordIndicesEnd; ++wordIndex)	{		accumulatorString[*wordIndex] += stringMap[wordsAsStrings[*wordIndex]][0];	}	times[3] = std::clock();	for (std::vector< unsigned int >::const_iterator wordIndex = randomIndices.begin(), wordIndicesEnd = randomIndices.end(); wordIndex != wordIndicesEnd; ++wordIndex)	{		accumulatorQckstr[*wordIndex] += stringMap[wordsAsQckstrs[*wordIndex]][0];	}	times[4] = std::clock();	std::cout << "std::string insert:    " << (times[1] - times[0]) << '\n';	std::cout << "std::string lookup:    " << (times[3] - times[2]) << '\n';	std::cout << "strLib::qckstr insert: " << (times[2] - times[1]) << '\n';	std::cout << "strLib::qckstr lookup: " << (times[4] - times[3]) << '\n';}

gcc 3.3.1:std::string insert:    828std::string lookup:    3531strLib::qckstr insert: 1500strLib::qckstr lookup: 4250msvc++ 7.1:std::string insert:    687std::string lookup:    2828strLib::qckstr insert: 1125strLib::qckstr lookup: 3047

That was run on a word list of 173528 words. I did have to add a constructor-from-std::string to your string class. I was also highly tempted to rename it to slowstr.

The reason strings are inefficient with std::map (that's strings in general, not std::string in particular) is because of the algorithm behind std::map and the nature of the English language. The requirements on std::map necessitate the use of some form of balanced binary tree. To lookup in a binary tree you must perform upto log2 comparisons. String comparisons will search through a string until they find a character mismatch. The problem with the English language is that many words share a common prefix (super, under, trans etc.) which means in some cases five or more character comparisons are needed to determine if a word is lexographically before or after another word.

A more efficient system is to use a hashing container, which (with an appropriate hash function) can greatly reduce the number of character comparisons needed.

Enigma

This topic is closed to new replies.

Advertisement