Inheriting from std::string::iterator (source included)

Started by
3 comments, last by MrEvil 18 years ago
Well, I was a little frustrated with not having a case-insensitive find() in std::string, and I didn't want to go so far as to construct a regular expression to find a case-insensitive substring. I wanted to use the existing standard library functions to do the job, so I figured the best way was to inherit from a standard iterator. I've never done any STL-compatible coding before, so I'd like some feedback on my ucase_iterator:

class ucase_iterator : public string::iterator
{
public:
	ucase_iterator(pointer ptr) : string::iterator(ptr)
	{
	}

	char operator*() const
	{
		return (make_ucase(**(const_iterator *)this));
	}

	char m_cTemp;

	const char* operator->() const
	{
		(char)m_cTemp = **(ucase_iterator*)this;
		return (&m_cTemp);
	}

	static char make_ucase(char c)
	{
		if(c >= 'a' && c <= 'z')
			return(c + ('A'-'a'));
		else
			return(c);
	}
};
Not much to it. I pass the constructor argument on to the string::iterator constructor, and I replace the * and -> operators. The * operator was easy, since it returns a value type (should be a const reference, but with a char it makes no practical difference). I just took the result of the superclass's operator*, passed it to my make_ucase function, and returned the uppercase char. The -> operator was a little more trouble. It demands a pointer to be returned. I don't want to make the original string uppercase (that was the whole point of making this iterator), so I created a char member that I fill in with the uppercase letter, then return a pointer to that. The problem is that the -> operator is const, so I have to do some seriously hacky casting to let me get the job done. To do a case-insensitive search, I do something like this:

string strSource = "This is a sentence with no caps";
string strFind = "SeNTenCe";

ucase_iterator iBeginFind(strFind._Myptr());
ucase_iterator iEndFind(strFind._Myptr());
iEndFind += strFind.length();

ucase_iterator iBeginSource(strSource._Myptr());
ucase_iterator iEndSource(strSource._Myptr());
iEndSource += strSource.length();

ucase_iterator iResult = search<ucase_iterator,ucase_iterator>(
	iBeginSource, iEndSource, 
	iBeginFind, iEndFind);

if(iResult == iEndSource)
	cout << "No match.\n";
else
	cout << int(iResult-iBeginSource) << '\n';
Anyway, it seems to work okay. It definitely feels clumsy, though, and I'd like some feedback if I'm doing something incredibly dumb here.
Advertisement
The pointer returned by operator -> doesn't have to be an array, it can just be in essence "&**this". You could store a single char inside the iterator, and return the address from operator ->.

A much better solution to your problem is to use the fifth argument of std::search, which is a predicate function. Something like this should work:
bool compare_case_insensite(char, char) {   // fill this in}std::string::iterator result = std::search(	source.begin(), source.end(), 	find.begin(), find.end(),	compare_case_insensitive);
Returning a reference or pointer to a local variable is undefined behaviour. Also your "hacky casting" is unnecessary. operator* is a const member and can thus be safely called without a cast (and const_cast would have been the better cast to use had it been necessary). A better way to implement this is to simply pass a case insensitive character comparison function to one of the standard library algorithms, i.e.:
struct case_insensitive_compare	:	public std::binary_function< char, char, bool >{	bool operator()(char lhs, char rhs)	{		return std::toupper(lhs) == std::toupper(rhs);	}};int main(){	std::string main_string("This is the string I want to search");	std::string search_string("ring I want");	std::string::iterator location = std::search(main_string.begin(), main_string.end(), search_string.begin(), search_string.end(), case_insensitive_compare());	if (location == main_string.end())	{		std::cout << "search string was not found\n";	}	else	{		std::cout << "search string was found at offset " << std::distance(main_string.begin(), location) << '\n';	}}

(code untested)

Σnigma
Heh heh. I should've seen that fifth parameter to search. Anyway, it's been a somewhat educational experience.
I also recommend using std::search. Whichever method you choose though, I'd use std::toupper instead of your make_ucase function. The main advantage of this is that it works in other locales and character sets (not as if you plan to EBCDIC, but it's a bonus) and with characters such as è, é, á, ü, ÿ, etc.

This topic is closed to new replies.

Advertisement