Splitting a string into a Vector

Started by
7 comments, last by SiCrane 18 years, 6 months ago
I have [tried] to make my own function for splitting up a string so each word in the string is placed into its own container in a vector. Here is my code:
vector<string> newWords;
	int caret = 0, wordCount = 0;

	// Split the line into individual strings.
	for(int i = 0; i < str.length(); ++i){
		if(str.at(i) == ' '){
			newWords.push_back(str.substr(caret, i));
			caret = i + 1;
		}
	}
But its just not working properly, anyone know whats wrong with it?
If it's possible for us to conceive it, than it must be possible.
Advertisement
The second parameter of substr is the number of characters in the substring, not the position of its end in the source string. Change i to i - caret.

EDIT: Alternatively, using the wonders of the SC++L:
#include <algorithm>#include <iostream>#include <iterator>#include <sstream>#include <string>#include <vector>int main(){	std::string pie = "i liek pi.  do u liek pi?";	std::vector< std::string > words;	std::istringstream sentence(pie);	std::copy(std::istream_iterator< std::string >(sentence), std::istream_iterator< std::string >(), std::back_inserter(words));	std::copy(words.begin(), words.end(), std::ostream_iterator< std::string >(std::cout , "\n"));}
Enigma
sweet it works, but now its missing the last word i enter, how can i fix this?

Edit: dun worry, i used enigma's code and it works fine
If it's possible for us to conceive it, than it must be possible.
Alternatively, the wonders of using boost.
#include <boost/algorithm/string.hpp>#include <iterator>#include <iostream>#include <ostream>#include <vector>#include <string>int main(void){	std::string str("I like pie, do you like pie?");	std::vector<std::string> words;	boost::split(words, str, boost::is_any_of(" ,?"), boost::token_compress_on);	std::copy(words.begin(), words.end(), std::ostream_iterator<std::string>(std::cout, "\n"));}


Prints:
I
like
pie
do
you
like
pie
--Michael Fawcett
Quote:Original post by Enigma
EDIT: Alternatively, using the wonders of the SC++L:

The copy with a back_inserter isn't necessary, you can use the iterator range constructor of std::vector instead.
int main(){  std::string pie = "i liek pi.  do u liek pi?";  std::istringstream sentence(pie);  std::vector< std::string > words( (std::istream_iterator<std::string>(sentence)), std::istream_iterator<std::string>());;  std::copy(words.begin(), words.end(), std::ostream_iterator< std::string >(std::cout , "\n"));}
Why would anyone want to replace a simple, elegant, straight-forward block of string parsing code that fits in 3 lines with the atrocious, unreadable crap that comes with stl and boost? I mean... I know they have their advantages, but for this?

Were you guys tossing ideas around for fun, or were you actually suggesting that that was better?
Very simple. The code I wrote with STL worked the first time I ran it. The code the OP had obviously didn't. And I didn't even need to worry about off by one errors, write an explicit loop or even think that hard.
Well, I'm not saying that it doesn't work. I like STL as much as the next guy... it's powerful stuff.

But for something like this, it just seems like overkill... a simple string tokenizing loop should be simple enough that one very fast glance will indicate what it does.

It just seems like declaring a single variable shouldn't use more parentheses and brackets than the entire "original" way...

It's just a preference... I was just wondering if other people actually like looking at code like that if they don't have to. No biggie.
From my point of view, it's reversed. Writing an explicit loop for something that can be done inside an object's constructor is overkill. And reading the explicit loop is harder than reading the constructor declaration since the explicit loop does things with indices that make it not immediately clear what's going on. Does substr() take the length or the two indices? It only parses words when it finds a space, is it intentional that it leaves off the last word? Should it only be spaces? What about tabs and other whitespace?

Keep in mind that you're comparing the complexity of the broken version of the original code to correct versions of the STL code. Getting the substr() arguments correct increases the number of symbols in the original code. Changing the loop so that it adds the last word as well also increases the complexity.

This topic is closed to new replies.

Advertisement