Im Not So Good At Writing Parsers
I've never been really good at this so im all hears as to how i can parse the following data solidly.
From a file a string is read that can be any length containing any number of floats:
"1.0,1.0,2.0,123.34"
What is a solid way of parsing this without using any existing libraries?
Dave
use the strtok function to tokenize the "," character and save the results into a vector of some sort.
http://www.cplusplus.com/ref/cstring/strtok.html
http://www.cplusplus.com/ref/cstring/strtok.html
You'll need to use *some* library, unless you plan on writing this in assembly and using raw system calls to perform file I/O. Assuming Python, you could just read it in and use split on the read string.
I've never been fond of strtok, isn't that C anyhoo, i'd like to write the algorithm myself.
Dave
Dave
If you'd prefer std::strings you can use getline().
string myDatawhile (!(getline(myFloatFile, myData, ',')).eof()) // get data until a comma is encountered{ // convert std::string myData to a float with a stringstream or what have you, push back}
From the strtok-man page: Never use these functions.
The idea is good though. If you're using C++, you might find the following piece of code useful:
It returns a vector of strings, each containing a number.
So, your code will be something like this:
The idea is good though. If you're using C++, you might find the following piece of code useful:
std::vector<std::string> StringUtils::tokenize(std::string const &str, std::string const &delims){ std::vector<std::string> tokens; size_t pos, pos2; pos = str.find_first_not_of(delims, 0); while (pos != std::string::npos) { pos2 = str.find_first_of(delims, pos); if (pos2 == std::string::npos) pos2 = str.length(); tokens.push_back(str.substr(pos, pos2-pos)); pos = str.find_first_not_of(delims, pos2); } return tokens;}
It returns a vector of strings, each containing a number.
So, your code will be something like this:
vector<string> tokens = tokenize(myStr, ","); for( vector<string>::iterator i = tokens.begin(); i != tokens.end(); ++i ) doSomethingWithNumber( atof(i->c_str()) );
So where does strtok stand with us these days. I was under the impression that it was old hat now?
Dave
Dave
Ok thanks guys, it seems that the simplest solution is the getline method, thanks also for your contribution DaBono.
Dave
Dave
Quote:Original post by Dave
So where does strtok stand with us these days.
non-reentrant, not thread safe
There's strtok_r() if you want to be re-entrant.
However, when scanning a list of values, I prefer to just do it manually. Using C style libraries:
If you want to be a little more specific about what characters you accept for delimiters, you can instead use 'str = strspn( end, ", " )' (for when you only want to accept commas and spaces).
You can formulate this same loop using std::string::find_first_of().
However, when scanning a list of values, I prefer to just do it manually. Using C style libraries:
char * str = "1,2,3.14";char * end;while( str && *str ) { end = 0; double d = strtod( str, &end ); if( !end || end == str ) { break; } handle_double( d ); // put in an array or whatever str = strcspn( end, "-+0123456789." ); // wind forward to next number}
If you want to be a little more specific about what characters you accept for delimiters, you can instead use 'str = strspn( end, ", " )' (for when you only want to accept commas and spaces).
You can formulate this same loop using std::string::find_first_of().
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement