get type of string in C++

Started by
12 comments, last by rip-off 12 years, 10 months ago
is there a way to determine the type of data that is stored in a string like
[source lang="cpp"]
magic("0.334") => float/double
magic("954")=> int
magic("c++")=> string
[/source]
i would be satisfied if it could distinguish between float, int and others
hackish solutions are accepted biggrin.gif
------------------------------
Join the revolution and get a free donut!
Advertisement

is there a way to determine the type of data that is stored in a string like
[source lang="cpp"]
magic("0.334") => float/double
magic("954")=> int
magic("c++")=> string
[/source]
i would be satisfied if it could distinguish between float, int and others
hackish solutions are accepted biggrin.gif



Hi there,

I can think of two things that you can do without really re-inventing the wheel. First, there's the C-function atof. This function takes a C-style string as a parameter and returns a double.

[source lang="cpp"]
[font=verdana, arial, helvetica, sans-serif]double atof ( const char * str );[/source]
Now, if this functions fails (for example if your String contains 'c++') it will either return 0.0 or HUGE_VAL (see the link). This works nice until you try to parse the string '0.0', right? If that case doesn't occur just go with it :) However, if this poses a problem you could use stringstreams. Construct an std::istringstream from your string and use the stream-operator to parse your string. This will either extract your value or, in case any problem occurs, the failbit will be set (see the above link on how to check it).
Hope that helps,Michael.[/font]
thank you so much... i already knew about stringstreams but i didnt know that the failbit existed... this is what ive been searching for smile.gif

EDIT: one last question though:
if i call the stringstream operator twice, will the second overwrite the firsts failbit, if it has been set?
like this

<stringstream operation fails>
//flag is set
<stringstream operation is successful>
//flag is not set?
------------------------------
Join the revolution and get a free donut!
Another option would be to use a chain of boost::lexical_cast operations along with a set of exception handlers to catch bad_lexical_cast exceptions; this would let you cascade through a series of types from most specific to most generic, until one succeeds. This could be considered the "type" of the data encoded in the string.

Wielder of the Sacred Wands
[Work - ArenaNet] [Epoch Language] [Scribblings]

[source lang="cpp"]const char *data;
// ...
char *endptr;
strtol(data, &endptr, 10);
if(data != endptr) {
// String is an integer.
} else {
strtod(data, &endptr);
if(data != endptr) {
// String is a real.
} else {
// String is a string.
}
}[/source]

Keep in mind, this only tells you if the string begins with the specified data type. If you want to test whether the entire string is consumed by the conversion, you'll have to test endptr against the end of the string as well.
Or as another option, to not get tangled in boost you could use the power of isdigit ( http://www.cplusplus...cctype/isdigit/ ) and isalpha (http://www.cplusplus...cctype/isalpha/) and you could write the magic function yourself :). This would also help you avoid problems with atof failing by returning 0.0.
Another couple of useful functions for these checks are strspn and strcspn. They tell you the length of the provided string which only consists (or does not contain) a set of characters. So you can do a quick first pass check;

[source]
strspn(p,"-0123456789")==strlen(p)
[/source]
will test to see if the string contains only valid integer characters.

These functions make it very easy to write quick parsers;

[source]
p+=strspn(p," \t\n\r");
[/source]
will skip over leading whitespace for example.

Another option would be to use a chain of boost::lexical_cast operations along with a set of exception handlers to catch bad_lexical_cast exceptions; this would let you cascade through a series of types from most specific to most generic, until one succeeds. This could be considered the "type" of the data encoded in the string.


A similar thing is what I use for parsers that doesn't have consistent line-order. (with C type parsing, but the principle is the same). With C type parsing (sscanf) I don't even have to check return values many times, because the function if "fails", doesn't touch the variables (I don't know about C++ stuff)
I found it a very simple and flexible way to parse text files (maybe slow, but parsing text files is not a common operation in the lifespan of an application anyway).
How about parsing your string with Boost.Spirit.Qi?

[source lang="cpp"]#include <boost/spirit/include/qi.hpp>
#include <boost/foreach.hpp>
#include <boost/variant.hpp>
#include <iostream>
#include <string>
#include <vector>

namespace qi = boost::spirit::qi;

struct variant_printer : public boost::static_visitor<void>
{
template <typename T>
void operator() (const T& t) const
{
std::cout << t << '\n';
}
};

int main()
{
std::string input = "1.337 2 lol 3.14 hi";
std::string::iterator begin = input.begin();
std::string::iterator end = input.end();

using qi::int_;
using qi::float_;
using qi::char_;
using qi::lit;
using qi::ascii::space;

qi::real_parser<double, qi::strict_real_policies<double> > const strict_double;

typedef boost::variant<double, int, std::string> parsed_type;

std::vector<parsed_type> values;

qi::rule<std::string::iterator, std::string()> word = +(~char_(' '));

qi::phrase_parse(begin, end,
*(strict_double | int_ | word),
space,
values);

BOOST_FOREACH(parsed_type v, values)
{
boost::apply_visitor(variant_printer(), v);
}
}

[/source]
I've personally used template functions for this:

[source lang="c++"]
template<class T> bool is(const std::string &s)
{
std::istringstream i(s);
T t;

i >> t;

if(i.fail()) return false;

std::string r;

i >> r;

if(!i.fail()) return false;

return true;
}

template<> bool is<std::string>(const std::string &s)
{
return true;
}

void f()
{
if(is<float>("23.0")) std::cout << "yep\n";
}
[/source]

A stringstream will stop at the first non-converted character so the second step in the first method attempts to put the remainder of the string into a std::string to check whether there is anything remaining in the string - otherwise "23.0xyz" would return true as an int or a float.

The specialisation for std::string is obviously necessary as stringstream's >> operator will stop at whitespace, so "hello world" would return false for std::string if using the first method.

It may be necessary to specialise for const char* and char* - can't remember if these will call the std::string specialisation or not offhand.

This topic is closed to new replies.

Advertisement