Sign in to follow this  
riruilo

What is the C++ version of sscanf?

Recommended Posts

Quote:
Original post by fcoelho
Use std::istringstream. You can check an example here.


I'll take type safe, type checked (at compile-time) format strings over current C++ streams any day. Lets just hope that C++0x generalizes the ability to analyze string literals at compile-time with constexpr functions (only currently possible with constexpr user-defined literals) and variadic templates.

Share this post


Link to post
Share on other sites
Quote:
Original post by snk_kid
I'll take type safe, type checked (at compile-time) format strings over current C++ streams any day.
What are you talking about? Printf certainly has a nice little domain language going, and though I occasionally wish for positional arguments and user-defined types as standard features I must admit that it covers my day-to-day needs admirably.
Scanf, on the other hand, is virtually useless for anything but the most rudimentary and sloppy type of parsing. As soon as you need any sort of error handling or try to read anything more complicated than space-separated set of integers you end up with convoluted formats with silent arguments, bizarre %[...] catchalls, and length limit parameters (naturally sent as integers rather than size_t just trip up unwary users) for every string. Also, someone needs to be shot for making %f a float type rather than a double.

Not that istreams are all that much better but at least there aren't quite as many ways to mess up with standard streams.

Share this post


Link to post
Share on other sites
istringstream makes useless copies of the string, allocates memory often when it's not necessary. And it is not so simple to use it.
I'd suggest using C string functions here because they also help to develop bug seeking skills. ^_^

Share this post


Link to post
Share on other sites
Quote:
Original post by implicit
What are you talking about? Printf certainly has a nice little domain language going, and though I occasionally wish for positional arguments and user-defined types as standard features I must admit that it covers my day-to-day needs admirably.


I'm talking about something that doesn't currently exist in C/C++ in an ideal form (yes I already know about GCC doing some form of compile-time type checking and there is boost.format, don't even go there).

In OCaml/F# format strings are analyzed by the compiler at compile-time and gives you a function type from the given format, this is given to a parametric type typically called format/Format. This is used with their standard library IO functions and if you apply these functions with arguments with the wrong type(s) and/or the wrong number of arguments you will get a compile-time error. You can use this special format type in your own functions.

As I've already stated in C++0x a "variadic template constexpr user-defined literal" (effectively a operator overload templated over an infinite sequence of characters (variadic template with non-type parameters) for which you can use a string literal to implicitly instantiate the template.

This is the only ideal way to analyze string(s) (literals) at compile-time in C++0x. It's stupid to limit this functionality to only user-defined literals, this should be applicable to any constexpr function.

The ability to analyze strings at compile-time is not just nice for format strings so why not give us the ability to achieve it in a more flexible way instead of the restricted way in the current working draft of C++0x.

My point is that while format strings kind of suck in C/C++, this does not mean the idea sucks in general in fact they are done very well in more type-safe languages.

Quote:
Original post by implicit
Scanf, on the other hand, is virtually useless for anything but the most rudimentary and sloppy type of parsing.


Something similar can be said for C++ I/O streams. If you're going to parse something complicated then you should be using regular expressions library and if they are not suitable then you should be using a lexer/parser framework.

Quote:
Original post by implicit
As soon as you need any sort of error handling or try to read anything more complicated than space-separated set of integers you end up with convoluted formats with silent arguments, bizarre %[...] catchalls, and length limit parameters (naturally sent as integers rather than size_t just trip up unwary users) for every string. Also, someone needs to be shot for making %f a float type rather than a double.


That is an issue specifically with C/C++ (something that could easily be rectified in C++0x with variadic templates and constexpr functions or an alternate method). This is not a problem with the idea of format strings in general.

[Edited by - snk_kid on December 26, 2009 7:38:53 AM]

Share this post


Link to post
Share on other sites
Quote:
Original post by snk_kid
...
Okay, I'll buy that. I got the impression that you thought that scanf with compile-time checking in GCC or extended with type-safety as-is to C++ was a useful way of doing parsing, but perhaps there is a library or language out there which has managed to do parsing with vaguely scanf-like templates in a reasonable way.

It's just that I've spent way too much time trying to get scanf to work in the past, and as soon as you need any sort of error detection it just doesn't work. Simply trying to figure out whether you've matched the whole line with sscanf is almost enough to give me a migraine.

Share this post


Link to post
Share on other sites
Quote:
Original post by implicit
Quote:
Original post by snk_kid
...
Okay, I'll buy that. I got the impression that you thought that scanf with compile-time checking in GCC or extended with type-safety as-is to C++ was a useful way of doing parsing, but perhaps there is a library or language out there which has managed to do parsing with vaguely scanf-like templates in a reasonable way.


For C++ Boost.Format gets you some way there but as far as I know it only works with output streams and I would probably get into trouble using it at work on a large project. Got to write & use lowest common denominator C++ code at work on large projects with lots of coders.

Share this post


Link to post
Share on other sites
Quote:
Original post by fcoelho
Use std::istringstream. You can check an example here.


Thanks a lot. I will try it right now.
I forgot to say I use std::string, never char.
I know I can use my_string.c_str() but I prefer to use a STL approach.

Thanks.

Share this post


Link to post
Share on other sites
Quote:
Original post by snake5
istringstream makes useless copies of the string, allocates memory often when it's not necessary.

I really doubt that a half-decent implementation would do that.
Quote:
And it is not so simple to use it.
I'd suggest using C string functions here because they also help to develop bug seeking skills. ^_^
int num1, num2;
double decimal_number;
std::string a_string;
//parse a string containing two ints, a floating-point number, and a string
a_stringstream >> num1 >> num2 >> decimal_number >> a_string;
You just have to keep in mind how it parses whitespace. Save your bug-seeking skills for where they're needed.

Actually, I usually use plain stringstream, which does input and output. It makes it a little easier. This page is a good reference. For stuff more complicated than that, you'll want regexes or a parsing framework. Boost.Regex and Boost.Spirit can be pretty nice. Or, depending on what you're using it for, an existing data format with pre-built parsers might be more convenient, like XML or JSON.

Share this post


Link to post
Share on other sites
Quote:
Original post by theOcelot
Quote:
Original post by snake5
istringstream makes useless copies of the string, allocates memory often when it's not necessary.

I really doubt that a half-decent implementation would do that.


It's not always possible to prevent. Consider, for example, the following:

string s1, s2, s3, s4;

s1 = "Hello,";
s2 = " Worl";
s3 = "d!";
s4 = s1 + s2 + s3;

How many times is memory allocated in statement 4? How many times is memory deallocated in statement 4?




The correct answer is that memory is allocated 3 times and deleted 2 times.

Allocation 1: A temporary T1 to hold the result of s2 + s3
Allocation 2: A temporary T2 to hold the result of s1 + s2 + s3
Allocation 3: A copy made of T2 to store in s4
Deallocation 1: T1
Dallocation 2: T2

But this is hardly optimal. You could do better by eliminating Allocation 3 and Deallocation 2, after all s4 is an exact copy of the result of the expression, why make another allocation to hold an exact copy of this temporary when the internals of the temporary itself could be used?

C++0x rvalue references and move constructors solve this. It's not possible under C++ currently.

This is, of course, just one example. There are numerous cases in STL, Boost, and other library code where temporaries / copies are not preventable but could be eliminated under C++0x. The return value optimization helps sometimes, but not always.

Share this post


Link to post
Share on other sites
Quote:
Original post by cache_hit
Quote:
Original post by theOcelot
Quote:
Original post by snake5
istringstream makes useless copies of the string, allocates memory often when it's not necessary.

I really doubt that a half-decent implementation would do that.


It's not always possible to prevent. Consider, for example, the following:

He was talking about stringstream, not string concatenation operator.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this