Sign in to follow this  
jorgander

std namespace have something analogous to sscanf?

Recommended Posts

Is there a method in the std namespace analogous to sscanf that performs input based on a passed-in arguement? What I'd like to avoid is creating an std object that allocates memory every time I want to parse something from a memory buffer. Example:
// Without std:
bool Parse(const char * Value, float & Size, bool & Relative)
{
    Relative = ( Value[strlen(Value) - 1] == '%' );
    return ( sscanf(Value, "%d", &Size) == 1 );
}

// With std:
bool Parse(const char * Value, float & Size, bool & Relative)
{
    std::stringstream temp(Value);
    temp.seekg(1, std::ios_base::end);
    Relative = ( temp.peek() == '%' );
    temp.seekg(0, std::ios_base::beg);
    temp >> Size;
    return ( temp.good() );
}

In the latter example, I'm assuming the stringstream constructor allocates memory and copies the data, which like I said is what I'm trying to avoid.

Share this post


Link to post
Share on other sites
Quote:
Original post by jorgander
In the latter example, I'm assuming the stringstream constructor allocates memory and copies the data, which like I said is what I'm trying to avoid.

In the former example you're assuming that sscanf() doesn't allocate and copy. Hmm, is micromanaging library code really going to net you that big bonus come payday?

Share this post


Link to post
Share on other sites
That is correct, I do not know what sscanf does behind the scenes either, but I think it's safe to say that the std version *does* copy the data. The only way I can be sure is to roll my own, which I'd rather not do for simple stuff like this.

As to your second point, no this is not for pay, it's my own personal work. And it's for stuff like reading data from a file, such as a 3D model file, that could contain hundreds of thousands of values.

It's a simple question, really: Does std have support for non-OO stuff like this, or some way that its objects won't copy data during construction, just retain a pointer to it? In my example, would passing std::ios_base::in as the second constructor parameter (specifying the stringstream as read-only afaik) still result in it being copied?

Share this post


Link to post
Share on other sites
Quote:
Original post by SiCrane
sscanf() is in namespace std if you include the cstdio header.


I chose sscanf as an example because I do actually have a function like the example I gave. I should have been more descriptive in the first post, but I didn't want to write a book just to ask it. My question is more generally, do std objects/methods always copy data? In the simplest case of determining if two char *'s are equal:

regular strcmp: strcmp(first, second) == 0
std string: std::string(first).compare(second) == 0

strcmp() probably does not copy anything, where as std::string probably makes a copy during construction.

Share this post


Link to post
Share on other sites
Quote:
Original post by jorgander
My question is more generally, do std objects/methods always copy data? In the simplest case of determining if two char *'s are equal:

regular strcmp: strcmp(first, second) == 0
std string: std::string(first).compare(second) == 0

strcmp() probably does not copy anything, where as std::string probably makes a copy during construction.

Why would std::string.compare() copy data? The only time that a std::string should copy data is on assignment - which comparing doesn't involve.

Now if you are comparing two string constants, then yes, you must construct the strings before comparing them, but if you already have two std::strings, comparing them is going to do exactly the same thing as strcmp.

Also don't lose sight of the fact that a std::string is (at a very basic level), little more than an object-oriented wrapper around a C character buffer. Most operations (with the exception of constructing from string literals), will have basically the same implementations.

Share this post


Link to post
Share on other sites
If sscanf is suitable for your needs, why not use it? It is part of the standard library.

As to strcmp versus string::compare, this looks a little stupid. Naturally the string version would create two temporary strings, so you get the same complexity + overhead of creating temporary strings.

However, if you used strings in the first place, there would be no additional strings needed to perform a ==. And the latter can potentially be more efficient than strcmp, because it can start by checking if the strings are the same length in the first place.

Share this post


Link to post
Share on other sites
Quote:
Original post by SiCrane
strcmp() is also in namespace std if you include the cstring header.


Hehe, I used strcmp as an example to get my point across. And although I guess I would ask about it too, I was wondering more about formatted i/o. And by "in the std namespace" I suppose I meant a class derived from std::istream or something along those lines. Perhaps I need to do a bit more research before I ask more. Anyway, thanks for the replies.

Share this post


Link to post
Share on other sites
Quote:
Original post by jorgander
Is there a method in the std namespace analogous to sscanf that performs input based on a passed-in arguement? What I'd like to avoid is creating an std object that allocates memory every time I want to parse something from a memory buffer. Example:

*** Source Snippet Removed ***

In the latter example, I'm assuming the stringstream constructor allocates memory and copies the data, which like I said is what I'm trying to avoid.


Good grief! Use a stringstream and get on with life :)

Is stringstream or string copying showing up in your profiling? I'm guessing not. Write the clearest code possible for starters. Make it whacky and unsafe only when it needs to be made whacky and unsafe.

There used to be a class called strstream, which allowed you to manage your own memory. But it was so daft that the standards committee thankfully decided to deprecate it.

Share this post


Link to post
Share on other sites
Usually, you'd allocate one instance of stringstream per model loader, then use that.

That means, that it'll grow to the size of largest element, after which, no copies will be performed anymore. The fact that it allocates for each value is your own choice in this case.

Share this post


Link to post
Share on other sites
The author of this paper is referring more or less to what I'm asking about when he says on page 27: "Since istreams want to read from objects, and not arbitray data, ...", then on page 30: "Examining the source and disassembly of the various methods showed that the ifstream IO, under Visual C++, simply spent a lot of time doing housekeeping". This and other sources like it helped confirm what I already suspected about std streams: they generally aren't ideal for quick operations such as this. I'll restate my original question and say that if anyone knows differently, please let me know.

Once again, thanks for your replies.

Share this post


Link to post
Share on other sites
Quote:
Original post by jorgander
The author of this paper is referring more or less to what I'm asking about when he says on page 27: "Since istreams want to read from objects, and not arbitray data, ...", then on page 30: "Examining the source and disassembly of the various methods showed that the ifstream IO, under Visual C++, simply spent a lot of time doing housekeeping".

Now I don't know about VC++'s stdlib, but the stdlibc++ has an option to disable buffer interfacing with the C-style printf family of functions. This cuts most of the housekeeping out, and can speed things up a lot.

Share this post


Link to post
Share on other sites
Quote:
Original post by swiftcoder
Now I don't know about VC++'s stdlib, but the stdlibc++ has an option to disable buffer interfacing with the C-style printf family of functions. This cuts most of the housekeeping out, and can speed things up a lot.


This option being ... ?
I'm interested : my project manipulate a lot of strings and streams, and speeding this a bit would definitely be good :)

Share this post


Link to post
Share on other sites
Quote:
Original post by paic
Quote:
Original post by swiftcoder
Now I don't know about VC++'s stdlib, but the stdlibc++ has an option to disable buffer interfacing with the C-style printf family of functions. This cuts most of the housekeeping out, and can speed things up a lot.


This option being ... ?
I'm interested : my project manipulate a lot of strings and streams, and speeding this a bit would definitely be good :)

Pathetic Performance? Ditch C - though I see on re-reading that it only affects cin, cout and cerr. File streams and string streams should be pretty much as fast as their C equivalents already.

Share this post


Link to post
Share on other sites
Quote:
Original post by jorgander
regular strcmp: strcmp(first, second) == 0
std string: std::string(first).compare(second) == 0

strcmp() probably does not copy anything, where as std::string probably makes a copy during construction.


But if you were doing things sanely, you would have already had std::string objects to compare.

You should also be aware that std::string is probably smarter about copies than you think. It can very often do things faster, too, because it carries length information around with it: if you do iterated strcat() calls, for example, you're constantly implicitly re-strlen()ing, and iterating through the string each time for that. You might be smart enough, when concatenating things in a loop to keep track of the "end point", but when you start doing things across function calls, you find you need to pass a second parameter for the length... and then you get the idea to bind these two things together in a struct... and before you know it, you've reinvented the wheel (probably badly).

And BTW, you don't need to compare them like that: with std::strings, you can just write first == second. Imagine that!

I think this illustrates something important: instead of looking for something natural, your C background guides your eye to the first thing that looks vaguely like the interface of the C equivalent, which is constrained by that language. This is the same phenomenon that leads to abominations like 'for i in range(len(container)):' in Python. I think of this as brain damage. Regrettably, it's very, very common.

Oh, and just for the heck of it - I would have written the original example as something more like (not tested):


// A wrapper I like to keep handy. Caller is responsible for checking stream
// state or setting exception flags, according to how it wants to do things.
template <typename T> extract(std::istream& is) {
T result; is >> result; return result;
}

// FIXME: needs a more descriptive function name.
std::pair<float, bool> Parse(const std::string& Value) {
std::stringstream temp(Value);
temp.exceptions(std::ios::failbit);
return std::make_pair(extract<float>(temp), Value[Value.length() - 1] == '%');
}

Share this post


Link to post
Share on other sites
I think you can do something like this to avoid memory allocation for copy of Value inside stringstream:
std::stringstream temp;
temp.rdbuf()->pubsetbuf(Value, std::strlen(Value) + 1);

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this