is std::string's memory always contiguous?

Started by
18 comments, last by visitor 15 years, 3 months ago
I've always searched for a way to use std::string as a buffer, and i've figured out a way, but I wanted to know is the C++ specification says a std::string's buffer must be contiguous? thanks
--------------------------------------Not All Martyrs See Divinity, But At Least You Tried
Advertisement
I believe it's implementation defined and is not guaranteed to be a contiguous block of memory... let me dig this up though and see if I can validate what I just said... I'm pretty sure it's not though. Meaning you can't do

std::string s = "Hello World";
std::cout << &s[0];

You can use c_str() or data() to get a const char* that's contiguous though. If you're looking for a buffer to store things in, you may want to consider std::vector<char>

[edit]

Well the C++ International Standard doesn't mention the word "contiguous" in the "String Classes" section, and my memory tells me it isn't stored contiguously, so I'm going to say that's my evidence to back up my original statement [smile]
[size=2][ I was ninja'd 71 times before I stopped counting a long time ago ] [ f.k.a. MikeTacular ] [ My Blog ] [ SWFer: Gaplessly looped MP3s in your Flash games ]
the memory returned by .c_str() or .data() will be contiguous but the pointer returned is only defined as valid until the next operation on the string.
But you probably have some better choices for a buffer depending on what you are doing.

vector<char> is going to be better, because it won't interpret binary data as a string, thus '\0' won't break a vector, where as it will invoke undefined behavior on a string.

stringstream might be better for throwing string data into a buffer, because like any other iostream, the << operator is overloaded for POD types and you can thus quickly insert formatted data.
well if the buffer is contiguous, even for the one operation, then it'd be safe to do this:

char str1[] = "Hello World";
std::string str2(11,(char)0);

memcpy(&str2[0],str1,11);


which is what in the vein of i'd want to do with it more or less(fstream stuff, win32 buffers)

I doubt the buffer'd move around much, no more than a vector's would when its resized


the reason I didn't want to use a vector is that i'd lose std::string's string searching and whatnot, and to use C's string library i'd have to add '/0' 's and whatnot.

plus std::string is generally considered good C++ practice
--------------------------------------Not All Martyrs See Divinity, But At Least You Tried
Quote:Original post by KulSeran
the memory returned by .c_str() or .data() will be contiguous but the pointer returned is only defined as valid until the next operation on the string.
But you probably have some better choices for a buffer depending on what you are doing.

vector<char> is going to be better, because it won't interpret binary data as a string, thus '\0' won't break a vector, where as it will invoke undefined behavior on a string.

stringstream might be better for throwing string data into a buffer, because like any other iostream, the << operator is overloaded for POD types and you can thus quickly insert formatted data.



is '/0' not just the hex representation of ascii nul? also the same as (char)0 or char x=0; I don't see how that'd break a std::string
--------------------------------------Not All Martyrs See Divinity, But At Least You Tried
Quote:Original post by godsenddeathis '/0' not just the hex representation of ascii nul? also the same as (char)0 or char x=0; I don't see how that'd break a std::string

It wouldn't break it, but it may yield some unexpected behavior if the programmer doesn't know about that:

std::string sString( "Hello\0World!" ); // sString.c_str() would now return a pointer to the C-string "Hello"

But you can do the following to append the whole string:

std::string sString;
sString.append( "Hello\0World", 11 );
[/CODE
and use c_str to get to read the whole contiguous buffer.
Well. a little informed searching later. I was wrong. Somewhere i had the impression that std::strings were free to discard data after the first enountered NULL. but that isn't the case.
forget i said anything about it. seems ok.
I'll keep that in mind, thanks
--------------------------------------Not All Martyrs See Divinity, But At Least You Tried
Quote:Original post by godsenddeath
the reason I didn't want to use a vector is that i'd lose std::string's string searching and whatnot

I can understand that

Quote:and to use C's string library i'd have to add '/0' 's and whatnot.

But I don't understand this. I'm going to assume '/0' is a typo and you meant '\0' [grin]. Do you mean you're trying to read a chunk of a text file, but if you were to read the chunk as a std::vector<char> that you'd have to add '\0' to the end of the vector in order for it to behave like a C-string? Why not use the std::string constructor that lets you pass in a char* buffer that's not necessarily null terminated? This link shows a little about which constructor I'm talking about. It's the fourth one down in the list. In other words, something like this:

#include <iostream>#include <string>int main(){    // Make our non-null terminated chunk of text (pretend it was read from a file, and that's    // why there isn't a '\0' tacked onto the end of it)    const char buffer[] = { 'H', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd', '!' };        // Copy the contents of buffer into the string.  We tell the constructor that buffer    // contains 12 chars.  Note that the constructor doesn't require a null terminated    // buffer.    std::string str(buffer, 12);        // And success!    std::cout << str << std::endl;        // And now we read a new non-null terminated chunck of text from our file    const char newBuffer[] = { 'H', 'i' };        // And we can either create a new string with newBuffer's contents, or we can reset    // str's value to that of newBuffer by using the assign method.  It's similar to the    // constructor.  We pass an array of const chars and tell it how big that array is.    str.assign(newBuffer, 2);        // And again, success!    std::cout << str << std::endl;}


Quote:I doubt the buffer'd move around much, no more than a vector's would when its resized

The string's buffer is never, ever guaranteed to be contiguous. Even if you set it to "hello world" and then never, ever manipulate the std::string, it is not obligated to store "hello world" in one contiguous buffer. Not changing the string's buffer doesn't change the fact that it's not guaranteed to be contiguous in the first place.

@bosin: try using lowercase code tags and adding a ']' to that last tag [smile]
[size=2][ I was ninja'd 71 times before I stopped counting a long time ago ] [ f.k.a. MikeTacular ] [ My Blog ] [ SWFer: Gaplessly looped MP3s in your Flash games ]
Quote:Original post by MikeTacular
But I don't understand this. I'm going to assume '/0' is a typo and you meant '\0' [grin]. Do you mean you're trying to read a chunk of a text file, but if you were to read the chunk as a std::vector<char> that you'd have to add '\0' to the end of the vector in order for it to behave like a C-string? Why not use the std::string constructor that lets you pass in a char*? This link shows a little about which constructor I'm talking about. It's the fourth one down in the list. In other words, something like this:


yes, i meant '\0', and because that adds additional dynamic allocation overhead, and an extra step i'd like to avoid if all possible

Quote:The string's buffer is never, ever guaranteed to be contiguous. Even if you set it to "hello world" and then never, ever manipulate the std::string, it is not obligated to store "hello world" in one contiguous buffer. Not changing the string's buffer doesn't change the fact that it's not guaranteed to be contiguous in the first place.


it's not guarenteed huh? thats a pitty
--------------------------------------Not All Martyrs See Divinity, But At Least You Tried

This topic is closed to new replies.

Advertisement