Sign in to follow this  
kristoferos

std::string bug?

Recommended Posts

Hello, is it possible to have a string containing the '\x00' character without it thinking that its the end of the string. For example, I want to have some data after the '\x00' character such as "\xFF\xFF\xFF\xFF\x00\xFF".

Share this post


Link to post
Share on other sites
What problems did you encounter? Using G++, I can put null characters in my strings without any effort.

Quote:
MSDN page:
Objects of type string belonging to the C++ template class basic_string<char> are not necessarily null terminated. The null character ' \0 ' is used as a special character in a C-string to mark the end of the string but has not special meaning in an object of type string and may be a part of the string just like any other character.

Share this post


Link to post
Share on other sites
a stl string does not have to be terminated by the \x00 character, but then you do actually have to tell it what size it should be.
String processing functions are mostly by nature null terminated, so erm I suppose it really depends on what you are planning to do with that string.

Share this post


Link to post
Share on other sites
Quote:
Original post by paulecoyote
a stl string does not have to be terminated by the \x00 character, but then you do actually have to tell it what size it should be..



std::string foo = "foo";
foo[1] = '\0';


Now foo is the string made of the characters f, \0, o (in that order).

Share this post


Link to post
Share on other sites
And if it does give you trouble there's a nift container std::vector that would work just fine without any second thoughts about it.

Oh, and null terminated strings really are crap for many many things, when doing heavy stringwork they simply are among the worst there is.

Share this post


Link to post
Share on other sites
The problem is most likely that you tried to assign it from a string literal. You can't stick \0 in a string literal; the assignment will think it's the end of the string. If you do mystring = std::string(string_with_nulls,length) instead of mystring = string_with_nulls it will work fine. If you have nulls in the string you also can't use the c_str member function.

Share this post


Link to post
Share on other sites
Thanks for all of your replies, passing the length as an argument worked but how would I be able to get the length of the big strings that change all the time?

Share this post


Link to post
Share on other sites
Quote:
Original post by ToohrVyk
Quote:
Original post by paulecoyote
a stl string does not have to be terminated by the \x00 character, but then you do actually have to tell it what size it should be..



std::string foo = "foo";
foo[1] = '\0';


Now foo is the string made of the characters f, \0, o (in that order).


I meant if you used strcpy or something like that that uses null terminating characters as a stop. What you've done there is declare something with 3 elements and change the 2nd one, so that is bound to work.

Share this post


Link to post
Share on other sites
Quote:
Original post by paulecoyote
I meant if you used strcpy or something like that that uses null terminating characters as a stop. What you've done there is declare something with 3 elements and change the 2nd one, so that is bound to work.


Then the obvious solution is to not use the functions in the cstring header and use std::string's many member functions or algorithms that are iterator aware. For example, why use strcpy when you are working with std::string? std::string has a perfectly usable copy constructor and assignment operator.

Share this post


Link to post
Share on other sites
you can do what you asked, just be aware that if you EVER have to use the .c_str() function, you will almost certainly not get the behavior you want ... cause all c_str users are going to get/expect a null terminated C-style string.

Share this post


Link to post
Share on other sites
*sigh* Code:


const char* testString=("abcd\x00 efghijk");
cout << "testString: " << testString << endl;
std::string nullTest(testString);
if (4==nullTest.length())
{
cout << "Well looks like only the first bit of the string was copied across, just like I was trying to say all along: " << nullTest << endl;
}
else
{
cout << "All the string was copied, my bad." << endl;
}

std::string nullTest2;
nullTest2.resize(13);
nullTest2.assign( testString, 13);
if ((13==nullTest2.length()) && ('k'==nullTest2[12]))
{
cout << "Well looks like the whole string was copied this time, because you told it what size it should be and used assign to tell it how many characters to copy: " << nullTest2 << endl;
}
else
{
cout << "Not all the string was copied, my bad." << endl;
}








Edit - updated to more comprehensive example

As you see (and probably know) SiCrane, when you have nulls embedded in a c-style string using the std::string copy constructor still does not copy across the whole string. I was assuming from the original post that he was assigning c-style strings to a std::string

Copying between different instances of std::strings though is of course a whole different kettle of herring [wink] OP, being that they are alot smarter then c-style strings.

[Edited by - paulecoyote on August 3, 2005 3:06:28 AM]

Share this post


Link to post
Share on other sites
Quote:
Original post by kristoferos
Thanks for all of your replies, passing the length as an argument worked but how would I be able to get the length of the big strings that change all the time?


Err good question. The windows system represents the environment variable block as null terminated strings, and the end of the block with two null terminating characters next to each other. This lends itself to looping until you come across that double null marker. But doesn't work if your data has that marker in the body.

This is more about storage strategy and depends on your data... for example you may want to write away how many characters are in the string first, then follow it with the string itself. May be something like "5\tabcdef" where 5 is how many characters to expect, the tab is thrown away (and marks the end of the number) and the rest that follows is the 5 characters.

Share this post


Link to post
Share on other sites
Quote:
Original post by paulecoyote
*sigh* Code:

All that code proves is that you don't understand the term "copy constructor". A copy constructor is a constructor for a class X that takes as an argument a cv qualified reference to X. That's it, no other constructor for a class X is a copy constructor.

Devja already posted the correct way to create a std::string from a non-null terminated pointer to a char array. My statement was directed exclusively at your post about the strcpy() function, which still stands.

Share this post


Link to post
Share on other sites
Quote:
Original post by SiCrane
Quote:
Original post by paulecoyote
*sigh* Code:

All that code proves is that you don't understand the term "copy constructor". A copy constructor is a constructor for a class X that takes as an argument a cv qualified reference to X. That's it, no other constructor for a class X is a copy constructor.

Devja already posted the correct way to create a std::string from a non-null terminated pointer to a char array. My statement was directed exclusively at your post about the strcpy() function, which still stands.


changed the one line to use the copy constructor, rather then just the constructor. Code still has same output.


const char* testString=("abcd\x00 efghijk");
cout << "testString: " << testString << endl;
std::string nullTest;
// Just for SiCrane
nullTest = (testString);
if (4==nullTest.length())
{
cout << "Well looks like only the first bit of the string was copied across, just like I was trying to say all along: " << nullTest << endl;
}
else
{
cout << "All the string was copied, my bad." << endl;
}

std::string nullTest2;
nullTest2.resize(13);
nullTest2.assign( testString, 13);
if ((13==nullTest2.length()) && ('k'==nullTest2[12]))
{
cout << "Well looks like the whole string was copied this time, because you told it what size it should be and used assign to tell it how many characters to copy: " << nullTest2 << endl;
}
else
{
cout << "Not all the string was copied, my bad." << endl;
}




Quote:
Original post by paulecoyote but then you do actually have to tell it what size it should be.

... which Deyja followed up with a concrete example (telling it how many characters where in the source string, IE tell it what size it should be:
Quote:
Original post by Deyjastd::string(string_with_nulls,length)


Quote:
Original post by kristoferos Thanks for all of your replies, passing the length as an argument worked


So we are agreed that the correct way is to tell pass an stl string how many characters it is expecting for a string with nulls embedded, right - though whatever method (construction, resizing, whatever).

Quote:
Original post by SiCranestd::string has a perfectly usable copy constructor and assignment operator.

But I stand by what I was saying in that passing a c-style string to an stl string copy contructor with null terminating characters in won't work, as that code fragment was trying to illustrate.

As for the strcpy stuff I meant processing before it's handed over to the stl string, as you say it isn't very advisable to use that stuff on a stl string. Because in context of the original posters problem, he seems to be reading things in from somewhere else then putting them in the string.

Share this post


Link to post
Share on other sites
Quote:
Original post by paulecoyote

changed the one line to use the copy constructor, rather then just the constructor. Code still has same output.

You still either don't seem to grasp that a char * is not a std::string or you still don't grasp what a copy constructor is. You are still not calling a copy constructor, in fact you are even farther from calling a copy constructor because you are calling an assignment operator. Let me repeat: a copy constructor for an object X is a constructor that takes a cv qualified reference to X. That's it. Constructing a string from a char * is not copy construction. Only constructing a std::string from a std::string is copy construction.

Quote:

But I stand by what I was saying in that passing a c-style string to an stl string copy contructor with null terminating characters in won't work, as that code fragment was trying to illustrate.

Which, you don't seem to understand, no one is debating. (Edit: aside from the fact that you insist on calling it a copy constructor when it isn't.)

[Edited by - SiCrane on August 3, 2005 9:42:43 AM]

Share this post


Link to post
Share on other sites
Quote:
Original post by paulecoyote
As you see (and probably know) SiCrane, when you have nulls embedded in a c-style string using the std::string copy constructor still does not copy across the whole string.


Yes it does. The C string ends at the first '\0' you put in there. The subsequent characters are not part of the string.

Of course, it does not copy the characters you put after the '\0', which are not part of the string, but are part of the character array.

Quote:
changed the one line to use the copy constructor, rather then just the constructor. Code still has same output.


I'm sorry, but you're not using a copy constructor in there.


std::string a( 10, '\0' );
std::string b(a); // copy constructor
std::string c = "foo!";
c = a; // assignment operator




in the above example, a contains only '\0' characters. Still, the copy constructor and assignment operation both conserve all 10 of these '\0' characters. Something which strcpy wouldn't, which was precisely SiCrane's point.

Share this post


Link to post
Share on other sites
ah bollocks. Fine. So I slightly messed up my terminology *again* with a crappy example that I'm sick to the teeth of now that I was doing to illustrate a point when I should have probably been doing something else anyway. Obviously I'm just like the crappest coder ever. Perhaps I should be shot. Or hung. Geeze I was just trying to help and I'm getting picked on. Sometimes I wonder why I bother posting, it's obviously all rubbish anyway.


const char* testString=("abcd\x00 efghijk");

//Constructor. It's got a parameter. Wooooooo doggy
std::string arse(testString);

//Copy constructor, different from an assignment because the = sign is used at time of declaration
std::string biting = testString;

//Default constructor. Default cos there's nothing there.
std::string hussies;
//Overridden assignment. It's different from the copy constructor. It's an implementation of =
hussies = testString;





if you change that bloody line to
std::string nullTest = (testString);

it still compiles with the same result, because of the null teminating thingy there and that's all I was trying to get across. Which I think everyone agrees on anyway. So I don't know why I'm wound up. Sometimes I can *really* understand why people smoke pot. And why I shouldn't come here in the daytime.

Share this post


Link to post
Share on other sites
paulecoyote: I can see what you've been saying, but your last post is still incorrect:
const char* cStyleString=("abcd\x00 efghijk");

//Constructor: std::string(char const *)
std::string string(cStyleString);

//Not a copy-constructor. Exactly the same as the above constructor: std::string(char const *)
std::string string2 = cStyleString;

//Default constructor: std::string()
std::string string3;

//Assignment operator: std::string & operator=(char const *)
string3 = cStyleString;

//Copy constructor: std::string(std::string const &)
std::string string4(string2);

//Copy constructor. Same as above: std::string(std::string const &)
std::string string5 = string2;

//Copy assignment operator: std::string & operator=(std::string const &)
string5 = string3;

The point SiCrane has been trying to make is that once you have your null-containing text inside a string you can deal with it safely and easily:
std::string initialString("Hello.\0Goodbye.\0Hello", 21);
std::string appendedString = initialString + " again.";
// appendedString = "Hello.\0Goodbye.\0Hello again."
appendedString.replace(appendedString.find("Goodbye"), 7, "Farewell");
// appendedString = "Hello.\0Farewell.\0Hello again."
std::cout << appendedString << '\n';
// prints "Hello.\0Farewell.\0Hello again.\n"

Enigma

Share this post


Link to post
Share on other sites
I ran into a similar problem when I was writing my socket code. I was using std::strings for the socket buffer, and things got all messed up when I recieved a \x00. My solution was to use a std::vector instead, and when I knew I was dealing with string data, and I needed to convert to a string, I did this:

std::vector<char> theVector; // Vector filled with data
theVector.push_back(0);
const char* szData = &theVector[0]; // Vector data is guaranteed to be contiguous
std::string strData = szData;
theVector.erase(theVector.end()-1);


A bit hacky, but it works.

/me Waits for reasons why that's horrible code [smile]

Share this post


Link to post
Share on other sites
Quote:
Original post by Evil Steve
I ran into a similar problem when I was writing my socket code. I was using std::strings for the socket buffer, and things got all messed up when I recieved a \x00. My solution was to use a std::vector instead, and when I knew I was dealing with string data, and I needed to convert to a string, I did this:
*** Source Snippet Removed ***
A bit hacky, but it works.

/me Waits for reasons why that's horrible code [smile]


Well for one, it'll truncate the string when you reach your first null... ;-) (Of course, without knowing the context you've used this in that may very well be intentional)

I would use:

std::string strData( &(theVector[0]) , theVector.size() );

Also, here's a nifty little wrapper that can solve all your null termination woes:

template < size_t size > std::string null_inclusive_string( const char (& string)[ size ] ) {
return std::string( string , size );
}

std::string foo = null_inclusive_string( "foo\0bar\0baz\0Mary had a little lamb... FOR DINNER!!! MUAHAHAHAHAHAHA" );


It works with fixed-size character arrays, which includes string literals at least on my version of the GCC compiler. It's nifty :-). But of note:

char foo[1000] = "hi";
std::string bar = null_inclusive_string( foo );
assert( bar.size() == 1000 );

Which may not be your desired intention. But of course one can use:
char foo[] = "hi"; //foo fixed sized @ 3
std::string bar = null_inclusive_string( foo );
assert( bar.size() == 3 );

</my2¢>

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this