Sign in to follow this  
jeff8j

c++ string pointer to char*

Recommended Posts

I am trying to avoid making an unnecessary copy of my data as I believe string->c_str() would do
so im trying to point directly to the string data like &string[0]
but my problem is I have a string pointer like this

[code]
void somefunction( std::string* data ){
someotherfunction( data->c_str() ); //needs char*
}
[/code]

how can I do the equivalent of &data[0] with the pointer?

Share this post


Link to post
Share on other sites
Your code should work as long as [font=courier new,courier,monospace]someotherfunction[/font] takes a [font=courier new,courier,monospace]const char*[/font].

If the problem is that it takes a [font=courier new,courier,monospace]char*[/font] (but does promise not to write to the buffer), then you can use the const-correctness hack:[code]someotherfunction( const_cast<char*>(data->c_str()) );//C++ style
someotherfunction( (char*)(data->c_str()) );//or C-style casting[/code]

Share this post


Link to post
Share on other sites
@yewbie Im not quite sure what your doing there and it seems like your still make duplicates of the string

@hodgman It works width the ->c_str() but im trying to avoid making an extra copy of the data thats why I was trying to point directly to the data

Now that im looking at everything im not sure c_str() is causing the extra data I think it might of been me so I guess going up a step does c_str() allocate any memory or is it a pointer to the data?

Share this post


Link to post
Share on other sites
[quote name='Hodgman' timestamp='1348150523' post='4982032']
Your code should work as long as [font=courier new,courier,monospace]someotherfunction[/font] takes a [font=courier new,courier,monospace]const char*[/font].

If the problem is that it takes a [font=courier new,courier,monospace]char*[/font] (but does promise not to write to the buffer), then you can use the const-correctness hack:[code]someotherfunction( const_cast<char*>(data->c_str()) );//C++ style
someotherfunction( (char*)(data->c_str()) );//or C-style casting[/code]
[/quote]


Which by the way, should ring the warning bells and blare the sirons. Simple put, casting away const is perhaps the biggest code smells you will ever encounter.

Not to say it isn't the right way to go, just that it is something you need to be extremely weary off. It *IS* a code smell, it is just possible what stinks is in fact a badly coded library and not your code.

Share this post


Link to post
Share on other sites
Thanks yewbie
It was me that was doing the extra memory allocation I falsely blammed c_str() but that page brings up a question it says maybe it allocates memory maybe not. Im handling big data on little resources so maybe makes a copy is a bit of an alarm is there a way to do my original idea of &data[0] but for strings that way I can make sure it gets a direct pointer?

Im not worried about writing back to it it will remain a const so no worry there. Im just trying to guarantee 2 100MB strings and program and os overhead can run safely on a 512MB device.

Share this post


Link to post
Share on other sites
[quote name='jeff8j' timestamp='1348150935' post='4982036']
Now that im looking at everything im not sure c_str() is causing the extra data I think it might of been me so I guess going up a step does c_str() allocate any memory or is it a pointer to the data?
[/quote]
In C++11, c_str() and data() are the same, and are just a pointer to the data (not a pointer to a copy of the data). Calling these functions is O(1). In C++03, however, c_str() may return a copy of the internal buffer (iirc, C++03 didn't require strings to be stored in a contiguous buffer like in C++11). For all practical purposes, chances are your compiler (if using C++03) does use a contiguous internal buffer and that c_str() just returns a pointer to it. Note that &data()[0] is not safe! There's no guarantee the internal string is null-character terminated (in C++03), plus (again, iirc) there's no guarantee that it points to a contiguous buffer. However, most implementations will use a contiguous array, so just using c_str() should be good enough and should never create a copy (unless you're working on an exotic system or with a weird compiler) (this is in C++03; C++11 is of course as I previously mentioned). Edited by Cornstalks

Share this post


Link to post
Share on other sites
Thanks Cornstalks thats good news now I would assume its c++11 not sure about the cross compiler though but ill give it the benefit of the doubt for now.

@SiCrane Thats what I was thinking but I get the warning warning: cast to pointer from integer of different size and a seg fault when running

Thanks everyone its much clearer in my head now everything is working with the c_str()

Share this post


Link to post
Share on other sites
One more thing to keep in mind with c_str()... DO NOT SAVE A COPY.

Period, ever. Nope, very zilch never. Got it?

A c_str() pointer dies with the containing object, leaving your reference dangling like a hand grenade waiting to explode. Also, any non-const method call on the originating std::string object will invalidate your pointer.

Therefore, always request a new c_str().

Share this post


Link to post
Share on other sites
In your original post only pointers are being copied. However c_str() can create a new internal cstring that IS guaranteed to be null terminated. Using that function is likely causing the extra memory usage. And as it was said before, any modification of the std::string or it's deletion will result in an the returned pointer from c_str() to be an invalid dangling pointer (. There is no way around this with std::string.c_str(). If you want other functionality such as a class that holds one and only one c string inside it then write your own, it isn't hard.

If you want to keep the value returned from c_str() you will need to copy it yourself.

[source lang="cpp"]

std::string mystr = "Hello world!";

const char *cstr = mystr.c_str();

char *copiedCStr = new char[strlen(cstr) + 1];

strcpy(copiedCStr, cstr);
[/source] Edited by EddieV223

Share this post


Link to post
Share on other sites
[quote name='EddieV223' timestamp='1348202840' post='4982237']
[b]However c_str() can create a new internal cstring[/b] that IS guaranteed to be null terminated.
[/quote]
In C++03, yes. In C++11, no.

[quote name='EddieV223' timestamp='1348202840' post='4982237']
Using that function is likely causing the extra memory usage.
[/quote]
I actually find that very unlikely. The popular implementations that I know of certainly don't. It's technically possible, yes (if he's using C++03), but it's unlikely he's working with an implementation that does so, just because most implementations (both C++03 and C++11) don't create a whole copy of the string when c_str() is called. As the OP stated in a later post: "It was me that was doing the extra memory allocation I falsely blammed c_str()"

[quote name='EddieV223' timestamp='1348202840' post='4982237']
If you want to keep the value returned from c_str() you will need to copy it yourself.
[code]
std::string mystr = "Hello world!";

const char *cstr = mystr.c_str();

char *copiedCStr = new char[mystr.size() + 1]; // no need to call an O(n) function when an O(1) function is available

strcpy(copiedCStr, cstr);
[/code]
[/quote]
Made a small improvement. Edited by Cornstalks

Share this post


Link to post
Share on other sites
@Serapth Im just using it for reading anyways its a encryption function so 1pointer for input and 1 for output is how im doing it

@Cornstalks your right it was my fault its not allocating any noticeable difference but it turns out im not using c++11 as when I try the c++11 threads it gives me something saying thats coming in the future so im still concerned that on various compiers/cross compiling and older version of c++ is used

To try to be on the safe side (im sure im not guaranteed anythingbut lets me sleep better) after getting some sleep I was able to setup the pointer with
(const unsigned char*)&(*data)[0]
That works just fine and fingers crossed will never allocate any memory across the board

Share this post


Link to post
Share on other sites
I would not do that. Ever. Any sane implementation on which [code](const unsigned char*)&(*data)[0][/code] works will not allocate any extra memory because c_str() can already just return the internal buffer. Any implementation which does allocate memory will not have a contiguous memory block for its string and [code](const unsigned char*)&(*data)[0][/code] will just cause horrible problems.

Share this post


Link to post
Share on other sites
As I and others have said, &(*data)[0] is horribly unsafe. Plus, it's an obvious enough operation that if it is a safe operation, you can expect your Standard Library implementors to have implemented c_str() as something like that. Your Standard Library implementors are actually brilliant people, and they won't unnecessarily copy a string if it can be avoided.

If you're so concerned about the performance of c_str(), why not just look at how it's defined and make sure it doesn't create a copy? My gcc implementation defines [font=courier new,courier,monospace]std::basic_string::c_str()[/font] as just [font=courier new,courier,monospace]_M_data()[/font], and [font=courier new,courier,monospace]_M_data()[/font] is defined as [font=courier new,courier,monospace]return _M_dataplus._M_p;[font=arial,helvetica,sans-serif], so it absolutely does not create a copy.[/font][/font]

[font=courier new,courier,monospace][font=arial,helvetica,sans-serif]Just check your implementation of c_str() if you're paranoid, and only use implementations that don't create a copy. But again, as I've said, it's unlikely you'll find an implementation that creates a copy of the data if you're not using an exotic system/compiler (and if you were, you'd probably already be aware of things like this).[/font][/font]

Share this post


Link to post
Share on other sites
@cornstalks Im not too concerned anymore knowing its not at least in my current system but I would almost rather it break than start swapping out and become horribly slow. Im just worried on cross compiling for arm devices since im sure they put far less time into the implementations than x86 but ill cross that path if it ever arises.

@yewbie lol pretty much I dont have to worry about null terminations and keeping the size for binary data, not that its difficult just why do it when strings are cleaner and easier well there is this scenario when I would have more control but I dont believe I need it now that we covered everything.

Share this post


Link to post
Share on other sites
[quote name='jeff8j' timestamp='1348248277' post='4982438']
@cornstalks Im not too concerned anymore knowing its not at least in my current system but I would almost rather it break than start swapping out and become horribly slow.
[/quote]
The problem is if it doesn't break---at first. It could work for you and your test cases, and then for a set of customers horribly fail. Plus, even if it never breaks, it introduces a potential security flaw.

[quote name='jeff8j' timestamp='1348248277' post='4982438']
Im just worried on cross compiling for arm devices since im sure they put far less time into the implementations than x86 but ill cross that path if it ever arises.
[/quote]
I would still be very surprised if an ARM implementation had c_str() create and return a copy. It's more work for the programmer who writes the std::basic_string implementation to create a string class that uses multiple allocations instead of one contiguous allocation, and it's even more work to create a copy than to return a pointer to it. So if they're lazy, c_str() is, in my opinion, even more likely to not create a copy.

Share this post


Link to post
Share on other sites
[quote name='Hodgman' timestamp='1348150523' post='4982032']
Your code should work as long as [font=courier new,courier,monospace]someotherfunction[/font] takes a [font=courier new,courier,monospace]const char*[/font].

If the problem is that it takes a [font=courier new,courier,monospace]char*[/font] (but does promise not to write to the buffer), then you can use the const-correctness hack:[code]someotherfunction( const_cast<char*>(data->c_str()) );//C++ style
someotherfunction( (char*)(data->c_str()) );//or C-style casting[/code]
[/quote]

You beat me to the chase Hodge. Const keyword guarantees it won't change unless specified.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this