Sign in to follow this  

const char* not crashing?

This topic is 3100 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Why doesn't the following code crash?
#include <string>
#include <iostream>

const char* test()
{
  std::string s = "hallo";
  s += "!!";
  return s.c_str();
}

int main()
{
  const char* c = test();
  
  std::cout << c;
 
  std::cin.get(); 
}

At the end of test(), the std::string gets deleted. How is the memory handled so that the const char* keeps pointing to something valid, and when is that deleted? The string isn't one simple constant but two added ones, so it can't just point to the literal strings in the code.

Share this post


Link to post
Share on other sites
Quote:
Original post by Lode
Why doesn't the following code crash?

*** Source Snippet Removed ***

At the end of test(), the std::string gets deleted. How is the memory handled so that the const char* keeps pointing to something valid, and when is that deleted?

The string isn't one simple constant but two added ones, so it can't just point to the literal strings in the code.
That's undefined behaviour. I'd guess that the only reason it works is that either the memory occupied by the string data isn't filled with gibberish when it's freed (I.e. not using the debug CRT), or your STL implementation keeps small strings on the stack, and the stack isn't being reset with gibberish when the function ends.

Share this post


Link to post
Share on other sites
Quote:
Original post by Lode
Why doesn't the following code crash?

At the end of test(), the std::string gets deleted. How is the memory handled so that the const char* keeps pointing to something valid, and when is that deleted?

The string isn't one simple constant but two added ones, so it can't just point to the literal strings in the code.
std::string.c_str() typically returns a pointer to the string's internal buffer, and even though that buffer will have been freed, the memory won't have been overwritten yet. Given that it is a small string, and it has only just been freed, you probably won't trigger the operating system's memory protection features when you access that memory.

Obviously, you should never do this, and should return the std::string itself instead.

Share this post


Link to post
Share on other sites
Well it's for a C-style dll interface, I need to return it as a const char* :(

And appearantly other people have also been doing it with code like that what I posted, which is what made me wonder about it and post it here in the first place.

What were the developers of C smoking when they decided not to add useful strings to the language and do they realise what they have caused still 30 years later?

Share this post


Link to post
Share on other sites
Quote:
Original post by Lode
What were the developers of C smoking when they decided not to add useful strings to the language

They didn't smoke anything. They just thought "Hey, let's create a portable assembly language!".

Share this post


Link to post
Share on other sites
Quote:
Original post by Lode
Well it's for a C-style dll interface, I need to return it as a const char* :(


There's no really good solution for this. The best thing you can do is have the caller give you some buffer where you'll deposit the result. Make sure the caller also specifies the size of the buffer, so you know when to stop if the string is too long.

Alternatively, you can return a pointer to a malloc()ed block of memory that they can release using free(). In this case, document this fact as loudly as you can, because someone will make a memory leak out of it.

For completeness, you could also return a pointer to some global buffer (similar to making s static in your example. However, this has its own problems.

Share this post


Link to post
Share on other sites
Quote:
Original post by Lode
What were the developers of C smoking when they decided not to add useful strings to the language and do they realise what they have caused still 30 years later?

What were the developers of FORTRAN smoking? What were the developers of APL smoking? Who smoked so much to actually invent any 2nd generation programming language?

And what were you smoking when you decided to use C where C++ was intended?

Don't take serious, just a small, benevolent side blow, not sure what I smoked to write such BS ;)


C was always about being minimalistic and close to the metal, DevFred's phrase about C being a "portable assembly language" is not fetched from to far away.

Also, C was always for people "who know what they do", i.e. when the return a pointer-to-char then they know why and what will happen. Pretty much like when you are logged in as root on a Unix box, C shows no mercy for programmer-fail. That analogy was no accident, btw.

One last note: C really does not have strings, it only has array of char. And a bit of syntactic sugar to make initialization a bit more comfortable:

#include <stdio.h>
int main () {
char mem[] = {'h','e','l','l','o',',',' ','w','o','r','l','d','\0'};
char *str = mem;
...
puts (str);
}

Share this post


Link to post
Share on other sites
Quote:
Original post by Lode
What were the developers of C smoking when they decided not to add useful strings to the language and do they realise what they have caused still 30 years later?


C is a glorified PDP-11 assembler -- the creators, Kerninghan and Ritchie, didn't think: they just wanted to get UNIX done.

Share this post


Link to post
Share on other sites
Quote:
Original post by Lode
What were the developers of C smoking when they decided not to add useful strings to the language and do they realise what they have caused still 30 years later?

I think it's more like "hey, we need some way to handle text in our new language. I have an idea on how we can do so much better than FORTRAN's Hollerith variables....".

Just evaluate their design decisions in the context of what was available in the 1960s.

Share this post


Link to post
Share on other sites
Quote:
Original post by Bregma
Quote:
Original post by Lode
What were the developers of C smoking when they decided not to add useful strings to the language and do they realise what they have caused still 30 years later?

I think it's more like "hey, we need some way to handle text in our new language. I have an idea on how we can do so much better than FORTRAN's Hollerith variables....".

Just evaluate their design decisions in the context of what was available in the 1960s.


Upon first glimpse, those Hollerith variables look like labeled data sections in some assembler. Would that assumption be correct?

Share this post


Link to post
Share on other sites
Quote:
Original post by alvaro
There's no really good solution for this. The best thing you can do is have the caller give you some buffer where you'll deposit the result. Make sure the caller also specifies the size of the buffer, so you know when to stop if the string is too long.
This would be my preference.

Quote:
Original post by alvaro
Alternatively, you can return a pointer to a malloc()ed block of memory that they can release using free(). In this case, document this fact as loudly as you can, because someone will make a memory leak out of it.
Yeah, strdup() does this. It's quite ugly, and is a great way to leak memory if you're not careful, but it's certainly an option.

Quote:
Original post by alvaro
For completeness, you could also return a pointer to some global buffer (similar to making s static in your example. However, this has its own problems.
That'd be my second choice. The only problem really comes when you have multiple threads or you need to cope with any length of string.

Share this post


Link to post
Share on other sites
Quote:
Original post by Oxyd
Quote:
Original post by Lode
What were the developers of C smoking when they decided not to add useful strings to the language and do they realise what they have caused still 30 years later?


C is a glorified PDP-11 assembler -- the creators, Kerninghan and Ritchie, didn't think: they just wanted to get UNIX done.


They actually had UNIX up and running before C, solely written in Assembler. But then they wanted to have a portable Unix. Find here an interesting story.

Plus initially, Unix and C were planned to subvert the soviet union.

Share this post


Link to post
Share on other sites
Quote:
Original post by Lode
Well it's for a C-style dll interface, I need to return it as a const char* :(

And appearantly other people have also been doing it with code like that what I posted, which is what made me wonder about it and post it here in the first place.

What were the developers of C smoking when they decided not to add useful strings to the language and do they realise what they have caused still 30 years later?


#include <string.h>

there's lots of string functionality there.... get your facts right man

Share this post


Link to post
Share on other sites
Quote:
Original post by Evil Steve
Quote:
Original post by alvaro
There's no really good solution for this. The best thing you can do is have the caller give you some buffer where you'll deposit the result. Make sure the caller also specifies the size of the buffer, so you know when to stop if the string is too long.
This would be my preference.

Or you could take some influence from MS on this matter, where if the buffer is a null pointer the size is returned for you to create the buffer. In effect it requires calling the function twice.

Quote:

Original post by alvaro
Alternatively, you can return a pointer to a malloc()ed block of memory that they can release using free(). In this case, document this fact as loudly as you can, because someone will make a memory leak out of it.

The advantage of allocing and freeing on the same side of the dll is that it has less potential pit falls IMHO.

Share this post


Link to post
Share on other sites
Quote:
Original post by Lode

What were the developers of C smoking when they decided not to add useful strings to the language and do they realise what they have caused still 30 years later?


The thing is, when C was invented, it might as well have targeted machines with, I dunno, 1024 BYTES of working memory.

Implementing strings that might was up to half the string's size memory could be considered.... obscene.

Things seem retarded today, when you just do "Hello" + "World" + "!!", and dozens of string buffers and heap allocations get thrown around in there somewhere, in the nice 64-bit OS with exabytes of memory.

But it wasn't so long ago, that individual bytes really mattered. Overlays and 64k/640k boundary anyone?

Quote:
const char* test()
{
std::string s = "hallo";
s += "!!";
return s.c_str();
}


And since we are talking about "what were they smoking":

<clippy>It looks like you are trying to concatenate two strings. Would you like to use strcat</clippy>.

Share this post


Link to post
Share on other sites
Quote:
Original post by Evil Steve
Quote:
Original post by alvaro
For completeness, you could also return a pointer to some global buffer (similar to making s static in your example. However, this has its own problems.
That'd be my second choice. The only problem really comes when you have multiple threads or you need to cope with any length of string.

The other problem is that an unsuspecting user might try something like this:
#include <stdio.h>
#include <ctype.h>

char const * capitalize(char const *s) {
static char buffer[1024];
for(int i=0; s[i]!='\0' && i<1023; ++i)
buffer[i] = toupper(s[i]);
s[1023]='\0';
return buffer;
}

int main() {
/* Prints `HILLARY HILLARY HILLARY' (with my compiler, but there's no guarantee) */
printf("%s %s %s\n",
capitalize("Hillary"),
capitalize("Rodham"),
capitalize("Clinton"));
}



Share this post


Link to post
Share on other sites
Quote:
Original post by alvaro
Quote:
Original post by Evil Steve
Quote:
Original post by alvaro
For completeness, you could also return a pointer to some global buffer (similar to making s static in your example. However, this has its own problems.
That'd be my second choice. The only problem really comes when you have multiple threads or you need to cope with any length of string.

The other problem is that an unsuspecting user might try something like this:
*** Source Snippet Removed ***
Ah yes - I forgot about that one.

Share this post


Link to post
Share on other sites
Quote:
Original post by Antheus
Quote:
Original post by Lode

What were the developers of C smoking when they decided not to add useful strings to the language and do they realise what they have caused still 30 years later?


The thing is, when C was invented, it might as well have targeted machines with, I dunno, 1024 BYTES of working memory.


The PDP-11 had an 16bit address space. A later model allowed for up to 4MiB. The KA10 processor on PDP-10 already allowed for up to 1152 kilobytes of memory.

And from http://cm.bell-labs.com/cm/cs/who/dmr/hist.html:
Quote:

[note from me: they got the PDP-11 in early 1970]

The first PDP-11 system

Once the disk arrived, the system was quickly completed. In internal structure, the first version of Unix for the PDP-11 represented a relatively minor advance over the PDP-7 system; writing it was largely a matter of transliteration. For example, there was no multi-programming; only one user program was present in core at any moment. On the other hand, there were important changes in the interface to the user: the present directory structure, with full path names, was in place, along with the modern form of exec and wait, and conveniences like character-erase and line-kill processing for terminals. Perhaps the most interesting thing about the enterprise was its small size: there were 24K bytes of core memory (16K for the system, 8K for user programs), and a disk with 1K blocks (512K bytes). Files were limited to 64K bytes.

Share this post


Link to post
Share on other sites
Quote:
Original post by phresnel

Quote:
Perhaps the most interesting thing about the enterprise was its small size: there were 24K bytes of core memory (16K for the system, 8K for user programs), and a disk with 1K blocks (512K bytes). Files were limited to 64K bytes.


Ok, so I was off by a factor of 8.

Share this post


Link to post
Share on other sites
Quote:
Original post by Antheus
Quote:
Original post by phresnel

Quote:
Perhaps the most interesting thing about the enterprise was its small size: there were 24K bytes of core memory (16K for the system, 8K for user programs), and a disk with 1K blocks (512K bytes). Files were limited to 64K bytes.


Ok, so I was off by a factor of 8.


Unfortunately, that's not insignificant (heh we are having fun [smile]). For example, I'd really like to have a 32 core processor. Also, I think you can count the whole 24K, or at least 16K, as C was used for userland as well as kernel country (the whole purpose of C was to rewrite Unix in the beginning). So yay, someone give me a 64 core CPU, or 256GiB of RAM. Also, this was only what was installed on the very first PDP-11; as said, addresses were 16bit long.

Share this post


Link to post
Share on other sites
Quote:
Original post by Atrix256
Quote:
Original post by Lode
when they decided not to add useful strings to the language


#include <string.h>

there's lots of string functionality there.... get your facts right man


Having "string functionality" is not remotely the same thing as having "useful strings". First off, the functionality has to be applied manually, and the string doesn't properly exist as an object. Second, it's a real pain to use.

Holding up string.h here is sort of like setting up a dog's dinner bowl in the kitchen, having your child fill it daily, and emptying it daily when noone is looking; and then claiming to have bought the child a dog. The functionality is there but the expected agent isn't, and it's not really all that useful.

Share this post


Link to post
Share on other sites
Quote:
Original post by phresnel
Unfortunately, that's not insignificant (heh we are having fun [smile]).


Yes, it is.

What matters is the "big enough" criteria. 1, 8, 24 or 64 kB is not "big enough".

4GB today however is.

There was that old joke about "copy the internet to the floppy". While at that time the mere notion of being able to carry the internet around with you seemed silly, today it's not. Internet doesn't fit on a floppy, but it fits into a shipping container.

Quote:
For example, I'd really like to have a 32 core processor.


So why not buy one?

Will you suddenly be able to solve bigger problems? Or just able to solve all of the existing ones up to 32 times faster.

The difference is that majority of hardware today is "good enough" for all the problems they encounter. Not only that, but NetBooks show that computers 5 years ago were good enough for most purposes.

But before then - they were not. Same rationale applies to some earlier design choices. It wasn't a matter of convenience, but simply of what was practical and possible. The assumption that entire working data set would fit into local storage was not a realistic consideration.

Share this post


Link to post
Share on other sites
Looks like an extended stack allocation lifetime issue. As I don't own a copy of the C++ standard, doing a bit of googling reveals GotW #88: A Candidate For the "Most Important const"
Quote:
Herb Sutter
Normally, a temporary object lasts only until the end of the full expression in which it appears. However, C++ deliberately specifies that binding a temporary object to a reference to const on the stack lengthens the lifetime of the temporary to the lifetime of the reference itself, and thus avoids what would otherwise be a common dangling-reference error.

Could it be this? :)

Share this post


Link to post
Share on other sites
Quote:
Original post by Naurava kulkuri
Could it be this? :)

No. That applies to const references being bound to non-reference return types, not pointers being bound to pointer return types.

Share this post


Link to post
Share on other sites

This topic is 3100 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this