Sign in to follow this  
szecs

string literals: how, when, where do (if) expire?

Recommended Posts

Is a string literal once defined/declared (what is the correct term), lasts all the time the program runs, or is overwritten 'randomly', or has its expiration some kinds of rules? Can this be done to 'collect' strings:
/*global*/ char *strings[100];
...
void Some_func(......)
{
...
    strings[i] = "blabla";
...
}
void Some_other_func(......)
{
...
    strings[k] = "blablabla";
...
}

void print()
{
    for(int i = 0; i < 100; i++)
        printfunc(strings[i]);
}
?

Share this post


Link to post
Share on other sites
From the C++ standard, 2.13.4/1:
Quote:
An ordinary string literal has type "array of n const char" and static storage duration

Objects with static storage duration last for the entire length of the program's execution.

Typically they are implemented by storing the character data in a read-only part of the executable's image.

Share this post


Link to post
Share on other sites
Quote:
Original post by mattd
Typically they are implemented by storing the character data in a read-only part of the executable's image.


Related to that:
Quote:
Original post by szecs

/*global*/ char *strings[100];
...
strings[i] = "blabla";
That's not valid to do, and your compiler should give you a warning. String literals are const, and assigning them to a non-const pointer is A Bad Thing.

(In before const char* vs char* const, etc [smile])

Share this post


Link to post
Share on other sites
Yes, I forgot to copy the const in front of that.

So with
...
const char *strings[100];
....

is OK, or not elegant or dangerous etc?

So literals will remain in memory even if I declared/defined them inside a function?

Share this post


Link to post
Share on other sites
Quote:
Original post by szecs
So literals will remain in memory even if I declared/defined them inside a function?

Yes.

Share this post


Link to post
Share on other sites
Quote:
Original post by mattd
From the C++ standard, 2.13.4/1:
Quote:
An ordinary string literal has type "array of n const char" and static storage duration

Objects with static storage duration last for the entire length of the program's execution.

Typically they are implemented by storing the character data in a read-only part of the executable's image.



I see. I always wondered how it would execute something like: char* bleh = "hello";, because of course, there's no 'new' operator! You're saying it essentially does, char* bleh = new char[]{'h', 'e', 'l', 'l', 'o'}; and simply deletes the memory used up at the end?

Share this post


Link to post
Share on other sites
Quote:
Original post by Zotoaster
Quote:
Original post by mattd
From the C++ standard, 2.13.4/1:
Quote:
An ordinary string literal has type "array of n const char" and static storage duration

Objects with static storage duration last for the entire length of the program's execution.

Typically they are implemented by storing the character data in a read-only part of the executable's image.



I see. I always wondered how it would execute something like: char* bleh = "hello";, because of course, there's no 'new' operator! You're saying it essentially does, char* bleh = new char[]{'h', 'e', 'l', 'l', 'o'}; and simply deletes the memory used up at the end?

No, that would imply that the character array is dynamically allocated on the heap, which it isn't. It's stored in the image of the executable, as in you could load up the EXE file in a hex editor and find your string (in some form or another). It's just loaded directly with the program.

Share this post


Link to post
Share on other sites
A program doesn't generally need to know if a pointer points to heap memory or static memory. A memory manager program might need to know that, but that would be due to programmer design. On 32 bit Windows each process has an address space 4 Gb in size. 4 Gb is the largest address that can fit into a 32 bit variable (2^32 == 4 Gb). The address space is virtual. A process isn't assigned 4 Gb of hardware memory. Most of that space is empty space. When a program is launched the operating system creates a process and the address space for that process and then maps the image of the program into that address space. The image of the program is the exe file. That's where the static variables are stored. The image is mapped rather than loaded, because the exe file isn't simply copied as is from the hard drive to memory. Sections of the exe file are loaded at specific intervals in the address space. This is a rough sketch of course.

Share this post


Link to post
Share on other sites
Quote:
Original post by m_switch
Is there really a difference between


const char* const foo = "bar";


and


static const char* const foo = "bar";


then?


I don't know about C++, but presume it's similar to C99 where the latter has file linkage (at global scope) whereas the former has global linkage at the same scope. (File linkage means the declaration is only linked with uses in the file where it occurs. Global linkage means uses in different files refer to the same variable.)

Share this post


Link to post
Share on other sites
Sorry, had problems with sluggish connection delaying my edits...

Inside a function, I don't think there's any semantic difference (since your non-static pointer was const). Maybe there's a difference in that if you take the address of the static variable its value is still valid outside function? Can someone clarify if static variables in a function work as globals in that regard? Realized there's a hole in my knowledge and I'm too lazy for language spec digging at 3 AM... :p

Share this post


Link to post
Share on other sites
Quote:
Original post by Beyond_Repair
Maybe there's a difference in that if you take the address of the static variable its value is still valid outside function?

Right.
Quote:
Can someone clarify if static variables in a function work as globals in that regard?

Both globals and static local variables have static storage duration. So they are valid throughout the lifetime of the program, that is, until you return from main or call std::exit.

Share this post


Link to post
Share on other sites
Quote:
Original post by m_switch
Is there really a difference between


const char* const foo = "bar";


and


static const char* const foo = "bar";


then?


For globals, const variables default to extern linkage in C and static linkage in C++. This means there would be no difference in C++ but, in C, the former could be accessed from any translation unit while the latter is restricted to the current translation unit.

For variables local to a function, let's say the next line is a return statement. Local variables without static storage duration won't exist once the function returns, so it's only valid to return a pointer to local variable if that variable has static storage duration.

Let's say we're going to "return foo;". Now, some cases for how foo was defined
#define foo "bar" // Case 1
const char* const foo = "bar"; // Case 2
static const char* const foo = "bar"; // Case 3
const char foo[] = "bar"; // Case 4
static const char foo[] = "bar"; // Case 5
Case 4 is the only one that doesn't work. Case 1 returns a pointer that came from a string literal; string literals have static storage duration so this is ok. Cases 2 and 3 also return pointers that came from a string literal so they're also ok. In case 5, the array has static storage duration so the pointer that's returned will be ok. In case 4, the array will cease to exist after the function returns so the pointer will no longer be valid once the function exits.

If we were going to "return &foo;", only case 3 is valid. Cases 1, 4, and 5 are trying to take the address of a naked value. Since the value isn't tied to a bit of memory (e.g. stored in a variable) it doesn't have an address. In case 2, foo won't exist once the function exits so the pointer won't be valid outside of the function.

Share this post


Link to post
Share on other sites
Thanks for your replies, but I'm still confused a bit:

Can I use the following safely?
const char *Strings[MAX_STRINGS];//global variable
index = 0; //global
...
void some_func(const *char input)
{
...
Strings[index++] = input;
...
}
void other_func()
{
...
some_func("blabla");
...
}
...
void main ()
{
...
some_func("some_string");
...
other_func();
...
print_all(Strings);
...
}


Assuming that always index<MAX_STRINGS of course

Share this post


Link to post
Share on other sites
Yes. Put more simply, the pointer you can get from a string literal will point to valid character data for the string for the length of the program.

Share this post


Link to post
Share on other sites
Quote:
Original post by Konfusius
You can cast away the constness of the static variable and assign it another value.


Casting away constness of a variable with static storage duration originally defined as const has undefined behavior.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this