Array and pointer question

Started by
19 comments, last by xeddiex 19 years, 4 months ago
No no no, this isn't an issue of accessing unallocated memory. Oluseyi already said it:

Quote:
The first example sets a pointer to your program's string table, which is arbitrarily stored (ie, you can't make any guarantees as to how or where they're stored; we usually expect it to be hard-coded at the top of the executable).


The string is in the executable, along with the rest of the application's code (which is in the form of x86 assembly code). Windows doesn't let you change the executable's data while you are actually running that executable (which is not an unreasonable rule).

And as was stated, it's a platform-specific thing. I've worked on platforms where you are allowed to modify string literals. (It's still a bad thing to do, though).
Advertisement
After careful reading I came to the conclusion that...
You cannot change char *str = "dog" (str[1] = 'x';) because the string is a "static" string literal.

Correct?
one..
Yes! That is correct.

char *str = "dog"; /* perfectly legal */
str[0] = 'c'; /* Illegal */

I am sure here where your program crashes because char *str is actually treated as const char *str. The only way you can change the contents is by copying it into a new "array".

That is it!
The whole array vs. pointer thing is confusing, and made worse because many tutorials say they're the same thing. My appologies since this has all been covered, but somewhat confusing and piecemeal. Here's my take. Given:

char *string1 = "Doggie";
char string2[7] = "Doggie";
char string3[] = "Doggie";
char string4[6] = "Doggie";
char string5[10] = "Doggie";

Now, string1 is a pointer to a string literal. I believe string literals are "const char *" in C++ and "char *" in C, but the behavior of modifying a string literal is undefined in both cases. What does this mean? Anything can happen when you try to modify it. In your case, this meant it crashed the program (it was in read only memory). In other cases, if might work fine. In yet other cases, it might cause other problems, like when you modify "Doggie" it also modifies "My Doggie" used elsewhere in the program (overlapping to save space). Do a search for "Death Station 9000" in comp.lang.c for an amusing look at undefined behavior (including nasal demons). Note, sizeof(string1) is likely 4 (that is, a pointer to char takes 4 bytes).

string2 is an array of 7 char's. This is equivalent to writing:
char string2[7] = {'D','o','g','g','i','e','\0'};
Note that initializing this way DOES include the terminating '\0'. sizeof(string2) is 7.

string3 is also an array of 7 char's. The length of the array is determined by the length of the initializer, which is 7 in this case (6 letters + 1 '\0'). This is equivalent to string2, but if you change the string you don't have to worry about changing the size of the array. sizeof(string3) is 7.

string4 is an array of 6 char's. This is equivalent to writing:
char string4[6] = {'D','o','g','g','i','e'};
Note that there is no terminating '\0', so this is not, technically, a string and will only cause problems if you pass it to functions expecting strings. My knowledge fails me on whether it's legal to make string4 an array of less than 6 char (while still initializing it with "Doggie"), but I believe it's legal and will just further truncate the string. sizeof(string4) is 6.

string5 is an array of 10 char's. Again, my knowledge fails me slightly. string5[7], string5[8], and string5[9] are either initialized to 0 (which is the same as '\0') or uninitialized. In any case, they are dead weight unless you intend to use them (string functions won't get passed the first '\0' in the array). sizeof(string5) is 10.

If you really want to understand the whole pointer vs. array thing, understand why sizeof(stringN) is what it is for each N. Keep in mind that sizeof(char) is defined to be 1. If we're talking int's instead of char's, sizeof(string1) will likely stay the same (that is, it will be 4), but all the others will have to be multiplied by sizeof(int) (which is likely equal to 4).
If it's like you say
Quote: Way Walker
"I believe string literals are "const char *" in C++ and "char *" in C, but the behavior of modifying a string literal is undefined in both cases. What does this mean? Anything can happen when you try to modify it. In your case, this meant it crashed the program (it was in read only memory)."


...then, char array[] = "doggie" would be a const string literal too!? If so, then why am I able to change a value of a element of a const string ie. array[3] = 'x'; (string is now, dogxie) /*NO CRASH*/ But can't do that with "char *str = "doggie" then, str[3] = 'x'; /*CRASH*/

I've read all of this: http://www.gamedev.net/reference/articles/article1697.asp and either it does'nt explain it or I'm plain dumb. God this is so messed up cause it's very confusing.

BTW Way Walker, I get the whole other stuff you wrote after the quote I quoted above.

p.s: I hope it's what dimensionX said.
one..
Quote:Original post by xeddiex
If it's like you say
Quote: Way Walker
"I believe string literals are "const char *" in C++ and "char *" in C, but the behavior of modifying a string literal is undefined in both cases. What does this mean? Anything can happen when you try to modify it. In your case, this meant it crashed the program (it was in read only memory)."


...then, char array[] = "doggie" would be a const string literal too!? If so, then why am I able to change a value of a element of a const string ie. array[3] = 'x'; (string is now, dogxie) /*NO CRASH*/ But can't do that with "char *str = "doggie" then, str[3] = 'x'; /*CRASH*/


In dimensionX's example, you have str with type "char *" which is pointing to a string literal which you should think of as having type "const char *". Modifying something that is const qualified usually produces undefined behavior, even if it's through a non-const pointer.

And I don't think you understood what I was saying, so my appologies for not being clear.
char myStr[] = "doggie";
is the same as doing
char myStr[7] = "doggie";
which means that myStr is an array of 7 elements in both cases. Doing
myStr[3] = 'x';
will change myStr to "dogxie" in both cases. However,
char *myStr = "doggie"; myStr[3] = 'x';
produces undefined behavior because you're modifying the string literal (in your case, this means it crashes).

Hmm... perhaps what's confusing you is the difference between [] when declaring a variable and [] in a prototype list? Consider:
void function1(char  arg1[])  {}void function2(char  arg2[7]) {}void function3(char *arg3)    {}int main(void) {  char  var1[]  = "doggie";  char  var2[7] = "doggie";  char *var3    = "doggie";  return 0;}

EDIT: arg1 is a pointer to char, arg2 is a pointer to char, arg3 is a pointer to char.
var1 is an array of 7 char, var2 is an array of 7 char, var3 is a pointer to char.

[Edited by - Way Walker on December 2, 2004 2:09:29 PM]
As numerous people have already pointed out

a) char name[] = "dog";
b) char *name = "dog";

Are fundemantally different. I wont go into the semantics..

But the short explanation is...

a) guarantees that access to the array will be in read/write memory.

b) guarantees nothing the string could end up in read only memory.. you may *sometimes* be able to modify the array other times you will get some form of access violation.


b) On x86 i think [probably wrong here] that the pointed to array ends up in the .DATA segment of the executable along with all other static string constants. ** if someone can put me straight on this i would be grateful **

As for (a) all i can say for certain is that the array will reside in some piece of memory that you can modify.

** AFAIK the best explanation of this and other such topics is by far the C Programming FAQ by Steve Summit



Quote:Original post by Way Walker
arg1 is a pointer to char, arg2 is an array of 7 char, arg3 is a pointer to char.
var1 is an array of 7 char, var2 is an array of 7 char, var3 is a pointer to char.

Actually, in your example, arg2 is a pointer to char as well. The 7 is meaningless.
Quote:Original post by Polymorphic OOP
Quote:Original post by Way Walker
arg1 is a pointer to char, arg2 is an array of 7 char, arg3 is a pointer to char.
var1 is an array of 7 char, var2 is an array of 7 char, var3 is a pointer to char.

Actually, in your example, arg2 is a pointer to char as well. The 7 is meaningless.


Yes yes, many appologies. I was thinking of
void function2(char arg2[][10]);
or
void function2(char (*arg2)[10]);
which are the solutions to another post from someone having trouble passing a multidimensional array to a function.
EDIT: Yeah, reference to array as well, but I'm used to C, so I don't tend to think of references. Anyway, if you're using C++, why aren't you using a vector or some other container?

Also, C99 has VLA's
void function2(size_t n, char arg2[n]);
which makes sizeof arg2 == n.
char *name = "dog";

That only allocates 3 positions (so index 0 - 2 is defined)... and when you put the null terminated thing... it died.

Hope that helps... I don't think it assigns the \0 by default.

This topic is closed to new replies.

Advertisement