Memory Allocate new

Started by
22 comments, last by monkeyboi 11 years, 9 months ago

Just because it's outside of the array range doesn't mean it's outside of a memory block allocated to your program as a whole.

I remember the program will automatically chech if it is out of bound, and if it is there will be an assertion or something. I also try
char tc[4] = "123";
//char tc[4] = "1234";failed
tc[5] = 0; // why this line could work??
Advertisement

[quote name='boogyman19946' timestamp='1342110453' post='4958460']
Just because it's outside of the array range doesn't mean it's outside of a memory block allocated to your program as a whole.

I remember the program will automatically chech if it is out of bound, and if it is there will be an assertion or something. I also try
char tc[4] = "123";
//char tc[4] = "1234";failed
tc[5] = 0; // why this line could work??

[/quote]


I think this is only because upon declaration the compiler KNOWS it only has 4 bytes in memory to use, so putting "1234" is actually 5 bytes since the zerobyte at the end.

But when you just put tc[5], you are directly dereferencing a pointer and the compiler is not aware of the size of the memory is has 'access' to. I'm guessing the compiler basically says "Oh i see only 4 bytes not 5, error".
But in the other case it just says "Oh put this byte in this memory address". it doesn't check to see if that is legal
Always improve, never quit.
Sorry, you are mistaken. In C++, there are not checked array accesses. std::vector has a member function "at" which does just that. Did you learn Java before? They have checked array accesses.

Also

char tc[4] = "1234";

does not work, because it has an implicit null byte at the end, it is equal to

char tc[4] = { '1', '2', '3', '4', '\0' };


This is an error, the compiler may detect. But later on, detection of out of bounds accesses is hard (and in most cases impossible) for the compiler.

Also array indices for a 4-element array range from 0 to 3. tc[4] = 0; is already an error, even though it will only sporadically be detected by crashing the program.

EDIT: arkane7 was quicker dry.png

If you need space for strings, consider using std::string instead.


Knowing and using std::* classes is not a replacement for actually understanding how the language works, and in particular how memory allocation works, particularly with a relatively low-level language such as C++. When he has a good understanding of the fundamentals, a basic understanding of the std::* classes and knowledge that he should use them should come along with it.

To the OP:

The reason that strlen is not returning the size of your array is simply due to the fact that that is not the purpose of strlen.

strlen returns the length of a C String, that is, a null-terminated character array.

You are allocationg 4 bytes of memory. However, that memory is coming out of the heap, and it is likely that there is "valid" memory after it (valid in that it won't crash, but undefined as per C++).

Since it is not initialized, it just has random junk. You are seeing strlen return 16. This is why:

| YOUR ARRAY |
[rnd][rnd][rnd][rnd][rnd][rnd][rnd][rnd][rnd][rnd][rnd][rnd][rnd][rnd][rnd][rnd][0]

rnd is any random non-zero value.

strlen has no knowledge of your array or its size, and only has knowledge of the pointer and that it must seek the null terminator. The fact that you have consistently gotten 16 is coincidental.

It could very well return this:

| YOUR ARRAY |
[rnd][rnd][0][rnd]

In which case, strlen would return 3.

The reason it could CRASH is that it could have allocated your memory at the edge of a page allocation, for instance. When you go outside the buffer, you are now reading unmapped memory, which is a page fault. There are other reasons too, simply put - don't read outside of memory you are aware of.
Ah see Thanks a lot every one

Knowing and using std::* classes is not a replacement for actually understanding how the language works, and in particular how memory allocation works, particularly with a relatively low-level language such as C++. When he has a good understanding of the fundamentals, a basic understanding of the std::* classes and knowledge that he should use them should come along with it.
thumb up lol

However, that memory is coming out of the heap, and it is likely that there is "valid" memory after it (valid in that it won't crash, but undefined as per C++).

So in fact my array is not in the heap?
any time you use the "new" or malloc (C-style) allocation methods, it goes onto the heap, contrast to putting it on the stack (any local variables for example, like int f =3)

So indeed your array IS on the heap, since you used new. What I think Ameise was getting at is that since it is on the heap there is less chance of encountering important data, such as things like returning addresses (the place to go back to after a function call for example) which are stored on the stack (if you overwrite a return address it may very well cause a page fault and crash the program)

Edit: the point is that you do not want to go out of bounds of any array or invalid memory. Even if you overwrite harmless data, you risk the opportunity to contaminate either your own program executing, the operating system, or other programs. This is why std::string is much better alternative to C-style strings (char[]) because you don't directly deal with pointers. Also it has its own size() function, as well a multitude of manipulating functions. All in all std::string is more flexible, useful, and easier.
Always improve, never quit.

any time you use the "new" or malloc (C-style) allocation methods, it goes onto the heap, contrast to putting it on the stack (any local variables for example, like int f =3)

So indeed your array IS on the heap, since you used new. What I think Ameise was getting at is that since it is on the heap there is less chance of encountering important data, such as things like returning addresses (the place to go back to after a function call for example) which are stored on the stack (if you overwrite a return address it may very well cause a page fault and crash the program)

Edit: the point is that you do not want to go out of bounds of any array or invalid memory. Even if you overwrite harmless data, you risk the opportunity to contaminate either your own program executing, the operating system, or other programs. This is why std::string is much better alternative to C-style strings (char[]) because you don't directly deal with pointers. Also it has its own size() function, as well a multitude of manipulating functions. All in all std::string is more flexible, useful, and easier.


My other point still stands in that regards, though, in that he shouldn't be using std::string until he understands CStrings. In regards to the standard lib, I don't usually recommend using things until you have an understanding of how they work.
I first came accross std::string before i learned about char arrays. This was way back in HS in intro programming. But i feel you are right he needs to understand pointers, allocation, and buffer limitations.

Key thinks that monkeyboi needs to know about C-style strings (aka char arrays):

  1. There must be a zerobyte/null byte inside that determines the end of the string. '\0' is the ASCII symbol for it, but 0 is also fine. When using strlen it counts the number of characters up until it find this zerobyte. This is why he had odd answers since the char[] was undefined upon initialization. When you put char[4] tc = "123"; you are really putting '1', '2', '3', '\0', filling it up; this extra step of putting the zerobyte at the end is just a feature of c++ (and C i believe), but only when you put var = "___", if you do it by individual bytes (var[4] = 'b') you MUST put a zerobyte in manually (tc[4] = '\0')
  2. Even if it has a zerobyte, never access/overwrite an element outside of its bounds (dont ever use tc[-2] or tc[n], or any larger number, only 0->n-1 ).
  3. Keep in mind the compiler will not warn you if you go out-of-bounds, since it doesn't really know the size of the array, only in the case of declaration and putting something too big will it ever tell you something (char [4] tc = "1234", was too big since '\0' is a fifth additional byte added on)
  4. When you use new, it goes to the heap in the CPU memory, which is a special place for dynamically allocated memory, or run-time(?) allocation.
  5. Always remember that char[] is still a pointer, as are all arrays. tc itself is the pointer to the first element of the array, as in the case above *tc would be '1'; *(tc+1) is '2', *(tc+2) is '3' and *(tc+3) is '\0'. When you use tc[0] or tc[2], it is just a "shorter" way of doing *(tc+2).
  6. char* and char[] are in many ways equivalent, just char[] is explicity announcing its array property, while char* could just be a pointer to a single char or possibly pointing to the beginning of a char array

edit: i put this in a post below but this needs to be mentioned along with this
----other things about memory management:

  • When using new, as stated before, it goes to the heap. If you allocate too many items, you will run out of memory unless you deallocate them. When you continually allocate memory but never free it, this is called a Memory Leak
  • To deallocate, you use the delete key word.
    say I put SomeClass* x = new SomeClass(); to deallocate it just use delete x; (keep in mind this calls the destructor for SomeClass and frees the memory that x was pointing to, allowing later allocations to use that memory)
    but if you allocate an array you have to do something else; in the case of char* ca = new char[3]; you will have to use delete [] ca;
Always improve, never quit.
I'm wondering why I was downvoted without any comments explaining what was wrong with what I wrote sad.png


I first came accross std::string before i learned about char arrays. This was way back in HS in intro programming. But i feel you are right he needs to understand pointers, allocation, and buffer limitations.

Key thinks that monkeyboi needs to know about C-style strings (aka char arrays):

  1. There must be a zerobyte/null byte inside that determines the end of the string. '\0' is the ASCII symbol for it, but 0 is also fine. When using strlen it counts the number of characters up until it find this zerobyte. This is why he had odd answers since the char[] was undefined upon initialization. When you put char[4] tc = "123"; you are really putting '1', '2', '3', '\0', filling it up; this extra step of putting the zerobyte at the end is just a feature of c++ (and C i believe)
  2. Even if it has a zerobyte, never access/overwrite an element outside of its bounds (dont ever use tc[-2] or tc[n], or any larger number, only 0->n-1 ).
  3. Keep in mind the compiler will not warn you if you go out-of-bounds, since it doesn't really know the size of the array, only in the case of declaration and putting something too big will it ever tell you something (char [4] tc = "1234", was too big since '\0' is a fifth additional byte added on)
  4. When you use new, it goes to the heap in the CPU memory, which is a special place for dynamically allocated memory, or run-time(?) allocation.
  5. Always remember that char[] is still a pointer, as are all arrays. tc itself is the pointer to the first element of the array, as in the case above *tc would be '1'; *(tc+1) is '2', *(tc+2) is '3' and *(tc+3) is '\0'. When you use tc[0] or tc[2], it is just a "shorter" way of doing *(tc+2).
  6. char* and char[] are in many ways equivalent, just char[] is explicity announcing its array property, while char* could just be a pointer to a single char or possibly pointing to the beginning of a char array



Regarding 4, only if he uses the standard non-placement new operator. If it is overloaded, everything's up in the air. Placement new just uses whatever memory you're pointing to.

Regarding 5, 2[tc] is also '3' smile.png I absolutely hate that syntax, though.
Why were you downvoted??? you were spot on in everything you mentioned.

oh and yeah i completely forgot about 2[var] syntax. I never use it since its confusing.

What do you mean overloading new? I'm not sure i've come accross that use of new



Also monkeyboi other things about memory management:

  • When using new, as stated before, it goes to the heap. If you allocate too many items, you will run out of memory unless you deallocate them. When you continually allocate memory but never free it, this is called a Memory Leak
  • To deallocate, you use the delete key word.
    say I put SomeClass* x = new SomeClass(); to deallocate it just use delete x; (keep in mind this calls the destructor for SomeClass and frees the memory that x was pointing to, allowing later allocations to use that memory)
    but if you allocate an array you have to do something else; in the case of char* ca = new char[3]; you will have to use delete [] ca;
Always improve, never quit.

This topic is closed to new replies.

Advertisement