Memory Allocate new

monkeyboi · 2012-07-15T13:06:40

char* ta = new char[4]; int size = strlen(ta); char* tb = new char[4]; Set a break point and monitor the memory address and context. + ta 0x02059318 "????????" char * + tb 0x02059358 "????????" char * size 16 int That is not what I am expecting. I only allocate 4 char space(4 bytes) for ta pointing to. But it has 8 or more bytes. The interval between ta and tb is 40 in hexadecimal which is 64 decimal. How could I calculate this to byte? and Why I got 16 using strlen? I am using a 64 bit system If you don't mind please explain it step by step. Thanks in advance Jerry

For Beginners

Started by monkeyboi July 12, 2012 02:45 PM

22 comments, last by monkeyboi 11 years, 9 months ago

monkeyboi

188

Author

July 12, 2012 05:18 PM

Just because it's outside of the array range doesn't mean it's outside of a memory block allocated to your program as a whole.

I remember the program will automatically chech if it is out of bound, and if it is there will be an assertion or something. I also try



char tc[4] = "123";

//char tc[4] = "1234";failed

tc[5] = 0; // why this line could work??

arkane7

213

July 12, 2012 05:37 PM

[quote name='boogyman19946' timestamp='1342110453' post='4958460']
Just because it's outside of the array range doesn't mean it's outside of a memory block allocated to your program as a whole.

I remember the program will automatically chech if it is out of bound, and if it is there will be an assertion or something. I also try



char tc[4] = "123";

//char tc[4] = "1234";failed

tc[5] = 0; // why this line could work??

[/quote]

I think this is only because upon declaration the compiler KNOWS it only has 4 bytes in memory to use, so putting "1234" is actually 5 bytes since the zerobyte at the end.

But when you just put tc[5], you are directly dereferencing a pointer and the compiler is not aware of the size of the memory is has 'access' to. I'm guessing the compiler basically says "Oh i see only 4 bytes not 5, error".
But in the other case it just says "Oh put this byte in this memory address". it doesn't check to see if that is legal

Always improve, never quit.

rnlf_in_space

1,918

July 12, 2012 05:39 PM

Sorry, you are mistaken. In C++, there are not checked array accesses. std::vector has a member function "at" which does just that. Did you learn Java before? They have checked array accesses.

Also



char tc[4] = "1234";

does not work, because it has an implicit null byte at the end, it is equal to



char tc[4] = { '1', '2', '3', '4', '\0' };

This is an error, the compiler may detect. But later on, detection of out of bounds accesses is hard (and in most cases impossible) for the compiler.

Also array indices for a 4-element array range from 0 to 3. tc[4] = 0; is already an error, even though it will only sporadically be detected by crashing the program.

EDIT: arkane7 was quicker

Ameise

1,148

July 12, 2012 05:41 PM

If you need space for strings, consider using std::string instead.

Knowing and using std::* classes is not a replacement for actually understanding how the language works, and in particular how memory allocation works, particularly with a relatively low-level language such as C++. When he has a good understanding of the fundamentals, a basic understanding of the std::* classes and knowledge that he should use them should come along with it.

To the OP:

The reason that strlen is not returning the size of your array is simply due to the fact that that is not the purpose of strlen.

strlen returns the length of a C String, that is, a null-terminated character array.

You are allocationg 4 bytes of memory. However, that memory is coming out of the heap, and it is likely that there is "valid" memory after it (valid in that it won't crash, but undefined as per C++).

Since it is not initialized, it just has random junk. You are seeing strlen return 16. This is why:

| YOUR ARRAY |
[rnd][rnd][rnd][rnd][rnd][rnd][rnd][rnd][rnd][rnd][rnd][rnd][rnd][rnd][rnd][rnd][0]

rnd is any random non-zero value.

strlen has no knowledge of your array or its size, and only has knowledge of the pointer and that it must seek the null terminator. The fact that you have consistently gotten 16 is coincidental.

It could very well return this:

| YOUR ARRAY |
[rnd][rnd][0][rnd]

In which case, strlen would return 3.

The reason it could CRASH is that it could have allocated your memory at the edge of a page allocation, for instance. When you go outside the buffer, you are now reading unmapped memory, which is a page fault. There are other reasons too, simply put - don't read outside of memory you are aware of.

monkeyboi

188

Author

July 12, 2012 07:24 PM

Ah see Thanks a lot every one

Knowing and using std::* classes is not a replacement for actually understanding how the language works, and in particular how memory allocation works, particularly with a relatively low-level language such as C++. When he has a good understanding of the fundamentals, a basic understanding of the std::* classes and knowledge that he should use them should come along with it.

thumb up lol

However, that memory is coming out of the heap, and it is likely that there is "valid" memory after it (valid in that it won't crash, but undefined as per C++).

So in fact my array is not in the heap?

arkane7

213

July 12, 2012 08:27 PM

any time you use the "new" or malloc (C-style) allocation methods, it goes onto the heap, contrast to putting it on the stack (any local variables for example, like int f =3)

So indeed your array IS on the heap, since you used new. What I think Ameise was getting at is that since it is on the heap there is less chance of encountering important data, such as things like returning addresses (the place to go back to after a function call for example) which are stored on the stack (if you overwrite a return address it may very well cause a page fault and crash the program)

Edit: the point is that you do not want to go out of bounds of any array or invalid memory. Even if you overwrite harmless data, you risk the opportunity to contaminate either your own program executing, the operating system, or other programs. This is why std::string is much better alternative to C-style strings (char[]) because you don't directly deal with pointers. Also it has its own size() function, as well a multitude of manipulating functions. All in all std::string is more flexible, useful, and easier.

Always improve, never quit.

Ameise

1,148

July 12, 2012 08:32 PM

any time you use the "new" or malloc (C-style) allocation methods, it goes onto the heap, contrast to putting it on the stack (any local variables for example, like int f =3)

So indeed your array IS on the heap, since you used new. What I think Ameise was getting at is that since it is on the heap there is less chance of encountering important data, such as things like returning addresses (the place to go back to after a function call for example) which are stored on the stack (if you overwrite a return address it may very well cause a page fault and crash the program)

Edit: the point is that you do not want to go out of bounds of any array or invalid memory. Even if you overwrite harmless data, you risk the opportunity to contaminate either your own program executing, the operating system, or other programs. This is why std::string is much better alternative to C-style strings (char[]) because you don't directly deal with pointers. Also it has its own size() function, as well a multitude of manipulating functions. All in all std::string is more flexible, useful, and easier.

My other point still stands in that regards, though, in that he shouldn't be using std::string until he understands CStrings. In regards to the standard lib, I don't usually recommend using things until you have an understanding of how they work.

arkane7

213

July 12, 2012 08:51 PM

I first came accross std::string before i learned about char arrays. This was way back in HS in intro programming. But i feel you are right he needs to understand pointers, allocation, and buffer limitations.

Key thinks that monkeyboi needs to know about C-style strings (aka char arrays):

There must be a zerobyte/null byte inside that determines the end of the string. '\0' is the ASCII symbol for it, but 0 is also fine. When using strlen it counts the number of characters up until it find this zerobyte. This is why he had odd answers since the char[] was undefined upon initialization. When you put char[4] tc = "123"; you are really putting '1', '2', '3', '\0', filling it up; this extra step of putting the zerobyte at the end is just a feature of c++ (and C i believe), but only when you put var = "___", if you do it by individual bytes (var[4] = 'b') you MUST put a zerobyte in manually (tc[4] = '\0')
Even if it has a zerobyte, never access/overwrite an element outside of its bounds (dont ever use tc[-2] or tc[n], or any larger number, only 0->n-1 ).
Keep in mind the compiler will not warn you if you go out-of-bounds, since it doesn't really know the size of the array, only in the case of declaration and putting something too big will it ever tell you something (char [4] tc = "1234", was too big since '\0' is a fifth additional byte added on)
When you use new, it goes to the heap in the CPU memory, which is a special place for dynamically allocated memory, or run-time(?) allocation.
Always remember that char[] is still a pointer, as are all arrays. tc itself is the pointer to the first element of the array, as in the case above *tc would be '1'; *(tc+1) is '2', *(tc+2) is '3' and *(tc+3) is '\0'. When you use tc[0] or tc[2], it is just a "shorter" way of doing *(tc+2).
char* and char[] are in many ways equivalent, just char[] is explicity announcing its array property, while char* could just be a pointer to a single char or possibly pointing to the beginning of a char array

edit: i put this in a post below but this needs to be mentioned along with this
----other things about memory management:

When using new, as stated before, it goes to the heap. If you allocate too many items, you will run out of memory unless you deallocate them. When you continually allocate memory but never free it, this is called a Memory Leak
To deallocate, you use the delete key word.
say I put SomeClass* x = new SomeClass(); to deallocate it just use delete x; (keep in mind this calls the destructor for SomeClass and frees the memory that x was pointing to, allowing later allocations to use that memory)
but if you allocate an array you have to do something else; in the case of char* ca = new char[3]; you will have to use delete [] ca;

Always improve, never quit.

Ameise

1,148

July 12, 2012 09:08 PM

I'm wondering why I was downvoted without any comments explaining what was wrong with what I wrote

I first came accross std::string before i learned about char arrays. This was way back in HS in intro programming. But i feel you are right he needs to understand pointers, allocation, and buffer limitations.

Key thinks that monkeyboi needs to know about C-style strings (aka char arrays):

There must be a zerobyte/null byte inside that determines the end of the string. '\0' is the ASCII symbol for it, but 0 is also fine. When using strlen it counts the number of characters up until it find this zerobyte. This is why he had odd answers since the char[] was undefined upon initialization. When you put char[4] tc = "123"; you are really putting '1', '2', '3', '\0', filling it up; this extra step of putting the zerobyte at the end is just a feature of c++ (and C i believe)

Even if it has a zerobyte, never access/overwrite an element outside of its bounds (dont ever use tc[-2] or tc[n], or any larger number, only 0->n-1 ).

Keep in mind the compiler will not warn you if you go out-of-bounds, since it doesn't really know the size of the array, only in the case of declaration and putting something too big will it ever tell you something (char [4] tc = "1234", was too big since '\0' is a fifth additional byte added on)

When you use new, it goes to the heap in the CPU memory, which is a special place for dynamically allocated memory, or run-time(?) allocation.

Always remember that char[] is still a pointer, as are all arrays. tc itself is the pointer to the first element of the array, as in the case above *tc would be '1'; *(tc+1) is '2', *(tc+2) is '3' and *(tc+3) is '\0'. When you use tc[0] or tc[2], it is just a "shorter" way of doing *(tc+2).

char* and char[] are in many ways equivalent, just char[] is explicity announcing its array property, while char* could just be a pointer to a single char or possibly pointing to the beginning of a char array

Regarding 4, only if he uses the standard non-placement new operator. If it is overloaded, everything's up in the air. Placement new just uses whatever memory you're pointing to.

Regarding 5, 2[tc] is also '3'

I absolutely hate that syntax, though.

arkane7

213

July 12, 2012 09:20 PM

Why were you downvoted??? you were spot on in everything you mentioned.

oh and yeah i completely forgot about 2[var] syntax. I never use it since its confusing.

What do you mean overloading new? I'm not sure i've come accross that use of new

Also monkeyboi other things about memory management:

When using new, as stated before, it goes to the heap. If you allocate too many items, you will run out of memory unless you deallocate them. When you continually allocate memory but never free it, this is called a Memory Leak
To deallocate, you use the delete key word.
say I put SomeClass* x = new SomeClass(); to deallocate it just use delete x; (keep in mind this calls the destructor for SomeClass and frees the memory that x was pointing to, allowing later allocations to use that memory)
but if you allocate an array you have to do something else; in the case of char* ca = new char[3]; you will have to use delete [] ca;

Always improve, never quit.

Memory Allocate new

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Memory Allocate new

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines