Sign in to follow this  

Memory Allocate new

This topic is 2012 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

[CODE]
char* ta = new char[4];
int size = strlen(ta);
char* tb = new char[4];
[/CODE]
Set a break point and monitor the memory address and context.

+ ta 0x02059318 "????????" char *


+ tb 0x02059358 "????????" char *


size 16 int

That is not what I am expecting. I only allocate 4 char space(4 bytes) for ta pointing to. But it has 8 or more bytes. The interval between ta and tb is 40 in hexadecimal which is 64 decimal. How could I calculate this to byte? and Why I got 16 using strlen? I am using a 64 bit system

If you don't mind please explain it step by step.

Thanks in advance

Jerry

Share this post


Link to post
Share on other sites
When you allocate memory it is (usually) filled with random garbage. Accordingly, calling strlen on any just allocated pointer is undefined. It might return any value. It might crash.

When allocating memory there is also the need to store memory management information somewhere. Where and how much is implementation defined, but placing it in front on the allocated block is one way to do it. That aside, there is no guarantee two successive news will allocate memory that is anywhere close to each other. Assuming new will usually allocate memory in a linear fashion (by no means guaranteed), even right after program startup the runtime library might have already been doing some allocations and deallocations and the first new could be allocated by reusing a hole.

Edit: In summary, strlen is not a viable method to check the length of a memory block. It does something completely different, that is return the length of a C string. Details of memory management are impossible to answer without talking about a specific compiler and build settings. Edited by BitMaster

Share this post


Link to post
Share on other sites
Your array variable is a char*, not strictly 4 chars. Pointers on a 64-bit system are 8-byte aligned. (according to [url="http://en.wikipedia.org/wiki/Data_structure_alignment"]wikipedia[/url]) Edited by BCullis

Share this post


Link to post
Share on other sites
[quote name='BitMaster' timestamp='1342105138' post='4958424']
When you allocate memory it is (usually) filled with random garbage
[/quote]
Ok should the garbage be filled inside of the space that is just allocated. Like the example I put earlier ta 0x02059318 "????????" char * there are 8 random charactors in ta. But I only allocate 4 chars using new char[4].[quote name='BCullis' timestamp='1342105331' post='4958425']
Your array variable is a char*, not strictly 4 chars.
[/quote]
I know in 64 bit system char* pointer itself is 8 byte. I want to use it point to a 4 chars space.

Share this post


Link to post
Share on other sites
[quote name='BitMaster' timestamp='1342105138' post='4958424']
there is no guarantee two successive news will allocate memory that is anywhere close to each other.
[/quote]
em this is quite right. but I have tested it several times all the results show the interval is 40 in hexadecimal. I know this can not promise anything but still can explain something.

Share this post


Link to post
Share on other sites
It was pure coincidence, that strlen returned 16. In this case, it could have returned zero, a million or even crashed your program. strlen reads the all the memory from the address you pass as its argument until the first null byte and returns the number of bytes it read. You allocated ta and put nothing in. strlen does not know how many characters you allocated. It's up to you to make sure, you never use more memory than you allocated. You get 8 random characters (which more likely are 16 random chars, just what your strlen returned, only most of them unprintable), because the first null byte in your memory is found after 8 (or 16) bytes.

If you write a null byte to the address pointed to by ta, strlen will return 0, and you the debugger will as well display an empty string. Try it:

[CODE]
char* ta = new char[4];
ta[0] = 0;
int size = strlen(ta);
[/CODE]

Share this post


Link to post
Share on other sites
[quote name='rnlf' timestamp='1342107235' post='4958442']
If you write a null byte to the address pointed to by ta, strlen will return 0, and you the debugger will as well display an empty string.
[/quote]
ok now I know strlen is very unreliable. And I try the code like this
[CODE]
char* ta = new char[4];
ta[5] = 0;
int size = strlen(ta);
[/CODE]
ta[5] definitely beyond the original bound, but it still works and gets the result like

+ ta 0x021b9318 "???" char *

Share this post


Link to post
Share on other sites
Just because it's outside of the array range doesn't mean it's outside of a memory block allocated to your program as a whole. You've probably overwritten memory that was allocated to your program by the OS.

Share this post


Link to post
Share on other sites
strlen is not unreliable, it's just not meant to be used for what you are trying to use it.

But just as boogyman says, try to use standard library classes as much as possible.

Avoid "new" like the devil and if you know you need it, read up on smart pointers before.

Share this post


Link to post
Share on other sites
[quote name='boogyman19946' timestamp='1342110453' post='4958460']
Just because it's outside of the array range doesn't mean it's outside of a memory block allocated to your program as a whole.
[/quote]
I remember the program will automatically chech if it is out of bound, and if it is there will be an assertion or something. I also try [CODE]
char tc[4] = "123";
//char tc[4] = "1234";failed
tc[5] = 0; // why this line could work??
[/CODE]

Share this post


Link to post
Share on other sites
[quote name='monkeyboi' timestamp='1342113513' post='4958478']
[quote name='boogyman19946' timestamp='1342110453' post='4958460']
Just because it's outside of the array range doesn't mean it's outside of a memory block allocated to your program as a whole.
[/quote]
I remember the program will automatically chech if it is out of bound, and if it is there will be an assertion or something. I also try [CODE]
char tc[4] = "123";
//char tc[4] = "1234";failed
tc[5] = 0; // why this line could work??
[/CODE]
[/quote]


I think this is only because upon declaration the compiler KNOWS it only has 4 bytes in memory to use, so putting "1234" is actually 5 bytes since the zerobyte at the end.

But when you just put tc[5], you are directly dereferencing a pointer and the compiler is not aware of the size of the memory is has 'access' to. I'm guessing the compiler basically says "Oh i see only 4 bytes not 5, error".
But in the other case it just says "Oh put this byte in this memory address". it doesn't check to see if that is legal

Share this post


Link to post
Share on other sites
Sorry, you are mistaken. In C++, there are not checked array accesses. std::vector has a member function "at" which does just that. Did you learn Java before? They have checked array accesses.

Also
[code]
char tc[4] = "1234";
[/code]
does not work, because it has an implicit null byte at the end, it is equal to
[code]
char tc[4] = { '1', '2', '3', '4', '\0' };
[/code]

This is an error, the compiler may detect. But later on, detection of out of bounds accesses is hard (and in most cases impossible) for the compiler.

Also array indices for a 4-element array range from 0 to 3. tc[4] = 0; is already an error, even though it will only sporadically be detected by crashing the program.

EDIT: arkane7 was quicker [img]http://public.gamedev.net//public/style_emoticons/default/dry.png[/img] Edited by rnlf

Share this post


Link to post
Share on other sites
[quote name='larspensjo' timestamp='1342108078' post='4958446']
If you need space for strings, consider using std::string instead.
[/quote]

Knowing and using std::* classes is [i]not[/i] a replacement for actually understanding how the language works, and in particular how memory allocation works, particularly with a relatively low-level language such as C++. When he has a good understanding of the fundamentals, a basic understanding of the std::* classes and knowledge that he should use them should come along with it.

To the OP:

The reason that [i]strlen [/i]is not returning the size of your array is simply due to the fact that that is not the purpose of [i]strlen[/i].

[i]strlen [/i]returns the length of a C String, that is, a null-terminated character array.

You are allocationg 4 bytes of memory. However, that memory is coming out of the heap, and it is likely that there is "valid" memory after it (valid in that it won't crash, but undefined as per C++).

Since it is not initialized, it just has random junk. You are seeing strlen return 16. This is why:

| YOUR ARRAY |
[rnd][rnd][rnd][rnd][rnd][rnd][rnd][rnd][rnd][rnd][rnd][rnd][rnd][rnd][rnd][rnd][0]

rnd is any random non-zero value.

strlen has no knowledge of your array or its size, and only has knowledge of the pointer and that it must seek the null terminator. The fact that you have consistently gotten 16 is coincidental.

It could very well return this:

| YOUR ARRAY |
[rnd][rnd][0][rnd]

In which case, strlen would return 3.

The reason it could CRASH is that it could have allocated your memory at the edge of a page allocation, for instance. When you go outside the buffer, you are now reading unmapped memory, which is a page fault. There are other reasons too, simply put - don't read outside of memory you are aware of.

Share this post


Link to post
Share on other sites
Ah see Thanks a lot every one[quote name='Ameise' timestamp='1342114884' post='4958486']
Knowing and using std::* classes is not a replacement for actually understanding how the language works, and in particular how memory allocation works, particularly with a relatively low-level language such as C++. When he has a good understanding of the fundamentals, a basic understanding of the std::* classes and knowledge that he should use them should come along with it.
[/quote] thumb up lol
[quote name='Ameise' timestamp='1342114884' post='4958486']
However, that memory is coming out of the heap, and it is likely that there is "valid" memory after it (valid in that it won't crash, but undefined as per C++).
[/quote]
So in fact my array is not in the heap?

Share this post


Link to post
Share on other sites
any time you use the "new" or malloc (C-style) allocation methods, it goes onto the heap, contrast to putting it on the stack (any local variables for example, like int f =3)

So indeed your array IS on the heap, since you used new. What I think Ameise was getting at is that since it is on the heap there is less chance of encountering important data, such as things like returning addresses (the place to go back to after a function call for example) which are stored on the stack (if you overwrite a return address it may very well cause a page fault and crash the program)

Edit: the point is that you do not want to go out of bounds of any array or invalid memory. Even if you overwrite harmless data, you risk the opportunity to contaminate either your own program executing, the operating system, or other programs. This is why std::string is much better alternative to C-style strings (char[]) because you don't directly deal with pointers. Also it has its own size() function, as well a multitude of manipulating functions. All in all std::string is more flexible, useful, and easier. Edited by arkane7

Share this post


Link to post
Share on other sites
[quote name='arkane7' timestamp='1342124849' post='4958539']
any time you use the "new" or malloc (C-style) allocation methods, it goes onto the heap, contrast to putting it on the stack (any local variables for example, like int f =3)

So indeed your array IS on the heap, since you used new. What I think Ameise was getting at is that since it is on the heap there is less chance of encountering important data, such as things like returning addresses (the place to go back to after a function call for example) which are stored on the stack (if you overwrite a return address it may very well cause a page fault and crash the program)

Edit: the point is that you do not want to go out of bounds of any array or invalid memory. Even if you overwrite harmless data, you risk the opportunity to contaminate either your own program executing, the operating system, or other programs. This is why std::string is much better alternative to C-style strings (char[]) because you don't directly deal with pointers. Also it has its own size() function, as well a multitude of manipulating functions. All in all std::string is more flexible, useful, and easier.
[/quote]

My other point still stands in that regards, though, in that he shouldn't be using std::string until he understands CStrings. In regards to the standard lib, I don't usually recommend using things until you have an understanding of how they work.

Share this post


Link to post
Share on other sites
I first came accross std::string before i learned about char arrays. This was way back in HS in intro programming. But i feel you are right he needs to understand pointers, allocation, and buffer limitations.

Key thinks that monkeyboi needs to know about C-style strings (aka char arrays):
[list=1]
[*]There must be a zerobyte/null byte inside that determines the end of the string. '\0' is the ASCII symbol for it, but 0 is also fine. When using strlen it counts the number of characters up until it find this zerobyte. This is why he had odd answers since the char[] was [b]undefined upon initialization[/b]. When you put char[4] tc = "123"; you are really putting '1', '2', '3', '\0', filling it up; this extra step of putting the zerobyte at the end is just a feature of c++ (and C i believe), but only when you put var = "___", if you do it by individual bytes (var[4] = 'b') you MUST put a zerobyte in manually (tc[4] = '\0')
[*]Even if it has a zerobyte,[b] never [/b]access/overwrite an element outside of its bounds (dont ever use tc[-2] or tc[n], or any larger number, only 0->n-1 ).
[*]Keep in mind the compiler will not warn you if you go out-of-bounds, since it doesn't really know the size of the array, only in the case of declaration and putting something too big will it ever tell you something (char [4] tc = "1234", was too big since '\0' is a fifth additional byte added on)
[*]When you use new, it goes to the heap in the CPU memory, which is a special place for dynamically allocated memory, or run-time(?) allocation.
[*]Always remember that char[] is still a pointer, as are all arrays. tc itself is the pointer to the first element of the array, as in the case above *tc would be '1'; *(tc+1) is '2', *(tc+2) is '3' and *(tc+3) is '\0'. When you use tc[0] or tc[2], it is just a "shorter" way of doing *(tc+2).
[*]char* and char[] are in many ways equivalent, just char[] is explicity announcing its array property, while char* could just be a pointer to a single char or possibly pointing to the beginning of a char array
[/list]
edit: i put this in a post below but this needs to be mentioned along with this
----other things about memory management:[list]
[*]When using new, as stated before, it goes to the heap. If you allocate too many items, you will run out of memory unless you deallocate them. When you continually allocate memory but never free it, this is called a Memory Leak
[*]To deallocate, you use the delete key word.
say I put SomeClass* x = new SomeClass(); to deallocate it just use [u][u]delete x;[/u][/u] (keep in mind this calls the destructor for SomeClass and frees the memory that x was pointing to, allowing later allocations to use that memory)
but if you allocate an array you have to do something else; in the case of char* ca = new char[3]; you will have to use [u][u]delete [] ca;[/u][/u]
[/list] Edited by arkane7

Share this post


Link to post
Share on other sites
I'm wondering why I was downvoted without any comments explaining what was wrong with what I wrote [img]http://public.gamedev.net//public/style_emoticons/default/sad.png[/img]

[quote name='arkane7' timestamp='1342126302' post='4958551']
I first came accross std::string before i learned about char arrays. This was way back in HS in intro programming. But i feel you are right he needs to understand pointers, allocation, and buffer limitations.

Key thinks that monkeyboi needs to know about C-style strings (aka char arrays):[list=1]
[*]There must be a zerobyte/null byte inside that determines the end of the string. '\0' is the ASCII symbol for it, but 0 is also fine. When using strlen it counts the number of characters up until it find this zerobyte. This is why he had odd answers since the char[] was [b]undefined upon initialization[/b]. When you put char[4] tc = "123"; you are really putting '1', '2', '3', '\0', filling it up; this extra step of putting the zerobyte at the end is just a feature of c++ (and C i believe)
[*]Even if it has a zerobyte,[b] never [/b]access/overwrite an element outside of its bounds (dont ever use tc[-2] or tc[n], or any larger number, only 0->n-1 ).
[*]Keep in mind the compiler will not warn you if you go out-of-bounds, since it doesn't really know the size of the array, only in the case of declaration and putting something too big will it ever tell you something (char [4] tc = "1234", was too big since '\0' is a fifth additional byte added on)
[*]When you use new, it goes to the heap in the CPU memory, which is a special place for dynamically allocated memory, or run-time(?) allocation.
[*]Always remember that char[] is still a pointer, as are all arrays. tc itself is the pointer to the first element of the array, as in the case above *tc would be '1'; *(tc+1) is '2', *(tc+2) is '3' and *(tc+3) is '\0'. When you use tc[0] or tc[2], it is just a "shorter" way of doing *(tc+2).
[*]char* and char[] are in many ways equivalent, just char[] is explicity announcing its array property, while char* could just be a pointer to a single char or possibly pointing to the beginning of a char array
[/list]
[/quote]

Regarding 4, only if he uses the standard non-placement new operator. If it is overloaded, everything's up in the air. Placement new just uses whatever memory you're pointing to.

Regarding 5, 2[tc] is also '3' [img]http://public.gamedev.net//public/style_emoticons/default/smile.png[/img] I absolutely hate that syntax, though. Edited by Ameise

Share this post


Link to post
Share on other sites
Why were you downvoted??? you were spot on in everything you mentioned.

oh and yeah i completely forgot about 2[var] syntax. I never use it since its confusing.

What do you mean overloading new? I'm not sure i've come accross that use of new



Also monkeyboi other things about memory management:[list]
[*]When using new, as stated before, it goes to the heap. If you allocate too many items, you will run out of memory unless you deallocate them. When you continually allocate memory but never free it, this is called a Memory Leak
[*]To deallocate, you use the delete key word.
say I put SomeClass* x = new SomeClass(); to deallocate it just use [u]delete x;[/u] (keep in mind this calls the destructor for SomeClass and frees the memory that x was pointing to, allowing later allocations to use that memory)
but if you allocate an array you have to do something else; in the case of char* ca = new char[3]; you will have to use [u]delete [] ca;[/u]
[/list] Edited by arkane7

Share this post


Link to post
Share on other sites
[quote name='arkane7' timestamp='1342128023' post='4958560']
Why were you downvoted??? you were spot on in everything you mentioned.

oh and yeah i completely forgot about 2[var] syntax. I never use it since its confusing.

What do you mean overloading new? I'm not sure i've come accross that use of new



Also monkeyboi other things about memory management:[list]
[*]When using new, as stated before, it goes to the heap. If you allocate too many items, you will run out of memory unless you deallocate them. When you continually allocate memory but never free it, this is called a Memory Leak
[*]To deallocate, you use the delete key word.
say I put SomeClass* x = new SomeClass(); to deallocate it just use [u]delete x;[/u] (keep in mind this calls the destructor for SomeClass and frees the memory that x was pointing to, allowing later allocations to use that memory)
but if you allocate an array you have to do something else; in the case of char* ca = new char[3]; you will have to use [u]delete [] ca;[/u]
[/list]
[/quote]

Overloading new... I don't recommend what I'm doing here (since it's stupid) but it's a quick example:
[source lang="cpp"]static char sBuffer[512];

void * operator new (size_t sz)
{
return (void *)sBuffer;
}[/source]
That won't return the pointer from the heap, but rather will return [i]sBuffer[/i].

In regards to placement new:
[source lang="cpp"]char buffer[512];
new (buffer) Object;[/source]

That also will not use the heap, but will construct the object within [i]buffer[/i].

For placement new, do [i]not ever call delete[/i]. Call the destructor manually. It will crash/cause a black hole as it is not a valid pointer to data on the heap.

[source lang="cpp"]char buffer[512];
new (buffer) Object;
((Object *)buffer)->~Object();[/source]

Although technically beyond the scope of mid-level C++, he should also have familiarity with the C allocators ([i]malloc[/i], [i]calloc[/i], [i]realloc[/i], ...) and deallocator ([i]free[/i]).

They function relatively the same as new (with a few caveats) except that they do not call constructors/destructors. The pointers may also not be interchanged between [i]malloc [/i]and [i]delete[/i], for instance.

[i]alloca[/i], if it's present, will allocate the memory on the stack (by decrementing the stack pointer, generally). I don't [i]usually[/i] recommend it as it's not safe - a failure for alloca to allocate results in a stack overflow.

Share this post


Link to post
Share on other sites
Also don't forget that using malloc/calloc etc you must actually put [i]how many bytes[/i] you want to allocate.

int* c = (int *) malloc( sizeof(int) * 30); // this will give you 30 elements of integers

Share this post


Link to post
Share on other sites
Firstly, use 'sizeof( )' instead of 'strlen( )':

int size = sizeof( char ) * 4;

Also, if you declare a static sized array like:
char myChar[4];
sizeof( myChar ) = 4,
int myInt[4];
sizeof( myInt ) = 16
because the compiler is able to determine the size of this at compilation time.

It cannot do this for run-time allocated array using new or malloc because you'll always be comparing the size of the type of the variable you pass in which is a pointer.
int* myInt = new int[4];
sizeof( myInt ) = 4,

*Or in this case the size is '8' because we're on a 64bit compiler, and all pointers are 8 bytes in size (According to the standard).
sizeof( char* ) = 8
sizeof( short* ) = 8
sizeof( int* ) = 8
....etc etc

Secondly, in C/C++ you don't have array access checks to warn you if you're going out-of-bounds, which leads to a typical problem of memory
overwrites which every C/C++ developer has had fun with. With the array defined as:
char myArray[4] = "123";

it is possible to do this without causing any compiler errors:
char value1 = myArray[5];
char value2 = mrArray[-1];

The value in value1 & value2 is random, could be anything, valid, in-valid, garbage, another object's data... What they are is the value stored at the address of 'myArray[0] plus 5 bytes '(value1), and 'myArray[0] minus 1 byte' (value2), if that makes sense :S

Finally, to give some further detail and explaination to the point of the memory spacing you got between ta & tb (40 hex, 64dec) can also be compiler and OS specific. For instance Windows, and in debug libs being used it can allocate more memory then you actually asked for, so if you asked for 4 bytes of data you might actually get 8 bytes. The additional 4 bytes can be given 2 bytes eitherside of your memory for padding. Padding which can be used to detect memory-overwrites or corruption, or prehaps prevent even minor cases over overwrites (ie. if it only corrupts the padding leaving your game-data untouched).

However still don't expect several calls of 'new' to give memory allocs sequentially ordered from eachother, use a single 'new' call for the total 'char' you need and reallocate from that eg.

char* myCharArray = new char[8];
char* ta = &myCharArray[0];
char* tb = &mayCharArray[4];

Memory Address:
ta = 0x000a5fc0
tb = 0x000a5fc4

(tb)0x000a5fc4 - (ta)0x000a5fc0 = 4

Hope this helps explain a few things.

Share this post


Link to post
Share on other sites
Sign in to follow this