How does c++ handle global dynamically allocated memory?

Started by
9 comments, last by ordered_disorder 17 years, 8 months ago
I was curious as to when and where my MSVC++ compiler puts the code for *edit* GLOBAL dynamically allocated memory. For example:

char *str=new char[50];

void main()
{
    delete str;
}



Where in the source is this happening? Right before main in some start code that allocates all global dynamic memory? This doesn't eat up 50 bytes of memory in the data section of me PE does it? *PS* My reasoning for doing this IS keeping the executable size as small as possible. I am working with objects that are many kilobytes.
Advertisement
Quote:Where in the source is this happening? Right before main in some start code that allocates all global dynamic memory? This doesn't eat up 50 bytes of memory in the data section of me PE does it?


Before the entry point is reached, the global variables are allocated and initialized. The above code, in the final executable, will allocate a 4-byte pointer to a char, allocate a 50-char buffer on the heap, and set the value of the pointer to the address of the allocated buffer.

Global variables in a given file are initialized in the order they appear. Global variables in different files are initialized in an undefined order.

Quote:My reasoning for doing this IS keeping the executable size as small as possible. I am working with objects that are many kilobytes.


Global variables don't appear as part of the executable. Their memory is allocated when the program is executed. The executable only knows how large they are, and how they are initialized.
You are awesome ToohrVyk, all your posts are gold. Thank you for so thoroughly answering my question.

Quote:Original post by ToohrVyk
Before the entry point is reached, the global variables are allocated and initialized. The above code, in the final executable, will allocate a 4-byte pointer to a char, allocate a 50-char buffer on the heap, and set the value of the pointer to the address of the allocated buffer.

Global variables with dynamic initialization are zero-initialized, and dynamic intialization occurs sometime before use. It's not guaranteed that the dynamic initialization will take place before main().

Quote:
Global variables don't appear as part of the executable. Their memory is allocated when the program is executed. The executable only knows how large they are, and how they are initialized.


Not necessarily true. Global variables often can appear as part of the executable image, especially if they have non-trivial static initialization, such as:
int numbers[] = { 1, 2, 3, 4, 5, 6, 7, 8 };

This will most likely appear as part of the executable image in a writable data segment.
Quote:Original post by SiCrane
Global variables with dynamic initialization are zero-initialized, and dynamic intialization occurs sometime before use. It's not guaranteed that the dynamic initialization will take place before main().


I agree, I was wrong on this part.

Quote:
Not necessarily true. Global variables often can appear as part of the executable image, especially if they have non-trivial static initialization, such as:
int numbers[] = { 1, 2, 3, 4, 5, 6, 7, 8 };

This will most likely appear as part of the executable image in a writable data segment.


And I would argue that the program stores only how to initialize the data. This storage happens to be the data itself placed in the correct memory area (so optimal initialization is simply doing nothing) as would be expected when initialization consists simply in copying data.
I fail to see how that argument makes the statement "Global variables don't appear as part of the executable." true. The global variable is there part of the executable image. If you increased the size of the global, the executable size will increase. If the variabled is externed properly you can get a object dump of the executable that lists the location of the global in the executable image.
I really appreciate the clarifications SiCrane. I didn't argue with "globals aren't part of the executable" because I figured globals that are initiated with data that isn't trivial, aren't called globals.. I am very sleeply :] But definitely they're part of the image, especially if they are strings are contain non trivial data.

After some debugging, I noticed that indeed as you said SiCrane, globals can be allocated at any time during a program run. I often noticed surreptitious allocation right before a global variable use. Even more interesting, I noticed there wouldn't be full allocations unless all the memory of that variable would be used.

For example, lets say I had a megabyte sized array, even if I assigned data to a few hundred random memory addresses in the array, that's all that would be allocated to my process+a few bytes. The windows memory manager has some damn interesting tricks.

And this is all compiled with optimizations off, and this kind of surreptitious memory managing is happening.
Why do you need globals anyways? Normally a sign of really bad design. You can allocate the memory in functions or classes and get the same results.
If you open an exe file in a PE editor, you'll see two fields in the PE header: SizeOfInitializedData, SizeOfUninitializedData. SizeOfUninitializedData seems to be a legacy field, as it is always set to zero by modern compilers as uninitialised data is merged with initialised. SizeOfInitializedData is the sum of all the data sections: .data, .rdata, .idata, .edata, .bss, .rsrc (which ever ones are present). These days, it is uncommon to see .bss, .idata or .edata since .bss was originally used for uninitialised data and the imports and exports sections are generally merged with .data to allow kernel32.dll to efficiently create the IAT and EAT on top of the existing import and export tables. I digress.

.rdata is reserved for read-only data such as constant strings, whereas writeable global data goes in .data.

char *InitialisedStatic = "Admiral Was Here";
char *UninitialisedStatic;
char *Dynamic = new char[50];

The addresses of all these strings (double-pointers) are in .data. The strings themselves, upon initialisation (which is done by kernel32.dll at load-time) reside is different places:

InitialisedStatic points to the .rdata section,
UninitialisedStatic points to NULL,
Dynamic points to the heap, above the PE image.

So unless the compiler determines that your initialised string needs write-access the initialised data will not be copied at load time, just referenced from the original PE image.

And just to confirm for the original question: All string pointers take up the same amount of space, no matter where they live (four bytes on a 32 bit machine) and unless you initialise it, no additional space will be used in your PE image.

Regards
Admiral
Ring3 Circus - Diary of a programmer, journal of a hacker.
Quote:Original post by Anonymous Poster
Why do you need globals anyways? Normally a sign of really bad design. You can allocate the memory in functions or classes and get the same results.


Different paradigms for different problems. If I were writing a 3d engine for instance, I would use a lower level of procedural programming and use globals. Messing around with the stack, multiple levels of abstraction, threading through encapsulation, and having the general code overhead is really bad when your code is getting called several hundred or even thousand times a second and it's cpu cycle heavy.

Small test programs, and specific non complex programs are cases for globals as well.

For this particular case I am writing low level systems software and I would prefer not wasting my user's cycles and memory with a program that they won't even interact with.

This topic is closed to new replies.

Advertisement