#### Archived

This topic is now archived and is closed to further replies.

# DLL heap questions

This topic is 5148 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

class ClassThatReallocatesWhenCHANGEIsCalled
{
...
void CHANGE ()
{
delete [] data_member;

data_member = new int[WHATEVER];
}
...
int *data_member;
};

My application code:
ClassThatReallocatesWhenCHANGEIsCalled class_instance;

// my_class_in_dLL is a pointer to a class that''s been got by
// using LoadLibrary, and then using an exported function to
// return a new instance
class_instance->CHANGE ();
my_class_in_dLL->InsideDLLFunction (class_instance);

My DLL code:
class MyClassInDLL
{
...
void InsideDLLFunction (ClassThatReallocatesWhenCHANGEIsCalled &class_instance)
{
class_instance.CHANGE ();
}
...
};

The DLL and the App each have access to the ClassThatReallocatesWhenCHANGEIsCalled implementation, and statically link to it. When CHANGE () is called in the App, it refers to the Apps code for CHANGE (), and it allocates from the Apps heap. When InsideDLLFunction () calls CHANGE (), it refers to the DLLs code for CHANGE (), and it tries to deallocate using the pointer contained within class_instance. But that pointer is only relevant to the Apps heap, and we''re in the DLL now. Oops! Time for Mr._CrtIsValidHeapPointer to teach you some manners! Is there some better way to this, through design or otherwise, avoiding the need to use the multithreaded DLL CRT trick. Sorry for the long post, but I think you''d have more to complain about if I asked some ambiguous questions!
Thanks for any insight! Ro_Akira

##### Share on other sites
So you''re writing an application in C++ and you worry about performance?

##### Share on other sites
Heh! antareus, at first I thought your comment was an implication that I should do it in C, asm! Then I considered the possibility that it was a genuine question!

Yes, it's a C++ application and I'm worried about performance. At least I want to be informed about the performance impact of such design decisions, even if there's not much that can be done about them. Are my presumptions correct throughout my original post?

Thanks,
Ro_Akira

[edited by - Ro_Akira on November 11, 2003 8:25:11 AM]

##### Share on other sites
> - Another solution is to use the multithreaded
> DLL CRT for both the App and the DLL.

Always match the code generation linker parameters for EXEs and DLLs; otherwise you get heap problems.

Yes, but it''s not widespread. Anything that deals with common data structures will have critical locks in the CRT (file I/O, memory allocation, ...). Critical sections are concentrated around fast-executing operations so that thread collisions rarely occurs in the real-world.

There is no difference in performance whatsoever between EXE and DLL when it comes to running the actual code; that is, once the code is loaded in memory. The performance difference comes at DLL loading time. Here are a few tips&tricks about using DLLs that can enhance performance.

a) You can delay-load a DLL until it is needed by the application. For example, by putting the victory dance code in a separate DLL, you can accelerate things by putting off use of that code for later (and only if the gamer wins). Check the linker switches for this feature.

b) Use the /BASE switch of the linker to specify where the DLL will land in memory at load-up. Why? Using the default linker settings, all your compiled DLLs will end up in the same address space. Smart Win32 OS will do an address fixup pass to move the DLL around to avoid collisions, which will create a new code copy instead of solely relying on the memorymap mechanism. So you end up memory-mapping your DLL, modify each and every 4K blocks, and swap it out again to disk... not efficient use of swapping.

d) Coalesce related code segments into 4K blocks. Why? The memory-map unit has a 4K granularity (8K on MIPS architectures). By putting all your setup UI dialog box code in a single segment allows the OS to page this segment out when it''s no longer needed and this segment''s LRU will eventually get marked as recyclable by the OS. Check the ''code_seg'' pragmas of the compiler. If you scatter the code segments all over the place, then you stand a chance of *maximizing* your app''s memory footprint (not a smart thing). By coalescing related code segments you are *minimizing* your memory footprint. Look at MFC''s source code as a good usage example of this.

> - Is there a good reason for the lack of a single

Unless you control both the DLL and EXE compilations, you can''t make any assumption as to how the client app is compiled and how threads are used if any. Microsoft''s COM paradigm alleviates this problem somewhat by way of explicit interfaces, explicit memory management contracts, and marked executables (apartment, multithreaded, freethreaded & singlythreaded).

Hope this helps.

-cb

##### Share on other sites

quote:

> - Another solution is to use the multithreaded
> DLL CRT for both the App and the DLL.

Always match the code generation linker parameters for EXEs and DLLs; otherwise you get heap problems.

Are you saying that I don''t really need to specify multithreaded
DLL CRT for both the App and the DLL, and I could just as well use single threaded and it just matters that they''re the same? Or are you talking about the need to match Debug App with Debug DLL, and Release with Release? Or are you just talking in general about linker parameters?

quote:

Yes, but it''s not widespread. Anything that deals with common data structures will have critical locks in the CRT (file I/O, memory allocation, ...). Critical sections are concentrated around fast-executing operations so that thread collisions rarely occurs in the real-world.

You''re saying that the performance impact from critical sections in the CRT is not something to be worried, and doesn''t need to be avoided like the plague?

quote:

There is no difference in performance whatsoever between EXE and DLL when it comes to running the actual code; that is, once the code is loaded in memory. The performance difference comes at DLL loading time.

Hmmm. Does linking to a DLL not affect inlining of functions?

Since loading DLLs is not something that will be done every frame in a game, for example, that''s an acceptable overhead for the added resultant flexibility.

In answer to (a). Isn''t this compiler specific. Explicit linkage can give the same result, the only disadvantage is you need to keep track of when you need to load it.

In answer to (c). I agree explicit linking is the most ''fun'' and flexible way to use DLLs!

In answer to (b). Do I need to specify a different base address for each DLL the application uses?

e.g :
ogl_graphics.dll 0x64000000
dx9_graphics.dll 0x64000000
oal_sound.dll 0x64100000
dx9_sound.dll 0x64100000

In answer to (d). Sounds like advanced stuff. Is the general gist of it too keep all related code (functions that call/are dependant on each other) in a small 4k block of memory, so it can be easily paged in and out by the OS. And you pass on hints/directives on these blocks through the use of code_seg? I''ll have a look at MFCs source code as you suggested.

Thanks,
Ro_Akira

##### Share on other sites
I couldn't do something mad, like passing the Apps heap pointer to the DLL, have it save its heap pointer, and use the Apps heap until it gets a DLL_PROCESS_DETACH, where by it goes back to using the DLL heap pointer? Or something?!

Sounds like an abomination!

Ro_Akira

[edited by - Ro_Akira on November 11, 2003 11:04:20 AM]

##### Share on other sites
> I could just as well use single threaded and it
> just matters that they're the same?

The CRT used by the EXE and DLLs must be the same for correct heap usage in any case. It doesn't matter which model you select as long as it is consistent across all binaries. If you mix & match you will need to devise explicit memory handling contracts across your binaries.

> You're saying that the performance impact from
> critical sections in the CRT is not something to
> be worried

::HeapAlloc( X, Y, HEAP_NO_SERIALIZE | HEAP_ZERO_MEMORY ) is probably the worst performing API in a memory-intensive multithreaded app. Apart from this, profiling will give you more insights as to where CPU is spent; in my experience, the CRT calls are far down the list of performance issues.

> Does linking to a DLL not affect inlining of functions?

Inlining involves code duplication, no? I suspect you'll end up with inlined functions duplicated in all your binaries that use it.

> In answer to (a). Isn't this compiler specific.

It's available in VC6 and up. Linux and IRIX have similar features for DSOs (under 'delay-load').

> In answer to (b). Do I need to specify a different
> base address for each DLL the application uses?

Ideally yes, especially if you have quite a few home-made DLLs you need to load. You could use the 'REBASE.EXE' within a Perl script to automate the extraction of the base addresses, compute the relocalization information, and change the base address of your binaries.

-cb

PS: Reading the following will give some insights for your EXE/DLL optimizations:

http://msdn.microsoft.com/msdnmag/issues/02/02/PE/default.aspx

[edited by - cbenoi1 on November 11, 2003 12:16:53 PM]

##### Share on other sites
quote:

> I could just as well use single threaded and it
> just matters that they''re the same?

The CRT used by the EXE and DLLs must be the same for correct heap usage in any case. It doesn''t matter which model you select as long as it is consistent across all binaries. If you mix & match you will need to devise explicit memory handling contracts across your binaries.

I''m not sure if you looked at the example of my code that I gave in my initial post. It only works if I use multithreaded DLL CRT for both App and DLL. Selecting, for example, singlethreaded for both App and DLL, results in the aforementioned Mr._CrtIsValidHeapPointer saying it knows nothing of these, ''other heaps''.

quote:

> You''re saying that the performance impact from
> critical sections in the CRT is not something to
> be worried

::HeapAlloc( X, Y, HEAP_NO_SERIALIZE | HEAP_ZERO_MEMORY ) is probably the worst performing API in a memory-intensive multithreaded app. Apart from this, profiling will give you more insights as to where CPU is spent; in my experience, the CRT calls are far down the list of performance issues.

> Does linking to a DLL not affect inlining of functions?

Inlining involves code duplication, no? I suspect you''ll end up with inlined functions duplicated in all your binaries that use it.

> In answer to (a). Isn''t this compiler specific.

That''s a feature of the PE file format and the OS. Linux and IRIX have similar features for DSOs.

Fair enough.

quote:

> In answer to (b). Do I need to specify a different
> base address for each DLL the application uses?

Ideally yes, especially if you have quite a few home-made DLLs you need to load.

Are there any recommendations as to how these base addresses should increment? For example, add another 0x00100000 for each one?

Thanks,
Ro_Akira

##### Share on other sites
While linking with the same code generation parameter is always advised, only when you link both DLL and EXE to the dynamic DLL version of the RTL do both the DLL and the EXE share the same heap. As you''ve noticed. I believe there is an article on MSDN on memory managment issues when using DLLs.

Matt

##### Share on other sites
quote:

...only when you link both DLL and EXE to the dynamic DLL version of the RTL do both the DLL and the EXE share the same heap...

And no amount of trickery will get around that, I guess.

I suppose I''ll just have to bite the bullet, and accept a multithreaded DLL CRT as a part of the harsh and unrelenting strife/life I endure. Attempting to ban dynamic allocation isn''t going to be practical.

There''s no modification to my design that will allow for a similar result. i.e. My DLL plugin can operate on the ClassThatReallocatesWhenCHANGEIsCalled class passed to it?

Ro_Akira

##### Share on other sites
> Are there any recommendations as to how
> these base addresses should increment?

Leave at least 64K between modules. The OS will interpret this as a no-man''s land and catch pointer errors, if any.

-cb

##### Share on other sites
Well, this is just my opinion, but I don''t using the Multithreaded DLL CRT is "biting the bullet". I believe the synchronization overhead is limited to only a few areas.

##### Share on other sites
quote:

Well, this is just my opinion, but I don''t using the Multithreaded DLL CRT is "biting the bullet". I believe the synchronization overhead is limited to only a few areas.

Very well. I will proceed with Multithreaded DLL CRT plan. But I shall insist on wearing heavy, cumbersome, and ultimatly ineffective body armour. To stop the bullet. :\

quote:

> Are there any recommendations as to how
> these base addresses should increment?

Leave at least 64K between modules. The OS will interpret this as a no-man''s land and catch pointer errors, if any.

At least 64k you say. Sounds good to me.

Thanks everyone. I believe the matter is closed. A satidfactory solution has been found, and good times were had by all. :/

Ro_Akira

##### Share on other sites
Using a single heap in a highly modular application that depends upon plugins seems like a nobrainer to me. Heaps IMO should be process-local, not local to each DLL like Win32 seems to suggest.

I use the MT DLL CRT (wee, acronyms) in my application and I resent the dependency more than anything else, but everything has tradeoffs. I'll take extensibility and modularity over Win32's asinine memory boundaries with shared libraries. Not sure about performance loss, as with anything else, use a profiler to determine if its even a problem.

Also, critical sections are process-specific and not as costly as a full-blown mutex. I wouldn't fret much over these things.

Cheers,
ant.

[edited by - antareus on November 11, 2003 2:16:04 PM]

• ### Forum Statistics

• Total Topics
628714
• Total Posts
2984357

• 23
• 11
• 10
• 13
• 14