Member function definitions and memory allocation

Started by
13 comments, last by NightCreature83 12 years, 8 months ago
Hey Everyone,

I'm trying to slowly integrate better programming techniques into my coding, and I just learned about the option of defining member functions outside of a class declaration. As long as the function definition is not considered 'inline', does that mean functions (and all of their variables) defined outside of the class declaration are not allocated to memory until they are called by the program? In other words, does this mean for a more memory-efficient program using this technique?

Thanks smile.gif
Advertisement
Yes and no.

Yes, the definition of a function has nothing to do with instances of a class. No, it's not more efficient.

Stephen M. Webb
Professional Free Software Developer

mellow.gif Well, darn. I hear inline functions are the way to go as far as performance, but they can cause the program to become a memory hog if there are too many instances. Should I only use inline functions if I know it that there will only be one or few instances of the class?
Your compiler should be smart enough to be able to inline things when it needs to.

It is not necessarily the case that inlining functions causes bloat -- in some cases it can actually shorten code by saving on register save/restore around the call.

The compiler almost certainly is better at this decision than you are though.
So does that mean that in this day and age it isn't necessary for me to worry about coding inline functions at all? I'm using the VS2010's compiler.

So does that mean that in this day and age it isn't necessary for me to worry about coding inline functions at all? I'm using the VS2010's compiler.


Taking the definitions out of the header file has advantages like reducing compilation times when you need to change the implementation, and generally being cleaner. So you should do that by default.

There will be situations where a function is being called many many times, which can make your program slow. In those cases, you may want to move the implementation to the header file to give the compiler the opportunity to inline the function.

So it's not quite true that you never have to worry about inlining these days, but it's close.
Your understanding of what an "instance" of an inline function is is not correct -- When someone warns of too man instances of inline functions causing code-bloat, they do not mean that there is an instance of every inline function for each instance of the class that has been created. Instead, an "instance" of an inline function occurs each time the compiler chooses to inline the function (the inline keyword is only a hint, the compiler may inline functions not marked "inline" and may not inline functions marked "inline" as it pleases). Whether you have one class from which the inline comes from, or a million of them, makes no difference.

As far as best practices WRT inline are concerned, go ahead and mark small, fast methods as inline and keep putting them inside the class. The compiler will likely make whatever decision it feels is best in light of other optimization settings (optimize for speed vs optimize for size, aggressiveness of optimization, etc), but giving the compiler some clue about what *you* think is a good inline candidate won't hurt anything. Back when compiler's took the Inline keyword as if it were from the mouth of Zeus himself, a programmer could really muck things up by overusing it, but those days are long past. Some compilers have an extension that allows a programmer to force inlining of a certain method, and that's the one that's dangerous to abuse nowadays.

Just don't assume that "inline" is a magic keyword to make things go faster, because its not. code which is too-aggresively inlined causes pressure on the code-cache, which increases the likelyhood of having to go up the cache hierarchy or out to main memory to get the next instruction. When this happens, inline can effectively cause the body of code, as a whole, to run much slower. On most modern platforms, call overhead is a negligible cost for 98% of functions. You gain much more by being cache-friendly. The Big optimization wins today are about choosing the right algorithm or how your data is arranged, not optimization that were popular (necessary, even) in the past.

throw table_exception("(? ???)? ? ???");


Your understanding of what an "instance" of an inline function is is not correct -- When someone warns of too man instances of inline functions causing code-bloat, they do not mean that there is an instance of every inline function for each instance of the class that has been created. Instead, an "instance" of an inline function occurs each time the compiler chooses to inline the function (the inline keyword is only a hint, the compiler may inline functions not marked "inline" and may not inline functions marked "inline" as it pleases). Whether you have one class from which the inline comes from, or a million of them, makes no difference.

As far as best practices WRT inline are concerned, go ahead and mark small, fast methods as inline and keep putting them inside the class. The compiler will likely make whatever decision it feels is best in light of other optimization settings (optimize for speed vs optimize for size, aggressiveness of optimization, etc), but giving the compiler some clue about what *you* think is a good inline candidate won't hurt anything. Back when compiler's took the Inline keyword as if it were from the mouth of Zeus himself, a programmer could really muck things up by overusing it, but those days are long past. Some compilers have an extension that allows a programmer to force inlining of a certain method, and that's the one that's dangerous to abuse nowadays.

Just don't assume that "inline" is a magic keyword to make things go faster, because its not. code which is too-aggresively inlined causes pressure on the code-cache, which increases the likelyhood of having to go up the cache hierarchy or out to main memory to get the next instruction. When this happens, inline can effectively cause the body of code, as a whole, to run much slower. On most modern platforms, call overhead is a negligible cost for 98% of functions. You gain much more by being cache-friendly. The Big optimization wins today are about choosing the right algorithm or how your data is arranged, not optimization that were popular (necessary, even) in the past.


I see. What exactly do you mean by "call overhead"?

I see. What exactly do you mean by "call overhead"?


Calling a non-inlined function involves several steps that take CPU time:
* pushing parameters onto the stack
* pushing the return address onto the stack
* saving the value of some registers
* changing the stack frame
* restoring the registers
* popping the return address from the stack
(There might be some costs associated with messing with the processor's instruction decoding pipeline, but I am not very certain of the details, so I wouldn't know how to describe it well.)

The cost of these steps is the call overhead. If you don't want to pay it, you could replace the function call with the body of the function, which is tremendously ugly and requires you to repeat the code at every point of the code that uses the function. When a compiler inlines a function, it means that he does the substitution for you, so your code is clean and you still save yourself the call overhead.

Do things make more sense now?
The idea of inlining code is to eliminate the need to push values onto the stack, or into registers (which usually would push some other useful value onto the stack) and then from retrieving a return value from a register later. Without inlining, this is what is generally done when calling/returning from a function. This is what they call "call overhead" -- its the stuff you have to do to call a function, but which isn't directly computing the value of the function. Usually, this overhead amounts to a few instructions and a few processor cycles. If you have a big function which takes 100 cycles to do the "real work" then a few cycles of overhead per call is a fairly small percentage of the overall time that the function takes, but if the "real work" is only a handful of cycles itself, then the overhead might become 50% or more of the total execution time in extreme cases -- that's why it *can* be a good idea to inline small functions, but there's virtually no gain to be had by inlining larger ones. The process of inlining, when applied, works to remove the overhead by using values from the caller's stack frame/registers rather than by transferring values into a non-inline function in its own registers or on the stack.

Of course, its not even that simple. Even inlined you may have to duplicate a register value or store a copy to the stack to keep safe, depending on whether or not the function is destructive with that particular register value, whether it will be over-written due to register-pressure in the rest of the calculation, or a myriad of other reasons. When you inline a function you have no idea how the compiler will adapt the function, or the code surrounding the call-site, to be most-optimal -- and it will be tuned to each site (though the result will be the same). This is the reason why "inline" today is just a hint -- The programmer cannot even remotely hope to understand how the code will transform in order to be inlined, much less at each individual callsite, and still further less whether the transformation at each site is more performant or less so.

throw table_exception("(? ???)? ? ???");

This topic is closed to new replies.

Advertisement