Archived

This topic is now archived and is closed to further replies.

A. Buza

virtual funcs... performance penalty?

Recommended Posts

Yes, there is a performance penalty (pointer to a function call).
Yes it is negligible in most every circumstance.

Basically, unless you''re doing a virtual function call in one of your inner most loops, you shouldn''t worry about it.

If you do still worry about it, there are ways to get some compilers to inline virtual functions. There was a Tip of the Day describing the basics of this on flipcode (www.flipcode.com) not too long ago...

You can also find information here:

http://msdn.microsoft.com/msdnmag/issues/0600/c/c0600.asp

Share this post


Link to post
Share on other sites
Don''t worry at all. I''ve looked at the code produced by my compiler, and here''s the difference:

Non-virtual:
call ABCD;

Virtual:
call [0123];

That''s it! A memory reference, which (on a 386) take the clock count up to 10, instead of 7. Oh no.



David Owen

Share this post


Link to post
Share on other sites
Actually, there''s a few more penalities than indicated.

First off, every instance of a class that has virtual functions has as an implicit member variable that is a pointer to the vtable for the function. So on an x86, that would double the memory required for a class that has only one variable. This leads to increased cache misses, memory motion costs, etc. Also it adds an mov instruction to the constructor for the class. For most classes that you use virtual functions for this isn''t a big deal, but it''s a good reason to never make a linked list node with a virtual function.

Second the function invocation code is slightly more complicated than previously indicated. Let''s say I have a class like so:

class Base {
virtual int Func1(int a);
virtual int Func2(int a);
int Func3(int a);
};
Base * b;

Then
b->Func3(5);
would produce the following assembly (assuming eax was already loaded with b:

push 5
push eax
call @@Base@Func3$qi
add esp,8

on the other hand
b->Func2(5);
would produce the following assembly:

push 5
push eax
mov edx,dword ptr [eax]
call dword ptr [edx+4]
add esp,8

There is one additional instruction, and two additional pointer indirections, one of which is indexed. Again not a big deal, but you can see how in a tight loop it could be murder.

Share this post


Link to post
Share on other sites
For those of you interested, I ran a simple performance test.

Basically I created an instance of a leaf class that was the 5th deriviation from a base class. This leaf class contained an empty virtual function and an empty normal function. I created an instance of the class as a pointer, and ran both functions in a loop 1 million times.

Virtual function time: 0.011080 seconds
Normal function time: 0.008243 seconds

FYI, If it''s a normal variable then the time is exactly the same.

My compiler: VC6


- Houdini

Share this post


Link to post
Share on other sites
How do you look at the compiler output/browse it/find certain sections?

By ''virtual'' function, do you mean one that has overriden a virtual one, but is no longer declared as virtual anymore??? If I override in the first child class down, can I override again in the next one down, even if the one in the first one down isn''t marked as virtual over again. Are all versions ''virtual''...?

Thanks

Share this post


Link to post
Share on other sites
The big performance penalty is when you use a base class pointer to a derived class and call an overridden function (all pure virtuals, and some virtuals).

Then again, the "big" performance penalty made by code about 200% slower on a 33MHz machine... I suppose it would only be about 6% slower on a 1000MHz machine... but still, that''s 6%!

Share this post


Link to post
Share on other sites
Yes, but do you think that most of the processing power your game takes up is in function calls? It''s spent displaying graphics and calculating AI. In my tests (using a base class pointer to a derived class and calling an overridden function) you lose a whole 0.003 seconds (3 milliseconds) for every 1 million virtual function calls.

- Houdini

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
About the asm code presented above:

I dont know what compiler produced those results, but the this pointer should be passed in the ecx register, not on the stack.

Thus, if the eax register contains ''b'', then push eax is replaced by mov ecx,eax

Of course the compiler would probably have put ''b'' in the ecx register in the first place (usually through a lea instruction).

Share this post


Link to post
Share on other sites
quote:
Original post by Anonymous Poster
About the asm code presented above:

I dont know what compiler produced those results, but the this pointer should be passed in the ecx register, not on the stack.



That''s assuming use of the MS thiscall calling convention, which is the MS default calling convention for member functions. The assembly output was created with Borland bcc32 version 5.4, which uses cdecl by default. However, whether arguments are passed on the stack or via registers is irrelevant. Whatever calling convention is used doesn''t change the fact that there is an additional instruction and a double indirection over the expense of a non-virtual function call.

And that''s just the basal level overhead. Additionally each of the indirection has an increased chance of causing a cache miss. The first indirection increases the probability of a cache miss with the number and size of the vtables referenced in a given working set. The second indirection increases probability of a cache miss as the number of separate called functions and the length of each called functions increases.

However, the cache miss penalty of the second indirection can be ignored, as this is the performance penalty for exploiting polymorphism, not a penalty inherent to the calling convention. So it''s the first indirection that needs to be examined to determine if the virtual call is worth it. So if your benchmark is currently just performing 10,000 virtual function calls off the same pointer, it will probably look a little more grim if you try performing 10,000 virtual function calls off of twenty pointers each to different derived classes, even if none of derived classes override the original base function.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Magmai Kai Holmlor -- it''s the same memory & call dereference when you do base->func() as when you do derived->func(), or even use a local copy of the derived: derive.func().

Share this post


Link to post
Share on other sites
null_pointer - sorry, forgot to login! Yeah, I neglected that because I almost never use inline functions, but you''re both right. If you''re using a non-dereferenced object, then yeah most modern compilers can inline that.

derived.func(); // probably inlined
derived->func(); // depends; probably NOT (becuase it''s hard to say if another class is derived from this one)
base->func(); // can''t inline, period



David

Share this post


Link to post
Share on other sites
quote:

Magmai Kai Holmlor -- it''s the same memory & call dereference when you do base->func() as when you do derived->func(), or even use a local copy of the derived: derive.func().



They''re different when its virtual
And its obviously not the same when you use -> as opposed to .
-> means derefernce!
-> to . is the difference between:
mov ecx, ObjectOnStack
-and-
mov ecx, ObjectOnHeap
mov ecx, [ecx]


differnce when its virtual
Program Output:
B
B
S

    
class CPureBase
{
public:
virtual char foo() =0;
};

class CBase
{
public:
char foo(){return(''S'');}
};


class B : public CBase, public CPureBase
{
public:
virtual char foo(){return(''B'');}
};



#include <iostream.h>
#include <conio.h>

void main()
{
B b;
CPureBase* pure;
CBase* base;
base = &b;
pure = &b;

cout<<b.foo()<<endl;
cout<<pure->foo()<<endl;
cout<<base->foo()<<endl;

getche();
}

Share this post


Link to post
Share on other sites