Archived

This topic is now archived and is closed to further replies.

genne

optimization of virtual functions

Recommended Posts

Hi! Do you think it''s worth using virtual functions? They seem to be slower to call then nonvirtual functions, but also sometimes necessary in order to get a nice structure of your code. For an example:
class Renderer {
public:
    virtual void drawTriangle( const Triangle &t ) = 0;
    // ...

};

class GLRenderer : public Renderer {
public:
    void drawTriangle( const Triangle &t ) {...}
    // ...

};

class Renderable {
public:
    virtual void render( Renderer *r ) = 0;
};

class Some3dObject : public Renderable {
public:
    void render( Renderer *r ) {...}
};

void main() {
    Renderer *r = new GLRenderer();
    Some3dObject o;
    o.render( r );
}
If you call o.render( r ) in every frame, you would loose a lot of speed, wouldn''t you? If you instead skipped all base classes and virtual functions, your code would run faster, but it would also get more ugly. So, how would you solve this?

Share this post


Link to post
Share on other sites
It''s usually a lot better to make an interface to represent an entire strip, fan, or list of triangles. Both GL and D3D are good at drawing a whole bunch of polygons all at once.

I''m hip because I say "M$" instead of "MS".

Share this post


Link to post
Share on other sites
Virtual functions constitute one extra pointer dereference per call. That doesn''t need to be optimized.

Share this post


Link to post
Share on other sites
So the speed loss calling virtual functions should be insignificant? Well, think i could live with one extra pointer dereference per call if it''s giving me the possibility to make the structure of the code nicer. Thx!

Share this post


Link to post
Share on other sites
DirectX uses virtual functions for everything, and that isn''t exactly considered slow, and consider the alternative:

if(opengl)
draw_gl_triangles()
if(dx)
draw_dx_triangles()
...

--------------------------------

"I''m a half time coder, full time Britney Spears addict"
Targhan

Share this post


Link to post
Share on other sites
The speed lose is because your program needs to do a runtime type check whenever you use a virtual function.

So everytime you call a virtual function, your program has to figure out which function to call using this runtime type check.

The quickest way to solve this would be using function pointers. Is the speed lose due to virtual function calls important to you? That depends on your speed vs code complexity tradeoff. Maybe for your case it doesn''t matter. Maybe it does. It depends on the situation

Share this post


Link to post
Share on other sites
well virtual functions are not as expensive as it sounds...for most compilers it shouldnt take no more then 3 memory fetches: one for getting the adress of the vtable, one for getting the adress of the function in the table, and one to fetch the objects offset in the enclosing object.

so if you have a function like
int buffer::operator[](int n) {return innerbufferdata[n];}
making the function virtual will effectively double its ececution time (assuming all memory fetches take the same time), so for example if this is a pixelbuffer that gets accessed thousands of times each frame it is bad to have the function virtual. but in your case, where the drawTriangle etc. functions will themselves probably do hundreds or thousands of operations the virtual function call overhead can be neglected, so there''s nothing wrong with your design



Share this post


Link to post
Share on other sites
Well, in my case, the speed does matters, but i still want the code to be well structured. I''m coding a 3d engine, and every extra frame per second would be appreciated. Though, i''m not sure i would like to sacrifice my "nice looking code" if it only gave me one or two extra frames per second. Perhaps should follow your advice using function pointers instead, might be a possible option!

Share this post


Link to post
Share on other sites
quote:
Original post by Burning_Ice
so if you have a function like
int buffer::operator[](int n) {return innerbufferdata[n];}
making the function virtual will effectively double its ececution time (assuming all memory fetches take the same time), so for example if this is a pixelbuffer that gets accessed thousands of times each frame it is bad to have the function virtual. but in your case, where the drawTriangle etc. functions will themselves probably do hundreds or thousands of operations the virtual function call overhead can be neglected, so there''s nothing wrong with your design


Uhm, that''s actually one of the virtual functions i''m using It only retrieves an element in a private array. I''m using virtual function almost everywhere in my code!

Share this post


Link to post
Share on other sites
well in your case it probably wouldnt give you an extra frame per second, but perhaps an extra frame every couple of minutes, accordingly to how many triangles you render per sec

also, your part
Some3dObject o;
o.render( r );
won't probably have any overhead at all because the static type is known, so your compiler can drop the vtable lookup altogether, since thats only neccessary when calling trough a base-pointer to a child-object (Base* p = new child; ... )


Edit: just read your last post: ok, then it is expensive, but really only if you call it many thousand times each frame





[edited by - Burning_Ice on July 6, 2003 11:37:35 AM]

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
""" Perhaps should follow your advice using function pointers instead, might be a possible option! """

The whole point is that virtual functions pretty much are function pointers. If your object does not have any virtual inheritance (ie, class CFoo : public virtual CBar), then virtual functions are 1 extra indirection per function call.

If you are using DirectX (or any other COM interfaces), every single method invocation is a virtual function call. The performance is relatively insignificant.

The most important thing to keep in mind: If you try to "emulate" virtual function calls, doing a switch block, if/else blocks, function pointers, etc, you will not be able to beat the virtual function support built into the compiler. Optimizing compilers still have restrictions they must take into account when dealing with user code. (ie, pointer aliasing). But, since they are in complete control of all aspects of memory layout and execution in the case of virtual functions, they don''t have to play conservatively with respect to this sort of stuff.

So use them.

Share this post


Link to post
Share on other sites
Optimize your algorithm first. Virtual functions aren''t bad, depending on where they are. Putting them inside an inner loop which will get pounded (like DrawTriangle) may cause a fair ammount of overhead. Only way to know is to profile it.

I would suggest having a much higher abstraction for your renderer. Maybe a DrawObect() call, with it handling drawing the polies iternally. This could cut the virtual function calls from hundreds of thousands per frame to a hand full.

Share this post


Link to post
Share on other sites
There should be no reason for the user to switch renderer during play so why not just make different versions of the executable file for different renderers? That way you don''t need to use function pointers or if statements because there won''t be any dynamic functions to resolve.

e.g.


#if GLRENDERER
class Renderable {
public:
void render( Renderer *r )
{
// GL renderer code

}
};
#else
class Renderable {
public:
void render( Renderer *r )
{
// DirectX renderer code

}
};
#endif

Share this post


Link to post
Share on other sites
or having each renderer in a dll an loading the right dll on startup, this would be nore elegant then having different .exes (and is used by quite some games)


Share this post


Link to post
Share on other sites
quote:
Original post by Burning_Ice
also, your part
Some3dObject o;
o.render( r );
won''t probably have any overhead at all because the static type is known, so your compiler can drop the vtable lookup altogether, since thats only neccessary when calling trough a base-pointer to a child-object (Base* p = new child; ... )


Edit: just read your last post: ok, then it is expensive, but really only if you call it many thousand times each frame





[edited by - Burning_Ice on July 6, 2003 11:37:35 AM]


The thing is that atleast my compiler (vc++ .net) seems to do a lookup in the vtable even if the static type is known. I profiled the following code using DevPartner, and realized to my dissapointment that functions overloading virtual functions in a base class, are slower then functions not overloading virtual functions:

class Base {
public:
virtual int &operator[]( unsigned int i ) = 0;
};
class A : public Base {
int data[100];
public:
int &operator[]( unsigned int i ) { return data[i]; }
};
class B {
int data[100];
public:
int &operator[]( unsigned int i ) { return data[i]; }
};
void main() {
A a; B b;
int x;
x = a[1]; // overloads virtual function -> slow

x = b[1]; // no overloading -> fast

}

According to DevPartner, x = a[1]; is approximately 6 times more slower then x = b[1];. Strange, isn''t it?

AP: I believe i''m convinced to use them, yes

BrianL: The Renderer code I gave in my first post, was just a bad example

seanw: That''s another possible solution, but not very nice looking, is it? If the virtual functions doesn''t take up more time then a pointer dereference, i think i''ll stick with the virtual functions!

Share this post


Link to post
Share on other sites
quote:
Original post by Burning_Ice
or having each renderer in a dll an loading the right dll on startup, this would be nore elegant then having different .exes (and is used by quite some games)





Aah, a great idea

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
The fetch for object base address offset can be optimized out with compiler smarts and static knowledge at the call site (assuming you don''t use virtual inheritance). MSVC does that. However, GCC does not; it uses the third fetch.

However, the third fetch is pretty much free, because the offset lives in the same cache line as the function address that you just fetched. Also, the address of the vtable is likely to already be in cache, or at least need to be in cache, because it lives among member variables of the object which you''ll presumably use in the function.

Thus, the amortized cost of a virtual function call is likely one cache miss, and possibly a TLB miss to go with it, compared to a straight non-virtual call. There''s a little bit of prediction problems with indirect calls, too (function pointers and virtual calls) but if your code calls a function at a point where scheduling/prediction is a problem, you''re using the wrong tool for the job.

Basically, you should use virtuals to structure your code nicely. You should design your API so that each call does as much work as is useful and necessary, and avoid unnecessary extra calls. No virtual calls in the inner loop! For example: it''e perfectly fine to have a virtual call to "lock" a vertex array and get a pointer you can write N vertices into. It''s not a great idea to have a virtual call to add a single vertex to the array. Most renderers have a virtual call sayiing "draw this vertex array/triangle list with this shader", but it would be really slow to have a call saying "draw this individual triangle" (or, worse, "issue this individual vertex".

Share this post


Link to post
Share on other sites
Lets say i have an render interface like this one:


// ==================================

class CRasterizerInterface
{
public:
CRasterizerInterface() : m_StateChanges(0) {}
virtual ~CRasterizerInterface() {}

// get/set current renderstate

finline RenderState GetCurrentRenderState() { return m_RenderState; }
finline void SetRenderState(RenderState state) { m_RenderState = state; }
finline int GetStateChanges() { return m_StateChanges; }

// || virtual functions ||

// -----------------------


// start/stop rendering

virtual void BeginnRendering() = 0;
virtual void EndRendering() = 0;

// en/disable render stage/function

virtual void Enable(PiplineStage stage) = 0;
virtual void Disable(PiplineStage stage) = 0;

// matrix operations

virtual void SetIdentityMatrix() = 0;
virtual void SetWorldMatrix(const Matrix4 &m) = 0;
virtual void SetViewMatrix(const Matrix4 &m) = 0;
virtual void SetProjectionMatrix(const Matrix4 &m) = 0;

virtual void PopMatrix() = 0;
virtual void PushMatrix() = 0;

// basic matrix operations

virtual void Scale(float x, float y, float z) = 0;
virtual void Translation(float x, float y, float z) = 0;
virtual void Rotation(float angle, float x, float y, float z) = 0;
virtual void MultiplyMatrix(const Matrix4x4& m) = 0;

virtual void Matrix4x4 GetCurrentMatrix() = 0;

// basic stuff

virtual void Clear(bool Screen = true, bool DepthBuffer = true, bool StencilBuffer = false) = 0;
virtual void SetClearColor(float r, float g, float b, float a) = 0;
virtual void SetClearColor(Color c) = 0;
virtual void SetColor(float r, float g, float b, float a) = 0;
virtual void SetColor(Color c) = 0;

// blending

virtual void SetBlendingFunction(Blendfactor sfactor, Blendfactor dfactor) = 0;

// shade mode

virtual void SetShadeMode(ShadeMode mode) = 0;

// fill mode

virtual void SetFillMode(FillMode mode) = 0;

// texture filter

virtual void SetTextureFilter(TextureFilter filter) = 0;

// depth buffer

virtual void SetDepthBuffer(DepthComparison func, bool WriteEnabled = true, float ClearDepth = 1.0f) = 0;

// set every buffer buffer options alone

virtual void SetDepthBufferFunction(DepthComparison func) = 0;
virtual void SetDepthBufferWriteEnabled(bool enabled = true) = 0;
virtual void SetDepthBufferClearDepth(float value) = 0;

// stencil buffer

virtual void SetStencilBuffer(StencilComparison func, int refvalue, int maksvalue,
StencilOperation fail, StencilOperation zfail, StencilOperation zpass) = 0;

// set every stencil buffer options alone

virtual void SetStencilBufferFunction(StencilComparison func) = 0;
virtual void SetStencilBufferReferenceValue(int value) = 0;
virtual void SetStencilBufferMask(int value) = 0;
virtual void SetStencilBufferOperations(StencilOperation fail, StencilOperation zfail, StencilOperation zpass) = 0;
virtual void SetStencilBufferFailOperation(StencilOperation fail) = 0;
virtual void SetStencilBufferDepthFailOperation(StencilOperation zfail) = 0;
virtual void SetStencilBufferPassOperation(StencilOperation pass) = 0;

// culling

virtual void SetCullingMode(CullingMode mode) = 0;

// rendering

virtual void Render(PrimitiveType type, VertexBuffer vb, unsigned int start, unsigned int count) = 0;

protected:
RenderState m_RenderState;
int m_StateChanges;
};


And on top of this i make an opengl and directx renderer. Would this be optimal? A good way?

Share this post


Link to post
Share on other sites