optimization of virtual functions

Started by
16 comments, last by genne 20 years, 9 months ago
""" Perhaps should follow your advice using function pointers instead, might be a possible option! """

The whole point is that virtual functions pretty much are function pointers. If your object does not have any virtual inheritance (ie, class CFoo : public virtual CBar), then virtual functions are 1 extra indirection per function call.

If you are using DirectX (or any other COM interfaces), every single method invocation is a virtual function call. The performance is relatively insignificant.

The most important thing to keep in mind: If you try to "emulate" virtual function calls, doing a switch block, if/else blocks, function pointers, etc, you will not be able to beat the virtual function support built into the compiler. Optimizing compilers still have restrictions they must take into account when dealing with user code. (ie, pointer aliasing). But, since they are in complete control of all aspects of memory layout and execution in the case of virtual functions, they don''t have to play conservatively with respect to this sort of stuff.

So use them.
Advertisement
Optimize your algorithm first. Virtual functions aren''t bad, depending on where they are. Putting them inside an inner loop which will get pounded (like DrawTriangle) may cause a fair ammount of overhead. Only way to know is to profile it.

I would suggest having a much higher abstraction for your renderer. Maybe a DrawObect() call, with it handling drawing the polies iternally. This could cut the virtual function calls from hundreds of thousands per frame to a hand full.

There should be no reason for the user to switch renderer during play so why not just make different versions of the executable file for different renderers? That way you don''t need to use function pointers or if statements because there won''t be any dynamic functions to resolve.

e.g.

#if GLRENDERER  class Renderable {  public:      void render( Renderer *r )      {        // GL renderer code      }  };#else  class Renderable {  public:      void render( Renderer *r )      {        // DirectX renderer code      }  };#endif
or having each renderer in a dll an loading the right dll on startup, this would be nore elegant then having different .exes (and is used by quite some games)


quote:Original post by Burning_Ice
also, your part
Some3dObject o;
o.render( r );
won''t probably have any overhead at all because the static type is known, so your compiler can drop the vtable lookup altogether, since thats only neccessary when calling trough a base-pointer to a child-object (Base* p = new child; ... )


Edit: just read your last post: ok, then it is expensive, but really only if you call it many thousand times each frame





[edited by - Burning_Ice on July 6, 2003 11:37:35 AM]


The thing is that atleast my compiler (vc++ .net) seems to do a lookup in the vtable even if the static type is known. I profiled the following code using DevPartner, and realized to my dissapointment that functions overloading virtual functions in a base class, are slower then functions not overloading virtual functions:
class Base {public:    virtual int &operator[]( unsigned int i ) = 0;};class A : public Base {    int data[100];public:    int &operator[]( unsigned int i ) { return data[i]; }};class B {    int data[100];public:    int &operator[]( unsigned int i ) { return data[i]; }};void main() {    A a; B b;    int x;    x = a[1]; // overloads virtual function -> slow    x = b[1]; // no overloading -> fast}

According to DevPartner, x = a[1]; is approximately 6 times more slower then x = b[1];. Strange, isn''t it?

AP: I believe i''m convinced to use them, yes

BrianL: The Renderer code I gave in my first post, was just a bad example

seanw: That''s another possible solution, but not very nice looking, is it? If the virtual functions doesn''t take up more time then a pointer dereference, i think i''ll stick with the virtual functions!
quote:Original post by Burning_Ice
or having each renderer in a dll an loading the right dll on startup, this would be nore elegant then having different .exes (and is used by quite some games)




Aah, a great idea
The fetch for object base address offset can be optimized out with compiler smarts and static knowledge at the call site (assuming you don''t use virtual inheritance). MSVC does that. However, GCC does not; it uses the third fetch.

However, the third fetch is pretty much free, because the offset lives in the same cache line as the function address that you just fetched. Also, the address of the vtable is likely to already be in cache, or at least need to be in cache, because it lives among member variables of the object which you''ll presumably use in the function.

Thus, the amortized cost of a virtual function call is likely one cache miss, and possibly a TLB miss to go with it, compared to a straight non-virtual call. There''s a little bit of prediction problems with indirect calls, too (function pointers and virtual calls) but if your code calls a function at a point where scheduling/prediction is a problem, you''re using the wrong tool for the job.

Basically, you should use virtuals to structure your code nicely. You should design your API so that each call does as much work as is useful and necessary, and avoid unnecessary extra calls. No virtual calls in the inner loop! For example: it''e perfectly fine to have a virtual call to "lock" a vertex array and get a pointer you can write N vertices into. It''s not a great idea to have a virtual call to add a single vertex to the array. Most renderers have a virtual call sayiing "draw this vertex array/triangle list with this shader", but it would be really slow to have a call saying "draw this individual triangle" (or, worse, "issue this individual vertex".
Lets say i have an render interface like this one:

	// ==================================	class  CRasterizerInterface	{	public:		CRasterizerInterface() : m_StateChanges(0)	{}		virtual ~CRasterizerInterface()				{}		// get/set current renderstate		finline RenderState GetCurrentRenderState()			{ return m_RenderState; }		finline void SetRenderState(RenderState state)		{ m_RenderState = state; }		finline int GetStateChanges()						{ return m_StateChanges; }		// || virtual functions ||		// -----------------------		// start/stop rendering		virtual void BeginnRendering() = 0;		virtual void EndRendering() = 0;		// en/disable render stage/function		virtual void Enable(PiplineStage stage) = 0;		virtual void Disable(PiplineStage stage) = 0;		// matrix operations		virtual void SetIdentityMatrix() = 0;		virtual void SetWorldMatrix(const Matrix4 &m) = 0;		virtual void SetViewMatrix(const Matrix4 &m) = 0;		virtual void SetProjectionMatrix(const Matrix4 &m) = 0;				virtual void PopMatrix() = 0;		virtual void PushMatrix() = 0;		// basic matrix operations		virtual void Scale(float x, float y, float z) = 0;		virtual void Translation(float x, float y, float z) = 0;		virtual void Rotation(float angle, float x, float y, float z) = 0;		virtual void MultiplyMatrix(const Matrix4x4& m) = 0;		virtual void Matrix4x4 GetCurrentMatrix() = 0;		// basic stuff		virtual void Clear(bool Screen = true, bool DepthBuffer = true, bool StencilBuffer = false) = 0;		virtual void SetClearColor(float r, float g, float b, float a) = 0;		virtual void SetClearColor(Color c) = 0;		virtual void SetColor(float r, float g, float b, float a) = 0;		virtual void SetColor(Color c) = 0;		// blending		virtual void SetBlendingFunction(Blendfactor sfactor, Blendfactor dfactor) = 0;		// shade mode		virtual void SetShadeMode(ShadeMode mode) = 0;		// fill mode		virtual void SetFillMode(FillMode mode) = 0;		// texture filter		virtual void SetTextureFilter(TextureFilter filter) = 0;		// depth buffer		virtual void SetDepthBuffer(DepthComparison func, bool WriteEnabled = true, float ClearDepth = 1.0f) = 0;		// set every buffer buffer options alone		virtual void SetDepthBufferFunction(DepthComparison func) = 0;		virtual void SetDepthBufferWriteEnabled(bool enabled = true) = 0;		virtual void SetDepthBufferClearDepth(float value) = 0;		// stencil buffer		virtual void SetStencilBuffer(StencilComparison func, int refvalue, int maksvalue,									  StencilOperation fail, StencilOperation zfail, StencilOperation zpass) = 0;		// set every stencil buffer options alone		virtual void SetStencilBufferFunction(StencilComparison func) = 0;		virtual void SetStencilBufferReferenceValue(int value) = 0;		virtual void SetStencilBufferMask(int value) = 0;		virtual void SetStencilBufferOperations(StencilOperation fail, StencilOperation zfail, StencilOperation zpass) = 0;		virtual void SetStencilBufferFailOperation(StencilOperation fail) = 0;		virtual void SetStencilBufferDepthFailOperation(StencilOperation zfail) = 0;		virtual void SetStencilBufferPassOperation(StencilOperation pass) = 0;		// culling		virtual void SetCullingMode(CullingMode mode) = 0;		// rendering		virtual void Render(PrimitiveType type, VertexBuffer vb, unsigned int start, unsigned int count) = 0;	protected:		RenderState	m_RenderState;		int			m_StateChanges;	};


And on top of this i make an opengl and directx renderer. Would this be optimal? A good way?

This topic is closed to new replies.

Advertisement