Advertisement Jump to content
Sign in to follow this  
Quat

Instancing Performance

This topic is 1886 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

With d3d11 instancing, you stream the instance data via a secondary buffer into the vertex declaration.  This can make the vertex structure look pretty big byte wise (for example, four extra float4s just from the world matrix).  I've read that you want to keep the byte size of vertex structures as low as possible to reduce memory bandwidth. 

 

Is using a large vertex structure from instancing equivalent to using the same large vertex structure without instancing memory bandwidth wise?  Or are the GPUs efficient at assembly the vertices from the non-instanced data and the instanced data so I do not need to worry about this?  I'm assuming it is efficient since instancing is a recommended optimization, but I wanted to check.

 

The reason I'm a bit concerned is that I want to add generic instancing support to my engine but I'm wondering if there is too much overhead if the instance count is small. 

Share this post


Link to post
Share on other sites
Advertisement

You can try to use Large constat buffers something like

 

#define MAX_INSTANCE_CNT ????

 

cbuffer PerInstanceData

{
 matrix world[MAX_INSTANCE_CNT];

 float4 somethingDifferent[MAX_INSTANCE_CNT];
};

 

and later use the SV_InstanceID;

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!