Jump to content
  • Advertisement
Sign in to follow this  
daedalic

Instancing

This topic is 2815 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Is shader instancing best, or is using two streams the best idea?

Also, how do you access data from other streams in HLSL?

Thanks!

Share this post


Link to post
Share on other sites
Advertisement
Quote:
Original post by daedalic
Is shader instancing best, or is using two streams the best idea?

As the idea of "shader instancing" is a bit counter-intuitive to me (it is far from being a "true" instancing technique) I want to link this from gamedev and this from MSDN.

Now, let's make the two really short:

  • True instancing (in D3D9) reduces the batch count by having a special mechanism that will "somehow" fetch some VS varying in parameters differently for each stream. Each index will be fetched normally and resolve to a set of vertex attribs according to the "stream divider" or "stream frequency".
    This allows to draw a variable amount of batches with limited effort.
    Because stream frequency is a per-stream-port property this requires 1 extra stream.

  • Shader instancing is a data trick which really produces a long batch. The vertex attributes are resolved normally.
    This implies everything could be put in a single stream as the "instance" data will really have to be manually replicated. Drawing a variable amount of instances requires more work and more bandwidth but it's still a very good deal as the vertex count is low to start with.

Now, if you think at it, you'll figure out the two things are not really comparable. The longer batch produced by shader instancing is really a standard batch and this means the vertex processing can work more efficiently.
In line of theory, the same could be said for the true instancing but that was not the case. As the complexity of the instanced mesh grows, the benefit of true batching decreases. I think this is the last document talking about D3D9...
You see that long batches will give better performance over the same amount of vertices obtained by "true" instancing.

I personally don't like much the idea of shader instancing (altough I used a variation of it based on uniforms instead of varyings).
Anyway, you see that there's a tradeoff between the two.
Quote:
Original post by daedalic
how do you access data from other streams in HLSL?
You don't... need to care. The API fetches the vertex attribs for you depending on the Stream Frequency and VertexDecl.

Share this post


Link to post
Share on other sites
If we're talking D3D9 here, then "shader instancing" has a few disadvantages:

1. Your batch size is limited by the how much instance data you can stick in the 256 constant registers available to the vertex shader

2. You have to extend both your vertex buffer and your index buffer, and also add an instance index to each vertex

3. Indexing into constant registers causes poor vertex shader performance.

You can mostly get around 1 and 3 by using a texture instead of shader constants, but only for hardware that has decent vertex texturing performance (DX10 class or better). And any hardware that has good vertex texturing performance supports hardware instancing, so you might as well use that instead.

Share this post


Link to post
Share on other sites
another consideration is re-use of instance data

if you need to re-use the same instance data for other things, different draw calls / effects etc.

a vertex texture is accessible at no extra cost and only requires incremental updates (via dup calls)

shader constants have to be re-sent for every effect change and all data has to be sent every time

in one game I use the same instance data to draw: the alien, its shadows, two hud symbols, floating score, status information and two particle systems
with up to 200 aliens visible that would get expensive any other way

using a dynamic vertex buffer for instance data is possible but suffers from issues related to locking
great for circular systems like explosions where you can wait 3 frames before updating etc.

I have found vertex texture based instancing to be universally useful and never use anything else

for many types of object the bandwidth savings enabled by incremental updates and data re-use far outweigh any cost of VTF

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!