Jump to content
  • Advertisement
Sign in to follow this  
turanszkij

DX11 Why are we still using index/vertex/instance buffers?

Recommended Posts

There are a number of options now in each API for storing data on the GPU. Specifically in DX11, we have Buffers which can be bound as vertex buffers, index buffers, constant buffers or shader resources (structured buffer for example). Constant buffers have the most limitations and I think that's because they are so optimized for non random Access (an optimization of which we have no control of). Vertex buffers and index buffers however have not many limitations compared to shader resource buffers to the point that I question their value.

For example, the common way of drawing geometry is to provide a vertex buffer (and maybe an instance buffer) by a specific call to SetVertexBuffers. We also provide index buffers with a specific call. At this point we also have to provide an input layout. That is significantly more of a management overhead than it would be if we provided the vertex and index buffers through shader resources and indexed them with sysvalues (eg. SV_VertexID) in the shader.

Now, I haven't actually tried doing vertex buffer management this way but I actually looking forward to it if no one points out the faults in my way of thinking.

Share this post


Link to post
Share on other sites
Advertisement

Thank you! I actually didn't think of using a dedicated index buffer but I see now that it still has value. What I am also interested in is that this way you can easily do hard edge normals and UV discontinuities without duplicated position vertices. I am already using deinterleaved vertex buffers (for more efficient shadow rendering/zprepass) so implementing that should not be very hard.

Oh and something to keep in mind: graphics debuggers (at least Nsight) cannot visualize geometry information without an input layout, that is certainly a downside of it.

Edited by turanszkij

Share this post


Link to post
Share on other sites

BTW since it wasn't explicitly stated what you do is you use a null vertex buffer, this will allow you to generate vertex's procedurally or by fetching them manually using the SV_VertexID and SV_InstanceID system values.  It has been documented here:

https://www.slideshare.net/DevCentralAMD/vertex-shader-tricks-bill-bilodeau

or

 
Starting page seven.
Edited by Infinisearch

Share this post


Link to post
Share on other sites

At the end of the day, vertex buffer and index buffer is just another buffer with semantics attach..as pointed out above I think vertex caching is one of the biggest reason for the having this distinction still as without this, the API will have to be able to flag a generic buffer as being cacheable..

Share this post


Link to post
Share on other sites

MJP point 2 is the most important, you need an index buffer in order to benefit from post vertex transform cache, besides that you could, if you only target recent hardware, go SoA (not interleave your vertex data) and fetch manually, that's what will happen on any GCN anyway.

As mentionned by MJP also, nVidia hardware works differently, not sure about latest gen, all consoles being GCN we tend to optimise for it...

Share this post


Link to post
Share on other sites

Ugh, I implemented it in my engine for every scene mesh render pass and it performs significantly worse on my GTX 1070 than using regular vertex buffers. I was rendering shadows on the sponza scene in 2ms for 6 point lights and the custom vertex fetch moves it up to 11 ms which is insane). The Z prepass of 0.2 ms got up to 0.4ms. These passes are using position and sometimes texcoord and instance deinterleaved buffers. 

The vertex buffers are float4 buffers which I create as shader resources with DXGI_FORMAT_R32G32B32A32 views. In the shader I declare them as Buffer<float4>. The instance buffers are structured buffers holding 4x4 float matrices.

I don't understand what could be going on but it is very fishy, I expected a very minor performance difference.

Share this post


Link to post
Share on other sites

I haven't implemented this myself, but you could try eliminating the overhead of automatic type conversion that buffers have. i.e. the buffer SRV contains a format field, specifying that the data is in a particular format, and the HLSL code says that it wants it converted to DXGI_FORMAT_R32G32B32A32_FLOAT format -- this ability for general purpose conversion might have an overhead on NV?

To avoid that, you could try using a ByteAddressBuffer, and something like asfloat(buffer.Load4(vertexId*16))., which hard-codes the expectation that the buffer will be in DXGI_FORMAT_R32G32B32A32_FLOAT format.

Alternatively you could try using a StructuredBuffer<float4>.

I'd be very interested to know if these three types of buffers have any performance differences... :wink:

Share this post


Link to post
Share on other sites

Yeah I will check with the other buffer types too and post my findings. And double check my implementation too, maybe I missed something more obvious. And I am using a hardware index buffer by the way.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!