transforming vertices using compute shader?

Started by
3 comments, last by Jason Z 12 years, 4 months ago
I've been having a discussion with someone writing a fairly decent planet sim, pretty terrain generation and water, but both fairly high poly.
We got to talking about graphics, I explained the merits of forward rendering z-pre-pass to allow them use of materials as opposed to the deferred approach they were going to change to, all they've done until now is forward, it would be better if they expand on that rather than start from scratch.
The main problem with z-pre-pass is with the high poly count multiplying vertices twice would be a pain, so we got to talking about double buffering transformed vertices.

To cut the long story short,
Would it work if vertices on screen were transformed in a compute shader and returned to their own vertex buffer before being passed to the z-pre-pass and then through the effects.
So;
Multiply Vertices in compute shader, Save in TransformBuffer.
Use TransformBuffer with a Z-Pre-Pass for occlusion culling.
Use TransformBuffer again for Forward Rendering the scene.
clear the TransformBuffer for the next frame.

Would it be worth doing for a fairly high poly scene?

Any comments would be appreciated,
Thanks for reading,
Bombshell
Advertisement
I doubt it would be worth it. Sure, it is possible to construct a scene where the computation per vertex is high enough for this to be win, but most scenes really won't get close. Remember that all you need is the position and possibly an alpha texture coordinate in your Z pass. Unless you have ray casted per vertex lighting, procedural displacement or a gazillion bones, the bandwidth and space tradeoff are likely to dominate your run time (all of these have better solutions anyway).

That said, if you have a few very expensive objects (in terms of computation per byte of geometry output) it may make sense to do it just for these. BTW IIRC, isn't it possible to write out the vertex data during the Z pass, rather than an additional compute pass?

Also, when doing a zpass it is possible to greatly reduce computation by creating a position+alpha only shader - you also can be conservative and simply skip the expensive geometry (characters, trees, complex alpha maps, etc.) if you don't expect them to occlude much.
Would it be worth doing for a fairly high poly scene?
Define 'high poly'. For a few hundred thousands triangles per frame? I doubt it. A couple of mililons? Depends on vertex shader complexity. Several millions? Maybe. But I'd look in stream-out first.

Previously "Krohm"

I've been doing a lot of experiments recently with stream-out optimizations for skinned characters. I implemented it with vertex shader stream-out, but the idea is similar to what you're proposing. Here are some points to consider:

  1. There are restrictions with buffer usages, which means you can't just throw every flag you want (structured buffer, UAV, SRV, etc.) on a vertex buffer. I can't remember offhand what the rules are since I can't find them in the docs anywhere, I just know that when I've used compute shaders in the past with vertex buffers the debug runtimes would yell at me.
  2. Since compute shaders don't use the input assembler, you'll have to handle vertex unpacking in your shader directly.
  3. A dedicated stream out pass is only going to be worth it if you have an expensive vertex shader, which usually means skinning with a lot of bones. And even then it was only worth it for me if I could re-use the skinned data at least 3 times.
  4. The best way to do it would be to have a vertex shader that transforms/skins and simultaneously writes out the transformed data to a buffer. However D3D11 doesn't support UAV's in vertex shaders, so you can't do this (D3D11.1 does).

I've been doing a lot of experiments recently with stream-out optimizations for skinned characters. I implemented it with vertex shader stream-out, but the idea is similar to what you're proposing. Here are some points to consider:

  1. There are restrictions with buffer usages, which means you can't just throw every flag you want (structured buffer, UAV, SRV, etc.) on a vertex buffer. I can't remember offhand what the rules are since I can't find them in the docs anywhere, I just know that when I've used compute shaders in the past with vertex buffers the debug runtimes would yell at me.
  2. Since compute shaders don't use the input assembler, you'll have to handle vertex unpacking in your shader directly.
  3. A dedicated stream out pass is only going to be worth it if you have an expensive vertex shader, which usually means skinning with a lot of bones. And even then it was only worth it for me if I could re-use the skinned data at least 3 times.
  4. The best way to do it would be to have a vertex shader that transforms/skins and simultaneously writes out the transformed data to a buffer. However D3D11 doesn't support UAV's in vertex shaders, so you can't do this (D3D11.1 does).



To expand on #1, it isn't possible to create a vertex buffer that also has a UAV flag. However, you can circumvent this restriction by passing the vertex data into the vertex shader as a shader resource view and then execute the pipeline with empty input assembler vertices. This generates vertices that only have the SV_VertexID system value as input, but you can then use that index to grab the relevant data out of the SRV.

You can take a look at the ParticleStorm demo in the Hieroglyph 3 engine for an example of doing this.

This topic is closed to new replies.

Advertisement