Best way to extract non zero elements from a sparse matrix

Started by
0 comments, last by MJP 11 years, 12 months ago
I'm trying to find the most efficient way to extract all non zero elements stored in a texture using D3D10 hardware. The resulting array need not be accessed by the CPU. I can think of 3 ways but I'm not sure which one is the most efficient or if there are better ways.

1. The naive way - copy texture to a staging resource and go through the buffer on the CPU.
2. Run a geometry shader with an n x m vertex buffer (1 vertex per pixel) as input and stream out only non zero vertices. Will require a texture lookup and conditional statement in the geometry shader. Sounds slower than (1) and also more wasteful in most circumstances.
3. Use the compute shader to do a scattered write. This seems like the best out of the 3 but I can't find a way to do this using the cs_4_0 profile as it doesn't have atomic operations required to implement a buffer index.
Advertisement
Yeah the easy way to do this is with global atomics, but you don't have access to those on 10-level hardware. In fact cs_5_0 has access to AppendStructuredBuffers, which conveniently wrap global atomics in a simple interface.

Using a geometry shader + stream out would probably work, but I doubt it would be very fast. Geometry shaders are slow in general, and even slower with stream out. However it would let you avoid the expensive CPU/GPU sync that would be required to read data back on the CPU so may well end up being faster.

You might want to try looking around to see if you can find any stream compaction algorithms suitable for a GPU. This older article might give you a place to start.

This topic is closed to new replies.

Advertisement