[SOLVED][DX10] Point sprites and the z-buffer

Started by
3 comments, last by Key_46 12 years, 9 months ago
I am implementing the ParticlesGS sample in my engine and i came up with a problem... the rendering order is screwed, the only good situation is when i have additive blending, but in my case i have some smoke particles that doen't look nice. So is there a workaround with z-buffering and blend states or i need to hardcode a sorting on the CPU? Because if i had to implement a CPU solution then the whole idea of using GS is worthless, right?
Advertisement

I am implementing the ParticlesGS sample in my engine and i came up with a problem... the rendering order is screwed, the only good situation is when i have additive blending, but in my case i have some smoke particles that doen't look nice. So is there a workaround with z-buffering and blend states or i need to hardcode a sorting on the CPU?


Your basic alpha-blended transparency is order-dependent, and there's no good way around it. Methods for order-independent transparency exist, but they are generally way to slow to be of practical use (especially for particles). I'm pretty sure that sample you're using sorted the particles on the CPU.



Because if i had to implement a CPU solution then the whole idea of using GS is worthless, right?


No, why would you say that? The point of using the GS for point sprite expansion is so that you only have to feed the GPU a single point per particle, rather than a list of 4 vertices per particle + an index buffer. You still get that benefit if you sort your point list on the CPU.
Thanks for the response MJP, the problem is that i am also redirecting the stream output so i can do all particle computations on the GPU side without sending data through the bus. But i don't know how better this is in comparison with normal CPU calculations. I am implementing the sorting on the CPU and i'm still getting pretty neat results smile.gif

I am also questioning the use of Geometry Shader, it makes the coding simplier but I've read (in 2007) that Geometry Shader produces a noticeable overhead, and since i'm creating a simple quad, isn't better to use instancing instead?

Thanks for the response MJP, the problem is that i am also redirecting the stream output so i can do all particle computations on the GPU side without sending data through the bus. But i don't know how better this is in comparison with normal CPU calculations. I am implementing the sorting on the CPU and i'm still getting pretty neat results smile.gif

I am also questioning the use of Geometry Shader, it makes the coding simplier but I've read (in 2007) that Geometry Shader produces a noticeable overhead, and since i'm creating a simple quad, isn't better to use instancing instead?


Ahh I see. Yeah in that case sorting is a little tricky. It is possible to sort on the GPU, however it's probably not practical unless you use a compute shader. If you do ever go down that route, Nvidia has whitepapers and tutorials for various sorting algorithms in their cuda developer zone.

As for the point expansion, the older DX10-era presentations from the IHV's used to say that geometry shaders were okay as long as you didn't emit more than 4 vertices and you kept your vertex structure pretty lightweight. So for particles that should be fine. But you can always profile if you need to know for sure for a particular GPU.

Ahh I see. Yeah in that case sorting is a little tricky. It is possible to sort on the GPU, however it's probably not practical unless you use a compute shader. If you do ever go down that route, Nvidia has whitepapers and tutorials for various sorting algorithms in their cuda developer zone.

As for the point expansion, the older DX10-era presentations from the IHV's used to say that geometry shaders were okay as long as you didn't emit more than 4 vertices and you kept your vertex structure pretty lightweight. So for particles that should be fine. But you can always profile if you need to know for sure for a particular GPU.


Thanks again, i will leave the full GPU and the CPU sorting solutions then. I will compare the performance with the instancing when i finish these two.

This topic is closed to new replies.

Advertisement