Compute Shader vs Transform Feedback

Started by
4 comments, last by Syntac_ 8 years, 7 months ago

Recently I've stumbled across few things inside particle systems in my own engine. Previously a lot of my particle simulation was done on the CPU (not that it would be too slow, but hey, I need by CPU more free for other computations), therefore I re-written them to work on the GPU. There are actually 2 ways I've implemented particles - the transform feedback (where geometry shader takes care of particle system) and compute shader. As for performance, I haven't seen any major difference in between these two techniques (assuming they are doing the same calculations).

Now, both implementations are just a basic particle system (a lot more simple than what I have on CPU currently), now - if I'd like to add things like collision detection and response for particles (I have collision geometry stored as shapes inside BVHs), applying 'wind' on them (velocity voxel-grid, precomputed), and also forming animations where particles form some sort of geometry surface -> which is a better way to go?

From my point of view compute shader based simulation might be better, but I guess that you can do all this stuff using both solutions. If you tried both, please also share some experience.

My current blog on programming, linux and stuff - http://gameprogrammerdiary.blogspot.com

Advertisement
Basically, CS gives you a lot more flexibility with how you choose to assign work to GPU threads, and how you read/write memory, but TF/SO support goes back a few extra generations of HW.

TF for low-end, CS for high-end (and PS+VTF/R2VB if you still care about DX9/GL2).
I'll echo Hodgman here: you really want to use compute shaders if they're available. They're not only more powerful, but they're also easier to work with compared to using geometry shaders.

If you go the compute shader route as suggested this might be of interest to you: http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2012/10/Holy-Smoke-Faster-Particle-Rendering-using-DirectCompute-Gareth-Thomas.ppsx

-potential energy is easily made kinetic-

This is slightly off-topic, so sorry if I'm hijacking the thread...

What would you guys recommend for DX10/OGL3 level hardware for OIT rendering? I am currently using a home-made Fourier opacity mapped OIT algorithm, but it's far from perfect. Adaptive OIT from Intel requires linked lists, so that won't work. This'd require compute shaders to do the culling on the GPU, and it can't handle arbitrary geometry even if the tile culling is done on the CPU. I just can't stand badly coded transparent effects that aren't sorted correctly, but I just can't seem to find a universal solution.

I did some testing a while back for my scene with 1 emitter with lots of particles (100k) and I gained up to 10x perf switching to CS. YMMV of course.

I also preferred to work with the CS over the GS.

This topic is closed to new replies.

Advertisement