Jump to content
  • Advertisement
Sign in to follow this  
B_old

GPGPU vs classic shader pipeline: rendering performance

This topic is 2638 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I recently stumbled into a discussion about the performance of GPGPU programs vs that of the "classic" shader pipeline. I have no experience with GPGPU but the argument put forward was, that GPGPU will always be faster because the API layer is thinner while accessing the exact same hardware and because of shared memory also. It kind of made sense to me for filters with large kernels for instance but I assumed that the normal shader pipeline would be faster for rasterizing and shading triangles because of some dedicated hardware.
So, in case I get asked why I use vertex and pixels shaders ect. for rendering, what should I say? :)

Share this post


Link to post
Share on other sites
Advertisement
So, in case I get asked why I use vertex and pixels shaders ect. for rendering, what should I say? :)

You should say: that is how internals of modern video card work - using shaders. All other things, like FFP, are emulated by driver internally using shaders.
Here's more info how modern video cards work: http://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/

Don't confuse GPGPU with rendering using shaders. GPGPU term is used also for computing other things, not only rendering.

Share this post


Link to post
Share on other sites
GPGPU techniques are most often used to perform generic (non-rendering) computation, not for rendering.

Though, there is some amount of research that has been done to implement rendering using GPGPU methods. For a very recent one, see a paper by Laine and Karras: High-Performance Software Rasterization on GPUs (in High-Performance Graphics 2011). Their measure is that the traditional rendering APIs beat a CUDA-implemented renderer by a factor of 2-8x.

Also, other sources exist. Perhaps a more interesting one is this: Alternative Rendering Pipelines in CUDA. The prospects for implementing a renderer in CUDA might be to solve the order-independent-transparency problem (see e.g. this), or to enable real-time raytracing.


I have no experience with GPGPU but the argument put forward was, that GPGPU will always be faster because the API layer is thinner while accessing the exact same hardware and because of shared memory also.


In the light of what I linked to above, I don't think this statement is true at all.

If someone knows about published papers similar to Laine and Karras, please share a link. This is an interesting topic to me as well.

Share this post


Link to post
Share on other sites
The problem with trying to emulate the graphics pipeline in GPGPU is that certain pieces of the pipeline are still done by dedicated hardware, burning shader time to emulate this takes longer. Triangle setup is one such example.

Also, current hardware dispatches the 'classic' shaders better than compute shaders. Take post processing; it is still recommended by IHVs to use pixel shaders over compute shaders for the full screen pass assuming the workload is the same (if you need to use group local storage then all bets are off of course). Tesselation is another area with dedicated hardware which trying to emulate would hurt.

This might well change as hardware progresses and changes but right now the traditional rendering pipeline is going to beat out any GPGPU emulation if you want the full feature set. (You might be able to do a very simple renderer faster by making lots of assumptions however.. but then you'd be using powerful hardware for sub-optimal results so about the only to do this is 'because I can' :D)

Share this post


Link to post
Share on other sites

I recently stumbled into a discussion about the performance of GPGPU programs vs that of the "classic" shader pipeline. I have no experience with GPGPU but the argument put forward was, that GPGPU will always be faster because the API layer is thinner while accessing the exact same hardware and because of shared memory also. It kind of made sense to me for filters with large kernels for instance but I assumed that the normal shader pipeline would be faster for rasterizing and shading triangles because of some dedicated hardware.
So, in case I get asked why I use vertex and pixels shaders ect. for rendering, what should I say? :)


the classic pipeline is faster at the moment, companys like NVidia put a lot of efforts in optimizing them and that's why you usually see 10% benefits in the old pipeline, when doing the same thing (e.g. a full screen post effect). That's also a reason why switching between "classic mode" and GPGPU (cuda/compute shader) is quite expensive and it's recommended to keep it as low as possible, in best case just once a frame, they have to reconfigure quite a bunch of units that need to be idle during that switch.


GPGPU gives you benefits if you don't do exactly the same work, e.g. if you share some computation between pixels. As an example take a bloom, usually every pixel samples all neighbours, which is quite a redundant work, you could just as good run one thread per line and just weight the already sampled values from the registers per pixel. you'd save a big bunch of texture fetches.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!