Compute Shaders

Started by
4 comments, last by Butabee 13 years, 3 months ago
I was wondering if anyone has experience with compute shaders.

I want to write a custom renderer and using compute shaders seems like it would be the way to go.

What can you tell me about compute shaders? Are they flexible? Can they be used for anything?
Advertisement
Is there some reason you feel that you want generic compute shaders (I'm thinking you're talking about something like CUDA or OpenCL), instead of the typical DirectX/OpenGL graphics rendering shaders? Unless you really want to write the whole rendering path yourself (and this will be a lot slower than using a graphics API), I'm not sure what you think you need to do that for?
[size=2]My Projects:
[size=2]Portfolio Map for Android - Free Visual Portfolio Tracker
[size=2]Electron Flux for Android - Free Puzzle/Logic Game

I want to write a custom renderer and using compute shaders seems like it would be the way to go.
They're the way to go for some things, on the latest hardware.

If you want to support current/older hardware, then you'll also need a fall-back code-path for anything you do with them.

e.g. Pixel shader bloom (for old cards) PLUS Compute shader bloom (for new cards), etc...
Are they flexible? Can they be used for anything?Anything that you can model as Input->Process->Output... They're used a bit for post-processing effects, as the memory access patterns don't always quite fit with what is provided by pixel shaders.
The main reason I want to use compute shaders... and I mean DX11 compute shaders, not CUDA/OpenCL. Is that I'm not making a triangle based renderer. I'm making a point based renderer, and from some testing, straight rasterizing them through directx is pretty slow. I have some ideas of some stuff that would be faster even if just done on the CPU, but would like the added performance of utilizing the GPU.
Here are some of the main reasons for using compute shaders, off the top of my head:

1. You don't need to use the traditional triangle rasterization pipeline,

2. Using the rasterization pipeline would slow down what you're doing.

3. You want to have inter-thread communication or data sharing, which you can do via shared memory

4. You want to be able to re-schedule or re-purpose threads on the fly

On a basic level, authoring a compute shader really isn't that different from authoring vertex shader or pixel shader in HLSL. You use the same basic language constructs, and you have access to the same texture and buffer resources (with the addition of having full access to writeable resources via unordered access views, which you also have access to in pixel shaders). The main difference (in my opinion) is dealing with the inter-thread communication, which is usually done with shared memory, sync functions, and atomic functions. With traditional rendering shaders the different executions of a shader are strongly segregated and thus can't share any data (which is also what made it probably the easiest way to author massively-parallel code), so if you're coming from that world you'll probably have to spend some time getting used to it.

In terms of performance...according to my own experience and IHV recommendations you should be careful about using them to replace something that would traditionally be done in a pixel shader. Pixel shaders will generally have better optimization and sheduling when it comes to raw ALU throughput, and thus compute shaders will have a very hard time beating them when it comes to doing lots of per-pixel math. To make a compute shader win out you'll have to make good use of your raw access to the hardware, and make use of shared memory or the lack of constricting pipeline to reduce memory bandwidth usage or to reduce the number of passes.

For your case, it sounds like it could potentially be a win if done with compute shaders. It would be great if you could share your results once you get something up and running!
I'm not entirely sure it would work but here's what I was planning on doing...

There would be a two dimensional array with point positions that encompass a cube(or maybe a sphere instead) around 0,0,0 scaled out to a certain distance that fits the screen resolution.

At initialization, the normal of each point in the array is calculated.

The normal is used as a key to a hash map with the array index as the value.

Then to get points that make up objects mapped to the array points the point position is subtracted by the camera position then the normal of the point is calculated.

Then to match the points up to the array the calculated point normal is looked up in the hash map.


Then there would be the screen/pixel buffer which view array points would have a pointer to which pixel is match up to what view array point.


I was planning on using an array with distances pre-calculated for normalizing but if I can use compute shaders I might not have to.


This probably sounds confusing because I'm not the best at explaining things, and there might be a couple details left out, but that's basically how I'm planning on rendering points.

This topic is closed to new replies.

Advertisement