Hey Guys,
I currently working on a project which need to use GPU to run a bilateral filter over an depth frame, and need to do it as fast as possible. So basically I will read a depth buffer into GPU and run a separable bilateral filer on it (which first filter it horizontally and then vertically)
So here are some decisions I have to made:
1. Should I make the depth frame a Texture2D or Buffer?
I have read some articles which says Texture is good for random access (Morton pattern memory layout), while Buffer is good for linear access. But in my case, during the first pass my PS or CS will read the data kinda linearly since its a horizontal pass, but when I run the vertical pass, in terms of memory access, it's more like a random access style since I will be reading data in a column by column pattern... I am not super sure whether there are other differences between Buffer and Texture, so I need advises and explanation on this.
2. Should I use pixel shader or compute shader?
I have see the constant time CS filtering algorithm, which is amazing. But I also profiled some of my image filtering algorithm between PS CS, and find out PS can run much faster than CS when filter kernel size is small. Also I was told the PS has some special hardware which are not exposed to CS to do Texture related things faster (which make question 1 more interesting). So I think I need to know which kind of task is good for PS or CS.
I know I should profile it myself and use the faster one based on the result. But having more advises before I start is always better :-) Also my project will be targeting on future GPU, so I think those decisions should be made based on the understanding of Texture, Buffer, CS\PS advantages, rather than blindly trying these combinations on current GPU.
Thanks in advance.
Peng