So I’m hoping (lots of hand waiving, I don’t really know yet...) that with the coherency of nearby pixels traversing similar paths of the BVH, the bandwidth consumption and cache misses will not be too much worse than with clustered.
I've done some testing with a quick BVH implementation to check the number of nodes read. I will not present my result, because the implementation is quick and dirty and would most likely not represent your work. Nevertheless, the question that arose was the number of how many nodes you need to touch per pixel in average. Eg the spheres in a spheretree get quite big really fast, and all the not light related spheres will increase the overhead of all pixels, even if you have unlit pixels etc. So, if you have pixel lit by 100 lights and you have an overhand of X nodes, it is quite cheap, but if you have a pixel lit by none or only one light, X nodes is really expensive. A clustered/cell based shader would fit more tighly around the variable number of lights influencing pixel areas. The question is, how high is X under different setups ?
Just for testing this number I would implement the same search algo from the shader on the CPU to do some rendering simulation to get some quick statistics.
Ashaman73: I dont know how to post an article on here, but maybe I'll do that at some point though, if theyre interested in me doing that.
Check this out to get some infos