[quote name='Jason Z' timestamp='1324146611' post='4894834']
[quote name='360GAMZ' timestamp='1324090742' post='4894682']
I'm forging ahead on my implementation of tile based CS lighting. One thing I ran into is that since the mini-frustum vs. light culling that the threads do is in view space, my light data (position and direction) needs to be in view space, too. In my game, all lights are stored in world space, so I could simply transform them to view space on the CPU as they're being written to the StructuredBuffer. I'm not too excited about doing this since our games tend to be CPU limited.
Why not convert the mini-frustums to world space instead? This would effectively require you to get the world space position and orientation of the camera, then you can generate your mini-frustums from that. That way your lights stay in world space, your mini-frustums are in world space, and no transformation is required on the CPU or GPU.
Would that work in your use case?
[/quote]
I think that should definitely work. Though, it would require 6 transformations instead of the 2 I'm currently doing: light to view space for culling and pixel position to world space for the lighting calc. Alternatively, I could do the lighting calc in view space, but I would have to transform the light to view space a 2nd time, so it's a wash. Unless I stored the transformed light for reuse in the lighting calc, but I believe 3 dot products is faster than a resource store + load.
[/quote]
Maybe I am not really understanding (sorry for beating a dead horse...) but if all of these are on your CPU side:
- Light data is in world space
- Frustum data is in view space
- Pixel position (in view space?)
- Lighting is carried out in view space
If all of that is true, then you should be able to convert the frustums to world space, reconstruct the world space pixel position instead of view position, and then carry out the lighting in world space. That would reduce the overall work needed on the GPU, while minimizing the work needed on the CPU (frustum data must be done on CPU). Am I seeing this correctly?