I ran NVPerfhud on the engine tonight, because I suspected some frame-rate dips I was seeing were due to poor frustum culling during shadow passes. Turns out I was right.
This showed up in the indoor dungeon level that I created this morning. It has no directional light, but around 8 point lights, and if shadows were being handled even for off-screen objects, that could be quite a hit.
At one point just turning a bit in the first room, with only 2 point lights clearly visible, caused the frame rate to drop from 200 to 150 fps, with no visible change on screen to the ighting.
NVPerfHUD, btw, is a great free graphics performance and debugging tool, btw, and is made by my good friend Raul Aguaviva at NVIDIA.
Here is a shot of perfhud, during the ambient/emissive pass, after normal map compositing has been performed :
Here is a shot showing a shadow render target for the main character from one of the point lights :
The single-step mode of perfhud lets you step through your draw calls one at a time, and shows you the current rendertarget after each step. By running this, I was able to find out that indeed, shadows were being created for clearly off-screen objects.
Turns out that the code that was collecting shadows wasn't culling aggressively enough. Each chunk of my world is about ~20x20 meters wide, and has ~1500 polygons in it. These chunks are drawn in order to receive shadows.
Each light has a bounding box representing its range, and each material chunk has a bounding box for each light touching it, that contains a tight bounds around all geometry lit by that light in the chunk.
The missing piece was a tight geometry bounds stored with the light. So each light now has an overall bounds, based on its position & range, as well as a tight geometry bounds, which is used for culling geometry for lighting and shadows. After adding that, converting my levels to understand the new light structure, then adding some code to cull world chunks before rendering shadowed objects, I eliminated the frame hitch, and brought the average frame rate up to 220 fps in that area.
There are still sections of the level where the frame rate dips to ~155, when there really are many close-by lights. Some of these aren't visible, but they definitely intersect the view frustum. Since I've been working with a top-down engine for so long, I never bothered with occlusion culling, so there's not an obvious way to fix this.
One approach would be to have the designers add anti-portals or occlusion triggers to the level, that could be tested against to cull more lights based on the camera's position & orientation.
Another approach would have the designers place a box around the light to indicate its potentially visible area. If the player was not in this special box, then the light would not be drawn. This way you could put the box on one side of a wall where you know the player can't see beyond it, even though the light would be in the view frustum.
In other news, I adjusted the post-processing a bit, and re-enabled it. I found that squaring the bloom buffer before
adding it back made things too blinky when the camera panned, and not thresholding at all didn't add enough hotness to the bright areas, so I did both and averaged them like so :
mad_d2 r0, t0, t0, t0
Here is a shot of the post-processing in action :