SPU based Deferred Shading in Battlefield 3
* Previous DICE games were forward rendered, just a few lights. Mirror's Edge, which they SHOULD be making a sequel to, used pre-rendered radiosity lighting. Goal of Battlefield 3 was to really expand the lighting and materials, and DICE picked deferred.
* Their PS3 implementation uses 5-6 SPUs in parallel with the GPU and CPU.
* GPU does the initial G-buffer fill, and then passes it off for shading by the SPU. Still waiting to see what that means. Are they using SPUs as fragment shaders?
* The framebuffer is sliced into 64x64 tiles, use a simple shared incrementing counter for sync.
* SPUs compute 16 pixels at a time, in SoA format at full float precision
* The GPU is still busy on other shading while the SPUs eat through the deferred shading
* About 8 millis real time on 5 SPUs for deferred shading while GPU does other stuff, equals 40ms total max compute time contributed by the SPUs.
* Light culling is done in two stage tile hierarchy, after whole-camera and coarse Z in light volumes
* A branch is used to skip all 16 pixels if the attenuation makes them unlit. This is a net win despite the branching cost.
* The SPUs are very even pipe heavy due to all of the FPU action, so complex functions can be replaced with a look-up table using strictly odd-pipe instructions, can decrease 21 cycle functions to 4 in case of complex functions.
* There is still a pure GPU based implementation of the shading pipeline that maintain visual parity, in order to facilitate debugging and validation.
* SPU job code can be reloaded into the live game, enabling bug fixes with quick iteration.
* My good friend Steven Tovey gets a call out at the end of the talk! He loves SPUs, so if you meet him you might wanna wear a dust mask.