Jump to content

  • Log In with Google      Sign In   
  • Create Account

Interested in a FREE copy of HTML5 game maker Construct 2?

We'll be giving away three Personal Edition licences in next Tuesday's GDNet Direct email newsletter!

Sign up from the right-hand sidebar on our homepage and read Tuesday's newsletter for details!


We're also offering banner ads on our site from just $5! 1. Details HERE. 2. GDNet+ Subscriptions HERE. 3. Ad upload HERE.






SPU based Deferred Shading in Battlefield 3

Posted by Promit, in Programming 02 March 2011 · 956 views

Waiting right now for Christina Coffin's talk on SPU based deferred shading in Battlefield 3. I hear they use DX11 compute shader heavily on the PC side, so it'll be interesting to see how they leverage the SPUs.

* Previous DICE games were forward rendered, just a few lights. Mirror's Edge, which they SHOULD be making a sequel to, used pre-rendered radiosity lighting. Goal of Battlefield 3 was to really expand the lighting and materials, and DICE picked deferred.
* Their PS3 implementation uses 5-6 SPUs in parallel with the GPU and CPU.
* GPU does the initial G-buffer fill, and then passes it off for shading by the SPU. Still waiting to see what that means. Are they using SPUs as fragment shaders?
* The framebuffer is sliced into 64x64 tiles, use a simple shared incrementing counter for sync.
* SPUs compute 16 pixels at a time, in SoA format at full float precision
* The GPU is still busy on other shading while the SPUs eat through the deferred shading
* About 8 millis real time on 5 SPUs for deferred shading while GPU does other stuff, equals 40ms total max compute time contributed by the SPUs.
* Light culling is done in two stage tile hierarchy, after whole-camera and coarse Z in light volumes
* A branch is used to skip all 16 pixels if the attenuation makes them unlit. This is a net win despite the branching cost.
* The SPUs are very even pipe heavy due to all of the FPU action, so complex functions can be replaced with a look-up table using strictly odd-pipe instructions, can decrease 21 cycle functions to 4 in case of complex functions.
* There is still a pure GPU based implementation of the shading pipeline that maintain visual parity, in order to facilitate debugging and validation.
* SPU job code can be reloaded into the live game, enabling bug fixes with quick iteration.
* My good friend Steven Tovey gets a call out at the end of the talk! He loves SPUs, so if you meet him you might wanna wear a dust mask.




PARTNERS