Jump to content
  • Advertisement
Sign in to follow this  
Halifax2

Question about "SPU Usage" from "Deferred Rendering in Killzone 2" Presentation

This topic is 3313 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I was reading through the Guerilla Games presentation: Deferred Rendering in Killzone 2. Most of it is pretty easy to understand, but I can't seem to comprehend what they are doing with their object rendering system. I'm just going to quote the specific slide (44):
Quote:
* Everything is data driven - No "virtual void Draw()" calls on objects - Objects store a decision-tree with DrawParts - DrawParts link shader, geometry, and flags - Decision tree used for LODs, etc. * SPUs pull rendering data directly from objects - Traverse scenegraph to find objects - Process object's decision-tree to find DrawParts - Create displaylist from DrawParts
Little bits of this make sense to me, such as iterating over the list of registered objects, grabbing their data, evaluating it and finally adding it to a 'displaylist' to be processed later. But what exactly is the data that the decision-tree holds? Can anyone provide some pseudocode that implements what they speak of?

Share this post


Link to post
Share on other sites
Advertisement
Maybe something like this:
class DecisionNode
{
public:
DecisionNode( const bool& c, RenderNode* t, RenderNode* f )
: pFalse(f), pTrue(t), condition(c) {}

virtual RenderNode* GetChildNode()
{
if( condition )
return pTrue;
else
return pFalse;
}
private:
RenderNode* pFalse;
RenderNode* pTrue;
bool& condition;
};

class Player
{
RenderNode lowDetail;
RenderNode highDetail;

bool lowLodMode;

DecisionNode node;

Player() : node(lowLodMode, &lowDetail, &highDetail) { lowLodMode = false;}

virtual DecisionNode& GetRenderDecisionTree() { return node; }
};

You create some kind of data-structures, probably with virtual functions (like my GetChildNode func above) that can choose between different renderables. Decision nodes could be used to select the right LOD model, decide whether to render a type of special-effect, etc...

Instead of each object having it's own draw function where you make these decisions using code, you use these 'decision nodes' to describe the conditions for which a renderable should be drawn.

Share this post


Link to post
Share on other sites
Ah, okay, thanks for the input Hodgman. That definitely makes sense, and is certainly very data-driven. I guess I overcomplicated the situation because in my mind I was imagining something of or relating to the decision trees they use for AI. :D

Well, once again, thanks. I'm going to try to implement this in a little test tech-demo of sorts I guess.

Share this post


Link to post
Share on other sites
It's worth pointing out that they *probably* aren't making much (any?) use of virtuals, and it's highly doubtful that they'd do anything clever like storing references to external bools. All of the data needed to evaluate decisions is almost certainly in a tight block of data coupled to the object itself. This is meant to run on SPU, after all...

Share this post


Link to post
Share on other sites
Quote:
Original post by osmanb
It's worth pointing out that they *probably* aren't making much (any?) use of virtuals, and it's highly doubtful that they'd do anything clever like storing references to external bools.

Yeah, I have to agree, they are probably bagging the vtable.

Share this post


Link to post
Share on other sites
interesting link.

although its worth pointing out that SPU's have very minimal scratchpad memory, and wont be doing things like large tree traversal etc within the memory they have (they are also not very suited to random access memory pattern tasks). its likely the PPU will do the actual traversal, or they have a nice DMA friendly tree structure that they can process linearly without needing to perform random memory accesses.

they wont be using virtual functions, or playing with vtables, the code will be compact, simple, and with as few branches as possible. think of SPU's as sort of like shaders with local memory, bidirectional DMA, and seriously kickass math performance.

cache, and DMA are critical to SPU's. they are very very powerful, but its tricky to keep them fed with data to process.

anyway, its mostly academic unless you actually have a ps3 to play with. the things to take away from the article are that they hand most of the processing off from the main CPU for render list generation (a good thing to copy, especially on multicore cpu's) and that they are doing a lot of processing, both in shaders, and on CPU. itd be interesting to see actual metrics on time and performance of such a solution. it sure does look good when its all said and done, but damn thats one complex renderer :)



Share this post


Link to post
Share on other sites
Quote:
Original post by Matt_D
they are also not very suited to random access memory pattern tasks)
True in general but DMA lists are available and do allow it for when its needed. I implemented a radix sort that works on large data sets with random access and can keep it going. I had to pad the individual DMA transfers to 16bytes though

Quote:
anyway, its mostly academic unless you actually have a ps3 to play with. the things to take away from the article are that they hand most of the processing off from the main CPU for render list generation (a good thing to copy, especially on multicore cpu's) and that they are doing a lot of processing, both in shaders, and on CPU. itd be interesting to see actual metrics on time and performance of such a solution. it sure does look good when its all said and done, but damn thats one complex renderer :)


I think how the evolution of every PS3 engine is like this,

1) Implement renderer on PS3 really quickly to show producers that you got it working
2) Profile - RSX is too slow
3) Move subset of render pipeline to SPU
4) Profile - RSX is too slow
5) Move subset of render pipeline to SPU
..
..
..
9) PPU is too slow
10) Move subset of simlation to SPU
..
..

A few years later you have complicated ping pong of data structures between 5 spu's, a ppu, the RSX, and probably some random middle ware on that 6th SPU.

-= Dave

Share this post


Link to post
Share on other sites
Yeah, Matt_D, I'm actually working with a PS3 Linux setup and I was interested in trying to implement something similar to Guerilla Games for a little technical demonstration.

Share this post


Link to post
Share on other sites
Quote:
Original post by Halifax2
Yeah, Matt_D, I'm actually working with a PS3 Linux setup and I was interested in trying to implement something similar to Guerilla Games for a little technical demonstration.


except the PS3 linux side of things doesnt allow for shaders. so you can only do half the job.

Share this post


Link to post
Share on other sites
Quote:
Original post by David Neubelt
A few years later you have complicated ping pong of data structures between 5 spu's, a ppu, the RSX, and probably some random middle ware on that 6th SPU.

-= Dave


i know exactly what you mean :) seen it happen a few times.

luckily ive also seen well designed stuff from the get go. it tends to be a lot stabler, has less bugs and is far easier to maintain than some overly complex spu ping pong :)

nice work on the radix sort :) what kind of throughput were you getting?

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!