Pushing or pulling renderables?

Started by
7 comments, last by Richy2k 15 years, 4 months ago
I'm considering how to refactor my renderer, and I'm wondering which of the following options to go with. Assuming all textures, materials and meshes are loaded and that the renderer can access them through a Renderable-object. 1. Pulling: Whenever a game object is created or destroyed I modify a global list controlled by the renderer. Each frame the renderer goes through this list (which will be sorted by state, etc) and queries each object asking if it needs to be rendered (is visible), and if so, renders it. The only mutations to the list happen when objects are created and destroyed, but the renderer will need to query each object for visibility. Basically a sorted rendergraph is maintained alongside the scenegraph. 2. Pushing: Each frame the program goes through the scenegraph and for each object that is visible creates a Renderable, and submits that to the renderer. The renderer will of course sort these as they come in, and can then draw the entire list, without needing to check anything. At the end of the frame the list is discarded. I feel the second option is much cleaner, but I dislike the fact that the sorting by state will happen each frame rather, than when objects are created. Is anyone using option 2, or would like to comment on this?
Advertisement
Definitely 2, at least assuming moderately complex 3D scenes with cameras that can significantly change what gets rendered each frame - because what you render each frame will change, it makes sense that you'll have to re-sort your render queues each frame as well. For example, transparent objects will certainly have to be re-sorted every frame no matter what. Depending on if/how you do batching/instancing, this will also be affected by which objects are actually visible.

Now, if you had a fixed camera and everything you ever wanted to render was already in view, that'd be a different story, and I'd probably go with 1. Or, if your object count wasn't that high (e.g., small scenes), then 1 could also be feasible.

Ultimately, I'm assuming the operation of "going through the scene graph" doesn't actually involve touching each object in the world; this should be traversal of some kind of spatial partitioning structure. If it does involve simple list traversal, then I'd maybe go back to option 1.
The second option is more commonplace, but the first option, and pulling operations in general, are not without their merits. Chiefly, they better lend themselves to multi-threading for such designs promote loose coupling. I'd love to hear other peoples' opinions on this design decision especially with regarding to its impact on threading.
I use the first approach for my gui renderer. I have a scenegraph that contains all gui nodes, but it is only updated if u destroy/create or move a node (e.g. setting a new parent). For rendering i use a alpha/depth sorted renderstack. Thus if u modify the properties (depth, alpa) of a gui element, only the renderstack is re-sorted. I chose this approach for my gui renderer because i wanted to optimize for rendering and not for state changes / node operations. A gui is quite static most of the time, the only properties that change are visibility and z order. Creating/deleting nodes takes a bit longer though as u have to update two data structures.
I guess u have to find out what scenario fits your needs best.

Edit:
I almost forgot to mention that I use the second approach for my octree renderer. The simple reason is that the octree is spatially sorted, hence i need to walk the graph every frame anyway to find out what to draw. The gui renderer on the other hand has no spatial sorting, thus approach 1 is much faster.
Quote:Original post by Ashkan
Chiefly, they better lend themselves to multi-threading for such designs promote loose coupling. I'd love to hear other peoples' opinions on this design decision especially with regarding to its impact on threading.

I can't understand why to keep a global list should be better for multithreading.

A globally accessible list looks like a dead end to me. Since it represents an unique endpoint, concurrency problems may arise.
Quote:Original post by undead
Quote:Original post by Ashkan
Chiefly, they better lend themselves to multi-threading for such designs promote loose coupling. I'd love to hear other peoples' opinions on this design decision especially with regarding to its impact on threading.

I can't understand why to keep a global list should be better for multithreading.

A globally accessible list looks like a dead end to me. Since it represents an unique endpoint, concurrency problems may arise.


Not to mention that the OP's question, IMO, isn't really about "pushing" or "pulling", despite the use of those words to describe each operation. Pushing/pulling is just a question of who's doing what; if the renderer were the one querying the scene graph, then it would be pulling and not pushing.

The more fundamental issue here is whether to traverse a single list every frame containing every single potential renderable, or whether to try to apply some sort of optimization structure.

Either approach could be multi-threaded just as easily, and both will be forced to deal with how to handle shared access to the device. The fact that a list is global isn't necessarily an issue, it's just a matter of ensuring no one else is writing to that structure while it's being read from. But this issue would arise for a shared list as well as a shared scene-graph of some sort.
Quote:Original post by emeyex
Not to mention that the OP's question, IMO, isn't really about "pushing" or "pulling", despite the use of those words to describe each operation. Pushing/pulling is just a question of who's doing what; if the renderer were the one querying the scene graph, then it would be pulling and not pushing.

Correct me if I'm wrong, as I don't think I'm so competent about multithreading issues, but there's a coherency problem with a global (I mean unique) pulled list.

OP states when creating/deleting objects he invokes add/remove to this unique list. In general you need to read data from a node to correctly render an entity, otherwise there's no need to have a scene graph, because all your entities can be stored into a list. IMHO the scene graph should be responsible of invoking add/remove (i.e. you add en entity to the scene graph, the scene graph invokes renderer->add(pEntity);). Otherwise nobody's going to guarantee the coherence between the scene graph and the renderable list. This makes pulling closer to pushing.

Quote:
The more fundamental issue here is whether to traverse a single list every frame containing every single potential renderable, or whether to try to apply some sort of optimization structure.

I agree but OP design is different. He states "the renderer queries each object asking if it needs to be rendered (is visible). Of course, this is closely linked with my previous point: since the objects need to know if they are visible it is required to mantain coherency between the graph and the unique list. Even if OP believed coherency isn't needed, he would have to inform objects about their visibility, regardless of the fact they could be part of the render list.

The need for the scene graph to guarantee coherence and to perform operations on every renderable, makes me wonder why OP should keep an unique list, since he can just add renderables to the renderer every frame.

Quote:
Either approach could be multi-threaded just as easily, and both will be forced to deal with how to handle shared access to the device. The fact that a list is global isn't necessarily an issue, it's just a matter of ensuring no one else is writing to that structure while it's being read from. But this issue would arise for a shared list as well as a shared scene-graph of some sort.

The fact is global isn't an issue per se but if OP is referring about a global list meaning it's unique, this makes the entire mechanism less flexible than a pushing one.

This is not the DirectX forum, but I wonder if multiple lists could help taking advantage of DX11 deferred contexts.
Thank you all for your replies, they've made me consider exactly what I was thinking when I wrote the post. I'll expound on that first.

I've never been a fan of a single hierarchy to represent the world for all systems. I've always felt that each system (render, physics, logic) should have it's own hierarchy of the data such that the data is sorted in an optimal way for each subsystem. A component-system works well with this (although isn't really relevant to my original question).

Thus, I was thinking in the first option, of having two tree structures; one is the regular scenegraph with nodes and accumulation of transformations, and the second is a rendergraph, which contains the same objects, but sorted by state and material, such that the renderer when it comes to rendering doesn't need to do any sorting, but needs only traverse the render graph. Since all the objects in the world would be in this graph (they are added and removed at construction and destruction), and there is no spatial connection between the nodes, the renderer would have to traverse all the nodes, and examine each node to see if it was visible or not (this would be a simple flag which is set from the spatially coherent scenegraph).

In the second option there is no fixed rendergraph, rather one is created each frame as we traverse the spatial hierarchy to determine visibility. After the entire tree is traversed the resulting structure would be similar to the rendergraph in option 1, except only visible nodes are contained in it. After then rendering this hierarchy it is discarded at the end of the frame.

Quote:emeyex
Definitely 2, at least assuming moderately complex 3D scenes with cameras that can significantly change what gets rendered each frame - because what you render each frame will change, it makes sense that you'll have to re-sort your render queues each frame as well. For example, transparent objects will certainly have to be re-sorted every frame no matter what. Depending on if/how you do batching/instancing, this will also be affected by which objects are actually visible.

This I feel is quite incorrect. How many objects to you plan on adding each frame? In a regular FPS you really don't add all that many objects. Hence the reason I'm even considering option 1.


Quote:Original post by emeyex
The more fundamental issue here is whether to traverse a single list every frame containing every single potential renderable, or whether to try to apply some sort of optimization structure.

I think this comment gets to the crux of the matter. With option 1 I need to examine each and every object in the world. With option 2 I can discard non-visible objects in groups depending on the spatial hierarchy.

Option 2 describes my Renderer. However, Option 1 describes my Scenegraph + UI system.

The Renderer is simply a bunch of addWhateverRenderableType calls, copies over the states required for rendering, shoves it into a list to be performed later. It's essentially a thin layer over the underlying API which also deals with batching and sorting. I.e. - Where possible multiple 2D renderables added will be merged together into one draw call, and fewer states will be set if 3D renderables share some data. When I come to do the rendering multithreaded, I'll simply keep multiple lists of renderables, and render the most recently added list.

The Scenegraph has all game objects added to it, and when instructed to perform a render with a specified viewport - it builds up 3 lists - front-back, back-front, and unsorted - all objects in these lists are classed as visible. Next, it calculates what lights affect them, and what projectors affect them. After this process it goes over all objects in the lists and calls their render function.

My UI system is very similar to the Scenegraph - add objects to it, it does the rest. The widget system is setup to increment the sort order of all 2D renderables to be added to the renderer, so it's automatically sorted correctly.

From game programming side - it's setup to pull, all you need to do is use the higher level constructs and you rarely need to touch the renderer.
Adventures of a Pro & Hobby Games Programmer - http://neilo-gd.blogspot.com/Twitter - http://twitter.com/neilogd

This topic is closed to new replies.

Advertisement