Sign in to follow this  
fareed_d

SceneGraph/Transformations question

Recommended Posts

Hi guys, I've been browsing these forums for quite a while now, but registered this account because I need to start asking some questions, if you'd be so kind! A little bit of back-story: I'm implementing a 2D scene graph. At the moment, I have transformation nodes, and geometry nodes descending from a single abstract parent class, Node_Component. Obviously, this will be expanded with more subclasses in the future, but at the moment I am just trying to get the basic structure sorted. Transformation nodes have a vector of child node_components (these can be transformation or geometry nodes). At the moment, rendering is done like this: when a new geometry object is attached to a node to be rendered, it pushes its set of vertices to a static std::list in the Geometry class, and is handed a pointer to its iterator within that list. I am using a list because when geometry is deleted, the pointers to other Geometry objects are not invalidated. When a transformation node is transformed, it updates all its children, so all attached Geometry nodes will get their new vertex co-ordinates from the respective transformation matrices, and use their pointer to update their vertices in the std::list. Once all transformations are complete, the application copies the list into an std::vector so that it is in a format that can be rendered using a vertex array (I'll be using a VBO when i get round to it) by openGL. Now, this is fine. The only nodes that are updated are the ones that need updating, increasing performance. My problems, and therefore my questions, are as follows: First question: How can I increase performance when a lot of objects are being updated? When I have a large amount of objects moving (my test is 1 geometry node for each of 10,000 transformation nodes attached to one rotating node, causing each of 20,000 nodes to be updated onscreen per frame), the performance falters significantly; is there any way to alleviate this? Second question (the important one!): Offscreen transformations take just as long as onscreen transformations! If i have a level in a game with 100,000 transformation nodes, but only 5 are onscreen, the other 999,995 are still updated each frame for the transformations! Is there any way I can alleviate this? Thanks for reading this wall of text, Fareed

Share this post


Link to post
Share on other sites
The method the OP describes is far from being efficient (if I understand it correctly). Never transform the vertices directly if there is not a really good reason for. Instead, use the Transformation nodes to build up a local-to-global transformation matrix, activate the correct matrix (e.g. using glLoadMatrix(...)), and render the geometry then as is. In that way you let OpenGL transform the particular vertices (what it does anyway) and save both performance and memory. Moreover, since the geometry is (often) not touched by you, you can push it as static geometry onto the graphics card.

Share this post


Link to post
Share on other sites
Quote:
Original post by fareed_d
First question: How can I increase performance when a lot of objects are being updated? When I have a large amount of objects moving (my test is 1 geometry node for each of 10,000 transformation nodes attached to one rotating node, causing each of 20,000 nodes to be updated onscreen per frame), the performance falters significantly; is there any way to alleviate this?

Second question (the important one!): Offscreen transformations take just as long as onscreen transformations! If i have a level in a game with 100,000 transformation nodes, but only 5 are onscreen, the other 999,995 are still updated each frame for the transformations! Is there any way I can alleviate this?


The described scenarios seem me somewhat artificial. Why the hell should I have so much transformations for so few geometry? Okay, you said it is "a test", but for what is such a test good for if reality will never looks like?

However, you suffer from the belief that a scene graph is the top of possibilities. You must notice that a scene graph cannot fulfil all senseful tasks at once. Therefore it is the case that nowadays a scene graph is seen just as one management structure besides others, more specialized ones. E.g. a spatial structure like a quadtree (speaking of 2D, right?) may be used to determine faster what geometry nodes are visible at all (keywords "visibility culling", "frustum culling", ...). A dependency graph can be used to manage the transformation nodes, perhaps with "dirty flag" propagation to perform re-calculations only when needed. Other structures may be applied as well.

Furthermore it may be that transformations, although expressed by one node per elemental transformation, can be joined to a more complex one. E.g. using X3D's Transform node instead of having a dozen single nodes in a chain.

Share this post


Link to post
Share on other sites
Thanks for your reply, I'll be looking into the stuff you mentioned!

Quote:
Original post by haegarr
The described scenarios seem me somewhat artificial. Why the hell should I have so much transformations for so few geometry?


I said 1 geometry node PER transformation node; 10,000 geometry nodes, 10,000 transformation nodes.

Quote:
Original post by haegarr
The method the OP describes is far from being efficient (if I understand it correctly). Never transform the vertices directly if there is not a really good reason for. Instead, use the Transformation nodes to build up a local-to-global transformation matrix, activate the correct matrix (e.g. using glLoadMatrix(...)), and render the geometry then as is. In that way you let OpenGL transform the particular vertices (what it does anyway) and save both performance and memory. Moreover, since the geometry is (often) not touched by you, you can push it as static geometry onto the graphics card.


transformation nodes DO build up a local to global matrix. When a geometry node is updated, it uses its parent's matrix and glMultMatrixf to get its world coords, which are then pushed to the vertex array. Rendering the geometry there and then with immediate mode would require every node to be rendered in that way.


The main thing that I wanted to know really was if there was a way to speed up/eliminate the need to update transformations that are happening offscreen.

thanks again.

Share this post


Link to post
Share on other sites
Well, this
Quote:
Original post by fareed_d
When a transformation node is transformed, it updates all its children, so all attached Geometry nodes will get their new vertex co-ordinates from the respective transformation matrices, and use their pointer to update their vertices in the std::list.

Once all transformations are complete, the application copies the list into an std::vector so that it is in a format that can be rendered using a vertex array...
sounds to me that not OpenGL but your application is doing the transformation on vertices! But perhaps I'm wrong.

Quote:
Original post by fareed_d
transformation nodes DO build up a local to global matrix. When a geometry node is updated, it uses its parent's matrix and glMultMatrixf to get its world coords, which are then pushed to the vertex array. Rendering the geometry there and then with immediate mode would require every node to be rendered in that way.
glMultMatrix and similar are done by the driver in software. Hence you have no real advantage using them if parts of the tree are not changing their transformations. You should instead use a dependency graph and calculate the local-to-global matrices manually. Store the respective matrix with the belonging node to implement a kind of cache.

Share this post


Link to post
Share on other sites
Quote:

sounds to me that not OpenGL but your application is doing the transformation on vertices! But perhaps I'm wrong.


What are you suggesting? Is there a way of getting glVertexPointer to take vertices which have not been transformed into global co-ordinates?

Quote:

glMultMatrix and similar are done by the driver in software. Hence you have no real advantage using them if parts of the tree are not changing their transformations. You should instead use a dependency graph and calculate the local-to-global matrices manually. Store the respective matrix with the belonging node to implement a kind of cache.


this is what i mean by "its parent's matrix"; ill paste the code:


void Node::update()
{
glLoadMatrixf(mParent->mMatrix);
glTranslatef(mPosition.x, mPosition.y,0);
glRotatef(mPosition.r,0,0,-1);
glGetFloatv(GL_MODELVIEW_MATRIX,mMatrix);

std::vector<Node_Component*>::iterator it;
for (it = mNodes.begin(); it != mNodes.end(); it++)
(*it)->update();
}


here, mMatrix is the "respective matrix" that is stored with the "belonging node", as you put it. This code is what happens when the position of this node (or a parent of this node) has changed; it gets its parent matrix, multiplies it with its local position, then stores that as its own matrix. It then updates its children.

P.S. I don't intend to load into the modelview matrix and transform like that forever; it's temporary until I write and benchmark faster matrix multiplication stuff

Share this post


Link to post
Share on other sites
Quote:
Original post by fareed_d
What are you suggesting? Is there a way of getting glVertexPointer to take vertices which have not been transformed into global co-ordinates?

For sure. The entire mechanic of the MODELVIEW matrix is just for that case: Pushing geometry defined in a local space onto the GPU, and let the GPU do the local-to-global(-to-view) processing.

Quote:
Original post by fareed_d
this is what i mean by "its parent's matrix"; ill paste the code:


void Node::update()
{
glLoadMatrixf(mParent->mMatrix);
glTranslatef(mPosition.x, mPosition.y,0);
glRotatef(mPosition.r,0,0,-1);
glGetFloatv(GL_MODELVIEW_MATRIX,mMatrix);

std::vector<Node_Component*>::iterator it;
for (it = mNodes.begin(); it != mNodes.end(); it++)
(*it)->update();
}


here, mMatrix is the "respective matrix" that is stored with the "belonging node", as you put it. This code is what happens when the position of this node (or a parent of this node) has changed; it gets its parent matrix, multiplies it with its local position, then stores that as its own matrix. It then updates its children.
So you use the gl routines as a vector math library (that wasn't clear from your previous post). Do you consider the fact that the MODELVIEW matrix contains the VIEW matrix, too? I mean, changing the camera will immediately make the entire tree of transformations invalid if not handled correctly. (Or, maybe, you never change the camera.)

Another thing is the aforementioned "dirty propagation". It is not clear yet _when_ you invoke Node::update() exactly. If you invoke it on every change of a transformation node, then you may re-compute transformations "just for fun", because a milli second later (but still before rendering) another change in the tree somewhere upwards may cause another re-computation. If such situations may occur in your solution, then please don't update() immediately but propagate a dirty state flag to all dependents that are currently not already dirty flagged. Then do the computation of the transformation not in the push model (as is done now) but in the pull model: When a Node is asked for its global matrix, it should look whether it is marked dirty. If not, it returns the previously computed matrix. Otherwise it asks it parent Node for its global matrix and uses it for computing the own global matrix as usual. This way you do the re-computation only for those nodes that are influenced by a change and are used in some way that requires the global matrix.

Share this post


Link to post
Share on other sites
Quote:
Original post by haegarr
Quote:
Original post by fareed_d
What are you suggesting? Is there a way of getting glVertexPointer to take vertices which have not been transformed into global co-ordinates?

For sure. The entire mechanic of the MODELVIEW matrix is just for that case: Pushing geometry defined in a local space onto the GPU, and let the GPU do the local-to-global(-to-view) processing.


Wow! I didn't realise that I could do that! That makes things a lot simpler, thanks.
EDIT: This would mean using several glVertexPointers, correct? There are several Geometry nodes, each with their own width and height; at the moment I am turning their width and height into local vertices, then multiplying them by the modelview matrix and storing them in ONE array which I point to with glVertexPointer.

As for your other point, I've effectively made the transformation methods safe; if someone rotates a node, then rotates it again, the update() is only called once for the combined transformation when the frame is processed.

[Edited by - fareed_d on February 13, 2010 9:21:12 AM]

Share this post


Link to post
Share on other sites
Quote:
Original post by fareed_d
This would mean using several glVertexPointers, correct? There are several Geometry nodes, each with their own width and height; at the moment I am turning their width and height into local vertices, then multiplying them by the modelview matrix and storing them in ONE array which I point to with glVertexPointer.

The way is to store each different geometry (i.e. their vertices and perhaps their indices) in its own VBO or at least in its own section of a VBO. Then set-up the MODELVIEW matrix from a specific Node, address the VBO with the correct glVertexPointer, and invoke its rendering. Then proceed with the next Node that provides geometry.

Notice that here is another possibility of saving runtime. If you have the case that several nodes have the same geometry, then you can re-use the vertex pointer set-up if you render all Nodes with that geometry in sequence. That would avoid some cost hidden in driver set-up. You may also try to use instancing in such a case.

Quote:
Original post by fareed_d
As for your other point, I've effectively made the transformation methods safe; if someone rotates a node, then rotates it again, the update() is only called once for the combined transformation when the frame is processed.
This is okay, although dirty flag propagation along with the pull model goes somewhat further.

Share this post


Link to post
Share on other sites
Quote:
Original post by haegarr
Quote:
Original post by fareed_d
This would mean using several glVertexPointers, correct? There are several Geometry nodes, each with their own width and height; at the moment I am turning their width and height into local vertices, then multiplying them by the modelview matrix and storing them in ONE array which I point to with glVertexPointer.

The way is to store each different geometry (i.e. their vertices and perhaps their indices) in its own VBO or at least in its own section of a VBO. Then set-up the MODELVIEW matrix from a specific Node, address the VBO with the correct glVertexPointer, and invoke its rendering.


If the rendering was invoked from each node, surely this would mean that every node that contained geometry would need to be updated every frame?

Share this post


Link to post
Share on other sites
Quote:
Original post by fareed_d
If the rendering was invoked from each node, surely this would mean that every node that contained geometry would need to be updated every frame?
For every frame each geometry node is required to be tested whether or not it needs to be rendered (keyword "visibility culling", you remember; some ways of speeding up like a suitable spatial management structure and frame coherency, if needed, can be used). This _may_ mean that for each node the global matrix needs to be re-computed (if it is dirty) just for the case of visibility culling. Only geometry that has passed the visibility check will be handled further, i.e. passed through the rendering pipeline.

However, the geometry, if it is static or skinning is done on the GPU or some other trick like this, already is on the GPU and hence need not be transferred again (that is why it is called static geometry). Only geometry where vertices are changed by the CPU need to be transfered again.

Share this post


Link to post
Share on other sites
Quote:
Original post by haegarr
Quote:
Original post by fareed_d
If the rendering was invoked from each node, surely this would mean that every node that contained geometry would need to be updated every frame?
For every frame each geometry node is required to be tested whether or not it needs to be rendered (keyword "visibility culling", you remember; some ways of speeding up like a suitable spatial management structure and frame coherency, if needed, can be used). This _may_ mean that for each node the global matrix needs to be re-computed (if it is dirty) just for the case of visibility culling. Only geometry that has passed the visibility check will be handled further, i.e. passed through the rendering pipeline.

However, the geometry, if it is static or skinning is done on the GPU or some other trick like this, already is on the GPU and hence need not be transferred again (that is why it is called static geometry). Only geometry where vertices are changed by the CPU need to be transfered again.


Well, thats great! The question is, how would I set up a VBO in openGL so that only geometry whose vertices have changed are transferred, and static geometry stays on the GPU?

Thanks a lot for your help, by the way ^^

Share this post


Link to post
Share on other sites
Quote:
Original post by fareed_d
The question is, how would I set up a VBO in openGL so that only geometry whose vertices have changed are transferred, and static geometry stays on the GPU?
OpenGL knows the modes STATIC, STREAM, and DYNAMIC for its VBOs. There are further MapBuffer, UnmapBuffer, and BufferData routines. There is e.g. a White Paper from NVIDIA that explains the differences.

Share this post


Link to post
Share on other sites
Quote:
Original post by haegarr
Quote:
Original post by fareed_d
The question is, how would I set up a VBO in openGL so that only geometry whose vertices have changed are transferred, and static geometry stays on the GPU?
OpenGL knows the modes STATIC, STREAM, and DYNAMIC for its VBOs. There are further MapBuffer, UnmapBuffer, and BufferData routines. There is e.g. a White Paper from NVIDIA that explains the differences.


Thanks for that :>
Reading that whitepaper, I think that I have discovered the difference in our ways of thinking; the projects that I am thinking of developing with my scenegraph are the sort where most objects will be textured quadrilaterals; as such, only 4 vertices per geometry object would be transformed in one go. With complex objects, I am sure that rendering Geometries individually using a VBO for each one is the way to go.

For GL_QUAD objects (I've decided against using openGL 3.x for now), surely the quickest way to render them is to push and substitute their vertices into a single static array, and then use one glDrawArrays call to render them all in one go? That is what my application is doing currently.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this