Update(), Render() Seperation

Started by
19 comments, last by Bob Janova 17 years, 2 months ago
The graphics driver already buffers operations. So interleaving is already happening, as long as your code is interleave-friendly.
Advertisement
I don't know what that code does, but I suppose it is possible, but then each object would have at least 3 variables in each instance that takes care of the timing, and to keep you from rewriting the main-loop code in each object you would have to call a few (virtual) functions(namely tick, frame and render) that makes it rather useless imho. The _only_ positive thing I can see from combining the update() and render() would be that the imgui would be somewhat easier to write.

My main loop btw, is basically Javier's although my version is slightly modified.

Regarding your game speed, unless I tell my game to 5 times as much updates, my updates really aren't any problem. generally it's my rendering that is too slow, I use the wrong container(using a vector instead of a list) or I do alot of allocating.
(*disclaimer* I'm pretty new too all of this, which the post itself may declare without this disclaimer, hah!).

If you mix update and render, isn't it also possible to end up rendering incorrectly? I mean, do we really know what to render until the game world is updated? If we don't know that object1 is supposed to explode until after we process a missile, which is like object11, then we would have started rendering too soon. Is it possible to get a problem like that? I mean, that could be taken care of in the next frame, and if we're doing like 60fps, it wouldn't make that much of a difference. But, if a situation like that were possible, it would just feel wrong.
The update and render separation has one big problem if you want to decrease your memory usage. Imagine you have 100 characters that share the same meshes etc. So they are identical, but animated independently.

Now if you first want to call an Update for each character this means you have to give each character its own transformation buffers that hold all the matrices that you will pass to the vertex shader later on to perform skinning.
So if you have 100 characters, you need to store 100x all transformation matrices. This is needed, because otherwise the update of the second character would overwrite the buffers of the first one. As you can imagine that goes wrong when performing the rendering of all characters.

What I do is make a scheduler that figures out the update and render order in such a way that there are no conflicts with shared memory. Also it manages multithreading. Say if there are 4 cpu's it would duplicate some buffers 4 times so that it can still update 4 characters at the same time, even if they are the same.

In a single threaded environment it would look like:
UpdateObject1()
RenderObject1()
UpdateObject2()
RenderObject2()

WIth the multithreaded scheduler enabled it would look like:
UpdateObject1() UpdateObject2() UpdateObject3()
RenderObject1()
RenderObject2()
RenderObject3()

Also its possible to perform multithreaded rendering but as you know that's not likely to be used very often. Also the object numbers could be different, as object 1,2,3 might share the same data etc.

However that means that instead of 100 times the amount of memory used for the transform matrices i would only use 1x the amount (or 4x, or whatever the number of processors/cores/threads is).

The scheduler uses callbacks to notify the engine when to update and render the objects. The engine allows to clone all data though, so that you could still put it in the Update / Render architecture. But then managing multithreading becomes a lot harder and you would use a lot more memory.

It's more tricky to manage, but its worth the memory savings. Especially when dealing with (character) instances.
Quote:Original post by Buckshag
The update and render separation has one big problem if you want to decrease your memory usage. Imagine you have 100 characters that share the same meshes etc. So they are identical, but animated independently.

Now if you first want to call an Update for each character this means you have to give each character its own transformation buffers that hold all the matrices that you will pass to the vertex shader later on to perform skinning.
So if you have 100 characters, you need to store 100x all transformation matrices. This is needed, because otherwise the update of the second character would overwrite the buffers of the first one. As you can imagine that goes wrong when performing the rendering of all characters.

100 3x4 float matrices takes 6.4 kilobytes. If each character needs to store 24 bone transformations, it's still only 115k, which can be allocated once and reused per-frame. It's probably not worth too much worrying.
In more complicated character animation systems you will have most likely stored:

local space matrices
world space matrices
inverse bind pose matrices
local space transforms (separated pos/rot/scale/scale rotation)
bind pose transforms (separated pos/rot/scale/scale rotation)

Likely you also store the matrices on an aligned way, so using 4x4 matrices.
The size of decomposed matrix elements is 2 vector3's + 2 quaternions.
That is 14 floats, which is 56 bytes. If aligned 64, but lets say 56.
For a 100 bone character that is 6400 + 6400 + 6400 + 5600 + 5600 = 30400 bytes.

That's ~30 kb per character. If there is a crowd of 1000 characters that's 30 mb we're talking about. That's a lot of memory, especially on the Wii or Gamecube for example.

Next to that you have some blending buffers as well, so there is more memory shared. And meshes as well...
You don't want to store the same meshes multiple times for the same character if not needed :) (think in cases where you do cpu deforms).

Besides that, sharing the memory will make your updates twice as fast because of cache efficiency. Our API allows you to choose what parts to share and what parts to make unique. If we share all the framerate doubles. This is because of better cache efficiency.
I don't see why you have to store per-frame matrices anyway... If you're creating them right before rendering of each character and throwing them away right after that (which is where your memory savings come from), I'd consider that to be just a part of the rendering process. This doesn't really have anything to do with whether or not you're "micro-interleaving" rendering with game logic or physics.
A couple reasons why I have them separate:

-My physics engine updates everyone at once
-Fixes some 'off-by-one' errors (well, not precisely) such as the camera being moved after drawing some but before drawing others
-I had a lot of unnecessary passing of data to places that didn't really need it (eg: game data to rendering routines and rendering data to update routines) partially caused by this (though admittedly I could have changed that)
-Makes it easier (possible?) to do some rendering things (particle sorting, for instance)

[edit]
Quote:Original post by Sneftel
The graphics driver already buffers operations. So interleaving is already happening, as long as your code is interleave-friendly.


What makes code interleave-friendly?
Most code is interleave-friendly. The two big things you can do to screw over interleaving:

1. Use dynamic meshes which are written to and drawn multiple times per frame.

2. Use queries or texture readback from the same frame or the last two frames.

If you're not doing either of those things, the driver will generally be able to buffer 2-3 frames ahead, making interleaving moot. IIRC, NVidia's profiling tool will tell you if this is happening.
Quote:Original post by Fingers_
I don't see why you have to store per-frame matrices anyway... If you're creating them right before rendering of each character and throwing them away right after that (which is where your memory savings come from), I'd consider that to be just a part of the rendering process. This doesn't really have anything to do with whether or not you're "micro-interleaving" rendering with game logic or physics.


If you create them right before rendering it means you can't do multithreaded updates unless you do multithreaded rendering as well. Well, it's possible of course by using the same trick as I do. When I share memory I basically do the same thing.

But some reasons that some engines need them per character is because other objects might need to know about the transforms of other objects. Another places where you might need per-character transforms are is character skeletal customization with motion retargeting or some animation/skeletal LOD techniques.

This topic is closed to new replies.

Advertisement