Jump to content
  • Advertisement
Sign in to follow this  
Thrawn80

A Render Cycle

This topic is 4286 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi all, First I'd like to thanks all resources and assistance to my previous threads posted. It has boosted my understanding on D3D tremendously. Therefore, I'll like to seek help again (I'm stuck here) This time, it's exclusively on the render loop... Do I do the following per loop? 1) Clear the vertex and index cache (so that I can put in the vertices that are supposed to be rendered) 2) Loop through all my objects to be rendered (sorted or not) 3) Put the vertices and indices in the buffers, my buffers are seperated into bins that holds different FVF formats and textures. Duplicate buffers wif the same format and texture can reappear. 4) Set identity matrix to world 5) Multiply it by the object's internal matrix (or some would prefer quaternions) 6) Set the view and projection matrices (from a stored set of matrices, so that I dun recalculate every frame) 7) SetTransform into it ... here's where i'm stuck ... if each object is using its individual matrix ... then how can I perform DrawPrimitive or DrawIndexedPrimitive? next, am I right that everytime I call a - DrawPrimitive() - DrawIndexedPrimitive() D3D then sends the number of vertices or indices based on the parameters set into the pipeline. Is it true that only a small bunch of triangles are being sent? Which means, it's inefficient? 8) Restart the whole thingy again per object basis. OK ... I know I could be gravely wrong .. please correct me if I'm wrong... I've thought abt it for a very long time and haven't really came out wif an answer because my understanding wif D3D isn't solid enuf yet. Thanks for your time reading this post. Regards, thrawn80

Share this post


Link to post
Share on other sites
Advertisement
Bare in mind that there are lots of options within the render loop and that there isn't strictly a single "perfect" solution.

Before I get started with your points, think of the drawing process in terms of 2 phases:

Firstly you configure the pipeline - pretty much any call to IDirect3DDevice9 that begins with "Set". This stage tells Direct3D what it should do when it receives data.

Secondly you actually send the data to Direct3D and are effectively saying "draw this geometry based on the configuration I just gave you".

A lot ot optimization goes into two area - reducing the number of steps to configure the pipeline and then reducing the number of times you tell Direct3D to draw. A lot of times these will be at odds with eachother - reducing drawing calls often makes the configuration more complicated for example.

Quote:
Original post by Thrawn80
1) Clear the vertex and index cache (so that I can put in the vertices that are supposed to be rendered)
No need to clear the buffer; just overwrite data or simply dispose of it.

Although in general you DO NOT want to create, modify or release resources in the main loop. Sometimes its unavoidable, but don't do it unless you have to.

Quote:
Original post by Thrawn80
2) Loop through all my objects to be rendered (sorted or not)
Correct, they should definitely be sorted (the primary way for reducing the number of steps required for pipeline configuration) and ideally culled according to current view (why waste time drawing objects that aren't visible?).

Quote:
Original post by Thrawn80
3) Put the vertices and indices in the buffers, my buffers are seperated into bins that holds different FVF formats and textures. Duplicate buffers wif the same format and texture can reappear.
Back to my previous point - don't modify the contents of any resources unless you absolutely have to. Lots of duplicated/static data is better than dynamically creating it every frame.

Quote:
Original post by Thrawn80
4) Set identity matrix to world
Unnecessary. A 'Set*()' call replaces whatever is in the slot already, so no point in setting something if you're going to overwrite it later.

Quote:
Original post by Thrawn80
5) Multiply it by the object's internal matrix (or some would prefer quaternions)
Sounds like you're taking an OpenGL approach with a matrix stack. Just do the relevant mathematics with D3DX to generate a single matrix and go from there - don't overcomplicate things.

Quote:
Original post by Thrawn80
6) Set the view and projection matrices (from a stored set of matrices, so that I dun recalculate every frame)
Only set them to the device if you've changed them on a previous frame. Redundant state changes are relatively cheap because they'll get caught by the runtime/drivers, but why waste crossing the API boundary?

Quote:
Original post by Thrawn80
7) SetTransform into it ...
Yup, you must ensure the transform matrices are configured before you render.

Quote:
Original post by Thrawn80
here's where i'm stuck ... if each object is using its individual matrix ... then how can I perform DrawPrimitive or DrawIndexedPrimitive?
Back to my original point - the first phase is configuring the pipeline. You can only issue multiple data in a single call if the data shares the same pipeline configuration. If two sets of data require different transforms then that is a different pipeline configuration, thus you must seperate them into two draw calls and place appropriate configuration code before each of them.

Quote:
Original post by Thrawn80
next, am I right that everytime I call a
- DrawPrimitive()
- DrawIndexedPrimitive()
D3D then sends the number of vertices or indices based on the parameters set into the pipeline. Is it true that only a small bunch of triangles are being sent? Which means, it's inefficient?
With D3D9 the draw-call overhead can be very bad for performance such that a lot of work goes into "batching" to reduce the number of draw calls issued each frame. Simple logic really - each call generates the same overhead such that packing as much data into a single call mitigates the overhead - 10ms shared over 100 triangles is worse than 10ms shared over 10000 triangles...

Quote:
Original post by Thrawn80
8) Restart the whole thingy again per object basis.
Yes, but try to take advantage of sorting to avoid any unnecessary pipeline configuration.



Get familiar with PIX and the D3DPERF_BeginEvent() and D3DPERF_EndEvent() calls. You can then use the single-frame event capture to see how much work your application REALLY does on any given frame. You then have a goldmine of information for identifying where your code is doing too much work - go back, change something and see if it improves the number of events per frame...

hth
Jack

Share this post


Link to post
Share on other sites
You should only be filling index/vertex buffers each frame if the data changes from frame to frame such as animation if/when it can't be done on the graphics hardware itself.

Share this post


Link to post
Share on other sites
Hi,

Thanks for replying...

What I really don't understand is that,

If I sort my vertices and indices into buffers with the same configuration,
logically, it'd mean that I hav to discard my buffers (index or vertex) every frame isn't it?

For example, if i have 10 cubes, each cube is situation at different points in 3D space, each cube has their own color... therefore if i remove a single cube,
(which is the better way)
1) I'd have to discard (not release and create) the buffer and refill everything
2) I just lock with a 0 flag and replaces its contents, then when rendering primitives, I just pass in the vertex count and primitive count accordingly. Since the data is linear, there won't be any problems if the count values are correct.
Note : Both buffers are created to the size of 10,000 vertices and 40,000 indices. When either buffer hits the limit, I'll render them straight away onto the screen, reset the count and start replacing the buffer. (Is that the way to batch?)

On the other hand, I've read other threads which mentioned that each individual object, in this case, our cube object, should have a matrix or quaternion on its own to multiply against the current transformation matrix.

If that's true, that means I have to
1) Check if the current object matrix has been changed (via rotation, scaling or translation)
2) If yes, get the current transformation matrix and multiply it. (Hey, it's actually only the WORLD transformation right? The view and projection doesn't really matter since they affect the final output, not the actual object position...)
3) Then update the vertex and index buffer accordingly by filling it with updated vertices and indices ...
4) Object's duties ends here...while other objects will repeat step 1 to 3

After the buffers are all set, give a single call to render (the buffer manager would hav rendered if the limit is already hit, this call is to render the remaining vertices and indices)

Am I correct in this order? I know there is no one perfect solution ... but does this way work fairly efficient enuf?

Thanks,
thrawn80

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!