Archived

This topic is now archived and is closed to further replies.

Dwiel

hot to deal with many obj each with diff tex/matrix

Recommended Posts

Dwiel    365
Hello, I have come across a problem dealing with the fastest way to draw all of the units in my 3d RTS game... It seems that I have come up with two methods, and was about to code up both and just see which was faster, but I figured I''d ask the experienced here to see if one of the methods was definately a waste of my time. Method 1: Objects are sorted by texture. This allows many objects to be copied into one VB and rendered with one DIP call. This means though that they must be transformed by the CPU because there is no way to have the GPU use one matrix for part of the VB and a different one for another part... Good thing is that I can minimize my changing of textures and DIP calls. Bad thing is I must transform the verts in the CPU... (maybe someone knows of an alternative) Method 2: Each object is again sorted by texture as to minimize the changes, though this time, each time an object needs rendering, a DIP call is used explicitely for it so that the transformation matrix can be changed and the transforming done in the GPU. The good thing here is that the transformin gis done in the GPU, but there will be many DIP calls which is very bad. One optimization that would benifit noth methods would be to have all of my static objects pre-transformed so that they only need to copied to the VB - multipule at a time - and rendered with fewer DIP calls... which one is most likely to get the best frame rates? I think that my choise will also depend on how taxed the CPU become after implementation of the AI. If the CPU is already bieng used 100%, I will want to stay away from transforming in the CPU... Any sugestions would be much appreciated, especially if they tell me of a method in which solution #1 can be used while transformin gin the GPU (best of both worlds...) Thanx, Dwiel

Share this post


Link to post
Share on other sites
Static data/static location put into one VB, one DIP per texture.

All small objects (<20 polys, ie: textured quads and such), sorted by texture, transformed on CPU and done with a single DIP.

Other data (Dynamic data, static data/dynamic position) sorted by texture and buffer (texture has precedence) each rendered with a [setstream if needed], set world transform, and DIP.

Share this post


Link to post
Share on other sites
Dwiel    365
Hey, thanx for the info! I''ll prolly code the sort by texture/transform in CPU method like you have said because most of my objects are less than twenty. If I end up with many over 20, I''ll code up the other method too and have the engine quickly decide based on the previous frame or the objects that need rendering this frame...

thanx a lot!

Any other ideas, or is this implementaion the norm?

Dwiel

Share this post


Link to post
Share on other sites
Guest Anonymous Poster   
Guest Anonymous Poster
Just so you know, the <20 poly rule is not a fixed number. It''s an approximate number I picked at random. Basically what I wanted to stress was that the cost of transforming small objects on the CPU will likely be less than the cost of many SetTransform, DIP pairs.

As always, the performance depends on your exact situation, and sweeping generalizations like I made aren''t always accurate.

My rule is based on what nVidia claims, not on any realworld benchmark, as I haven''t written any such benchmark.

Try making a simple test app, placing a few hundred objects randomly, but still somewhat like your actual game situation. Benchmark both CPU transforms and GPU transforms, and see which is best. It WILL depend on Gfx Card, and it WILL depend on CPU speed, bus speed, etc., but I have no idea when you cross the CPU is better vs. GPU is better threshold.

Share this post


Link to post
Share on other sites
blue_knight    194
Why not put multiple dynamic objects in a buffer (with the same texture) and use 1 bone skinning ("stiching"?), i.e. 1 matrix per object. You can render 20+ objects, each with their own transform, in one call this way. You''d just have to have the vertices have an 8bit index for the matrix to use.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster   
Guest Anonymous Poster
Blue Knight. It''s a valid solution (one that we use on our XBox titles) if your game will require shader capable hardware. If you are targetting older hardware this isn''t really an option. I suppose the software shader emulation would work with this too... I haven''t gotten around to testing that technique.

Share this post


Link to post
Share on other sites
Dwiel    365
Sound like a very good idea.. I''ll definately check it out. I think that I should probobly come up with a way so that the first time the user runs my game, it checks for which method is faster, and then where the point is where the two methods are equally efficient. I could also encorperate the shader idea this way too. I definately will check it out.

Thanx for all of the help... I''m going to try to implement one of the methods tonight if I don''t end up spending all of my time studing for my AP tests... *sigh*

Dwiel

Share this post


Link to post
Share on other sites