Archived

This topic is now archived and is closed to further replies.

D3D performance

This topic is 6283 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

i am currently getting into 3d programming, and the api of my choice is d3d. now after i figured out a few things, i am drawing a terrain with 20000 polys that looks quite neat. it runs very smooth on my athlon 700 with a geforce (60 frames), but when i tried it out on my brother''s pc (a celeron 466 with tnt2), i was thoroughly dissapointed: 6 frames... (all framerates while looking at all 20000 polys from above with one light in 640x480x16) i don''t believe that my brother''s pc is THAT slow, as almost all new games run on it quite good, and T&L shouldn''t do that much for performance (i think). so am i doing something completely wrong, or is D3D not built for such high(?) polycounts? some hints on what i am doing: set up vertex buffer with 20000 D3DVF_VERTEX (untransformed by any chance). draw every frame with DrawPrimitiveVB(), using 16-bit zbuffer. (anymore specs required?) thanks for any answers rid

Share this post


Link to post
Share on other sites
Try checking the obvious, make sure you''re not using software rendering on the TNT2, or putting your untransformed vertex buffers in video or AGP memory (be sure to have SYSTEMMEMORY specified when you create the VB on a device with software TnL.)

You can also try using the Nvidia Statistics Driver to see exactly what areas you''re being hit on (I''m unsure if this is publicly available.)

Share this post


Link to post
Share on other sites
20,000 is a ton of polygons to send to the graphics card at once. For most systems, you shouldn''t send more than 3000-4000 polygons per frame (depending on the features you have, of course). You might look into implementing octrees or quadtrees to help limit the amount of data going to the card.

----------------------------------------
"Before criticizing someone, walk a mile in their shoes.
Then, when you do criticize them, you will be a mile away and have their shoes." -- Deep Thoughts

Share this post


Link to post
Share on other sites
quote:

T&L shouldn''t do that much for performance



um, yea - that''s probably it - hwT&L really shines in high poly situations.

try putting the gforce in your brothers comp - see what happens

Share this post


Link to post
Share on other sites
mhkrause: yeah, putting the vertex buffer in systemmemory gave a boost of almost 1 frame with 7 frames total now! hell of a fast


The Senshi: do those trees mean that i do some visibility checks before giving the geometry to the device? something like bsp?


Yorvik: i will try disabling the TnL functions of my card, but the point is that i don''t want to develop software only for highend computers!
why isn''t it possible that it also runs on a celeron with tnt2, which of course isn''t a very fast system nowadays, but still not the typical lowend system?
am i really overexpecting the performance of d3d? don''t normal games use up to 20,000 polys?
at least normal character models should have something about 1,500 polys, so with some of those and the level geometry it will easily sum up to said amount, am i any right?

thanks again
rid

Share this post


Link to post
Share on other sites
With today''s technology (even with HW T&L) 20000 polygons is a lot. You need to be clipping polys on your own (don''t blame the API for your lack of effort), especially if your using a terrain engine--there''s LOTS to be clipped! Look into BSPs...

Vyvyan

Share this post


Link to post
Share on other sites
an octree or quadtree is used to subdivide a space into manageable chunks. (octree for 3d, quad for 2d)
basically, you take your entire "world" and enclose it into a box (or square). this is the root of the tree. then, you divide it in half along each axis and you will have more boxes, each containing a smaller part of the world.
continue this until you reach some threshold number of polys in each box, or a certain depth.
once the tree is built, you can start at the top, and for any box that is not partially in view, throw it out. if it is completely in view, draw everything inside. if it is partial in, keep going down levels of the tree checking for children in/out of the view.
just a quickie expl, but pretty good overall, i think.

crazy166
some people think i'm crazy, some people know i am

Share this post


Link to post
Share on other sites
i agree 20000 is a lot of polygons to render without a hardware accelerated card. for the record ive gotta worse computer celeron433 + vanta but can comfortably do 10000 multitextured + lit tris (on screen) at 20-25fps

Share this post


Link to post
Share on other sites
Another little tricks, try to minimize calls to setTexture. I have do a isometric engine (for the moment just with the base tile).
First try for each tile I was doing a SetTexture -> 42 fps. Second try I classify the tiles by texture, I do a loop on all texture and draw each tiles with the current texture -> 85 fps.
Of course you cannot do that with all type of 3D engine.

Share this post


Link to post
Share on other sites
20000 is not that much. My P2 266MHz with 128MB memory and G400 MAX ran a 10000 polygon "star system" (star, eight planets and a space ships) at 50fps. The scene also had six lights and the polygons were textured. And the engine was not optimized very well.

-Jussi

Share this post


Link to post
Share on other sites
Well, the good thing about quad/octrees is that if one square is not visible, all the other squares below it in the tree are invisible as well, so you can eliminate some visibility checking. You might want to forget about any poly''s which are directly behind the viewer, and therefore invisible, as these will not be removed by back face culling if they have CCW winding order.
Are you using ComputeSphereVisibility? It is basically a clipping function for getting rid of polys before culling them. However, if you use the quad/octrees or finding if polys are behind you, you could possibly eliminate them from the world-camera transformation, thereby having to multiply fewer vectors by matrices.

------------------------------
#pragma twice


sharewaregames.20m.com

Share this post


Link to post
Share on other sites
thanks for your suggestions on how to optimize for visibility checking.
i am not doing any clipping yet, i just wanted to see how fast d3d is, pure brute force poly drawing. btw it doesn''t matter much if i use a texture, use speculars, and enable/disable other rendering options.
so what i extract basically from this thread is that i can''t send 20000 polys to the device and expect it to run fast on "not so highend" machines, right?
but i don''t fully understand why the computation effort to implement quadtrees or other clipping methods is lower than just sending the whole scene to d3d and letting it determine is something is visible and not send it to the card then.
but well, i guess that is why it is called "immediate" mode

Share this post


Link to post
Share on other sites