Jump to content
  • Advertisement

Archived

This topic is now archived and is closed to further replies.

Marc aka Foddex

Mysterious FPS drop with DrawIndexedPrimitve

This topic is 5327 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

[DirectX 7 immediate mode] [Windows 2000] Hi all! I''m experiencing a strange FPS drop with DrawIndexedPrimitive (DIP from now on). I''m working on an engine that''s supposed to do both OpenGL and DirectX. In my test program I''m rendering a small model (about 2000 faces) using Cal3D and a small world (291 faces). The model is rendered with about 20 calls to DIP. The 291 faces all have different textures (at least they can have), so I''m rendering those with 291 calls to DIP. These are the results on a GeForce 440 Go: Rendering model only: OpenGL: 200 FPS Direct3D: 190 FPS Rendering both model and world: OpenGL: 180 FPS Direct3D: 20 FPS On my Radeon 9800 the results are: Rendering model only: OpenGL: 540 FPS Direct3D: 490 FPS Rendering both model and world: OpenGL: 480 FPS Direct3D: 30 FPS What on earth could this be? In Direct3D DIP is called 291 times extra each frame, but in OpenGL a similar call to glDrawElements is done as well, and there the FPS doesn''t drop so heavily.. Any help or suggestions are greatly appreciated! Marc

Share this post


Link to post
Share on other sites
Advertisement
When calling DIP, make sure you specify the right vertex limits, else, the hardware will retransform ALL the vertices again and again. 2000 vertice * 300 batch = 600000 trasnforms.

However, this error shouldn't be enough to drop at 30fps. Is hardware vertex processing enabled?

EDIT:
Oh, and unless this particular model is the only one in the scene, 291 DrawPrimitive* call for a single model is WAY too much.

[edited by - Coincoin on April 14, 2004 11:17:41 AM]

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Instead of making 291 individual draw calls
you should be batching your geometry by
material (i.e. texture). Try taking a look at
this paper from nvidia.

http://developer.nvidia.com/attach/3442

Share this post


Link to post
Share on other sites
Thank you guys for replying!


The model of 291 faces is my ''world'' (291 is ofcourse arbitrary, another world file contains a different amount of faces). It''s rendered by traversing a BSP tree. There are also partially transparent faces, so the order in which the faces are rendered is important.

That''s why I *think* that batch rendering for these faces is not an option (I don''t see how, but I haven''t yet read that NVIDIA document). Once again, it shouldn''t be that big a problem, since OpenGL renders it just fine (''fine'' as in ''fast'').

What I do is the following (in pseudo code):
- prepare a strided data struct for all the vertices, normals and texture coordinates in the world model (I use a VertexBuffer in ''user'' space, not in Video Memory, is this a problem?)

Then, 291 times:
- call DrawIndexedPrimitives, supplying that data struct, each time with a different index array.

It works just fine, but it''s not fast. Any suggestions?

Marc

Share this post


Link to post
Share on other sites
I think I know what the problem is, because I''ve been able to improve the performance of my application greatly.

I have a buffer of about 1200 vertices, and supply this buffer to DIP. Of these 1200 vertices, I only use 4 to 5 in every call (I use triangle fans). But I think DirectX transforms them all in every call I make to DIP!! So whenever I render a single trianglefan, about 1200-5=1195 vertices get transformed without any purpose (causing great loss of speed).

I''m now using the IDirect3DVertexBuffer7 interface: it has the ability to process (i.e. transform) all vertices once for every frame, after which they can be used over and over again This speeds up my Direct3D renderer engine from 20 FPS to 150 FPS, though this still is a lot slower than OpenGL (which is running at about 180 FPS).

Maybe someone has any suggestions to make about this subject, any clues?

Marc

Share this post


Link to post
Share on other sites
Your problem is may be with the agp. Since your buffers are non-video and that you are making a large amount of calls to DIP then you may encounter a large amount of latency and bottlenecking over the agp bus. Especially if you are not specifying a tight vertex spread. 291 calls to DIP sending the entire vertex buffer over the agp 291 times, thats probably going to hurt.
I suggest that you move the buffers to video memory or use managed memory. Also try to batch where-ever possible and try to ensure that you have a tight index spread so that DP only has to send a small portion of the buffer for each call.
You say that you are using BSP for scene management? Are you culling non-visible tri''s as that will greatly help in situations such as this (where a cull can cut down on extra DIP calls over the agp).

Cheers.

Share this post


Link to post
Share on other sites
quote:
Original post by Marc aka Foddex
I''m now using the IDirect3DVertexBuffer7 interface:


If you switched to DX8 or DX9, the data could be transformed by the graphics cards (assuming you create your device with hardware processing enabled). You send it to the card once, and all transformations and lighting calculations happen on the GPU. Also, when using hardware processing, the entire vertex array isn''t transformed, only the vertices used. When using software processing you have to be careful about it (as you''ve learned )

As a bonus, D3D8 is supposed to be much cleaner and easier to use than D3D7. By investing the time in upgrading you get a few things... experience with the current D3D API which could be good for employment purposes, a simpler app, a faster app, the ability to use new features such as shaders, more community help (most people won''t be able to help you with D3D7 specific code).

Share this post


Link to post
Share on other sites
@SoulSpectre:

According to the help file, when I do "ProcessVertices" on a Vertex Buffer, they''re supposed to be transformed only *once*, so that can''t be the problem anymore. I also have tried to get them into video memory in stead of system memory (I''m not sure whether or not I''ve achieved this though =]), so I''m guessing that the VB is not transferred entirely everytime I call DIP.
About managed memory: can I do that in DX7 as well? I couldn''t find anything about in the help ...

I''m not yet culling faces by the way, but that''s extremely easy to implement, so that''s a matter of time But my main concern is the large difference in performance between OpenGL and Direct3D. Once I have their performances at an equal level, then I know (or assume) that I have the ''best code'' for each library. After that I can start implementing other performance boosting algorithms, like culling faces.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

@Namethatnobodyelsetook:
Yeah... DirectX 8 and 9... I wish I could use them, as almost *all* the tutorials these days are in DX8 or 9. But unfortunately, I have to use DX7, because the company I work for has customers with old hardware... =[

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

About BSP tree''s: does anyone if it''s possible to render BSP trees faster than simply traversing the tree and rendering each face with a separate call to DIP? Since it''s important to render in a back-to-front order, I can''t batch faces with the same texture. I also *have* to render in back-to-front-order (using depth-testing is not an option), because else semi-transparent faces get rendered incorrectly.

Thanks in advance everyone!

Share this post


Link to post
Share on other sites
If you have processed the vertices then your problem may lay in your index spread, if you are sending the entire vb buffer over the agp for every DIP call then you will encounter a hit in performance because of the restricted bandwidth whereas the DrawPrimative call will only send that portion of the buffer over in what you specify in the call. You should always try to ensure that your buffers reside in video memory where ever possible, especially for textures, sending textures continously over the agp will have a noticable impact on FPS. Also if you move your buffers to video never read from them, the impact is huge, only write to them.

Because you are using the index buffer i am assuming that your are sharing vertices. For a sphere you will have alot of shared vertices (6 triangles could share one vertex) and this will generate a wide index spread unless you ensure that the vertices are properly ordered.

Why are you not using the depth buffer? It makes life so much easier . You could then split your tris into opaque/transparent groups and easily batch the two groups into texture batches etc then render them seperately, the opaque first, the transparent lot after. Having no depth buffer is going to destroy any decent chance of batching and this will effect performance, especially over the agp.


Share this post


Link to post
Share on other sites
quote:
Original post by SoulSpectre
Why are you not using the depth buffer? It makes life so much easier . You could then split your tris into opaque/transparent groups and easily batch the two groups into texture batches etc then render them seperately, the opaque first, the transparent lot after. Having no depth buffer is going to destroy any decent chance of batching and this will effect performance, especially over the agp.



That''s a very good point actually... could have thought of that myself It has one major implementation issue though: my lightmaps are batched together on a few large textures. I can batch the tris with one and the same normal texture, but if their lightmaps are all on different textures, then I''m still getting nowhere (since i''m using multitexturing, not multi pass rendering)

Guess I''ll have to update my lighting compiler then Or does someone have some comments on this lightmap problem?

Share this post


Link to post
Share on other sites

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!