A plethora of DXMeshs : best practice

Started by
11 comments, last by The Frugal Gourmet 18 years, 10 months ago
Hi guys, I know this has been asked before in various ways, but I'm having difficulty compiling useful information with the search feature. ... I have around 50-100 "space fighter" models all loaded from the same .x resource. So, all 50-100 items are identical in every way except they each maintain different locations (angle, scale, position) on the screen. Currently, I am rendering using the Mesh class (in C#, but similar to the ID3DXMesh interface in C++). In other words, for each and every item (every frame) I call Mesh->DrawSubset(). The mesh complexity is not trivial, and I noticed with all those space fighters (and the rest of my game chugging) I achieve barely acceptable frame rates on a mid-range video card. What I'm asking for is, is there a good way to optimize this practice? Something worth refactoring for? Using the Mesh class is the simplest for me, but I would be willing to optimize if the speed gains are worth it.
Co-creator of Star Bandits -- a graphical Science Fiction multiplayer online game, in the style of "Trade Wars'.
Advertisement
are you just using Mesh or are you using an animated Mesh with AnimationRootFrame and MeshContainers?

Just Mesh.
Co-creator of Star Bandits -- a graphical Science Fiction multiplayer online game, in the style of "Trade Wars'.
Quote:Original post by The Frugal Gourmet
I have around 50-100 "space fighter" models all loaded from the same .x resource. So, all 50-100 items are identical in every way except they each maintain different locations (angle, scale, position) on the screen.

One thing you can do on shader 2.0/3.0 cards is to use instancing which will boost you through the roof compared to rendering all those same items seperately because it would allow you to draw all 50-100 in a single draw call which is usually very effective.
Sweet.

All right, I have never actually used instancing before, so I will give it a shot. Researching it now. I assume there is no integrated instancing with the Mesh class, and I have to get and handle my buffers separately..
Co-creator of Star Bandits -- a graphical Science Fiction multiplayer online game, in the style of "Trade Wars'.
Instancing is a great feature, I like it a lot... But it's worth baring in mind that instancing is a fairly advanced feature, at the time of writing only the GeForce 6x00 series supports it (to my knowledge!). This doesn't really help you when you specifically mention mid-range hardware [sad]

Have you got a culling algorithm implemented? or are all 50-100 ships always visible on-screen each frame? Often the best improvements for rendering can come from a more aggressive/efficient culling system. Why render something thats not actually visible?? [smile]

Another thing to consider, are you re-calculating all world matrices for each render call? Whilst composing a full matrix is a relatively quick operation, you could probably shave some time off here. I wrote my own Scale*Rotation*Translate function that was 2.5x faster than D3DX for the simple reason that I could use mathematical elimination to reduce the pure mathematics down...

Do all of the fighters move on each-and-every frame? you might well be able to take advantage of temporal consistancies by caching recently used configurations.

hth
Jack

<hr align="left" width="25%" />
Jack Hoxley <small>[</small><small> Forum FAQ | Revised FAQ | MVP Profile | Developer Journal ]</small>

Quote:Original post by jollyjeffers
Instancing is a great feature, I like it a lot... But it's worth baring in mind that instancing is a fairly advanced feature, at the time of writing only the GeForce 6x00 series supports it (to my knowledge!). This doesn't really help you when you specifically mention mid-range hardware [sad]


Yes, I studied the subject for a few hours today but so far I have come to the conclusion it wouldn't necessarily apply that well to my situation and might not be worth the substantial effort for my Indie project.

Quote:
Have you got a culling algorithm implemented? or are all 50-100 ships always visible on-screen each frame? Often the best improvements for rendering can come from a more aggressive/efficient culling system. Why render something thats not actually visible?? [smile]


In this case, all 50-100 are on the screen at a time and all are visible.

Quote:

Another thing to consider, are you re-calculating all world matrices for each render call? Whilst composing a full matrix is a relatively quick operation, you could probably shave some time off here. I wrote my own Scale*Rotation*Translate function that was 2.5x faster than D3DX for the simple reason that I could use mathematical elimination to reduce the pure mathematics down...

Do all of the fighters move on each-and-every frame? you might well be able to take advantage of temporal consistancies by caching recently used configurations.


Thanks for the insight. I am -- in fact -- computing world matrices for each fighter every single frame, and I might be able to shave off some time here. I think this may be a potential speed boost if I refactored some things.

Also: Are there any speed boosts to be obtained from NOT using the Mesh class that comes with DX?
Co-creator of Star Bandits -- a graphical Science Fiction multiplayer online game, in the style of "Trade Wars'.
Quote:Original post by jollyjeffers
Instancing is a great feature, I like it a lot... But it's worth baring in mind that instancing is a fairly advanced feature, at the time of writing only the GeForce 6x00 series supports it (to my knowledge!). This doesn't really help you when you specifically mention mid-range hardware [sad]

Actually instancing can be done with Radeon 9500 hardware and up and is not only restricted to shader model 3.0 cards. For pure hardware instancing yes, you need an SM 3.0 cards, but check out the shader instancing, constant instancing, etc from the DX SDK and you will find that with an SM 2.0 card you can achieve almost the same results and it is very worth it in the end. At least it was in my case.

Also for the OP, have you thought about using imposters for things further away where you don't need as much detail?
Quote:Original post by Saruman
Quote:Original post by jollyjeffers
Instancing is a great feature, I like it a lot... But it's worth baring in mind that instancing is a fairly advanced feature, at the time of writing only the GeForce 6x00 series supports it (to my knowledge!). This doesn't really help you when you specifically mention mid-range hardware [sad]

Actually instancing can be done with Radeon 9500 hardware and up and is not only restricted to shader model 3.0 cards.

Interesting, now you mention it - I do vaguely remember reading about this in one of those many conference/technical slides..

Quote:Are there any speed boosts to be obtained from NOT using the Mesh class that comes with DX?

Yes, it is quite possible that you can get some speed here. Bottom line is that the D3DX functionality is good for the vast majority of things, but there are always going to be some exceptions where you can do it better - but just remember that you have to write and maintain it [smile]

I'm not too sure what you could do by re-writing a Mesh class here though, probably something about grouping stuff together in the same buffers (say 2-4 huge buffers instead of 50-100 small ones)...


To try a different 'angle' - how much profiling have you done? either of the "pure" code that you wrote, and of the D3D side of things using PIX? it would be very useful to you if you knew if you were CPU or GPU bound... that is, if your mid-range hardware quite simply doesn't have the grunt to display so much geometry (use imposters as Saruman suggested, maybe LOD), alternatively it might well be that the organization of your rendering is stalling the CPU (look at batching/re-arranging your render process)...

hth
Jack

<hr align="left" width="25%" />
Jack Hoxley <small>[</small><small> Forum FAQ | Revised FAQ | MVP Profile | Developer Journal ]</small>

All of the fighters are identical? Skipping D3DX functionality could save you a lot. You can use a single vertex buffer and index buffer to render every one of them. You could upload the stream source, textures, then turn on render states. Afterwards, enter a tight loop to upload a single matrix and call Draw(Indexed)Primitive for each one.

Another method is the skinning-instance technique. You would have to keep all of your fighters in the same buffer, and you have to store an index in all of your vertices which represents which fighter that vertex belongs to. When you render, you only call DrawPrimitive 1 to 5 times, after uploading a set of matrices. For example, you could upload 25 matrices at a time, and your vertex shader uses the vertex index value to grab the world matrix for all vertices. To put it short, it allows you to apply around 25 matrices to a single mesh to allow the mesh sub-sections to move independently from each other. Also keep in mind that cull-checks will be a pain.

This topic is closed to new replies.

Advertisement