DirectX 9: General Optimisation Tips ?

Graphics and GPU Programming Programming

Started by rubicondev May 26, 2007 01:59 PM

10 comments, last by rubicondev 16 years, 11 months ago

296

Author

May 26, 2007 01:59 PM

Let's forget a moment about demos and ideal world applications that a tech guy from nVidia would rustle up to show off, I'm talking about actual real-world engines. How badly do drawprim counts count against everything else for example ? My batches are fairly small, but they have to be - there's not much point putting a trillion faces into a telegraph pole, yet all those poles need rendering separately so they can be positioned. I've sort of reached a point where I feel that I'm missing something major as my engine feels kinda slow. It does all lighting per-pixel, but using ubershaders so it's all single pass. In a scene I have with around 250 dp's (and no lighting atm) I'm getting a framerate from my 1950 that feels more inline with playstation. I've commented out chunks of stuff and I never seem to get much of a speed up. I guess I'm looking for general optimising advice. It's no good suggesting playing with PIX - I'm at a stage that's way before that level of scrutiny. I'm considering an overhaul of the pipeline at a gross scale, but not quite sure which way to go because I don't really know what's wrong with what I have now. It's driving me crazy. Is there a decent *current* paper on general performance guidelines for real world apps ?

------------------------------Great Little War Game

Promit

13,404

May 26, 2007 02:21 PM

This presentation is practically legendary when it comes to this stuff. And there's a lot more that is worth reading.

SlimDX | Ventspace Blog | Twitter | Diverse teams make better games. I am currently hiring capable C++ engine developers in Baltimore, MD.

rubicondev

296

Author

May 26, 2007 03:15 PM

Direct hit. Many thanks...

------------------------------Great Little War Game

Harry Hunt

543

May 26, 2007 03:25 PM

From my own (recent) experience: shader switches are the devil. Try to only switch when you absolutely have to. I got a 300% increase in FPS just from reducing shader switches. Also, I noticed that while a lot of people tell you you should sort by shader, grouping by shader is really what you need. The difference is that grouping can be done in O(n) while sorting is at best a O(n * log n) operation.

rubicondev

296

Author

May 26, 2007 03:31 PM

I'm already doing that (the sorting part anyway). In fact I'm doing quite a lot of things like that which is why I don't get why my engine feels so slow. If I turn off the rendering completely then it rips so I know the problems not in the rest of my game.

Keep the tips coming though, handy thread already. :)

------------------------------Great Little War Game

Ashkan

451

May 27, 2007 04:29 AM

In a non-pipelined single processor architecture, optimizing the wrong place would gain you little extra performance. The beauty of programming pipelined multiple processor architectures is that optimizing the wrong place would gain ZERO extra performance! A chain is only as strong as its weakest link. A pipeline is as fast as its slowest stage.

If the part you're optimizing currently is not where the bottleneck is located, all your efforts would go in vain. If your application is not CPU-bound for instance, reducing number of batches per frame won't help at all.

So, the key to the wonderland is:

Quote:
FIRST locate the bottleneck, THEN try to optimize it. This won't totally remove the bottleneck. It simply moves the bottleneck to another part of your application and boost your performance to some extent. Relocate the bottleneck, then optimize it again. Do this as many times as you need until you reach the sweet spot. You may then exploit the idle time in other stages and complicate those stages for free!

PROFILE and OPTIMIZE

rubicondev

296

Author

May 27, 2007 04:42 AM

I'm all for general profiling, trouble is I don't know how to do it properly anymore.

My feelings about VTune are expressed quite concisely elsewhere, but I don't know what other products are available that actually work. I used metrowerks CATS a long time ago but it seems to be discontinued. Dev no longer has it's simple one built in.

PIX on 360 is okay, but the bottlenecks on that beast are in totally different places so I've gone as far as I can with that. The PC version of PIX is grim by comparison and doesn't seem to show much about system-wide clashes and bottlenecks.

I'm currently downloading the Beta of ATIs perfhud equivalent. Hopefully that'll give me some insight, but my download seems to be running at a byte an hour so who knows....

I do strongly suspect I'm CPU bound. I'm using a lot of it for my game (it has a simple fluid dynamics system in it), and turning off rendering completely makes it fly. I have about 250 smallish batches to draw 80K polys, all of which are single pass. I wouldn't expect this to be bound by anything tbh - have we gone backwards ?

I'm certainly no newb at this stuff, but I do think I must have a schoolboy error somewhere. I just can't find it!

------------------------------Great Little War Game

RivieraKid

706

May 27, 2007 04:57 AM

Quote:Original post by RubiconMobile
How badly do drawprim counts count against everything else for example ? My batches are fairly small, but they have to be - there's not much point putting a trillion faces into a telegraph pole, yet all those poles need rendering separately so they can be positioned.

Total War doesn't draw each soldier with a single call to dip.

see here

Ashkan

451

May 27, 2007 05:24 AM

NVPerfHUD is a great profiling tool to look into, although you already seem to be aware it. You also need an NVIDIA card if you are to use it to its full extents. I suggest you first test your application with that (or its ATI equivalent) to validate your guess and make sure you indeed are CPU-bound. If you actually are, go for algorithmic optimizations first rather than coding hacks since they would gain in much bigger improvements unless you're doing some BIG coding mistake. Go for coding optimizations next.

The documents that come with NVPerfHUD are nevertheless very interesting. They would help a lot if you feel kind of lost. There was especially that one guide on optimizations HOW TOs, whose name I've unfortunately forgotten, but I'm sure you'll be able to find it. "NVIDIA GPU Programming Guide" is also a good reading. You can find that at NVIDIA's developer website.

rubicondev

296

Author

May 27, 2007 06:27 AM

I think I'm currently downloading the entire nVidia website now, thanks :)

Instancing is at the top of my engine wishlist but it won't help my current projects as they don't have lots of repeats - the telegraph poles was just a classic example of batching problems. (The lead xbox programmer for Spartan is a colleague of mine btw)

------------------------------Great Little War Game

DirectX 9: General Optimisation Tips ?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

DirectX 9: General Optimisation Tips ?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines