Batching and minimizing DIP's and state changes

Started by
13 comments, last by _the_phantom_ 13 years, 1 month ago
It have come to my attention that although general rendering performance guidelines suggest minimizing draw call count and render state changes, most big AAA tiles do only very minimal optimizations on this or no optimizations at all, for example i PIX'ed few popular games to see whats up:

Mass effect - ~1000 draw calls per frame, 20000(!) render state changes and ~1000 draw primitive UP calls per frame
Civilization V (DX11) - 3000+ draw calls, thousand or so set pixel/vertex shader, 500+ set texture, etc per frame on average

Whats up with this nonsense? Why none even bother to do some optimizations on this matter?
Advertisement
Really no clue?
Maybe they already optimized what they could...
Maybe they decided that performance was good enough and further optimizing was not worth the effort...
Maybe they were quickly approaching a deadline...
You can only minimise so much before the work involved in doing so becomes overly onerous and you get into diminishing returns. Maybe these titles genuinely do have ~1000 unique object states per frame, and therefore can't go any lower? 1000 draw calls is really quite low for a large complex scene, so I certainly wouldn't call it "very minimal or no optimizations". "No optimizations" in a 1,000,000 triangle scene would be 1,000,000 draw calls, after all.

Direct3D has need of instancing, but we do not. We have plenty of glVertexAttrib calls.


You can only minimise so much before the work involved in doing so becomes overly onerous and you get into diminishing returns. Maybe these titles genuinely do have ~1000 unique object states per frame, and therefore can't go any lower? 1000 draw calls is really quite low for a large complex scene, so I certainly wouldn't call it "very minimal or no optimizations". "No optimizations" in a 1,000,000 triangle scene would be 1,000,000 draw calls, after all.


Not true.

Look at this(ignore the gray stuff): http://img202.images...112/greybar.jpg

Each yield icon (green/gold coins on map) gets a draw call(!), each hex grid overlay (white border around EACH hex) gets a draw call, ignoring the fact that you could do that with one draw call even without using instancing - and with simpler code! With all this i think in screen shot we are getting well over 10k+ dip's, and its NOT fast.

Im not a pro graphics engineer, but with a 10 minute of taught you could render all yield icons in two draw calls and one set texture (they are using DX11 FFS!), and whats even more obvious, you could overlay hex grid with one draw call or none at all if you just sample it when drawing terrain...

My taught is that they just do not give a damn... Same thing for my loved Blizzard - in WoW when drawing terrain chunks, instead of doing simple test "if (stage0Texture != textureToSet)" they just go what a heck and don't do nothing, ignoring the fact that wow is heavily cpu bound and targeted at wide range of hardware, low-mid end including (released in 2001!) - and i tested, i have build the same thing, and doing this simple optimization gave 11% performance gain, for about 20 lines of extremely simple code.


So my taught is - if you are indie developer making your game, don't event try to optimize this stuff, no one is doing it anyway and being fine - even when targeting low-mid end hardware. Doing this kind of optimization have proven to be useless and don't listen to 101 texts of saying batch batch batch, omg draw calls, etc - just write your game flexible and start worrying if you are reaching over 3k in average scene. ;P
I don't think there are noobs working for the large companies. I'm pretty sure they know they could have done better, but when deadlines are approaching and the publisher keeps the pressure up and there are other tasks with higher priority AND you know that your game will sell shitloads either way..

I hope your advice to indies to not give a f*** was just sarcastic. :)
umadbro?
"Spending your life waiting for the messiah to come save the world is like waiting around for the straight piece to come in Tetris...even if it comes, by that time you've accumulated a mountain of shit so high that you're fucked no matter what you do. "

Each yield icon (green/gold coins on map) gets a draw call(!), each hex grid overlay (white border around EACH hex) gets a draw call, ignoring the fact that you could do that with one draw call even without using instancing - and with simpler code! With all this i think in screen shot we are getting well over 10k+ dip's, and its NOT fast.
My taught is that they just do not give a damn... Same thing for my loved Blizzard - in WoW when drawing terrain chunks, instead of doing simple test "if (stage0Texture != textureToSet)" they just go what a heck and don't do nothing, ignoring the fact that wow is heavily cpu bound and targeted at wide range of hardware, low-mid end including (released in 2001!) - and i tested, i have build the same thing, and doing this simple optimization gave 11% performance gain, for about 20 lines of extremely simple code.
The simple explanation is that all those coins on the map were implemented by a gameplay programmer, not a graphics programmer. Furthermore, graphics programming was most likely outsourced to an engine developer like Emergent.
So -- designer asks gameplay to put coins on the map, gameplay uses the engine to do so. Performance is "ok", so no one delves deeper into the flaws of the engine. The engine's graphics programmer works in a different building and isn't directly exposed to the horrible abuse his modules are receiving from negligent gameplay code, so he doesn't improve his interfaces.

Its very easy to imagine how perfectly optimised code doesn't get produced in the real world, where time and money are quite limited.

So my taught is - if you are indie developer making your game, don't event try to optimize this stuff, no one is doing it anyway and being fine - even when targeting low-mid end hardware. Doing this kind of optimization have proven to be useless and don't listen to 101 texts of saying batch batch batch, omg draw calls, etc - just write your game flexible and start worrying if you are reaching over 3k in average scene. ;P
Yeah.... nah.
I'm working on a console game atm where half the CPU time is consumed by rendering tasks. I would love to perform the optimisations you're on about (we could probably reduce draw calls by 10x), but frankly the schedule doesn't allow time for it. On the sequel I'll definitely be doing some of this optimising so that we can do more stuff with our CPU, other than waste it on GPU command buffer generation.
Thanks Hodgman!

This indeed is also issue when you don't have strict time limit, but are creating and planning to release indie game in timely fashion - and maybe this is even worse, because you get impression that your schedule is OK, throughout you are wasting tons of time on least important details, and not focusing on right parts of code - this and some other aspects of indie game design are the main reason why so many games get abandoned unfinished.

I have seen articles talking about general guidelines for finishing your hobby game in timely fashion and finishing it at all, but they seem to lack relation of real code to fashion and priorities of coding, have not seen a text saying : you probably better skip dip/state change optimizations, as it provides a huge time sink and will delay your project without giving much in return, as instead, there are a lot of information how dips are root of evil and such :D Also primary focus for most people when concerning code is - FAST! and not fast in sense of development time, but in terms of milliseconds.

Someone really should construct some guidelines for this matter, that are down to earth and based on practical, real world examples and experience. Basically how to write code "fast(development time) and fast(code execution time)", and whats good balance in REAL situations.One way of doing this is to produce some sort of performance impact index, based on some sort of reference processor and gpu, saying - "System to minimize dips/state changes - general guide - Avoid", "Easy to use art-game pipeline - general guide - High Focus" etc

(I) have not seen a text saying : you probably better skip dip/state change optimizations, as it provides a huge time sink and will delay your project without giving much in return.


That's probably because it isn't really true. If you are working on a game as an indie developer and you factor in a system from the start to handle this, it isn't a particularly massive time sink at all. My scene manager took about an hour to code up right at the start of my project and I've barely touched it in the last two years.

As Hodgman says, when working in a large team and when using a third party library for rendering, the task does then become far more complex. Maybe it was cheaper to buy a binaries-only licence for the rendering engine so such optimisations aren't possible? Maybe several developers have argued passionately at meetings to be allowed to perform this optimising but have been vetoed by project managers more concerned about features in time for deadlines? Maybe the business model is based on X percent sales to above and beyond a certain target hardware, making the optimisation useless in this specific business context?

We simply don't know, but to assume that the developers either 1) didn't know about this or 2) didn't care is unfounded based on the evidence we have.


Also primary focus for most people when concerning code is - FAST! and not fast in sense of development time, but in terms of milliseconds.


Disagree. Primary focus is 1) does it work acceptably to return investment and 2) will it be finished in time. Optimising for speed of execution is only necessary, and indeed worthwhile, if either of the two above factors are compromised.

This topic is closed to new replies.

Advertisement