Avoiding redundant state changes

Started by
4 comments, last by Ashkan 16 years, 4 months ago
Hey! I was thinking about redundant state changes between batches, would removing them result in anything? After each batch I reset all states to a known state to guarantee consistency, I thought that the driver/API would remove these redundant changes as an optimization and that I wouldn't really gain much by removing them. Is this true? Is it a good idea to have a "current state" object in my renderer that always keeps track of the current state and then simply check with it before setting any states?
Advertisement
You are strongly advised to minimise state changes as far as you comfortably can. DirectX 10's pipeline can handle this trigger-happy system, but anything earlier will suffer, perhaps badly.

For a small project, it's easiest to set states only when necessary, and live with the difficulties this causes when the rendering order needs to be changed. For larger-scale programs, you're encouraged to keep track of states manually as you describe. Depending on the level and nature of the state-switching, this can give overall performance a considerable boost.
Ring3 Circus - Diary of a programmer, journal of a hacker.
Thanks, that was what I wanted to hear!
Will this affect only CPU performance, or GPU? Or maybe both? Probably depends on the driver... I'm using OpenGL by the way, I don't know how forgiving it is.
Driver and batch overhead is usually associated with CPU performance, since you're making more driver calls for the same amount of GPU work.
NextWar: The Quest for Earth available now for Windows Phone 7.
Quote:Original post by patrrr
Thanks, that was what I wanted to hear!
Will this affect only CPU performance, or GPU? Or maybe both?

Well, both, but that's probably not the best way to think of it.

By manually keeping track of state and avoiding an unnecessary change, a few important situations arise:

1. The need to call the API is removed. While redundant state-changes are often caught by the API (and so very little work is done), there are certain situations where this isn't possible. The result is a transition to kernel-mode (as the driver comes into play), which costs CPU-time.

2. The GPU need not flush its buffers. Considering that geometry tends to come in large batches of the same type, GPUs utilise heavy parallelism, with long pipelines that work on different parts of the data simultaneously. Although the cores are largely independent, they share render-states, and so they cannot work on data of differing type at the same time. The crux here is that a state-change requires the video card to wait for all of its pipelines to finish work before the state can be changed and the next batch can start being processed. Regular state-changes make for many such GPU 'pipeline bubbles' and amount to more waiting and less doing.

3. Parallelism is benefited. Most current games are either CPU- or fillrate-limited, and most render states pertain to the vertex pipeline, so excessive state-changing generally serves to tip this unbalance even further. It's a very general claim, but state changes hence tend to cost CPU-GPU parallelism.

The relative importance of these three factors is debatable, but it's probably not a worthwhile conversation as any one of these facts is enough to convince a sensible programmer to manage states carefully.

State management is such a common drain on processor resources that DirectX 10 (I can't speak for OpenGL 2, but I assume the situation is the same) has made huge efforts to take the strain off the programmer. Batching and state-management are tedious details that the programmer would ideally not have to concern himself with, and considering how it looks a lot more like manual-labour than insightful problem-solving, it's quite right that the effort is moving out of the domain of the programmer and into that of the silicon. Alas, until we can disregard platforms that aren't currently top-of-the-range, we'll just have to grin and bear it.
Ring3 Circus - Diary of a programmer, journal of a hacker.
This is the definitive guide to batching. It provides great insight into the subject and is a must-read for all graphics programmers.

The usual trend is that GPUs are advancing faster than the CPUs and the gap between the fastest GPU and the fastest CPU is widening. This is why people usually tend to relate batching with CPU-bottlenecks, but as TheAdmiral explained the negative effects on GPU is important nonetheless.

As the presentation states, compared to D3D, OpenGL is a more lightweight API and helps you hit the CPU cap with a larger number of batches but that doesn't mean it's something you don't need to worry about. It's going to catch you sooner or later.

This topic is closed to new replies.

Advertisement