why is Changing Render State expansive?

Started by
6 comments, last by Rand 17 years, 1 month ago
I was told here and there that:don't change your render state, it's expansive; so why? just transfer a state to gpu, and set the flag; the only reason I can figur out is that: the changes make the gpu's parallel computation wait for setting completion; any details?
Advertisement
Setting renderstates (or manipulating devices in general, for that matter) requires that the operating system, in conjunction with the cpu, does a mode transition between application level (ring 3) and kernel level (ring 0) to enable commanding of the physical hardware. As the driver gets called, each function goes through this mode switch which takes some time.

In addition, the cpu might need to actually upload or download data from the graphics card in order to realize the commands. While data bus bandwidths have grown significantly in the last few years, it still can take some time to do the copying.

There is also the factor that if your renderstates happen to result in a situation in which the data to be used is not ready, the driver will have to wait until the data dependencies are resolved (by waiting for the drawing operations). This is called "stalling" the hardware and - despite often manifesting itself in the context of graphics commands - is a parallelism issue as opposed to a pure hardware issue.

Niko Suni

Render state changes may also trigger partial specialization or full recompilation and linkage of the shaders and the subsequent upload of the binaries. However, such operations are most probably cached and will not cause such a great performance hit when there is only a few states involved.
A more comprehensive presentation :
http://developer.nvidia.com/docs/IO/8230/BatchBatchBatch.pdf

LeGreg

Of course, the statements above only apply to DirectX. In OpenGL, changing render states is extremely cheap and not a cause for performance worries.
Quote:Original post by Gagyi
I am really curious why. The drivers? (BTW I only do directX)

Pretty much. In OpenGL, card vendors provide a driver which runs at least partially in user-mode. Every state change is delegated to this driver, which can then analyze and reorder stuff and then decide when it wants to make the expensive context switch to kernel mode which is required to actually communicate with the GPU. In contrast, Direct3D vendor-provided drivers are completely in kernel mode, so they can't do clever things to avoid the switch; by the time they're invoked, it's already too late.

In Vista with DirectX 10, this outdated organizational model is changed.
AFAIK, the shader recompilation issue is perfectly valid on OpenGL and on architectures that use JIT compiled fixed-function OGL pipelines. However, the effect should be less noticeable on the latter case.

Moreover, if no state changes are done between consecutive draw calls of the same primitive type, then OpenGL driver can combine those into one batch on the fly causing less communication with the GPU.
Quote:Original post by Eric Lengyel
Of course, the statements above only apply to DirectX. In OpenGL, changing render states is extremely cheap and not a cause for performance worries.


Thats what i thought. Its cheap on ps3 too.

This topic is closed to new replies.

Advertisement