Why driver gives a gain of FPS ?

Started by
6 comments, last by MattSutherlin 8 years, 7 months ago
Hi,
The changelog of the driver always says it gives a gain of FPS on multiple games.
Why does it happen ?
Thanks
Advertisement

The changelog of the driver always says it gives a gain of FPS on multiple games.

Why does it happen ?

Linky

The changelog of the driver always says it gives a gain of FPS on multiple games.

Large game developers often work directly with graphics-card vendors to get them to create specially optimized drivers for their games (IE having them optimize their shaders internally, optimizing certain code paths, etc.)


L. Spiro

I restore Nintendo 64 video-game OST’s into HD! https://www.youtube.com/channel/UCCtX_wedtZ5BoyQBXEhnVZw/playlists?view=1&sort=lad&flow=grid

It is my understanding that one of the things that happens is custom versions of shaders for said application. From what I've heard some are hand optimized versions of the original.

-potential energy is easily made kinetic-

special drivers also remove code posts that a game doesn't need, that helps on CPU side.

gpus usually also have different ways how they can access memory ( e.g. cached or not, in graphiccard-vram or main memory,..) and that configuration is have dependent, which driver programmer figure out during profiling.

sometimes games do not smart things e.g. textures without mipmaps, drivers enforce gpu-mipmap-generation.

sometimes there are limited resources for special optimizations like fast zbuffers, assigning these resources to the most time critical buffers can save ms of time.

there are different ways of doing the same thing e.g. const buffer update can be done directly, via staging buffers, via updatesubresource... if driver use a generic code post, none party is 100% speed to not hurt another. preparing a driver for one way a game works can give some extra % in sore and gpu memory.

...
Because D3D11/GL are way too abstract :lol:

Vulkan/D3D12 will remove a lot of this driver magic and make the game engine responsible.

Drivers teams will probably still hand-optimize/rewrite the shaders used by big games though...

The changelog of the driver always says it gives a gain of FPS on multiple games.

Why does it happen ?

Linky

This post was an eye opener. Thanks for sharing.

They may be doing a lot of correctness optimizations (or avoid pessimizations) as well.

One particular pessimization that I remember is that OpenGL buffer objects were allowed to be mapped for write-only and subsequently read, or mapped read-only (up to 3.2 or 3.3 if I recall correctly) and be written to. For some reason, the standard didn't disallow that, and didn't state that doing so anyway would result in undefined, possibly desastrous behavior (it does now).

Some programmers (or a lot, depending on whom you ask) got the not-very-intuitive buffer flags wrong, even to the extent of mapping buffers write-only and then reading from it, which forced the driver to somehow provide that data. Doing that on the fly is inefficient on some architectures and simply impossible on others (for example there is no way to make a memory region accessible for writing, but not for reading on a typical desktop computer).

The consequence was that driver writers decided to simply always fetch the buffer contents from the GPU, in case some butt-head tries to read from it. (Now don't ask me to provide a source for this story... but I think there was mention in Cozzi/Riccio...)

One obvious thing to do as an optimization for a particular binary would therefore be not to do useless DMA transfers once it is known that a particular application behaves correctly and uses its buffers as-intended and as-advertized.

In the same sense, the driver could skip other correctness tests, for example it could shortcut framebuffer or texture completeness checks or shader stage consistency checks if analyzing the binary in the lab has shown that it is always working correctly.

The rules for the command queue can be relaxed in some cases as well. Usually, the driver will not sync when you swap buffers (although strictly, it should), but will continue pulling commands from the queue until it encounters something that would modify the frame buffer. That allows for somewhat fewer, shorter stalls, without observable difference.

Now, for a known, well-analyzed program for which is known that e.g. it first runs some physics simulation or lays down some data in a G-buffer rather than modifying the visible framebuffer, it possible to even continue with commands that do make modifications (to some point). So, instead of stalling, the program would continue working on the next frame.

(Yes, the complete truth is even more complicated since drivers will usually pre-render two or three complete frames without you being aware of it. However, the same thing holds true for any N frames... you can continue doing frame N+1 up to some point in a non-destructive way without additional storage if you know exactly what the specific binary is doing)


Large game developers often work directly with graphics-card vendors to get them to create specially optimized drivers for their games (IE having them optimize their shaders internally, optimizing certain code paths, etc.)

IHVs often do this without actually working directly with the developers, too. I learned the truth outlined in that linked post when a game I was working on had a listed perf increase in a driver update, and I asked when we had worked with NVidia on that and the answer was a resounding, "Never". At that same job, I came across a value overflow corruption across members of a float3 in our shaders that was due to those reverse-engineered optimizations. So, it's a fun give and take.

This topic is closed to new replies.

Advertisement