• Announcements

    • khawk

      Download the Game Design and Indie Game Marketing Freebook   07/19/17

      GameDev.net and CRC Press have teamed up to bring a free ebook of content curated from top titles published by CRC Press. The freebook, Practices of Game Design & Indie Game Marketing, includes chapters from The Art of Game Design: A Book of Lenses, A Practical Guide to Indie Game Marketing, and An Architectural Approach to Level Design. The GameDev.net FreeBook is relevant to game designers, developers, and those interested in learning more about the challenges in game development. We know game development can be a tough discipline and business, so we picked several chapters from CRC Press titles that we thought would be of interest to you, the GameDev.net audience, in your journey to design, develop, and market your next game. The free ebook is available through CRC Press by clicking here. The Curated Books The Art of Game Design: A Book of Lenses, Second Edition, by Jesse Schell Presents 100+ sets of questions, or different lenses, for viewing a game’s design, encompassing diverse fields such as psychology, architecture, music, film, software engineering, theme park design, mathematics, anthropology, and more. Written by one of the world's top game designers, this book describes the deepest and most fundamental principles of game design, demonstrating how tactics used in board, card, and athletic games also work in video games. It provides practical instruction on creating world-class games that will be played again and again. View it here. A Practical Guide to Indie Game Marketing, by Joel Dreskin Marketing is an essential but too frequently overlooked or minimized component of the release plan for indie games. A Practical Guide to Indie Game Marketing provides you with the tools needed to build visibility and sell your indie games. With special focus on those developers with small budgets and limited staff and resources, this book is packed with tangible recommendations and techniques that you can put to use immediately. As a seasoned professional of the indie game arena, author Joel Dreskin gives you insight into practical, real-world experiences of marketing numerous successful games and also provides stories of the failures. View it here. An Architectural Approach to Level Design This is one of the first books to integrate architectural and spatial design theory with the field of level design. The book presents architectural techniques and theories for level designers to use in their own work. It connects architecture and level design in different ways that address the practical elements of how designers construct space and the experiential elements of how and why humans interact with this space. Throughout the text, readers learn skills for spatial layout, evoking emotion through gamespaces, and creating better levels through architectural theory. View it here. Learn more and download the ebook by clicking here. Did you know? GameDev.net and CRC Press also recently teamed up to bring GDNet+ Members up to a 20% discount on all CRC Press books. Learn more about this and other benefits here.
Sign in to follow this  
Followers 0
Steve_Segreto

Prevent redundant constants register writes

4 posts in this topic

Is there a way to prevent API calls like SetPixelShaderConstantF() and SetVertexShaderConstantF() from happening if the value written is already there? Also is this a useful optimization and how expensive are many redundant constants register writes between DIP calls?

I was thinking if the device can't do this automatically, it could be done in software if you move away from the Effects framework and use the ": register (cx)" keyword in your shader generation and then scoreboard the register values yourself with an array of floats in C++.
0

Share this post


Link to post
Share on other sites
Before optimizing anything, use a profiler to check if this is really a problem. Not seldom an unreasoned optimization will just introduce bugs, bloat the code, even reducing performance.[img]http://public.gamedev.net//public/style_emoticons/default/smile.png[/img]
2

Share this post


Link to post
Share on other sites
[quote name='Steve_Segreto' timestamp='1348114538' post='4981914']
Is there a way to prevent API calls like SetPixelShaderConstantF() and SetVertexShaderConstantF() from happening if the value written is already there? Also is this a useful optimization and how expensive are many redundant constants register writes between DIP calls?
[/quote]Yes it's definitely possible, but some methods for eliminating those redundant calls may be more expensive than the cost of the redundancy! So if it's useful depends on how the problem is solved.
Regarding the cost of these redundant calls -- as usual with GPU work, it depends on a lot of things... First there's just the simple CPU overhead of making a function call into D3D -- this is a small cost, perhaps similar to any virtual function in your own code, not much. Then the rest depends on your GPU/driver; there's a lot of ways that shader constants can be implemented.

1) Older SM2/3 cards, e.g. GeForce 7, may not even support shader constants as a hardware features! These cards implement shader constants by cloning the asm code of your shader and patching the constant values into the asm code as literals. On these cards, it's likely that setting a constant simply writes it into a global buffer, and sets a 'dirty' flag. For each draw-call, if the 'dirty' flag is set, then new GPU-accessible RAM is allocated to store the new shader code ([i]probably in a ring buffer - and if you fill it with too many constant-changes, the CPU will stall until the GPU drains it![/i]), and a new version of the shader is written to that RAM using the appropriate cached constants. Depending on whether this task is carried out by the driver on the CPU, or if the driver sends commands to the GPU's memory controller to do the patching, then each draw-call that uses different constants from the last will have a very high CPU or GPU overhead.
N.B. this may only apply to vertex shaders, or pixel shaders, or both. To optimize for these GPU's, you would definately want to sort by shader program, and by shader constants as a high priority.
Sorting draw-calls by shader-constant values sounds like a ridiculously hard task, so instead, I'd change your rendering API so that you [url="http://www.gamedev.net/topic/618167-emulating-cbuffers/"]use the abstraction of cbuffers instead[/url] -- this greatly simplifies the process of determining if two draw-calls use the same constants.

2) Cards newer than the above ones will support constants in hardware, so the ridiculously huge costs of generating unique shader asm code for different draw-calls disappears. The most simple way for them to send the constants to the GPU registers is to put them in the command buffer, along with the draw calls, so each time you set a constant, the driver is allocating some space in the command buffer ([i]probably a ring buffer again; if you fill it with too many GPU commands per frame, the CPU may stall[/i]) and writing a packet such as [font=courier new,courier,monospace]{ID_SET_PS_REGISTER, (short)idxFirstReg, (short)regCount, /*floats * regCount * 4*/ }[/font]. Writing these packets on the CPU will be fairly cheap, it's basically just [font=courier new,courier,monospace]memcpy[/font]ing your inputs to a different destination. Receiving the packets on the GPU should be almost free, as the GPU-front-end that reads the commands will be running in parallel with the GPU units that are actually doing your rasterization and shading, and these latter units should always be the bottleneck.

3) Even newer cards have the option of sending constants through the command buffer, or writing them into resident GPU RAM buffers where appropriate. If we put a lot of faith in the driver, it should be able to figure out the most optimal way of doing this to make it very cheap, so again, the CPU-side cost should be about that of a [font=courier new,courier,monospace]memcpy[/font], and the GPU-cost should be almost nil. However, you should still group as many constants together as possible so that you can set a lot of constants with one call [i](i.e. using the 3rd parameter of [font=courier new,courier,monospace]SetPixelShaderConstantF[/font][/i]).
1

Share this post


Link to post
Share on other sites
Freaking awesome link! I was thinking along the exact same lines, if you're just organized with the allocation of the constants and force them to a specific register number using ": register (cx)" then you can shadow the values in source code and stop the engine from calling SetVertexShaderConstantF based on a simple memcmp.

I have profiled my game many times in PIX, and I see the disturbing trend of repeatedly thrashing the constant registers back and forth as well as pedantically setting all "uniform externs" before each DIP.

I didn't think about sorting by constants register usage, but sorting by vertex shader might be a step in the right direction.

Thanks for the food for thought guys!
0

Share this post


Link to post
Share on other sites
My profiling has shown that it always helps to redundancy-check small values such as bool, float, int, up to vec4.
Matrices are better to just send away.
Note that this is probably going to be fairly consistent across the current generation of cards in DirectX, and is not just a hardware issue but a very much a driver issue as well. Because the same cards, running the same scenes, passing the same matrices in OpenGL are always better to redundancy-check, even with larger types such as matrices.

Note also that redundancy-checking in general is a good idea, not just for uniforms, but for textures, vertex buffers, index buffers, shaders, depth-test state, etc.
In DirectX 9 you will definitely want to sort by shader first, followed by textures. This allows you to maximize redundancies, which makes it worth your time to actually do redundancy checks.
I disagree with this being any kind of premature optimization. In general, this is a rule in the world of graphics, which is why tools such as gDEBugger and Xcode OpenGL ES Analysis show performance warnings when redundant states are set.


L. Spiro
1

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0