• Create Account

## Constant Buffer usage

Old topic!

Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

3 replies to this topic

### #1Hyunkel  Members

401
Like
0Likes
Like

Posted 16 September 2012 - 07:35 AM

I went over some of my code with a colleague yesterday and he was quite surprised by how I manage my constant buffers.
Except for a few rare and very specific situations, I only use 2 "global" constant buffers.

A per-frame buffer, which contains data which only needs to be updated once per frame.
[source lang="cpp"]cbuffer PerFrameCB : register (b0){ float4x4 CameraView : packoffset( c0.x); float4x4 CameraProjection : packoffset( c4.x); float4 CameraPosition : packoffset( c8.x); float4 SunDirection : packoffset( c9.x); float2 ViewportSize : packoffset(c10.x);}[/source]

And another buffer used for everything else.
This buffer is 1kb in size, and is updated whenever new data is needed, which is multiple times per frame.
Both of these buffers are always bound to the registers b0 and b1 for all shader stages.

I've been told that this is the "wrong" way to do it. I'm supposed to split this up into individual constant buffers.

However I don't understand why that is the case.
If I split my current b1 constant buffer into X buffers, not only do I still need to update these buffers, but I'll also need to bind a new constant buffer whenever new data is needed.

I don't see how my method is wrong, but I'm a little paranoid when I hear such claims because I am self-taught.
So I figured it's better to ask than potentially doing something wrong.

Cheers,
Hyu

Edited by Hyunkel, 16 September 2012 - 07:38 AM.

### #2mhagain  Members

12436
Like
2Likes
Like

Posted 16 September 2012 - 08:24 AM

A per-frame cbuffer is OK to use for constants that really only change once per frame. Using a single large cbuffer for everything else is always advised against - Microsoft advise against it, the hardware vendors advise against it, and I'm going to advise against it.

The reason why is because every time you need to update even a single constant, you need to upload the entire cbuffer to the GPU. So if you're drawing a sprite that only needs 4 float3s updated, that entire 1kb cbuffer needs to go up. If you're drawing hundreds of these sprites - ouch!

The general usage is that you have a "per-object" constant buffer, which may (but doesn't have to be) different for each object type, and which only contains the constants needed to draw each object type. So a mesh cbuffer will contain a matrix and some animation info, a sprite cbuffer will contain info needed to billboard the sprite, etc. Sort your objects by type for drawing, select the cbuffer to use and bind it (add some state filtering to your engine here), then map the cbuffer with discard, update, unmap, draw.

Don't worry too much about the overhead of switching constant buffers - this is a very lightweight operation, certainly far more lightweight than uploading a full 1kb buffer multiple times per frame.

It appears that the gentleman thought C++ was extremely difficult and he was overjoyed that the machine was absorbing it; he understood that good C++ is difficult but the best C++ is well-nigh unintelligible.

### #3Hodgman  Moderators

49397
Like
2Likes
Like

Posted 16 September 2012 - 08:40 AM

The GPU and CPU aren't synchronised -- usually the GPU is running about a whole frame behind the CPU. This means that when you use D3D to tell the GPU to do something, you're really just putting a command into a queue that it will get to later.
e.g.
CPU:| Draw A | Draw B |
GPU:                  | Draw A | Draw B |
This has a big impact on how GPU-side resources are managed.
Let's say that above, A and B both use the same cbuffer, but it's updated before each draw call.
When you ask to update the buffer for [B], the CPU has to wait until your [A] command has been completed, otherwise it will be drawn incorrectly, and then after [A] has been drawn, it will allow you to map the buffer and draw [B], so you end up with a big stall, like:
CPU:| Update | Draw A |            | Update | Draw B |
GPU:                      | Draw A |                 | Draw B |
If your GPU driver is really clever, then instead of stalling, it can actually allocate 2kb space instead of 1kb space, and although it looks like you've got 1 cbuffer, it's actually allocated two different ones internally! This is basically the same as if you'd gone and made multiple cbuffers yourself, but you're relying on the GPU driver to save you instead of writing safe code yourself:
CPU:| Update 1 | Draw A | Update 2 | Draw B |
GPU:                                          | Draw A using 1 | Draw B using 2 |

### #4Hyunkel  Members

401
Like
0Likes
Like

Posted 16 September 2012 - 09:32 AM

Thank you, this was extremely helpful!

Old topic!

Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.