Sign in to follow this  
ValMan

Render state caching problems

Recommended Posts

ValMan    466
Hello, I just implemented the render state cache (no compiling or debugging done yet). The general concept seems simple: keep an array of values for all D3D's render states, texture stage states (for up to 8 stages) and sampler states (for up to 8 samplers). Each time a state is about to be changed, we first compare the incoming value against cached value, if not equal than we cache the new value and then call D3D API to set the state. If state is already set, we return immediately. However, when implementing all this, I had two problems: 1) All cached states must be initialized to default values according to D3D documentation. This seems very un-elegant, because it's a huge list of assignments to array elements in my Graphics class constructor. I had no problem typing all that, but it made me wonder if that's what I really had to put up with, or if there was a better way. For all of you out there - is that how you did this? My device is created with pure flag, so I couldn't loop through all states and just call the API to get the current state value to initialize the cache with. 2) When capturing state blocks, I have to use D3D API calls directly, otherwise some important states might end up not being in the state block, if they happen to be equal to current device states at the time of capture. Again, the results in a bunch of un-elegant code that has to set each device state individually, check return value of each API call, throw error if failed, etc. I did a quick work-around by creating "unchecked" versions of state setter functions to remedy this, reducing cache-bypassing state setting to one line of code for each state. An even larger problem is when I have to apply those captured state blocks. The device states will get changed when I call Apply(), but the internal cached states will not. So I am forced to do a bunch of array element assignment once again, after a call to Apply. There are quite a few places where I have to apply state blocks, so this results in lots of code. The logical solution to this problem is to create my own state block class, that will hold a pointer to D3D state block object in addition to having a vector of cached states that will get written into cached array elements when the state is applied. However, all of this together sounds like it will produce a huge overhead - not only are state blocks slow in DirectX as is, but I will also have to loop through my own vector<> and write a bunch of values to memory each time I apply a state. Again, is that how you implemented this, or is there a better way?

Share this post


Link to post
Share on other sites
paic    645
Hi,
I can't really answer your questions, but I can explain how I'm doing things.

I'm not using a "cache" for render/sampler/texturestage states. There's simply too much to handle. What I do is using a StateBlock class which is something like :


class StateBlock
{

public:

StateBlock(void);
~StateBlock(void);

void AddRenderState(D3DRENDERSTATETYPE state, DWORD value);
void AddSamplerState(D3DSAMPLERSTATETYPE state, DWORD value, DWORD sampler);
void AddTextureStageState(D3DTEXTURESTAGESTATETYPE state, DWORD value, DWORD sampler);

void Finalize(void);

void Apply(void);
void Capture(void);
void Restore(void);

private:

ID3DXStateBlock * m_pStateBlock; ///< the actual set of states
ID3DXStateBlock * m_pBackup; ///< the stateblock used to capture states and restore them later
}



I simplified a lot the class. It stores a list of little structures storing states, and when I call Finalize, it builds the ID3DXStateBlock's. Then the 3 methods Apply(), Capture() and Restore() are self explicit :)

The point is : I only cache the current StateBlock used. I don't care about single states. I just manipulate the global set of states as a whole. It's pretty easy to implement and manipulate, and for the moment, it has good performance. I'm using typical scene of ~500k tris, and at max, I have ~20 different StateBlocks.

Share this post


Link to post
Share on other sites
Giallanon    1893
Quote:

1) All cached states must be initialized to default values according to D3D documentation. This seems very un-elegant, because it's a huge list of assignments to array elements in my Graphics class constructor. I had no problem typing all that, but it made me wonder if that's what I really had to put up with, or if there was a better way. For all of you out there - is that how you did this? My device is created with pure flag, so I couldn't loop through all states and just call the API to get the current state value to initialize the cache with.

D3D render state are DWORD if i remember right...you could just memset your cached values with 0xffffffff.
Doing this way, the first time you set a render state, you'll receive a warning (something like ignoring redundat state change) but I think you can afford a single redundant state change..from there on, your cache will be in sync with d3d internal states.

Share this post


Link to post
Share on other sites
jollyjeffers    1570
I'd personally opt for something like Giallanon suggests. That or some sort of bitfield to signify a cache element is "dirty" and exempt from checking - the dirty concept can work nicely for the lost-device scenario as well as first-time loading...

hth
Jack

Share this post


Link to post
Share on other sites
paic    645
But that doesn't solve the problems of state blocks. And in DirectX, it's better to set a stateblock containing a few redundant state changes, than setting each state separately with SetRenderState() and such. Even if the latter avoids any redundant change.

Share this post


Link to post
Share on other sites
ValMan    466
So, sounds like the best way is not to track every single state to call API when it gets changed, but instead attempt to put every single state change used throughout the game or the engine in a D3D state block. Which I've heard is also how D3D10 handles this.

I am still struggling to understand how render state blocks fit into the rendering pipeline, but so far I have arrived at the following (my engine is 2D):


// Each actor is a sprite, so rendering it would result in a quad added to the rendered que.
// Quads in the render que would need to be sorted by Z Order and Material.

Actor
Position // Applied to verts of the quad as POSITION0.
// Specified as world units, cached into units used with ortho projection.

Z Order // Applied to verts of the quad in POSITION0
// but since I don't use Z-buffer, only useful
// to renderer for determining when to break a batch

Color // Applied to verts of the quad as COLOR0, if vertex declaration has a slot

SpriteInstance // Sprite used, which contains texture coordinates in animation frames
Animation
Frame
TextureCoords[] // One set for every texture. How many textures depends
// on vertex declaration, which is a part of Material
// specified as float U,V, but can be set from pixel values
// applied to verts of the quad as TEXTUREn

AnimationInstance // Specifies which Animation to use in Sprite, and which frame of that animation

Rotation // Specified in degrees. When changes,
// call D3DXMatrixTransformation2D to update internal SRT matrix

Pivot // Specified in world units, cached into ortho units.
// When changes, call D3DXMatrixTransformation2D to update SRT

Scale // Specified as percentage. When changes,
// call D3DXMatrixTransformation2D to update SRT

SRT Matrix // Not exposed, but used for collision detection and rendering

MaterialInstance // Stored in pool of MaterialInstances. Every time some material property gets
// changed, a new material instance may have to be created
Material
VertexDeclaration // Used to figure out the contents of quad's vertices

Effect // Optional. If not using an effect, use fixed pipeline
D3DEffect // Not exposed

Effect Technique // Used only when effect is specified

Properties[] // Values of Effect's exported constants. If material has no effect, not used.
// Exposed, but compiled into internal D3D state block which is not exposed
// Some properties with special semantics, like WorldViewProj,
// would be set by engine on every frame

Textures[] // Exposed, but compiled into internal D3D state block which is not exposed

States[] // Exposed, but compiled into internal
// D3D state block which is not exposed

D3DStateBlock // Not exposed, contains all states
// that must be set before object is rendered
// this includes API calls resulting from Properties, Textures and States


Does this look like something that can be accomplished? I really want a flexible way to render, but all of this looks like it will take years to implement. There are also blank spots which I don't know how to implement. For example, I know that it's possible for Effect code to set rendering states. But then, how do I figure out whether or not it did this, and which states it changed, so I may include those in un-exposed D3D state block in the Material, and then also prevent it from setting those states again?

So anyway, before an actor can be rendered, I would have to re-create the material instance's state block.

1. Device.BeginStateBlock

2. Force Effect somehow to change all states needed for selected Technique, if Effect is specified, so they can be recoreded in the state block

3. Loop through all Properties and set an Effect constant for each. This will cause SetShaderConstant calls to be recorded in the state block

4. Loop through all Textures and set all of them, to cause SetTexture calls to be recorded in the state block

5. Loop through all States and set them to cause SetRenderState/SetSamplerState/SetTextureStageState to be recorded in the state block

6. Device.EndStateBlock

Each time actor's material instance changes because we set a Property, a Texture, a State or Technique, the internal state block will be marked as "dirty". When attempting to render this object next time around, steps above will be followed to update it.

Share this post


Link to post
Share on other sites
paic    645
A few things not in order :

When using .fx effects, you can indeed change states. But when using the ID3DXEffect in your engine, you can also specify a few flags so that the effect will restore the states as they were before using it.

About state blocks. They should only be created during initialization. They are not meant to be recreated / modified each frame ! The point of a stateblock is to set many render/sampler/texturestage states using only one API call (IDirect3DStateBlock9::Apply)
It's not as flexible as setting states individually, using SetRenderState and stuff, but it's a lot faster.
Think about your engine : most of the sprites you are drawing will use the exact same states (alpha blending, alpha testing, etc.) Then you'll probably have a particle system which uses a shader and probably another set of states. Etc.
So during initialization, create a few stateblocks for the different kind of entities of your engine and before rendering, group entities by the stateblock they're using, and draw them, only changing stateblock when needed.

For example, in an extremely simplified game, you'd have 3 stateblocks : one for sprites, one for effects, one for GUI. Then rendering would look something like :

- Set sprites stateblock
- Render all sprites (set textures, set matrices, etc.)
- Set effect stateblock
- Set shader
- Render all effects
- Set gui stateblock
- Render GUI



But again, that's just the way I'd do it. If you use many different states, using stateblocks might not be the best option.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this