# Efficient Render States In 2D (*gulp*)

## Recommended Posts

GroZZleR    820
Hey all, I whipped up a quick particle system structure today. Nothing fancy, just enough to do some prototyping before fleshing out the entire system. I've run into a problem though, and it's crippling my framerate from texture / state changes. Imagine this scene: Imagine Cyan and Blue are people, Brown is a table and Red is some sort of particle system (perhaps an active spell on the table, it doesn't really matter). As you can see, to achieve this sense of depth, the scene has to be rendered from top to bottom. This leads to a state change in the middle of rendering, a switch of texture, then a switch back. That's not so bad - it's when the particles begin moving (perhaps flying to all corners of the room), you end up having a significant amount of switches. If you have more then two particle systems going, this really hurts. Is this just the price you pay for the pseudo depth of most classic RPGs? I certainly can't think of a more efficient way to do it. For the record, in this situation, I sort by Y then Texture. I can't move the particles to their own "layer" because that would stop them from rendering either "infront" of Cyan or "blocked" by Blue. Anyone? [Edited by - GroZZleR on July 1, 2005 3:08:51 AM]

##### Share on other sites
I'm assuming that the state changes are occuring because each different object has to change the state to render itself. Is there a reason you need to manually sort out the depth yourself? Why don't you set up a z-buffer and assign the z coordinate higher or lower depending on the y-axis coordinate?

EDIT: That way you could change the state once for all of x object, render them all, and then switch the state to all of y object, render them all, and then etc... so that you only have to set the state once for every kind of object.

##### Share on other sites
Syranide    375
Yupp I agree with the previous post, and really, it isn't as hurtful as one can think to change states in 2D or do many render-calls, sure it will always impact performance a little, but 2D-applications today are not very limited to the 2D-performance.

(Seeing simple Physics samples draw 1000 boxes, using 6000 render-calls per frame and still maintaining a high FPS, even though doing 6000 is just plain stupid, it proves that one doesn't need to be overly cautios of wasting 5 or 10)

But agreeing with the previous post, why not do like in 3D, use the Z-buffer and batch textures and renderstates together into "optimized" groups. (can make things simpler as you don't need to draw in the right order at all times, however, minor performance hit)

##### Share on other sites
GroZZleR    820
Thanks guys, helpful as always.

However - I seem to be having some issues implementing the Z-buffer. It doesn't seem to work, heh. A google for "Managed DirectX ZBuffer" just gets me the ZBuffer development site, which ironically enough, doesn't seem to have an article on the ZBuffer. ;)

Here's the various snippets to try to get this going:
First my presentation parameters:
Direct3D.PresentParameters parameters = new Direct3D.PresentParameters();parameters.Windowed = true;parameters.SwapEffect = Direct3D.SwapEffect.Discard;parameters.BackBufferFormat = Direct3D.Format.Unknown;parameters.PresentationInterval = Direct3D.PresentInterval.Immediate;// Z-Buffer parameters.AutoDepthStencilFormat = Direct3D.DepthFormat.D16;parameters.EnableAutoDepthStencil = true;

Then I clear the ZBuffer and enable writing when I draw:
_device.Clear(Direct3D.ClearFlags.Target | Direct3D.ClearFlags.ZBuffer, Color.Aquamarine, 1.0F, 0);_device.BeginScene();_device.RenderState.ZBufferEnable = true;_device.RenderState.ZBufferWriteEnable = true;

I setup my projection matrix with an Orthographic Projection:
ortho = DirectX.Matrix.OrthoOffCenterLH(0.0f, 640.0f, 480.0f, 0.0f, 0.0f, 10000.0f);

Those last 2 numbers are "ZNear" and "ZFar", so I'm pretty sure that means my values have to fall between that (which they do). Are there any sort of invalid values there?

No luck with what I've got now. I'm specifying the Z in my vertices to be the same as my Y, but it doesn't seem to change anything. It draws in the order I add them. If I do Y = 0, Y += 16, it'll draw top to bottom. If I do Y = 160, Y -= 16, it'll draw bottom to top. If I do Y = Random.Next it'll be random.

Any ideas?

EDIT: All of my sprites are drawn with alpha blending, if that makes any difference what so ever.

##### Share on other sites
Syranide    375
zNear cannot be 0.0 (preferable 1.0), and zFar should be a lower value.

Know, that, zNear and zFar scale the precision of the z-buffer, so choosing 1 and 10, would have accuracy of a few decimals, 1 and 100000, would now be very low, meaning that 10.1 and 10.2 are the same places.

However, for you I don't think it doesn't matter that much, as you aren't going to use 10000 different levels I would say, setting it to 1 to 100 would give you 100 different levels to draw on without any problem at all.

So drawing on Z:1 and then Z:50 would make the sprite in Z:1 topmost and Z:100 being behind.

However, know that drawing transparent sprites _must_ be done back to front (sorting) or you will start seeing strange effects. (however, due to the use of the z-buffer, you can now draw all opaque sprites first, then the transparent)

EDIT: drawing transparent sprites as in alpha blended sprites, as the partially transparent pixels will fill the depthbuffer, meaning that nothing behind will be drawn.

##### Share on other sites
GroZZleR    820

Got it working changing the 0 to a 1 as you've suggested. Clearly, this isn't going to solve my top-to-bottom problem as there should be many more sprites wedged in there. As you say, the depth buffer is overwriting them with transparent pixels.

Any other suggestions? =P

I don't know if there is another one. The sprites have to go top-to-bottom and if that means a particle is "lower" then a non-particle, it has to change renderstates / textures just to draw itself, then change states back.

Am I maybe just overconcerned? Dropping from 2.8K FPS to 600 FPS isn't a fun drop.

##### Share on other sites
Syranide    375
2.8K to 600FPS isn't really something one can measure, because even changing a single renderstate could have such effect at those framerates.

However as I said, drawing transparent sprites must be done sorted back-to-front, that is, sorting them all by z-value before rendering them to the screen.

And don't worry too much about the speed, changing textures and rendering states isn't that costy that one thinks, and as you are using so small textures, you would most likely be getting away easily by creating a palette of all the textures used, thus having only to set texture once or twice, and only change renderstates once or twice.

EDIT: However, when thinking, why would one need a z-buffer if sorting back-to-front anyways, would seem to be a waste...

##### Share on other sites
kosmon_x    205
Using FPS to measure performance is a bad idea, since FPS is non-linear. Use frame time instead.

##### Share on other sites
rdunlop    158
One thing that you should determine is where your bottleneck really is. In a 2D application where each pixel may be rendered multiple times per frame with alpha blending, often fill rate will be your limiting factor. There's a good paper on tracking down the limiting factors of your Direct3D application here:

http://developer.nvidia.com/docs/IO/8230/GDC2003_PipelinePerformance.pdf

Depending on what the limiting factor is in your application, there may be a few optimizations that you can perform:

* If you have objects that are opaque (or only have fully translucent/opaque pixels, as I'll get into below), these can be rendered unsorted using the depth buffer, prior to rendering any objects with translucent pixels. That means two passes - first render opaque objects unsorted, then render translucent objects sorted back to front.

* When rendering that second pass, you can turn off Z write enable to reduce the overhead of writing to the depth buffer, since they are sorted back to front.

*...however, leave D3DRS_ZENABLE set to true when rendering the translucent pass, as it will prevent having to render any pixels that are already occluded by an opaque object.

* You can use the alpha test render states to prevent pixels with alpha values below a specified value from being drawn. This reduces the number of pixels that need filled.

* Alpha testing also prevents these pixels from writing to the depth buffer, so if you have objects to render that have an alpha mask but have no translucent pixels (i.e. all pixels either opaque or transparent), then you can treat them as if they were opaque objects (i.e. render unsorted using z-buffer) as the transparent pixels won't occlude other unsorted objects.

* If you do have a separate first pass for opaque objects w/ z-buffering, you can reduce the fill rate by rendering objects in this pass front to back. That way pixels that are occluded by closer objects don't need to be rendered.

Hmmm, feel I'm missing a couple of points, but hopefully that may be of some use...

## Create an account or sign in to comment

You need to be a member in order to leave a comment

## Create an account

Sign up for a new account in our community. It's easy!

Register a new account