Stencil clear takes more than 1ms

Started by
10 comments, last by Vertex333 15 years, 6 months ago
My stencil clear takes about 1.3ms. I do thousands of them in one Frame :D (since every object needs a fresh stencil buffer). Thx, Vertex
Advertisement
1. How are you profiling that?
2. Why do you need a fresh stencil buffer for every object?
3. Why do you expect that it should be a quick operation?
1000's of clears per frame? That's crazy! First off you'll use a ton of bandwidth just sending stencil values out to memory. Second, each Clear is an API call and API calls are expensive. Plus the effect will be made worse by having to draw each object with a single draw call, which is going to be incredibly slow with thousands of objects.
note that if this is for stencil shadows, so each new shadow *caster* requires a fresh stencil buffer consider two things:

1. Aggressive culling - projecting a bounding sphere of a light isn't hard and culling objects against a bounding sphere is easy. You could possible eliminate a lot of redundant drawing very quickly.

2. Have a think about odd/even stencil rendering. I forget the details now, but you can sometimes get away with alternating between GE and LE tests thus halving the number of clear operations you need to perform.


hth
Jack

<hr align="left" width="25%" />
Jack Hoxley <small>[</small><small> Forum FAQ | Revised FAQ | MVP Profile | Developer Journal ]</small>

Quote:Original post by jollyjeffers
note that if this is for stencil shadows, so each new shadow *caster* requires a fresh stencil buffer consider two things:

1. Aggressive culling - projecting a bounding sphere of a light isn't hard and culling objects against a bounding sphere is easy. You could possible eliminate a lot of redundant drawing very quickly.

2. Have a think about odd/even stencil rendering. I forget the details now, but you can sometimes get away with alternating between GE and LE tests thus halving the number of clear operations you need to perform.


hth
Jack


If it did happen to be stencil shadows, I used the Code Sampler version for my shadows which only clears the stencil once per frame and I get good results.
Quote:Original post by stupid_programmer
If it did happen to be stencil shadows, I used the Code Sampler version for my shadows which only clears the stencil once per frame and I get good results.
Does that handle multiple lights? From memory you need at least one stencil clear per light as it can get difficult determining which pixel accepts contributions from which lights - you might be able to do it with some bitwise magic I guess, but the normal incr/decr operations probably won't work...


Jack

<hr align="left" width="25%" />
Jack Hoxley <small>[</small><small> Forum FAQ | Revised FAQ | MVP Profile | Developer Journal ]</small>

Quote:Original post by jollyjeffers
Quote:Original post by stupid_programmer
If it did happen to be stencil shadows, I used the Code Sampler version for my shadows which only clears the stencil once per frame and I get good results.
Does that handle multiple lights? From memory you need at least one stencil clear per light as it can get difficult determining which pixel accepts contributions from which lights - you might be able to do it with some bitwise magic I guess, but the normal incr/decr operations probably won't work...


Jack


No it doesn't, which isn't a problem for me but might be for others. You wrote the article on this site using shadows? I tried that one before because I like how it flowed better but my objects have a lot of transparency going on which the implemention doesn't handle well (completely transparent pixels prevent stencil writes). Sure if I knew how the set render states a bit better I could fix the problem.

Thx@all

Quote:Original post by Evil Steve
1. How are you profiling that?
2. Why do you need a fresh stencil buffer for every object?
3. Why do you expect that it should be a quick operation?

sry forgot that: Release, D3D Relase, PIX draw timing, intel onboard graphics.

It is not needed for shadows and I don't know how to not clear if using multiple objects. See this link for more details: http://www.gamedev.net/community/forums/topic.asp?topic_id=509583

I need it for every objects, since obects can affect each other. May clearing only a smaller rectangle be faster than the whole Stencil buffer? I expected that it is implemented (atleast for one value) in a fast hardware call. Maybe drawing a rectangle that writes 0 values to the whole stencil buffer may be faster.

If I can trust PIX, stencil clear is most expensive. I need that much drawcalls, since I draw different objects (ver small though) that can't be grouped together.

:( In one case I do have about 3000 drawcalls (I know that it shouldn't be more than 500, but what should I do) and therefore about 14000 RS changes. ;) It's not that well performing... about 0.5FPS with my "ultrafast" Intel GMA.

Would vertex/pixelshader or D3D10 being worth consideration?

I already had the Idea to use a degenerated triangles between the small objects and call them in one drawcall, or at least some of them. All polygons that need the stencil have to be drawn individually anyway.

edit: Ok, maybe PIX slows the application down anyway (bad for measuring). Nevertheless the ration between clear and Drawcall shouldn't be that wrong.

Thx,
Vertex

[Edited by - Vertex333 on October 14, 2008 1:38:51 AM]
Clear() takes an optional rectangle parameter. You can use that to limit the section of the screen you're clearing. You could also redraw the shape a third time with the appropriate stencil buffer op to clear that bit of the stencil buffer.

You can obviously cut the number of draw calls down significantly by grouping together objects that don't overlap too. If nothing overlaps then you can draw the whole lot in one go.

You should also be able to write some code that simply detects if a shape is self intersecting, and only uses the stencil buffer trick on those ones. However, I don't know any better algorithm than the obvious O(N^2) testing of all edges for intersection with each other, so it could get expensive. However as long as the shapes don't change every frame it could still help.

There's also another option. Instead of using the stencil buffer to get the overlaps right you can modify the shape in code so you can render it normally. However that's going to be slow. See http://www.codeproject.com/KB/recipes/hgrd.aspx for some code for doing it. That code is a bit dodgy though - it compares iterators with null which doesn't compile on modern compilers. You'll want to fix it or find something better.
Quote:Original post by Adam_42
Clear() takes an optional rectangle parameter. You can use that to limit the section of the screen you're clearing. You could also redraw the shape a third time with the appropriate stencil buffer op to clear that bit of the stencil buffer.

You can obviously cut the number of draw calls down significantly by grouping together objects that don't overlap too. If nothing overlaps then you can draw the whole lot in one go.

You should also be able to write some code that simply detects if a shape is self intersecting, and only uses the stencil buffer trick on those ones. However, I don't know any better algorithm than the obvious O(N^2) testing of all edges for intersection with each other, so it could get expensive. However as long as the shapes don't change every frame it could still help.

There's also another option. Instead of using the stencil buffer to get the overlaps right you can modify the shape in code so you can render it normally. However that's going to be slow. See http://www.codeproject.com/KB/recipes/hgrd.aspx for some code for doing it. That code is a bit dodgy though - it compares iterators with null which doesn't compile on modern compilers. You'll want to fix it or find something better.

Thx! Grouping is not that easy. self intersecting detection... ok, but as you said maybe expensive. thx for the sample, already had a look on it... I don't know if it can help me (maybe too CPU bound)... clearing with rectangles is an option, however I have to set the RT not to discard the rest (which may be odd anyway).

What I want to do now is to use VS/PS 4.0 to generate it in another way... but I do not have any idea how to do it if stencil and the render to texture shouldn't be used. Any ideas?

thx,
Vertex

This topic is closed to new replies.

Advertisement