Archived

This topic is now archived and is closed to further replies.

Reducing OverDraw...

This topic is 5650 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

hi, This is a little idea I picked up from an interview I read with John Carmack (about Doom3 of course!)... I think he might be using this method in D3, but I cant remember! Anyway, Overdraw can be a big problem in games these days - particularly when you have shader scripts operating on them and lots of multi-layered effects. The thing I read in this interview seemed to suggest using 2 passes to remove this problem: First pass: disable use of textures, lighting and color buffer rendering. Enable Depth buffer writing Second pass: enable use of textures, lighting and color buffer rendering. Disable Z-writes, allow z-tests the first pass would then generate a complete depth buffer image for the scene... the second pass would be a normal render, but the presence of the zbuffer image would allow most pixels to be rejected very quickly, and only the final (visible) pixel to be actually rendered. Does this make sense? Do you think this would work well to reduce overdraw? The only down-side I can see is that you''d need to transform all geometry twice, such that you''ll only improve in speed if the geometry rendering isn''t too complicated, and the reduction in fill rate compensates... Also, would this be a good strategy to try? Any comments/thoughts? Jack;

Share this post


Link to post
Share on other sites
quote:

Does this make sense?


Yes.

quote:

Do you think this would work well to reduce overdraw?


No

OK, a bit more precise: yes, it would speed your engine up, if your scenes had few geometry, but used a huge amount of pixel shaders. The GF3+ has special optimization (early z rejection) to speed up those cases.

But the use of complex geometry is more and more common, while only a small part of the geometry really needs heavy-duty pixelshaders. And in any case, you would actually double transformation costs *and* fillrate (even if it would be faster, there is still the zbuffer fill cost).

Perhaps a compromise would interesting: in the first pass, render an approximate scene into the zbuffer (making sure that the approximated model will never have a part in front of the original full detail one). You could get away with a fraction of the faces needed by your real scene, and early z rejection would still speed you up quite a bit. Would be interesting to do some profiling using that method.

BTW: if you want to reduce overdraw, you should have a look into occlusion culling. It can be very effective in cutting down overdraw.

/ Yann


[edited by - Yann L on June 26, 2002 9:25:59 PM]

Share this post


Link to post
Share on other sites
It''s too bad if you can''t save the transformed vertices from the first pass to use for the second pass. It actually sounds like something the hardware should do as an optimizion.

I wonder if there is a way to do that and actually improve performance over transforming twice.

Value of good ideas: 10 cents per dozen.
Implementation of the good ideas: Priceless.

Proxima Rebellion - A 3D action sim with a hint of strategy

Share this post


Link to post
Share on other sites
quote:

It''s too bad if you can''t save the transformed vertices from the first pass to use for the second pass. It actually sounds like something the hardware should do as an optimizion.


The OpenGL extension ''compiled vertex arrays'' (CVAs) can do that. Or at least, it should be able to do that. Sometimes. Perhaps.

It''s not standarized. The idea was to cache transformed vertices in VRAM, so that they could be reused in a second pass. But because of non-standarization, every manufacturer is interpreting it differently. Sometimes it will actually cache the transformed vertices, sometimes it will only copy the untransformed vertex data into VRAM for fast access. Depends on manufacturer and even driver revision. Which unfortunately makes it pretty useless.

/ Yann

Share this post


Link to post
Share on other sites
I''m currently using Direct3D8 for all this, which doesn''t yet have occlusion culling.

I didn''t think it would work too well, but its an interesting idea to think about. Maybe I''ll play around with something else next

thanks for the thoughts/opinions etc...

Jack;

Share this post


Link to post
Share on other sites
About the geometry throughput that you are worried about... Personally I don''t see it as much of a problem. Since you can pretty much run at the graphics cards max polygon throughput ( since you are disabling texturing, lighting etc ), which is extreamly high in modern graphics cards, the first pass will be really fast.

Death of one is a tragedy, death of a million is just a statistic.

Share this post


Link to post
Share on other sites
quote:

About the geometry throughput that you are worried about... Personally I don''t see it as much of a problem


Well, our game is geometry limited on a GF4 Ti4600 (at full detail level). So, if I included that feature, I would actually lose framerate...

It always depends on your specific application, since it basically trades lots of geometry bandwidth against not so much fillrate (if you draw the full detail mesh, if you only draw an approximation, things might be different). The early z-rejection is not that early in the pipeline eiher, you can save some cost, but it won''t be for free. Not sending additional geoemtry, on the other hand, is free.

I would still go with a good occlusion culling system: the fastest geometry is the one you don''t even send to the 3D card. We have made very good experiences with a hierarchical occlusion system in our game, in most scenes, we can cull around 90%-95% of the potential overdraw. I don''t even dare to imagine what would happen to the framrate, if this amount was sent to the card, to let it do the culling via early z rejection.

But it might work for your engine, just try it out !

/ Yann

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Ok, color pink and call me stupid but I have a few questions with regards to the terms used here.

I don''t really understand what you mean by "pass" ? Now I''m just starting off in OpenGL programming and I''ve heard this term a lot but I don''t really understand its meaning, implication and implimentation when programming a rendering pipeline.

Thanks in advance.

Share this post


Link to post
Share on other sites
I''ll try to answer your question about what a pass is.

You may render the same scene multiple times in a single frame. For example, think of Quake III. The world geometry is rendered once using the base textures. It is then rendered a second time using multiplicatively-blended lightmap textures.

This example isn''t entirely true however; on most cards, this is done in a single pass using multitexturing. However, when a single "shader" has more texture stages than the card does, then each additional stage requires an additional pass - a re-rendering of the same geometry, though perhaps using a different texture or blend mode.

You may render your scene once to the z-buffer, and a second time to the color buffer, as you suggested. You render it twice in one frame; you use two passes.

You may be building a terrain engine. You want parts of the terrain to have a "sand" texture, and other parts to have a "grass" texture, and you want the two to blend seemlessly. So you first render the terrain with the sand texture. Then you render the same terrain again with the grass texture, but with alpha blending to allow parts of the sand texture to show through. You rendered the same terrain twice; you used two-pass rendering. Now let''s say you wanted to add snow to your landscape as well: you''d use a third pass.

Multitexturing is done in a single pass: you render the geometry once; you just use multiple hardware texture units. When possible, use multitexturing to avoid additional passes.

Multipass rendering on new T&L cards generally does not create bottlenecks when dealing with the geometry. A CVA or D3D equivalent stores the already-transformed verteces, so only fillrate is the problem. As Yann pointed out, though, CVAs are ill-defined, so performance when using multipass techniques may vary significantly from card to card. But, in general, geometry throughput is not the problem.

The problem, as I mentioned, is raw fillrate. This is why Doom 2000 will run as "only" 30 FPS - it uses many, many passes. The card can only draw so many pixels per second!

Share this post


Link to post
Share on other sites