zbuffer issue

Started by
10 comments, last by Absolution 23 years ago
One problem I have noticed with the zbuffer I have implemented is that i have to zero it out every iteration of the main loop. This seems to be the largest bottleneck of my program right now. I have tried a basic loop, a memset, and I even coded it in assembly, but they all take about the same time (I''m guessing my compiler is reducing them all to the same code - i.e. the assembly). Anybody have any ideas on how I can speed this part of my code up? A faster way to zero out an array of doubles would help. Abs
Advertisement

1- Using a hardware Z-buffer. (I guess it''s not an appropriate solution).

2- Using floats instead of double. Floats should be precise enough and are twice smaller.

3- Using MMX ?
1. Ya, that's not really an issue since I am writing this in DirectDraw on purpose...however, you gave me an idea using fixed-point math and a directdraw hardware surface.

2. I am using floats. I didn't mean to say double.

3. I have never got into that actually. I know assembly well enought to inline my procedures if need be, but that's about it. Is it fairly easy to use/learn?

Edited by - Absolution on March 19, 2001 7:01:25 PM
> however, you gave me an idea using fixed-point math and a
> directdraw hardware surface

Remember to store your surface in SYSTEM memory. VIDEO memory is very fast for hardware stuff but very slow if you access it with the processor.

On my PC (P3-500 with GeForce1). Assuming you use software, writing on a SYSTEM surface is twice faster than writing on a VIDEO surface and reading on a SYSTEM surface is 20 times faster than reading a VIDEO surface (video mem is *really* slow for reading).

> I have never got into that actually. I know assembly well
> enought to inline my procedures if need be, but that''s about
> it. Is it fairly easy to use/learn?

Neither do I. I''ve already seen some piece of ASM using MMX and it was really criptic. However, filling a Z buffer with 0 shouldn''t be too hard. A description of the MMX instructions must be available on intel web site.

The good thing with MMX is that you can set 64 bits to an appropriate value with only one instruction.
?
Are you a newbie ?
ZBuffer are not used in software for a very good reason they are slow as hell, use Coverage buffer or things like that.
they are tuts about that on www.flipcode.com

Z/W Buffer are good for Hardware.
(And even their they are a big bottleneck, which only a few grpahics chips avoid (PowerVR, Gigapixel and Flipper AFAIK))

-* So many things to do, so little time to spend. *-
-* So many things to do, so little time to spend. *-
quote:Original post by Ingenu

?
Are you a newbie ?
ZBuffer are not used in software for a very good reason they are slow as hell, use Coverage buffer or things like that.
they are tuts about that on www.flipcode.com



That''s why carmack used one in quake 2 right? Ooops, guess they''re not slow. THey''re actually pretty damn fast, compared to the alternatives, which require depth sorting AND polygon splitting EVERY frame.

The downside of a zbuffer is the large memory requirement, which, on a PC, is irrelevant.

I think you are the newbie here
(edit) arrg! stupid html less than signs(edit)
quote:Original post by Absolution

One problem I have noticed with the zbuffer I have implemented is that i have to zero it out every iteration of the main loop. This seems to be the largest bottleneck of my program right now. I have tried a basic loop, a memset, and I even coded it in assembly, but they all take about the same time (I'm guessing my compiler is reducing them all to the same code - i.e. the assembly). Anybody have any ideas on how I can speed this part of my code up? A faster way to zero out an array of doubles would help.

Abs


I read about this trick somewhere, i think it originated from michael abrash/john carmack/id software.

ANyway, you cut your z-buffer precision by 1/2, but switch from the high and low end of the [0,1] range each frame.

Then you switch your depth test as well each frame

For example, on even frames, map z to [0,0.5], do normal less than depth test.
On odd frames, map z to [0.5,1] but do a greater than test. In essence your modified z is 1-normalz, where normal z falls between 0 and 0.5, like in even frames.
Anyway, this should keep you from having to clear your z buffer. You could also try negating z and swapping tests each frame, to preserve full precision(ie map z to [0,1] one frame and to [-1,0] the next frame)

Hope some of this helps.




Edited by - sjelkjd on March 20, 2001 3:34:31 PM
That''s a great idea, I''m going to try that and see what results I get. If I can avoid zeroing my zbuffer I will see a huge fps increase.

Abs

p.s. As an aside, I have been converting some functions to fixed point math. A lot of people say that it is not worth it anymore since floats are just as fast on modern compilers, but I think they are forgetting that you usually have to convert those floats to ints at some point and that chews up a lot of cycles - particularly if you are doing a few per pixel. With fixed-point math you might save with integer math, but you will definitely save during the casting.
Yep sign swapping on the Z value is how Quake did it. However
if you are not writing to the whole screen with each frame
this will not work very well.

If you are not rendering to the whole buffer, or even if you
are then another trick is to increment the zbuffer every frame
by a large integer.

Basically...

Init a counter on your very first pass.
z_add=0;
clear_z_buffer


for each frame

zval=interpolated_z-z_add;
if(zval<*z_buffer)
{
*z_buffer=zval;
}



next frame bump up z_add
z_add+=64000;

when this overflows clear the z_buffer and init z_add to 0.

It could be hours before you end up clearing the zbuffer again.
Depending on the range of z in your scene you could even bump
up by a smaller value.










Edited by - ancientcoder on March 21, 2001 3:43:43 PM
Another good idea I will have to try.

I tried sign swapping with limited success. Like you mentioned, the biggest problem had to do with me not writing the entire buffer on every screen - since the program I am writing right now is just a simple model viewer. If I draw a cylinder for example and then rotate it I get a lot of distortion around the edge (where values from the last frame interfere). I also get sparkles within the polygons too. What I mean is a get the ocassional pixel that is out of place and rotating between two colours (so it looks sparkly). I fixed the first problem by zeroing the buffer on even frames only. Just letting me half the number of zbuffer-refreshes gave me a huge performance gain. I am still puzzeled about the sparkling though. I think this is due to the decrease in precision, but I'm not sure - it could also be in the way I implemented it. The way I did it was similar to what was suggested. On even frames I would map my interpolated z-values (which are in the range near-plane to far-plane) to 0.0-0.5 and do a normal if(zbuffer[offset]>ztest) test. On odd frames I would map to 1.0-0.5 (reverse mapping)and do the opposite test. Then, like I said, I would zero the buffer. It's definitely faster, but the loss of quality isn't worth it.

I will try the other suggestion. For the kind of program I'm doing right now it seems like a better choice. Now if I could only remember what the maximum value of a float was....

Abs


Edited by - Absolution on March 21, 2001 6:04:36 PM

This topic is closed to new replies.

Advertisement