Sign in to follow this  
jaafit

HLSL: float4 clamping/overflow on Geforce 5?

Recommended Posts

Hi, I'm wrote a bloom shader based on the one in Shader X3 and it looks great on most cards. However, I have run into some compatibility issues with older GeForce cards (Geforce 3, 4, & 5). I wonder if anyone could look at this code of mine and tell me what I'm doing wrong. I've narrowed it down to this resampling code here:
	PS_OUTPUT ps_Resample( VS_OUTPUT_RESAMPLE inVert, uniform float brightness) 
	{
	PS_OUTPUT Output;

	float4 sample0 = tex2D(SourceSampler, inVert.uv0);
	float4 sample1 = tex2D(SourceSampler, inVert.uv1);
	float4 sample2 = tex2D(SourceSampler, inVert.uv2);
	float4 sample3 = tex2D(SourceSampler, inVert.uv3);

	Output.RGBColor.rgb = sample0.rgb;
	Output.RGBColor.rgb += sample1.rgb;
	Output.RGBColor.rgb += sample2.rgb;
	Output.RGBColor.rgb += sample3.rgb;
	Output.RGBColor.rgb *= 0.25;
	Output.RGBColor.rgb *= brightness;
	Output.RGBColor.a = 1;

	return Output;
	}

In the code above, for each pixel, I am sampling the texture in four different places and averaging the result. I use this function once for the downsample of the screen (brightness=1.0), then again when I'm blurring (brightness=1.5). On most cards, this works great. On the GeForce cards, this doesn't look great at all. For some reason, the end result of the downsample is much a dimmer image than the original. It looks as though the result of each "+=" addition is clamped to 1 before being assigned back to Output.RGBColor. I tested this theory by going back to a card on which the shader works fine and putting a "saturate" call between the adds. The resulting image was very similar to the bad result I'm seeing on the GeForce. I tried using a float4 instead of Output.RGBColor to hold the sum. This did not change anything. I also tried multiplying each sample by 0.25 before adding to Output.RGBColor so that the value would never get higher than one. This actually did fix the downsample pass, but dimmed the blur! Rounding errors I think? What is it about those older Geforce cards that make simple arithmetic results differ? Are their pixel shader instructions clamping everything? How can I get around this?

Share this post


Link to post
Share on other sites
Shader 1 was only meant for values from 0..1. I'm surprised the problem exists on a GeForceFX card, but the 3 and 4's aren't surprising.

Pixel shader 1 also only allows constants between 0 and 1, so a brightness of 1.5 will get rounded to 1. You could use a brightness of 0.5 and 0.75 and tell the shader to use brightness*2. This won't even take an extra instruction in 1.1, it will just use the _x2 modifier on the instruction.

Share this post


Link to post
Share on other sites
Okay thanks. That explains a lot.

I also noticed ealier today that changing the brightness parameter didn't change anything unless it was exactly 2.0! 1.9 and 2.1 would be the same as 1.0, but 2.0 would be insanely bright. It must be ingoring that brightness multiplication unless brightness is exactly 2, in which case it uses the _x2 modifier you mentioned? Strange... very strange.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this