# Differences between SM2 and SM3 renders?

This topic is 3319 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

Take a look at this: Those two images were created using the exact same pixel shader code, but compiled for SM2 and SM3 respectively. In SM2 the specular highlights are fugly - they're huge and bright, almost as if they were vertex-based highlights instead of per-pixel (they're not, I double and triple checked that one just to be sure). So is this a difference in rendering models or perhaps shader compilers between SM2 and SM3? Just for reference, the shader code (just in case I'm doing something stupid which is causing the wired highlights)
float4 SM2Light(float4 inCol, float2 tc0, float3 norm, float3 vpos)
{
float4 c = tex2D(diffMap, tc0);

float4 outCol = inCol*c; // multiply texture color by accumulated vertex lighting color

// now apply sunlight
if (sunlight.type==2) // sunlight must be type 2 or it's off
{
float3 diffy, l, speccy;
float a,t,r;

diffy = c.rgb;
// colors[1][0] thru [1][2] are the specular color
float3 speccyColor = float3(colors[1][0], colors[1][1], colors[1][2])*c.rgb;
l = normalize(-sunlight.pos); // note pos is actually dir in sunlight
norm = normalize(norm);
a = max(dot(norm, l), 0.0f); // alter light based on angle of incidence
diffy.rgb *= (sunlight.col.rgb)*a; // standard lighting mul'd by angle of incidence

//specular lighting here
r = reflect(-l, norm);
//t = pow(abs(max(dot(r, normalize(eyePos - vpos)), 0.50f)), 12);
t = pow(abs(max(dot(r, normalize(eyePos - vpos)), 0.0f)), colors[1][3]);
// colors[1][3] is the specular power

speccy.rgb = sunlight.specMod*t*(speccyColor.rgb * sunlight.col.rgb);

outCol.rgb += diffy.rgb + speccy.rgb;
}

// mul to match mesh material diffuse color settings...
outCol *= float4(colors[0][0], colors[0][1], colors[0][2], colors[0][3]);

// return the modified color
return outCol;
}

// this is the main entry point for the pixel shader
float4 mainPixel(float4 inCol : TEXCOORD3, float2 tc0 : TEXCOORD0,
float3 norm : TEXCOORD1, float3 vpos : TEXCOORD2) : COLOR0
{
float4 col = SM2Light(inCol, tc0, norm, vpos);
return col;
}



##### Share on other sites
SM 3.0 requieres a vertex shader to be present, otherwise you'll get garbarge.
And if you're using it, please post it

Cheers
Dark Sylinc

##### Share on other sites
Yep, using a vertex shader. I'll post it, but as it's SM2 that's acting up, I don't see how it'll help much...

OutputVS SM3Vertex(float3 posL,					float3 N,					float2 tc0){		// do your vertex shader code here and return an Output VS object...	OutputVS outVS = (OutputVS)0; // zero the output data		outVS.posH = mul(float4(posL, 1.0f), mWVP);	// push out the modified vertex data...		float3 vpw = mul(float4(posL,1.0f), worldMat).xyz; // get vertex pos in world space		float3 norm = mul(float4(N, 0.0f), worldInvMat).xyz;	norm = normalize(norm);		outVS.vpos = vpw;	outVS.norm = norm;	outVS.tex0 = tc0;		return outVS;}

The SM2 version is almost identical, but with some added vertex lighting (which, before you ask, is not relevant as it's for point lights, and the only light in the test scene is sunlight)

##### Share on other sites
OK:

//Shouldn't it be 1.0f instead of 0.0f?
float3 norm = mul(float4(N, 0.0f), worldInvMat).xyz;

Quote:
 outVS.vpos = vpw;outVS.norm = norm;outVS.tex0 = tc0;

So, you output 1 position and 2 tex coordinates from the vertex shader.
But the pixel shader takes 4 tex coordinates as input.

And you make "float4 outCol = inCol*c;" regardless of the conditional "sunlight.type".
This means you will get undefined behaviours, probably handled different in SM 2.0 and 3.0

Cheers
Dark Sylinc

##### Share on other sites
Try clamping the lower bound of the exponent to pow to 0.001

##### Share on other sites
Quote:
 //Shouldn't it be 1.0f instead of 0.0f?float3 norm = mul(float4(N, 0.0f), worldInvMat).xyz;

Um..no? Normals multiplied by the world inverse transpose matrix need to have a w component of 0. Just to be sure I tried it with 1 instead, but it didn't fix anything.

Quote:
 So, you output 1 position and 2 tex coordinates from the vertex shader.But the pixel shader takes 4 tex coordinates as input.And you make "float4 outCol = inCol*c;" regardless of the conditional "sunlight.type".This means you will get undefined behaviours, probably handled different in SM 2.0 and 3.0

All the outputs from the vertex shader go through TEXCOORD0 - TEXCOORD4, except of course for POSITION0, which is the screen space position. The 'vpos' variable going into the pixel shaders is the world space of the vertex, and is fed through using a TEXCOORD register.

The outCol*inCol*c stuff takes the standard lighting output of the vertex shader (inCol var) and multiplies by the sampled texture color to give a diffuse color starting point for the lighting algorithm. These vars have all been tested and are storing/passing in appropriate values.

Quote:
 Try clamping the lower bound of the exponent to pow to 0.001

Nice idea, but it had no effect :(

##### Share on other sites
I found it, and yes, it was something monumentally stupid.

In the shader I compute the reflect vector, then attempt to store it in a single float, instead of a float3. The worst part was, I debugged that shader 10+ times in PIX, and each time I saw the variable "r" pop up in the vars list with just a single value, and the penny didn't drop.

I wish shaders had casting checks like C++ - it would save me so much time when I do stupid things like this, which is about 3 times a week on average :)

##### Share on other sites
Quote:
Original post by Damocles
Quote:
 Try clamping the lower bound of the exponent to pow to 0.001

Nice idea, but it had no effect :(

Its a good idea in general, because the pow on shader hardware is an approximation that is done with exp and log instructions, and the result is undefined when pow the exponent is 0. Its also one of the few areas different hardware is allowed to differ for some reason.

##### Share on other sites
Quote:
 Its a good idea in general, because the pow on shader hardware is an approximation that is done with exp and log instructions, and the result is undefined when pow the exponent is 0. Its also one of the few areas different hardware is allowed to differ for some reason.

That's good to know. Thanks for the tip.

• 10
• 18
• 14
• 18
• 15