Sign in to follow this  

shader compile flags destroy my shaders (SOLVED)

This topic is 3743 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

hi, my shader works fine on my graphics card (nVidia 6600) but doesnt work on ATI radeon 9600. I enabled the D3DXSHADER_FORCE_PS_SOFTWARE_NOOPT and VS equivalent on my computer to try and find the problem and now have an entirely black screen. What exactly do these flags do apart from skip optimizations on the shaders? In the directx docs it says it will "Force the compiler to compile against the next highest available software target for pixel shaders" but i dont know what that means. Thanks [Edited by - jrmcv on September 18, 2007 11:57:04 AM]

Share this post


Link to post
Share on other sites
These flags allow the shader compiler to use more resources than the hardware profile. This is useful if a shader refuse to compile for one of the hardware profile. Then can take the assembler output and check what the problem is. But such shaders will never run on real hardware.

Share this post


Link to post
Share on other sites
9600 should support model 2.0 so that's probably not the problem. It may be the difference in the hardware support for something else, like texture formats, or data types like half precision.

Also there may be other differences, sometime ATI drivers are less forgiving than Nvidia's for certain mistakes.

Share this post


Link to post
Share on other sites
If you force to use PS in software, it will only run on the REF device, which I assume is not what you're using.

I'd suggest using REF to see what results you get, as it'd help you see if the problem is in the NVIDIA drivers or AMD's.

Share this post


Link to post
Share on other sites
thanks again for replies.

The program runs fine under the REF on mine but i dont know what its like on his and hes reluctant to start installing the SDK. If it runs under the REF on mine anyway it should on his too right? because its just a simulation of the hardware standard.

It looks as if some of the object transforms are being ignored or a lot of the objects seem in one place at the bottom of the screen always. I dont understand this because it works on mine and havent seen any errors. Anyone else seen that? Any ideas on where i go next because i'm pretty stuck? I have tried moving stuff around, and turning off optimizations in shaders and it still happens. I know his card supports shader 3.0 too because he's run other stuff that i have done in 3. He gets the same FPS as me too which he usually does for other stuff too so that implies that it is still doing the work.


EDIT: i've also tried changing the shader target model and stripping the shader down to just basic transforms and it still doesnt work.

EDIT2: are floating point textures (A16B16G16R16F) clamped on any hardware? are they definetely signed accross hardware? (i ask because one of these textures contributes to the transform matrix)



P.S: sorry i know its off topic but its sort of on topic at the same time. :-)

Share this post


Link to post
Share on other sites
Quote:
Original post by jrmcv
EDIT2: are floating point textures (A16B16G16R16F) clamped on any hardware? are they definetely signed accross hardware? (i ask because one of these textures contributes to the transform matrix)

I have no idea, but it looks like a good direction to check. Are the textures created at runtime or beforehand? If at runtime, then saving them on his machine for comparison might give you a clue.

Share this post


Link to post
Share on other sites
Quote:
Original post by jrmcv
The program runs fine under the REF on mine but i dont know what its like on his and hes reluctant to start installing the SDK. If it runs under the REF on mine anyway it should on his too right? because its just a simulation of the hardware standard.
Yes, there's no need to run the SDK+REF on his hardware as it will be identical.

What your results are telling you is that the REF agrees with your hardware, therefore you're using some sort of feature or capability that the 6600 doesn't.

I had a similar issue a year or so back where ATI hardware didn't clamp values for SM1 targets but the NV hardware did (NV was correct), so these things do happen...

Quote:
Original post by jrmcv
I have tried moving stuff around, and turning off optimizations in shaders and it still happens.
disabling optimizations isn't as useful as you might think. A lot of the time it'll blow the instruction limit and fail to compile - thats ok if you go for longer SM3 shaders, but it does make it hard when running SM2 shaders.

Quote:
Original post by jrmcv
it still doesnt work.
Unless I missed it you haven't quoted any return codes or debug warnings. Are you 110% sure that every resource is successfully created and compiled? It could be that you've done something that is unsupported, failed to create and you're rendering with a missing resource bound to your pipeline...

Running via PIX is a great idea, but that'll be hard if he doesn't have the SDK. Another option is to download the NV shader analyser tools to see if they complain about anything.

Quote:
Original post by jrmcv
EDIT2: are floating point textures (A16B16G16R16F) clamped on any hardware? are they definetely signed accross hardware? (i ask because one of these textures contributes to the transform matrix)
They should be standard half-precision floating point values with no clamping.

hth
Jack

Share this post


Link to post
Share on other sites
hi again and thanks for the reply again,

I think i have narrowed down the problem to the shader not running. It runs all the shaders before and after that but ignores this one. The matrices are not setup because all the SetTransforms are set to identity matrices so it explains why nothings being transformed correctly.

I have the Effect->Begin and Effect->BeginPass wrapped in FAILED( ) bit but still no errors. I have no debug errors on mine apart from redundant state changes and cant run the debugger on my friends computer to get more information. Has anyone seen this before and/or have ideas on what might be going on?



heres some simplified idea of my code if it helps (deferred shading/rendering.. sort of):

// step1: draw albedo textures and material properties
SetRenderTarget(0, albedo); // A8R8G8B8
SetRenderTarget(1, materials); // A8R8G8B8
Effect->Begin();
for each object
// Setup textures and matrices here..
Effect->BeginPass(0)
Draw();
Effect->EndPass();
next
Effect->End();

// step2: draw normals and world positions
SetRenderTarget(0, Normals); // A16B16G16R16F
SetRenderTarget(1, WorldPos); // A16B16G16R16F
Effect->Begin();
for each object
// Setup textures and matrices here..
Effect->BeginPass(0)
Draw();
Effect->EndPass();
next
Effect->End();

// step 3: draw ambient pass as full screen quad
SetRenderTarget(0, LightAccumulate); // A8R8G8B8
SetRenderTarget(1, NULL);
Effect->Begin()
// Setup textures and matrices here..

Effect->BeginPass(0);
DrawQuad();
Effect->EndPass();

// step 4: (where it all goes wrong) draw lights
for each light
// Setup textures and matrices here..
Effect->BeginPass(1);
Draw();
Effect->EndPass();
next light
Effect->End();





I'm open to any random stabs in the dark too if your not sure because i'm running out of ideas.

Thanks again

EDIT : appolgies for the fluctuation between C and BASIC. Thought it might be easier to skim through. Im looking in to the nVidia shader annalysis tool but will that work with ATI? thats the card im having problems with. nVidia is fine.

Share this post


Link to post
Share on other sites
yeah here it is, its pretty messy (and few comments) because i've been playing with it quite a bit trying to find the problem but see what you can make of it.



struct VS_OUTPUT_LIGHT
{
float4 Position : POSITION; // vertex position
float4 ScreenUV : TEXCOORD0; // screen space UV coordinates for referencing Gbuffer
float dist : TEXCOORD1; // not used anymore, will be removed
};

VS_OUTPUT_LIGHT RenderSceneVS_OMNI( float4 vPos : POSITION, float2 texUV : TEXCOORD0)
{
VS_OUTPUT_LIGHT Output;
float4 ppos = mul(vPos, worldViewProjection);
Output.Position = ppos;

Output.ScreenUV = ppos;
float zDist = distance(viewInverse[3].xyz , mul(vPos, mWorld)) ;
Output.dist = zDist;
return Output;
}


PS_OUTPUT_ZP RenderScenePS_OMNI( VS_OUTPUT_LIGHT In )
{
In.ScreenUV.xy /= In.ScreenUV.w;
Output.RGBColor = float4(0,0,0,0);

float2 texelpos = 0.5 * In.ScreenUV.xy+ float2( 0.5, 0.5 );
texelpos.y = 1.0f - texelpos.y;

float3 wPos = tex2D(gb4_sampler, texelpos).rgb + viewInverse[3].xyz;

float3 LPOS = LightPos;
float dist = distance(wPos, LPOS);
float intensity;
dist = max(0, 1 - (dist / (LightDist*0.85)));
clip(dist);
float4 mapNorm;
mapNorm = tex2D(gb3_sampler, texelpos);

float3 vLDIR = normalize(wPos.xyz - LightPos);

intensity = max(0, dot(vLDIR, -mapNorm.xyz)) * min(1, dist * 1.3);
clip(intensity);


float3 albedo = tex2D(gb1_sampler, texelpos).rgb;

float3 E = normalize( wPos - viewInverse[3].xyz );
float3 HalfVec = normalize(E + vLDIR);
float spec = min(1, pow( max(0 , dot(-mapNorm,HalfVec) ) , 50) * dist);

float3 diffuse = float3(1,1,1) * intensity;
Output.RGBColor = float4(diffuse * albedo + spec * 0.9,1);
return Output;
}



Thanks

Share this post


Link to post
Share on other sites
How about the technique declaration for that shader code? Could be quite important! Also, whats the decl for PS_OUTPUT_ZP? Your previous pseudo code seems to suggest MRT (SetRenderTarget() calls) but your HLSL fragment doesn't.

Have you tried using ID3DXEffect::ValidateTechnique() and asserting on the result?

Quote:
I have the Effect->Begin and Effect->BeginPass wrapped in FAILED( ) bit but still no errors.
How about the code where you actually create the effects and retrieve handles? Compiling an effect also returns a buffer of error messages when appropriate.

Quote:
Im looking in to the nVidia shader annalysis tool but will that work with ATI? thats the card im having problems with. nVidia is fine.
Got things muddled up, you want the AMD/ATI shader analyzer then!


Jack

Share this post


Link to post
Share on other sites
hi again, the buffer is error being check and doesnt report anything. Sorry this is not the shader used to build the Gbuffer which uses MRT, i think they are ok because he can switch to them in the appliction and they sound like they are working.



sampler gb1_sampler = sampler_state // albedo
{
texture = <texGB1>;
AddressU = WRAP;
AddressV = WRAP;
AddressW = WRAP;
MIPFILTER = LINEAR;
MINFILTER = LINEAR;
MAGFILTER = LINEAR;
};
sampler gb2_sampler = sampler_state // material/ specular level/power
{
texture = <texGB2>;
AddressU = WRAP;
AddressV = WRAP;
AddressW = WRAP;
MIPFILTER = POINT;
MINFILTER = POINT;
MAGFILTER = POINT;
};
sampler gb3_sampler = sampler_state // normals in world pos
{
texture = <texGB3>;
AddressU = WRAP;
AddressV = WRAP;
AddressW = WRAP;
MIPFILTER = LINEAR;
MINFILTER = LINEAR;
MAGFILTER = LINEAR;
};
sampler gb4_sampler = sampler_state // world position
{
texture = <texGB4>;
AddressU = WRAP;
AddressV = WRAP;
AddressW = WRAP;
MIPFILTER = LINEAR;
MINFILTER = LINEAR;
MAGFILTER = LINEAR;
};


PS_POUTPUT_ZB {
float4 RGBColor : color0;
}

technique RenderLighting
{

pass P0 // AMBIENT
{
zenable = false;
VertexShader = compile vs_2_0 RenderSceneVS_AMBIENT( );
PixelShader = compile ps_2_0 RenderScenePS_AMBIENT( );
}
pass P1 // OMNI
{
alphablendenable = true;
srcblend = one;
destblend = one;
zwriteenable = false;

VertexShader = compile vs_3_0 RenderSceneVS_OMNI( );
PixelShader = compile ps_3_0 RenderScenePS_OMNI( );
}
}


i havent included all the FX because theres quite a lot and didnt want to clutter the post. pass 0 does work... but pass 1 doesnt.

Originally these were both SM2 but changed to SM3 just to make sure no problems with shader models.

Share this post


Link to post
Share on other sites
Two things immediately jump out here:

  1. Radeon 9600's don't support shader model 3, so the second technique will not work.
  2. Radeon 9600's can't do any blending or filtering with floating point textures so that won't work either.


For #2 I'm not 100% sure (from a quick glance) if you're using it, but if you have it enabled then the driver might still be rejecting your draw calls.

Provided the above is correct then the lesson to be learnt is "enumerate, enumerate, enumerate" [grin]

hth
Jack

Share this post


Link to post
Share on other sites
I'll try changing them and see if that works. Can't find out until tomorrow because my friends out at the moment. I presumed BeginEffect would return an error if anything failed to compile for the available hardware. Thats the reason i didnt want to use precompiled shaders.

Thanks for the help

Share this post


Link to post
Share on other sites
I've had SM3/SM4 hardware for so long now that I've not had anything fail to compile/load in my apps, but I wouldn't be so sure of BeginEffect() failing if you used a SM3 shader on non-SM3 hardware. You'll probably get some debug output but it's not uncommon for draw (&related) calls to silently fail...

hth
Jack

Share this post


Link to post
Share on other sites
Quote:
Original post by jollyjeffers
Two things immediately jump out here:

  1. Radeon 9600's don't support shader model 3, so the second technique will not work.
  2. Radeon 9600's can't do any blending or filtering with floating point textures so that won't work either.


Correct. FP16 blending was introduced in the Geforce 6800, and later in the Radeon X1800. FP16 filtering was present on the 6800, but was not present on Radeon cards until the X2900XT.

Share this post


Link to post
Share on other sites

This topic is 3743 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this