nVidia HLSL Bug on GeForce 480+ / Quado 4000

Started by
8 comments, last by Mathucub 13 years, 5 months ago
I've got a problem...

When we ship our product; I precompile our shaders with fxc.
I get the occasional "didn't" work complaint. Usually that is a bug
on the client side handing up the constants and renderstate.

I get a report from our QA Engineer that the chrome is broken
on the new GeForce card. The same shader [both precompiled and sourced]
run on the GeForce 6800 up to the 9xxx series.

On the new card--the shader literally doesn't draw anything.
It also trashes other meshes in the scene.

The bug seems to be related to the Wrap0 renderstate. If I comment it out
then the reflection map appears. With it there, nada.

In this repro the wrap isn't that important; however, in the real shader
I've got another set of 2D textures I'm reading from and using the coordinates--
so I have to have the Wrap0 set.

:edit -- tested on the quadro, its broken there as well.
//I've got a Quadro 4k around here, somewhere. If it doesn't work there
//then I can actually submit a bug to nvidia; however, they usually ignore
//GeForce bugs.


Here is a sample cube map...
http://members.gamedev.net/farplane/BlowersGallery.dds.zip

float4x4 worldViewProj : WORLDVIEWPROJECTION;float4x4 worldView     : WORLDVIEW;float4x4 viewI         : VIEWINVERSE;float4x4 worldIT       : WORLDINVERSETRANSPOSE;float4x4 world         : WORLD;float4x4 worldI        : WORLDINVERSE;texture CubeMap<  string ResourceName = "BlowersGallery.dds";  string ResourceType = "Cube";>;samplerCUBE CubeMapSampler = sampler_state{	Texture = <CubeMap>;	MinFilter = Linear;	MagFilter = Linear;	MipFilter = Linear;	AddressU  = WRAP;	AddressV  = WRAP;	AddressW  = WRAP;};struct VS_INPUT{	float4 Position:POSITION0;	float3 Normal:NORMAL0;	float3 Binormal:BINORMAL0;	float3 Tangent:TANGENT0;	float2 UV:TEXCOORD0;	float2 UV1:TEXCOORD1;	float2 UV2:TEXCOORD2;};struct VS_OUTPUT{	half4 Position:POSITION;	half4 Diffuse :COLOR0;	half4 Specular :COLOR1;	half4 UVPrevMainMain :TEXCOORD0;	half4 UVMaskAlpha :TEXCOORD1;	half4 UVEnv:TEXCOORD2;	float3 WorldNormal : TEXCOORD3;	float4 WorldPosition : TEXCOORD4;	float3 WorldView : TEXCOORD5;	float3 ModelNormal : TEXCOORD6;};VS_OUTPUT mainVS(VS_INPUT IN) {    VS_OUTPUT OUT = (VS_OUTPUT)0;    OUT.Position = mul( float4(IN.Position.xyz , 1.0) , worldViewProj);        OUT.WorldNormal =  mul(IN.Normal, worldIT);	OUT.WorldView = normalize(viewI[3].xyz - mul(IN.Position, world));    return OUT;}float4 mainPS(VS_OUTPUT IN) : COLOR{      float4 PS = float4(0.0f,0.0f,0.0f,0.0f);		float3 N = normalize(IN.WorldNormal);	float3 V = normalize(IN.WorldView);	float3 R = reflect(-V, N);			PS = texCUBE(CubeMapSampler, R);  	    return PS;}technique technique_main{    pass pass0    {		//Wrap0 = U|V|W;        VertexShader = compile vs_3_0 mainVS();        PixelShader  = compile ps_3_0 mainPS();     }}


[Edited by - Mathucub on November 10, 2010 9:37:28 AM]
Advertisement
Quote:Original post by Mathucub

In this repro the wrap isn't that important; however, in the real shader
I've got another set of 2D textures I'm reading from and using the coordinates--
so I have to have the Wrap0 set.


You are using a cube sampler in the shader you posted, what do you need the wrap for in the "real" shader? The coordinates are normalized direction vectors, and wrapping means 5.5 to 0.5 in plane coords.

In case you do something absolutely different in this real shader, like using 2dsampler instead of cube one, try Wrap0=U|V, without W component.

I don't see anything wrong.

A) Check the DX Debug runtimes for anything unusual.
B) I see you're using COLOR0 & COLOR1, try using TEXCOORDs instead. I wouldn't be surprised the newer GPU is assigning the COLORn semantics to the same physical registers as one of the other TEXCOORn, thus producing garbage.
C) Check for buffer overruns when writing to your vertex buffers from your application. They have the annoying propierty that everything shows up correct and go unnoticed until they're run in different GPUs.

Cheers
Dark Sylinc
Quote:Original post by JohnnyCode
Quote:Original post by Mathucub

In this repro the wrap isn't that important; however, in the real shader
I've got another set of 2D textures I'm reading from and using the coordinates--
so I have to have the Wrap0 set.


You are using a cube sampler in the shader you posted, what do you need the wrap for in the "real" shader? The coordinates are normalized direction vectors, and wrapping means 5.5 to 0.5 in plane coords.

In case you do something absolutely different in this real shader, like using 2dsampler instead of cube one, try Wrap0=U|V, without W component.



In the real sampler there are 4 other 2D textures [wrap, prior wrap, light map,
and distortion]. I automatically extrude meshes from 2D to 3D: so I have
to have the monotonic wrap enabled.

The first thing I thought was getting rid of W; however, having U or V present
by itself also fails.
Quote:Original post by Matias Goldberg
I don't see anything wrong.

A) Check the DX Debug runtimes for anything unusual.
B) I see you're using COLOR0 & COLOR1, try using TEXCOORDs instead. I wouldn't be surprised the newer GPU is assigning the COLORn semantics to the same physical registers as one of the other TEXCOORn, thus producing garbage.
C) Check for buffer overruns when writing to your vertex buffers from your application. They have the annoying propierty that everything shows up correct and go unnoticed until they're run in different GPUs.

Cheers
Dark Sylinc


Nothing in the debug runtime.

I'll try switching to TEXCOORDS; however, if that is true I'm in trouble...

For my internal lighting system I'm already running out of TEXCOORDS for
many of the more complex shaders. I allow other shaders to graft in -- the user
has to just include my headers and bind to my #define's. I think in per-pixel
shader mode I'm down to 2 open TEXCOORDS.

If I have to move from COLOR it will kill a lot of installs that are live in the wild.

I will be seriously disappointed in nVidia if they did that.

I've had a similar issue with them in the past: I had a vertex shader that had
to use EVERY component of EVERY TEXCOORD as output. On 1 machine it would blue
screen until they updated my Quadro driver.
Quote:Original post by Mathucub
Quote:Original post by Matias Goldberg
I don't see anything wrong.

A) Check the DX Debug runtimes for anything unusual.
B) I see you're using COLOR0 & COLOR1, try using TEXCOORDs instead. I wouldn't be surprised the newer GPU is assigning the COLORn semantics to the same physical registers as one of the other TEXCOORn, thus producing garbage.
C) Check for buffer overruns when writing to your vertex buffers from your application. They have the annoying propierty that everything shows up correct and go unnoticed until they're run in different GPUs.

Cheers
Dark Sylinc


Nothing in the debug runtime.

I'll try switching to TEXCOORDS; however, if that is true I'm in trouble...

For my internal lighting system I'm already running out of TEXCOORDS for
many of the more complex shaders. I allow other shaders to graft in -- the user
has to just include my headers and bind to my #define's. I think in per-pixel
shader mode I'm down to 2 open TEXCOORDS.

If I have to move from COLOR it will kill a lot of installs that are live in the wild.

I will be seriously disappointed in nVidia if they did that.

I've had a similar issue with them in the past: I had a vertex shader that had
to use EVERY component of EVERY TEXCOORD as output. On 1 machine it would blue
screen until they updated my Quadro driver.


Switching to TEXCOORDs didn't help.
Bummer!
That leaves memory corruption or driver/device bug.
Can we see screenshots? How it looks in normal machines, and how it looks in the new card?

Also:
Try playing with the NVIDIA config panel (a good start: everything off)
Try seeing what happens with NVPerfHUD
A hunch: Use point filtering for the cube map, no mipmap

Cheers
Dark Sylinc
Quote:Original post by Matias Goldberg
Bummer!
That leaves memory corruption or driver/device bug.
Can we see screenshots? How it looks in normal machines, and how it looks in the new card?

Also:
Try playing with the NVIDIA config panel (a good start: everything off)
Try seeing what happens with NVPerfHUD
A hunch: Use point filtering for the cube map, no mipmap

Cheers
Dark Sylinc


At this point I'm sure its driver related.
I've tested 6 earlier video cards. Everything up until the geforce 480 and quadro 4000 works.



This is what it looks like when it works. I'm not going to post the failed view...
I can't even take a shot of it on the newer cards. Every drawprim call succeeds, but
it just flickers randomly. This isn't a UV style flicker. The entire vertex buffer simply doesn't
get drawn. No output into the backbuffer/rendertarget at all.

I know Wrap0 can be dangerous... Wrap0 on a textured quad is a nightmare; however,
this is not a case where I would expect damage.

I've got ways around it... Just none are good. Luckily this hasn't hit anything on air, yet.

How does this work in the gaming world? Reading specific video card id's and driver versions
and providing alternate shaders?
Indeed it looks like a driver bug or hardware flaw.

However I can't help notice you only talk about NVIDIA testings. How about ATI and Intel?

In my experience NVIDIA cards tend to be much more forgiving when it comes to errors, memory corruption and shader nastiness. This is specially true for OpenGL, but still holds true for Direct3D.

Good luck with the bug report.
Cheers
Dark Sylinc

Edit:
Quote:Original post by MathucubHow does this work in the gaming world? Reading specific video card id's and driver versions
and providing alternate shaders?

Usually. But using Device & Vendor ID can be tricky, since it doesn't guarantee all incompatible cards will be included, nor account incompatible cards which may become compatible in the future (or that some GPU has a wrong ID, but that's really unlikely). Also you need to make sure the IDs you've retrieved are from the main display device that renders the game (it's not very unusual to see a Radeon HD 5xxx paired with a GeForce card for CUDA or physX)

The best way is to leave an option i.e. "[x] Force Geforce 480 fix", and enable it by default when the device & vendor ID match, but informing the user about the situation and that he can change it anytime at his own risk.
This way you ensure all users will be able to play the game flawlessly, even if a particular ID didn't match with the one in your blacklist database, or that card is now compatible.
Quote:Original post by Matias Goldberg
Indeed it looks like a driver bug or hardware flaw.

However I can't help notice you only talk about NVIDIA testings. How about ATI and Intel?

In my experience NVIDIA cards tend to be much more forgiving when it comes to errors, memory corruption and shader nastiness. This is specially true for OpenGL, but still holds true for Direct3D.

Good luck with the bug report.
Cheers
Dark Sylinc

Edit:
Quote:Original post by MathucubHow does this work in the gaming world? Reading specific video card id's and driver versions
and providing alternate shaders?

Usually. But using Device & Vendor ID can be tricky, since it doesn't guarantee all incompatible cards will be included, nor account incompatible cards which may become compatible in the future (or that some GPU has a wrong ID, but that's really unlikely). Also you need to make sure the IDs you've retrieved are from the main display device that renders the game (it's not very unusual to see a Radeon HD 5xxx paired with a GeForce card for CUDA or physX)

The best way is to leave an option i.e. "[x] Force Geforce 480 fix", and enable it by default when the device & vendor ID match, but informing the user about the situation and that he can change it anytime at his own risk.
This way you ensure all users will be able to play the game flawlessly, even if a particular ID didn't match with the one in your blacklist database, or that card is now compatible.


We work in a little bit of different worlds. I'm in TV broadcast. We use
the same "gaming" technology [DX9, moving to 11] but do things a bit differently.
Our stuff is on air 24/7/365 so we validate all the hardware and only allow
the system to run on qualified platforms. You can use any machine to edit
scenes/templates off air--but anything that has an SDI uplink must
be validated.

ATi really isn't an option. The nVidia SDI card only talks to Quadros.

We put the bug report in. Two at once, under 7 our texture bandwidth from
CPU->GPU seems to have been cut in half with the last driver update.
We are stuck on the prior generation of cards and drivers until its resolved.

This topic is closed to new replies.

Advertisement