Indeed, the rate of change maintains constant between the two vertices. That's because mathematically everything that varies per pixel is canceling each other. If you manage to put some variation to the formula so that elements unique to each pixel remains, you may get smooth results as you wanted.
Try normalizing the normal before dd'ing, that may do the trick:
float4 PS(VS_OUT pIn) : SV_Target
{
float3 n = normalize( pIn.normalW );
float3 dndx = ddx( n);
float3 dndy = ddy( n);
float sn = 100.0f;
return float4( n, 1 );
return float4( sn*dndx, 1 );
return float4( sn*dndy, 1 );
}
Itried that and it didnt help, but going by your description, ...how would that help?
Another thing i was wondering about is how the pixel quads are implemented. If I have i have 9 pixels in my buffer
1 2 3
4 5 6
7 8 9
Then pixel 7's ddx is 8-7 and its ddy is 4-7. Now, is pixel 8's ddx 9-8 and its ddy 5-8? The reason I ask is becuase wouldnt that mean that all pixels are dependant on neighbours and so would prevent a lot of parallel task properties?
Or is it the case that pixels 4,5,7,8 all share the same ddx (9-8) and all share the same ddy (4-7). Mean that you only get dependencies in pixel quads, but is less accurate.