warning X4714: ~'excessive temp registers'

Started by
3 comments, last by MJP 12 years ago
I am compling the following compute shader using fxc:

void RayCast(uint3 ThreadIndex : SV_DispatchThreadID)
{
RayStruct Ray = GenerateCameraRay(ThreadIndex.xy);
RaySceneResultStruct RaySceneResult = GetRaySceneResult(Ray);
uint PixelIndex = ThreadIndex.x + ThreadIndex.y * ResWidthDiv2 * 2;

if(RaySceneResult.DoesMeet)
{
RayHits[PixelIndex].Direction = Ray.Direction;
RayHits[PixelIndex].Position = RaySceneResult.Position;
RayHits[PixelIndex].Normal = RaySceneResult.Normal;
RayHits[PixelIndex].ObjectType = RaySceneResult.ObjectType;
RayHits[PixelIndex].ObjectIndex = RaySceneResult.ObjectIndex;
}
}


(where RayHits is a UAV) and I get the following warning:

warning X4714: sum of temp registers and indexable temp registers times 1024 threads exceeds the recommended total 16384. Performance may be reduced.[/quote]

Performance is significantly reduced.

Simply commenting out the line "RayHits[PixelIndex].Normal = RaySceneResult.Normal;" gets rid of the warning, but, confusingly for me, removing all other lines inside the if statement except that one does not get rid of the warning...

It might be of interest that the asm output shows that 13 r# registers are used without the specified line, and 19 are used with that line (16 being the limit specified by 16384/1024), regardless of whether all the other lines in the if statement are present.

WTF. What makes that line so special, and how can I fix this?
Advertisement

WTF. What makes that line so special, and how can I fix this?


Well when you comment out that line you're also going to cause the compiler to strip out all instructions needed to generate the data for that line. So in your case, all of the code in your other functions for getting the normal of the intersecting surface will get stripped out.
Wow, the compiler is smarter than I thought. I guess the solution then is to simplify the calculation of the normal if possible. Many thanks.
Hmm, better yet, I'll divide my shaders into single functions, which IIRC is the way shaders are supposed to be coded anyway...

Eg: the code for GenerateCameraRay() will be put into one shader, and the code for GetRaySceneResult will be put into many shaders, one for each type of object, etc.
Yeah the compiler is extremely aggressive with optimizations, since all of the code run by a shader is compiled simultaneously (expect in the case of dynamic shader linkage) and also because the HLSL language is pretty simple. This also means that function calls are pretty much always inlined, so using them typically has no effect on the resulting assembly. So you're free to organize your functions and #include's however you'd like, and you'll always end up with a tightly-optimized shader.

This topic is closed to new replies.

Advertisement