Jump to content
  • Advertisement
Sign in to follow this  
HolyOdin

Performance between half and float under SM3.0

This topic is 1969 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Is there a peroformance gain if i change some variable from fp32 to fp16??
for example:
void PSMain(VS_OUTPUT psIn,out float4 oCol :COLOR0,out float4 oCol1 :COLOR1)
{
float4 wNormal = mul(g_Instant_Constant.World,float4(psIn.oNormal,0.0f) );
float3 normal = wNormal.xyz;
normal = normalize(normal);
float4 color;
color = tex2D(g_DiffuseTex, psIn.UV0.xy);
color.a *= psIn.DifLight.a;
?
and
void PSMain(VS_OUTPUT psIn,out half4 oCol :COLOR0,out half4 oCol1 :COLOR1)
{
half4 wNormal = mul(g_Instant_Constant.World,half4(psIn.oNormal,0.0f) );
half3 normal = wNormal.xyz;
normal = normalize(normal);
half4 color;
color = tex2D(g_DiffuseTex, psIn.UV0.xy);
color.a *= psIn.DifLight.a;
?
thanks for your help

Share this post


Link to post
Share on other sites
Advertisement

I read some recent article (can find link now) and they mentioned that half is slower, it was used for old nv 5xxx FX series cards as it performed better if i remember correctly.

Share this post


Link to post
Share on other sites
On some old cards, around '04/'05/'06 maybe, then the half type did actually make your shaders run faster. Most GPUs though ignore it and treat it the same as float.

Share this post


Link to post
Share on other sites

D3D10+ specifies full float (source), so even if running SM3 code on such hardware, you're more likely to get half mapped to float.

Share this post


Link to post
Share on other sites

The only GPU's that ever supported half precision in shaders were Nvidia's FX series, 6000 series, and 7000 series GPU's. On the FX series using half-precision was actually critical for achieving good performance, since full precision came with a significant performance penalty. ATI hardware used a weird 24-bit precision internally for everything on their early DX9 hardware, since the spec for SM2.0 was somewhat loose in terms of how it defined the precision and format of floating-point operations. Later ATI DX9 hardware used full 32-bit precision for everything, since SM3.0 required IEEE compliance (or at least something much closer to it).

For SM4.0, the half-precision instructions and registers were completely removed from the specification. Using the "half" type in HLSL will cause the compiler to use full-precision instructions, and in practice no DX10 or DX11 GPU's support half-precision arithmetic internally. Weirdly enough lower-precision instructions have made a comeback in D3D11.1, primarily for mobile hardware. However in 11.1 the syntax for using it is different, you have to use types like "min16float" and "min16uint".

Edited by MJP

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!