Jump to content

  • Log In with Google      Sign In   
  • Create Account


Performance between half and float under SM3.0


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
5 replies to this topic

#1 HolyOdin   Members   -  Reputation: 130

Like
0Likes
Like

Posted 28 April 2013 - 02:58 AM

Is there a peroformance gain if i change some variable from fp32 to fp16??
for example:
void PSMain(VS_OUTPUT psIn,out float4 oCol :COLOR0,out float4 oCol1 :COLOR1)
{
float4 wNormal = mul(g_Instant_Constant.World,float4(psIn.oNormal,0.0f) );
float3 normal = wNormal.xyz;
normal = normalize(normal);
float4 color;
color = tex2D(g_DiffuseTex, psIn.UV0.xy);
color.a *= psIn.DifLight.a;

and
void PSMain(VS_OUTPUT psIn,out half4 oCol :COLOR0,out half4 oCol1 :COLOR1)
{
half4 wNormal = mul(g_Instant_Constant.World,half4(psIn.oNormal,0.0f) );
half3 normal = wNormal.xyz;
normal = normalize(normal);
half4 color;
color = tex2D(g_DiffuseTex, psIn.UV0.xy);
color.a *= psIn.DifLight.a;

thanks for your help

Sponsor:

#2 kunos   Crossbones+   -  Reputation: 2205

Like
0Likes
Like

Posted 28 April 2013 - 03:09 AM

in theory yes, there must be.. in practice, I haven't seen one.


Stefano Casillo
Lead Programmer
TWITTER: @KunosStefano
AssettoCorsa - netKar PRO - Kunos Simulazioni

#3 belfegor   Crossbones+   -  Reputation: 2554

Like
0Likes
Like

Posted 28 April 2013 - 03:18 AM

I read some recent article (can find link now) and they mentioned that half is slower, it was used for old nv 5xxx FX series cards as it performed better if i remember correctly.



#4 Hodgman   Moderators   -  Reputation: 29302

Like
4Likes
Like

Posted 28 April 2013 - 03:18 AM

On some old cards, around '04/'05/'06 maybe, then the half type did actually make your shaders run faster. Most GPUs though ignore it and treat it the same as float.

#5 mhagain   Crossbones+   -  Reputation: 7802

Like
2Likes
Like

Posted 28 April 2013 - 04:17 AM

D3D10+ specifies full float (source), so even if running SM3 code on such hardware, you're more likely to get half mapped to float.


It appears that the gentleman thought C++ was extremely difficult and he was overjoyed that the machine was absorbing it; he understood that good C++ is difficult but the best C++ is well-nigh unintelligible.


#6 MJP   Moderators   -  Reputation: 10810

Like
2Likes
Like

Posted 29 April 2013 - 01:24 AM

The only GPU's that ever supported half precision in shaders were Nvidia's FX series, 6000 series, and 7000 series GPU's. On the FX series using half-precision was actually critical for achieving good performance, since full precision came with a significant performance penalty. ATI hardware used a weird 24-bit precision internally for everything on their early DX9 hardware, since the spec for SM2.0 was somewhat loose in terms of how it defined the precision and format of floating-point operations. Later ATI DX9 hardware used full 32-bit precision for everything, since SM3.0 required IEEE compliance (or at least something much closer to it).

For SM4.0, the half-precision instructions and registers were completely removed from the specification. Using the "half" type in HLSL will cause the compiler to use full-precision instructions, and in practice no DX10 or DX11 GPU's support half-precision arithmetic internally. Weirdly enough lower-precision instructions have made a comeback in D3D11.1, primarily for mobile hardware. However in 11.1 the syntax for using it is different, you have to use types like "min16float" and "min16uint".


Edited by MJP, 29 April 2013 - 02:08 AM.





Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS