Jump to content
  • Advertisement
Sign in to follow this  
JoeJ

How will fp16 affect games

This topic is 421 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Personally i'm excited about upcoming Vegas double rate fp16 capability.
In my realtime GI project i can use it probably for totally everything, including position.
I do not expect wonders. We already store compressed data so there is no bandwidth to win, but it should be a big speed up.

I guess the same is true for any renderer as well, but i'm no expert here.
What do you think? Do you plan to use it or already have experience you can share?
What expections do you have, where are the limitations?


I wonder why the feature is not in Scorpio, while PS4Pro already has it.
(Assuming a game uses equal amounts of fp32 and fp16 calculations, both have the same GFolps. If it uses mainly fp16, PS4Pro wins!)
I'm not sure if consumer Volta will have faster fp16 than fp32.
Also we read a lot that fp16 is an AI thing and not for games, which IMHO is total nonsense. Isn't it?

Share this post


Link to post
Share on other sites
Advertisement

Having read a little this comment on read it sort of sums up alot of FP16 simply.

 

 

FP 16 is less accurate with just 5bits for the exponent and 10 bits for the fraction. So less smaller numbers and a greater distance between high numbers. And the maximum number is also way smaller than it is for FP32.         

 

Anything around geometry, im not sure it has a real practical use, im sure Hodge might have some nefarious way of using it.   But I could see some benefits in local co-ordinates systems, I know some people working on long running popular commercial games where there are vast distances and this does play a part.  So maybe performance here.  

Maybe useful on Texture UV's also (one I can see being very useful).  Small numbers most of the time.

I would like to know though how well FP16 and FP32 play together, because it might be a similar story to single to double precision where there is a hit to do the cast.  So, it might be limited in what real applications you can use FP16 for.

Share this post


Link to post
Share on other sites
GeForce FX, it actually did have float16 computation hardware too. These new GPU's are just bringing it back into style again

 This is what I was thinking, read the thread title and I was like "FP16? Again?". Also quite important in mobile GPUs. Haven't kept up with the GPU news lately...

Edited by TheChubu

Share this post


Link to post
Share on other sites

It can be quite useful on PS4 Neo, I expect the same on new PC HW. You just have to know what you're doing and not mix it with fp32 (too much). And profile to see if it's better than with fp32.

The yield can be savings in interpolators (VS->PS param cache), double rate of (some) ALU, maybe even lower register pressure? It's totally good for games, e.g. HDR colour computations where you don't need that much precision but also anything else.

Share this post


Link to post
Share on other sites
I hope the register count doubles but also LDS if we use it for fp16 data (we would get 100% occupancy for almost everything).
Conversion with fp32 should be no issue. There might be a instruction that does this in one cycle, or even adding fp16+fp32 with free conversation.

I guess something like a deferred lighting pass could be done entirely with fp16.
But skinning a detailed character in fp16 and adding the result to a fp32 offset? Probaly too bad artefacts.

Share this post


Link to post
Share on other sites

I hope the register count doubles but also LDS if we use it for fp16 data

It's already common to compress data going in/out of LDS in cases where it avoids a spill to memory - the f32tof16 and f16tof32 intrinsics let you do the conversation and pack 2 floats into a 32bit LDS variable  :wink:

Edited by Hodgman

Share this post


Link to post
Share on other sites

I hope the register count doubles but also LDS if we use it for fp16 data

It's already common to compress data going in/out of LDS in cases where it avoids a spill to memory - the f32tof16 and f16tof32 intrinsics let you do the conversation and pack 2 floats into a 32bit LDS variable :wink:


I've done this a lot and since that i think fp16 would do for me even for positions.
OpenCL is more flexible here, it allows to un/pack a single value to hi or low word. We often use 1 or 3 numbers and want the remaining 16 bits for other stuff so that's very nice to have.
To get this in GLSL i made my own conversation routines but i can't remember if this was slower than using the built in functions with one zero and masking.

Remainds me on how much better OpenCL C is than shading languages.
OpenCL has pointers so you can use the same LDS memory for float4* and later for int2* or half*
With native fp16 the need for this in shading languages will become just more obvious.

Khronos plans to merge CL & VK, but i hope someone builds an open source GLSL replacement on top of SPIR-V before that happens... :)

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!