DX9 Creating Many Threads

Graphics and GPU Programming Programming

Started by MysteryX October 06, 2015 03:13 AM

11 comments, last by MysteryX 8 years, 6 months ago

Nik02

4,359

October 09, 2015 04:07 PM

You could do the conversion in pixel shader.

Niko Suni

Adam_42

3,664

October 09, 2015 11:01 PM

If for some strange reason you can't use the GPU for the conversion, with recent CPUs there are intrinsics to convert back and forth from float to half quickly. See http://blogs.msdn.com/b/chuckw/archive/2012/09/11/directxmath-f16c-and-fma.asp

The intrinsics are: _mm_cvtps_ph() and _mm_cvtph_ps() which require F16C instruction set support.

Even if you can't use those intrinsics the other DirectXMath functions may be quicker than the functions you're currently using.

MysteryX

285

Author

October 10, 2015 12:31 AM

I was doing the conversion pixel by pixel which was taking forever. The updated code is processing them all at once with a buffer, which is much faster.

Is there a performance difference between DirectX Math and the DX9 conversion function?

Or, is there a way to convert straight from INT into half-float? Converting within the shader could be a good idea; but converting UINT16 to float16 would cause cropping.

As for the memory usage and many threads and devices being created, I'll see if I could chain the various shaders one after the other in the same instance, where I would reconfigure the same device with different shaders each step in the chain. This would probably drastically improve memory usage and performance.

Edit: I replaced the DX9 function with DirectX Math. Performance went up from 18.5fps to 20fps.

DX9 Creating Many Threads

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

DX9 Creating Many Threads

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines