Jump to content
  • Advertisement
Sign in to follow this  
MysteryX

Reviewing Memory Transfers

This topic is 963 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I have written some code that takes video frames, run a series of HLSL pixel shaders on them, and returns the processed video.

 

I would like to review the overall memory transfers to see if it's done properly, if it could be optimized, and then I have some questions.

 

In this sample, we have 3 input textures out of 9 slots, and we run 3 shaders.
D = D3DPOOL_DEFAULT, S = D3DPOOL_SYSTEMMEM, R = D3DUSAGE_RENDERTARGET

m_InputTextures
Index  0 1 2 3 4 5 6 7 8 9 10 11
CPU    D D D                   S
GPU    R R R             R  R  R

m_RenderTargets contains one R texture per output resolution. The Render Target then gets copied into the next available index. Command1 outputs to RenderTarget[0] and then to index 9, Command2 outputs to index 10, Command3 outputs to index 11, etc. Only the final output needs to be copied from the GPU back onto the CPU, requiring a SYSTEMMEM texture.

All the processing is done with D3DFMT_A16B16G16R16 format.

 

The full code is here

 

Any comments so far?

 

Here are a few questions.

 

1. Do all GPU-side textures need to be defined as RenderTargets? If I don't, then StretchRect commands fails, which I'm using to move the data around.

 

2. Can I do the processing in D3DFMT_A16B16G16R16 and then return the result in D3DFMT_X8R8G8B8 to avoid converting from 16-bit to 8-bit on the CPU? If so, which textures do I have to change?

 

3. If I want to work with half-float data, can the input textures be D3DFMT_A32B32G32R32F and then do all the processing in D3DFMT_A16B16G16R16F, to avoid doing float to half-float conversion on the CPU? If so, which textures do I have to change? Similarly, I could pass input frames as D3DFMT_X8R8G8B8 and process it in D3DFMT_A16B16G16R16.

 

Edit: To be more specific, I support 3 pixel formats: D3DFMT_X8R8G8B8, D3DFMT_A16B16G16R16 and D3DFMT_A16B16G16R16F. I'd like to be able to take the input in D3DFMT_X8R8G8B8, do the processing in D3DFMT_A16B16G16R16F and give back the result in D3DFMT_A16B16G16R16 (use use any other combination). How can I do that?

 

Edit2: By changing the last RenderTarget and the last m_InputTextures set to D3DFMT_A16B16G16R16F, I'm able to read it back as half-float successfully. If I change it to D3DFMT_X8R8G8B8, however... the R and B are reversed, and the image is repeated twice side-by-side. Other than that, if I compare that with the regular 16-bit processing, the image is exactly the same which indicates it internally processed it as D3DFMT_A16B16G16R16. Creating the texture as D3DFMT_A8B8G8R8 fails.

 

Then I've also tried changing the input textures to D3DPOOL_SYSTEMMEM and using UpdateSurface instead of StretchRect. It works. However, performance is not better and memory usage is higher.

Edited by MysteryX

Share this post


Link to post
Share on other sites
Advertisement

As an update, it appears that both the "inversed pixels" and "duplicated width" problems were bugs with my own code. I fixed those and I'm now able to process 16-bit data and getting a 8-bit result. I suppose the same will work for input textures.

 

And considering my (failed) test with UpdateSurface, it seems the way I'm doing it with StretchRect is the right way.

 

Edit: Got it working. It can now take 8-bit frames as input, process with half-float data and get the output as 8-bit frames, and all the data conversion is done on the GPU instead of the CPU.

Edited by MysteryX

Share this post


Link to post
Share on other sites

I have a problem with the memory textures re-design. Changing the last RenderTarget and the textures to read the data back to D3DFMT_X8R8G8B8 works fine.

 

However, changing the very first textures in to D3DFMT_X8R8G8B8 causes image distortion

 

HR(m_pDevice->CreateOffscreenPlainSurface(width, height, m_formatIn, D3DPOOL_DEFAULT, &Obj->Memory, NULL));

 

Here are screenshots comparing the result between using the first input texture as D3DFMT_X8R8G8B8 vs D3DFMT_A16B16G16R16. All other textures in the processing chain are D3DFMT_A16B16G16R16.

 

Texture_In1.png Texture_In2.png

 

Is there a way to do the pixel conversion on the GPU without having such distortion?

Edited by MysteryX

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!