Jump to content
  • Advertisement
Sign in to follow this  

Reviewing Memory Transfers

This topic is 1112 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I have written some code that takes video frames, run a series of HLSL pixel shaders on them, and returns the processed video.


I would like to review the overall memory transfers to see if it's done properly, if it could be optimized, and then I have some questions.


In this sample, we have 3 input textures out of 9 slots, and we run 3 shaders.

Index  0 1 2 3 4 5 6 7 8 9 10 11
CPU    D D D                   S
GPU    R R R             R  R  R

m_RenderTargets contains one R texture per output resolution. The Render Target then gets copied into the next available index. Command1 outputs to RenderTarget[0] and then to index 9, Command2 outputs to index 10, Command3 outputs to index 11, etc. Only the final output needs to be copied from the GPU back onto the CPU, requiring a SYSTEMMEM texture.

All the processing is done with D3DFMT_A16B16G16R16 format.


The full code is here


Any comments so far?


Here are a few questions.


1. Do all GPU-side textures need to be defined as RenderTargets? If I don't, then StretchRect commands fails, which I'm using to move the data around.


2. Can I do the processing in D3DFMT_A16B16G16R16 and then return the result in D3DFMT_X8R8G8B8 to avoid converting from 16-bit to 8-bit on the CPU? If so, which textures do I have to change?


3. If I want to work with half-float data, can the input textures be D3DFMT_A32B32G32R32F and then do all the processing in D3DFMT_A16B16G16R16F, to avoid doing float to half-float conversion on the CPU? If so, which textures do I have to change? Similarly, I could pass input frames as D3DFMT_X8R8G8B8 and process it in D3DFMT_A16B16G16R16.


Edit: To be more specific, I support 3 pixel formats: D3DFMT_X8R8G8B8, D3DFMT_A16B16G16R16 and D3DFMT_A16B16G16R16F. I'd like to be able to take the input in D3DFMT_X8R8G8B8, do the processing in D3DFMT_A16B16G16R16F and give back the result in D3DFMT_A16B16G16R16 (use use any other combination). How can I do that?


Edit2: By changing the last RenderTarget and the last m_InputTextures set to D3DFMT_A16B16G16R16F, I'm able to read it back as half-float successfully. If I change it to D3DFMT_X8R8G8B8, however... the R and B are reversed, and the image is repeated twice side-by-side. Other than that, if I compare that with the regular 16-bit processing, the image is exactly the same which indicates it internally processed it as D3DFMT_A16B16G16R16. Creating the texture as D3DFMT_A8B8G8R8 fails.


Then I've also tried changing the input textures to D3DPOOL_SYSTEMMEM and using UpdateSurface instead of StretchRect. It works. However, performance is not better and memory usage is higher.

Edited by MysteryX

Share this post

Link to post
Share on other sites

As an update, it appears that both the "inversed pixels" and "duplicated width" problems were bugs with my own code. I fixed those and I'm now able to process 16-bit data and getting a 8-bit result. I suppose the same will work for input textures.


And considering my (failed) test with UpdateSurface, it seems the way I'm doing it with StretchRect is the right way.


Edit: Got it working. It can now take 8-bit frames as input, process with half-float data and get the output as 8-bit frames, and all the data conversion is done on the GPU instead of the CPU.

Edited by MysteryX

Share this post

Link to post
Share on other sites

I have a problem with the memory textures re-design. Changing the last RenderTarget and the textures to read the data back to D3DFMT_X8R8G8B8 works fine.


However, changing the very first textures in to D3DFMT_X8R8G8B8 causes image distortion


HR(m_pDevice->CreateOffscreenPlainSurface(width, height, m_formatIn, D3DPOOL_DEFAULT, &Obj->Memory, NULL));


Here are screenshots comparing the result between using the first input texture as D3DFMT_X8R8G8B8 vs D3DFMT_A16B16G16R16. All other textures in the processing chain are D3DFMT_A16B16G16R16.


Texture_In1.png Texture_In2.png


Is there a way to do the pixel conversion on the GPU without having such distortion?

Edited by MysteryX

Share this post

Link to post
Share on other sites
Sign in to follow this  

  • Advertisement

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!