Rendering Shader into Texture

Started by
13 comments, last by MysteryX 8 years, 8 months ago

I'm writing a DLL that takes video frames in, processes them through a HLSL shader with DirectX 9, and then return the result in memory. It works when rendering against a backbuffer with pixel format D3DFMT_X8R8G8B8. but I need to run the shader and get the result in D3DFMT_A32B32G32R32F or D3DFMT_A16B16G16R16F pixel format which the backbuffer doesn't support.

If I add this line of code, I can run with the D3DFMT_A32B32G32R32F format but the shader isn't being run anymore.

m_pDevice->SetRenderTarget(0, m_pTextureSurface);

I'm starting to think that the reason the HLSL shader isn't being run is because, for optimization purpose, the code only gets executed when calling to display. What makes me think this is that I had a crash when rendering a video larger than the back buffer... but instead of crashing when rendering or presenting the scene, it only crashed at the very end when calling GetRenderTarget().

How can I force the device to render the result of the HLSL shader into another texture of D3DFMT_A32B32G32R32F format without passing through the backbuffer that has a more limited format?

I don't need to display anything to the screen. In fact, this DLL has no UI.

Advertisement
Creating a renderable texture and using SetRenderTarget is correct.

Do you have details on any of your crashes?

So it "should" be working.

The source code is here

https://github.com/mysteryx93/AviSynthShader/blob/master/VideoPresenter/D3D9RenderImpl.cpp

I call Initialize, then ProcessFrame for every frame.

I don't know if any of you have time to look through the source code, but I really don't know what else I can provide to identify the issue. It works without SetRenderTarget with D3DFMT_X8R8G8B8 pixel format, and the minute I add SetRenderTarget, I get the original image back without having the shader applied.

are you running that one always?

	HR(m_pDevice->ColorFill(m_pTextureSurface, NULL, D3DCOLOR_ARGB(0xFF, 0, 0, 0)));
	HR(m_pDevice->StretchRect(m_pOffsceenSurface, NULL, m_pTextureSurface, NULL, D3DTEXF_LINEAR));
also, try 16bit/halfs and try point instead of linear (some hardware doesn't support it). and enable the D3D debug runtime, it will show warnings/errors.

and also try PIX or intels tool ( https://software.intel.com/en-us/gpa ) and step through the scene and check where reality diverges from your expectations ;)
You have:
m_pDevice->SetRenderTarget(0, m_pTextureSurface);
And also:
m_pDevice->SetTexture(0, m_pTexture);

But texture/textureSurface are both views of the same resource (the former is a readable shader resource view, and the latter is a writable render target view).

You can't simultaneously have a resource bound for both reading and writing -- this is a "hazzard" (race condition), so the API will try to be helpful and resolve the hazard for you, probably by setting the render-target to NULL... Causing the draw to have no effect.

You need your source and destination resources to be different.

Under WindowsXP, you could use the Direct3D control panel to enable D3D9 debug mode, which would've spammed error messages to the debug console about this... But I think D3D9's debug mode has been abandoned by MS on newer versions of Windows...

WOW, thanks for the useful answers! Neither Google nor StackExchange community were helping me with this, but you guys are proving to be very helpful :-)

I'll explore these tomorrow.

I haven't written the code to do it with half-float yet, as C++ doesn't natively support half floats. What's the best way to do that conversion? Also, one person who proved to be useful but had little time to help me, said that D3DFMT_A16B16G16R16F allows values to overflow the 0-1 range, while D3DFMT_A16B16G16R16 is for when all values are within 0 and 1. With D3DFMT_A16B16G16R16, is the data being stored different than a "regular" half float by not having a bit for the negative sign?

StretchRect is being run.

Help me understand the proper flow of data. I'm trying to add support for passing additional textures to the shader and it's not working yet, so it appears I have to rewrite that.

What I just tried that isn't working: creating a texture in DefaultPool, and copying data from memory into its surface. LockRect failed.

What I'm currently doing for the main texture:

- CreateOffscreenPlainSurface, and copy frame data into that surface from memory

- CreateTexture as RenderTarget

- Render scene, copy frame from OffScreenPlainSurface to RenderTarget.

- Call GetRenderTarget, CreateOffscreenPlainSurface, GetRenderTargetData, and copy from OffscreenSurface back into memory

So, I suppose what's missing is that an extra texture must created as the input for the shader? I took this from a sample, perhaps it was "working" but they were missing that.

Do I need to create an OffscreenPlainSurface and a Texture for each additional input textures? What's the best way to copy from OffscreenPlainSurface into the Texture, with StretchRect?

Thanks!

I haven't written the code to do it with half-float yet, as C++ doesn't natively support half floats. What's the best way to do that conversion? Also, one person who proved to be useful but had little time to help me, said that D3DFMT_A16B16G16R16F allows values to overflow the 0-1 range, while D3DFMT_A16B16G16R16 is for when all values are within 0 and 1. With D3DFMT_A16B16G16R16, is the data being stored different than a "regular" half float by not having a bit for the negative sign?

Float32 -> Float16 is possible, but expensive on the CPU. I use this code: http://pastebin.com/n1eBJYUT
Newer CPUs actually have native instructions to do the conversion (in the AVX instruction set), but they're not widespread/common yet sad.png

Blah16F is Float16 / Half-float.
Blah16 is unsigned short / uint16, interpreted as a fixed-point fraction where 0 - > 0.0 and 65535 -> 1.0.

Thanks, this is extremely helpful. I rewrote the texture buffer logic. It currently only renders a black frame (?), but besides that, does the overall logic look good this time? A shader should take 2 input frames and return one frame.

You'd call Initialize, then CreateInputTexture for each input frame, CopyToBuffer for each input frame, and then ProcessFrame to get the result.

https://github.com/mysteryx93/AviSynthShader/blob/master/VideoPresenter/D3D9RenderImpl.cpp

EDIT: Uncommenting the SetRenderTarget line got my input to display

EDIT2: The source input displays, but the shader isn't being run.

I still can't figure out why the Shader isn't running. Is this the right way of processing the data?


HR(m_pDevice->ColorFill(m_pRenderTargetSurface, NULL, D3DCOLOR_ARGB(0xFF, 0, 0, 0)));
HR(m_pDevice->StretchRect(m_InputTextures[0].Memory, NULL, m_pRenderTargetSurface, NULL, D3DTEXF_LINEAR));
return m_pDevice->Present(NULL, NULL, NULL, NULL);

In this case I'm copying a texture over to the render target and expect the shader to run on it. But what if there are 2 or 3 input textures, such as for a Diff? Do I just randomly copy either one of the two textures over to get the shader to run?

I'm also seeing that this was wrong. Replacing 1 with i still doesn't get the shader to run.


for (int i = 0; i < maxTextures; i++) {
    if (m_InputTextures[i].Texture != NULL)
        SCENE_HR(m_pDevice->SetTexture(1, m_InputTextures[i].Texture), m_pDevice);
}

What else can I look at?

Btw, from my experience, when something isn't working while it should, and I just can't figure out what's wrong, it usually is something really stupid like a typo, or a basic flaw of logic.

This topic is closed to new replies.

Advertisement