Sign in to follow this  
damiena

Direct3D 11 Read/Write Texture render target

Recommended Posts

Hello,

I've started an pixel shader 4.0 project emulating old hardware, but have came across a limitation where a resource can't be bound as both input and output in the pipeline. I'm thinking that to get away from this limitation, I can update the project to use pixel shader 5.0, and using Unordered Access Views. However, the samples in the DXSDK only include using Compute Shaders (which I'm not using). So I'm wondering, what is the simplest way to bind a texture as a render target, and use it as the input of a pixel shader? For those who are familiar with GL_NV_texture_barrier, what is the simplest way to emulate this functionality in Direct3D?

Share this post


Link to post
Share on other sites
Bind a render target (I think you need to set a dummy render target for this to work even if you don't plan to write anything to it, but I'm not 100%) and your UAVs by using OMSetRenderTargetsAndUnorderedAccessViews, then use the UAVs in the pixels shader as read/write.

In your pixel shader declare it as RWTexture2D<yourFormat> reTexture.

Share this post


Link to post
Share on other sites
[quote name='n3Xus' timestamp='1305544247' post='4811383']
Bind a render target (I think you need to set a dummy render target for this to work even if you don't plan to write anything to it, but I'm not 100%) and your UAVs by using OMSetRenderTargetsAndUnorderedAccessViews, then use the UAVs in the pixels shader as read/write.

In your pixel shader declare it as RWTexture2D<yourFormat> reTexture.
[/quote]

On looking at the documentation, RWTexture2D doesn't have the Load method, which I require. My code works as follows


1)Render everything to Framebuffer1, which is a texture with a render target view

2)Before calling SwapChain->Present, copy Framebuffer1's contents to Framebuffer2, which is a texture with a shader resource view, and use Framebuffer2 as a textured quad on the screen.

Copying the contents of the previous frame allowed me to use the contents in the current frame (ala ping-ponging), but I've come across the limitation I wrote of in the first post, where I have to use 'updated' contents of Framebuffer2 even though it hasn't been updated yet (Present hasn't been called yet).

Is your suggestion stating that I should create another rendertarget view (Framebuffer3) and a UAV and do the following:

[code]

ID3D11RenderTargetView rtv[2] = {Framebuffer1,Framebuffer3};

ID3D11UnorderAccessView uav[1]={ pixelshaderUAV };

context->OMSetRenderTargetsAndUnorderedAccessViews (2, rtv, NULL, 0, 1, uav,0);

[/code]

Should Framebuffer3 and pixelshaderUAV be bound to the same texture? Also, what modifications would need to be made to the following pixel shader? Simply add RWTexture2D<float4> to the top?

[code]

Texture2D ColorLookup: register (t0);

float4 main(float2 pos:SV_POSITION, float3 tex: TEXCOORD): SV_Target {

float4 IndexLookup = ColorLookup.load(tex);

....//do some calulations on IndexLookup

return ColorLookup.Load (/* calculated value based on uniforms and IndexLookup*/);

}

[/code]

Share this post


Link to post
Share on other sites
I don't understand what you are trying to do.
What are you 'actually' trying to do?

You wrote:
[color="#1C2837"][size="2"][quote]...where I have to use 'updated' contents of Framebuffer2 even though it hasn't been updated yet (Present hasn't been called yet). [/quote][/size][/color]
This confuses me. Swap chains Present just copies your swap chains render target to the front buffer so you can see it on the display. Even if you don't call Present everything still gets rendered internally, you just won't see any results visually on your screen.



One mistake you made:

This line:
[quote]context->OMSetRenderTargetsAndUnorderedAccessViews (2, rtv, NULL, 0, 1, uav,0); [/quote]
should be:
[quote]context->OMSetRenderTargetsAndUnorderedAccessViews (2, rtv, NULL, 2, 1, uav,0); [/quote]
See the docs to see what I'm talking about.

Share this post


Link to post
Share on other sites
[quote name='n3Xus' timestamp='1305569126' post='4811552']
I don't understand what you are trying to do.
What are you 'actually' trying to do?.
[/quote]

I'm trying to emulate the GPU of a console. Instead of creating one texture and bind it as both render target view (RTV) and shader resource view(SRV), I create 2 textures, with one as an RTV, and the other as the SRV. After I complete rendering commands into the RTV, I copy its contents into the SRV, and present the SRV's contents to the screen/frontbuffer using a texture mapped quad. The SRV is also used with rendering commands to the RTV that may internally reference data uploaded to the RTV, thus preventing the read/write hazards that the Direct3D runtimes are so strict about. My problem occurs when the console's GPU begins to reuse the contents of the RTV before the SRV is updated with the latest changes (which only occurs when a Present occurs, or if a texture upload event occurs).

Share this post


Link to post
Share on other sites
So you do this:
Render everything onto RTV.
Copy from RTV into SRV.
Than use the contents of the SRV as a texture on a fullscreen quad.
Draw this fullscreen quad to swap chains RTV - why not draw directly onto the swap chains RTV and then copy from swap chains RTV into SRV?

[quote]The SRV is also used with rendering commands to the RTV [/quote]
Does this sentance says: "I use the previous frame data ( which is now stored in SRV) for some effects/whatever when I render this frame onto the RTV"?


[quote]...that may internally reference data uploaded to the RTV[/quote]
What do you mean by this?

So the problem is that your RTV gets overwritten before you have the chance to copy it to SRV? If this is so, then this sounds more of an logical error. Are you doing some crazy
multithreading?

And the Present we are talking about is actually the emulated GPUs present, not the DX11 present?

Share this post


Link to post
Share on other sites
[quote name='n3Xus' timestamp='1305573744' post='4811582']
So you do this:
Render everything onto RTV.
Copy from RTV into SRV.
Than use the contents of the SRV as a texture on a fullscreen quad.

[/quote]


Pretty much.

[quote] why not draw directly onto the swap chains RTV and then copy from swap chains RTV into SRV?[/quote]

That technique was used years ago when GPU's didn't support offscreen render targets. It required a lot of guess work to determine which part of the RTV is being presented to the screen. The whole SRV isn't presented when drawing the full-screen quad: only a sub-section of it, as the console supported double buffering. The remaining portions of the SRV (not displayed at DX11's Present call) may be used for texture mapping of the RTV frames.

[quotename='n3Xus' ]
Does this sentance says: "I use the previous frame data ( which is now stored in SRV) for some effects/whatever when I render this frame onto the RTV"?
[/quote]

Yes. An example of an 'effect' would be paletted textures.


[quote name='n3Xus']

[quote]...that may internally reference data uploaded to the RTV[/quote]


What do you mean by this?
[/quote]

I mean that similar to how 'modern' GPUs allow users to render to an offscreen texture, then use that texture to present to the screen, the console's GPU allows the user to render to a portion of the screen that isn't being displayed, and use the results for texture mapping (e.g. updated texture palettes). No multithreading is being done. One way to 'solve' this problem, would be to copy the result of every draw call from the RTV to the SRV, but that would be veeerry slow especially in scenarios when there are hundreds of draw calls per frame.

[quote]And the Present we are talking about is actually the emulated GPUs present, not the DX11 present?[/quote]


They are the same. When the emulated GPU says to present, I copy its contents to the SRV and use DX11's present.

Share this post


Link to post
Share on other sites
You should've said that you are using one "big" texture for everything at the very 1st post [img]http://public.gamedev.net/public/style_emoticons/default/laugh.gif[/img] . I don't have any experience with console programming but I read something about this "one texture for everything" thing once.

The ID3D11DeviceContext::UpdateSubresource has a parameter 'const D3D11_BOX *pDstBox'. By setting this parameter you can direcly copy only to a certain region of your texture and it doesn't matter if that texture is a RTV or SRV.

Using a UAV is definitely a way to go, since it will allow you to read and write from the entire texture at the same time. The problem is that using a UAV can be slow.

You could try changing the D3D11_VIEWPORT structure to tell DX to which part of the texture you want to write (the other parts stay as they are, only the part of the texture in the viewport will get overwritten).



DirectX isn't designed for having one big read/write for everything, so you may want to experiment with it.


If you were to use a UAV, would you still need to ping-pong?

Share this post


Link to post
Share on other sites
[quote name='n3Xus' timestamp='1305579837' post='4811626']
The ID3D11DeviceContext::UpdateSubresource has a parameter 'const D3D11_BOX *pDstBox'. By setting this parameter you can direcly copy only to a certain region of your texture and it doesn't matter if that texture is a RTV or SRV.
[/quote]

I've been use ID3D11DeviceContext::CopySubresourceRegion for all of my texture uploads which are smaller than the original texture, and ID3D11DeviceContext::CopyResource just before calling IDXGISwapChain::Present. It works very well, considering that the number of texture uploads per frame is usually small :). The documentation from both Nvidia and AMD discourage using ID3D11DeviceContext::UpdateSubresource.

The framebuffer is ~1-2MB (depending on the DXGI format I choose), and I'm given the (texture) coordinates of which section to display prior to calling Present, so the viewport isn't the issue.

[quote]If you were to use a UAV, would you still need to ping-pong?[/quote]

Theoretically, I wouldn't have to. But I've read elsewhere of instances where it can be slow, but I won't know until I try it out :). My problem is that I haven't seen any tutorials showing the use of UAVs without using a Compute Shader.

Share this post


Link to post
Share on other sites
Here's how you do it:

Create a texture resource for a dummy render target and a separate texture resource for you unordered access view (do this just to be safe, maybe you could create just one texture resource and use it as a render target and as an UAV, but I'm not sure).

I'll skip the render target creation code.



Create the resource and uav:

[code]
ID3D11Texture2D* textureResourceForUAV;
ID3D11UnorderedAccessView* uav;



D3D11_TEXTURE2D_DESC d;
d.ArraySize=1;
d.BindFlags=D3D11_BIND_UNORDERED_ACCESS;
d.CPUAccessFlags=0;
d.Format=DXGI_FORMAT_R32_UINT; // Change this to what you need
d.Height=displayMode.GetScreenHeight(); // -||-
d.Width=displayMode.GetScreenWidth(); // -||-
d.MipLevels=1;
d.MiscFlags=0;
d.SampleDesc.Count=1;
d.SampleDesc.Quality=0;
d.Usage=D3D11_USAGE_DEFAULT;
HR(dev->CreateTexture2D(&d,NULL,&textureResourceForUAV));

D3D11_UNORDERED_ACCESS_VIEW_DESC du;
du.Format=d.Format;
du.Texture2D.MipSlice=0;
du.ViewDimension=D3D11_UAV_DIMENSION::D3D11_UAV_DIMENSION_TEXTURE2D;
HR(dev->CreateUnorderedAccessView(textureResourceForUAV,&du,&uav));[/code]



DX11 will probably complain if you will create a SRV for textureResourceForUAV and have it bound to the the pipeline at the same
time as the UAV, so I will assume you will have it in read/write mode all the time (aka slow mode ^^).


Bind it like this:
[code]deviceContext->OMSetRenderTargetsAndUnorderedAccessViews(1,&yourDummyRenderTarget,NULL,1,1,&uav,0);[/code]

Now the shader code:
Since I don't know how/when you read/write from the UAV texture, I will assume every shader can read and write to UAV texture.

[code]// Add this to the top, also change the <float4> to whatever format you need
RWTexture2D<float4> uavTexture :register (t0);

// And than in your pixel shader
float4 pixelShaderMain(inPS_IN input):SV_Target0 // Match the float4 to whatever format your dummy render target is
{
// Read from uav
// The value in the [ ] operator is a int2, so if you have texcoords in [0,1] range be sure to convert them to
// to integer coordinates by multiplying them by texture size
float4 a=uavTexture[input.texcoords.xy*textureSize.xy];


// Write
uavTexture[int2(30,500)]=float4(1,0,0,0);

return float4(0,0,0,0); // It doesn't matter what you return to dummy render target
}[/code]

Hope it helps, ask if anything is unclear [img]http://public.gamedev.net/public/style_emoticons/default/smile.gif[/img]

Share this post


Link to post
Share on other sites
Thanks a lot!! It appears to be working so far without slowdown. I had to modify your suggested code in line with recommendations of [url="https://encrypted.google.com/url?sa=t&source=web&cd=1&ved=0CBkQFjAA&url=http%3A%2F%2Fmsdn.microsoft.com%2Fen-us%2Flibrary%2Fff728749(v%3Dvs.85).aspx&ei=RtXRTYSDO8negQemj4XPCw&usg=AFQjCNGI0Ni1RYPtt4zRMrSxULpCJfcvUA"]Unpacking and Packing DXGI_FORMAT for In-Place Image Editing[/url], as I had gotten a compiler warning/error about writing to the float4 RWTexture2D.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this