Sign in to follow this  
Gama_Quant

DX11 ID3D11DeviceContext::OMSetRenderTargetsAndUnorderedAccessViews(...)

Recommended Posts

Hi,

I'm trying to implement Order Independent Transparency in DX11 as described [url="http://www4.atword.jp/cathy39/category/direct3d11/oit-direct3d11/"]here[/url] , but I've got stuck with following problem: It seems that my app is unable to bind UnorderedAccessView to proper slot of the pixel shader. (I'm using DX11 Effect framework for shaders...)
Although I think I bind them correctly, I'm getting this debug message:
[quote]
D3D11: INFO: ID3D11DeviceContext::DrawIndexed: The Pixel Shader unit expects an Unordered Access View at Slot 0, but none is bound. This is OK, as reads of an unbound Unordered Access View are defined to return 0 and writes are ignored. It is also possible the developer knows the data will not be used anyway. This is only a problem if the developer actually intended to bind a Unordered Access View here. [ EXECUTION INFO #2097374: DEVICE_UNORDEREDACCESSVIEW_NOT_SET ]
[/quote]

relevant pixel shader part (just write value of 5 to every affected pixel...), compiled under ps_5_0:
[CODE]
...
RWByteAddressBuffer OIT_StartOffsetBuffer : register( u0 );
[earlydepthstencil]
void PS_StoreFragments( PS_INPUT input)
{
uint x = input.position.x;
uint y = input.position.y;

// Read and update Start Offset Buffer.
uint uIndex = y * iScreenWidth + x; // get offset from position
uint uStartOffsetAddress = 4 * uIndex;
uint uOldStartOffset;
OIT_StartOffsetBuffer.InterlockedExchange( uStartOffsetAddress, 5, uOldStartOffset );
return;
}
...
[/CODE]

Here is how my app is trying to bind the UAVs:
[CODE]
ID3D11RenderTargetView* pViewNULL[ 1 ] = { NULL };
ID3D11DepthStencilView* pDSVNULL = NULL;
m_pDevCon->OMSetRenderTargets( 1, pViewNULL, pDSVNULL );
ID3D11UnorderedAccessView * apUAVs[1] = { m_pDX_UAV };

// Initialize UAV counters
UINT initCounters[] = { 0 };
// bind render targets & UAVs m_pDevCon->OMSetRenderTargetsAndUnorderedAccessViews(0,NULL, NULL, 0, 1, &(apUAVs[0]), initCounters);
[/CODE]

Here is how I create m_pDX_UAV:
[CODE]
bool CreateUAV(UINT uiWidth, UINT uiHeight, UINT uiItemByteSize)
{
uiWidth = max(1,uiWidth);
uiHeight = max(1,uiHeight);
lx_uint32 itemCount = uiWidth * uiHeight;
uiByteSize = itemCount * uiItemByteSize;

// create buffer
D3D11_BUFFER_DESC descBuf;
memset( &descBuf, 0, sizeof( descBuf ) );
descBuf.StructureByteStride = uiItemByteSize;
descBuf.ByteWidth = uiByteSize;
descBuf.MiscFlags = D3D11_RESOURCE_MISC_BUFFER_ALLOW_RAW_VIEWS;
descBuf.BindFlags = D3D11_BIND_UNORDERED_ACCESS | D3D11_BIND_SHADER_RESOURCE;
if (FAILED( m_pDev->CreateBuffer( &descBuf, NULL, &m_pDX_Buffer ) ))
{
return false;
}
// create UAV
D3D11_UNORDERED_ACCESS_VIEW_DESC descUAV;
memset( &descUAV, 0, sizeof( descUAV ) );
descUAV.ViewDimension = D3D11_UAV_DIMENSION_BUFFER;
descUAV.Buffer.FirstElement = 0;
descUAV.Format = DXGI_FORMAT_R32_TYPELESS;
descUAV.Buffer.NumElements = itemCount;
descUAV.Buffer.Flags = D3D11_BUFFER_UAV_FLAG_RAW;
if (FAILED( m_pDev->CreateUnorderedAccessView( m_pDX_Buffer, &descUAV, &m_pDX_UAV ) ))
{
return false;
}
// create SRV
D3D11_SHADER_RESOURCE_VIEW_DESC descSRV;
descSRV.ViewDimension = D3D11_SRV_DIMENSION_BUFFER;
descSRV.Buffer.FirstElement = 0;
descSRV.Buffer.NumElements = itemCount;
descSRV.Format = DXGI_FORMAT_R32_UINT;
descSRV.BufferEx.Flags = D3D11_BUFFEREX_SRV_FLAG_RAW;
descSRV.BufferEx.FirstElement =0;
descSRV.BufferEx.NumElements = itemCount;
if (FAILED( m_pDev->CreateShaderResourceView( m_pDX_Buffer, &descSRV, &m_pDX_SRV ) ))
{
return false;
}
return true;
}
[/CODE]

Does anybody have any idea, why is this happening? Is the binding of UAVs to shader and applying shader pass via: ID3DX11EffectPass::Apply(...) order sensitive? (now I first bind UAVs and then ::Apply() shader.). Do I need to bind DepthStencilView?

Thanks for any suggestion.

Share this post


Link to post
Share on other sites
[quote]
ID3D11RenderTargetView* pViewNULL[ 1 ] = { NULL };
ID3D11DepthStencilView* pDSVNULL = NULL;
m_pDevCon->OMSetRenderTargets( 1, pViewNULL, pDSVNULL );
ID3D11UnorderedAccessView * apUAVs[1] = { m_pDX_UAV };

// Initialize UAV counters
UINT initCounters[] = { 0 };
// bind render targets & UAVs m_pDevCon->OMSetRenderTargetsAndUnorderedAccessViews(0,NULL, NULL, 0, 1, &(apUAVs[0]), initCounters);
[/quote]

FYI - Looking at the documentation it is redundant to OMSetRenderTargets to NULL as the call to OMSetRenderTargetsAndUnorderedAccessViews should clear any conflicting render and depth stencil views. Edited by Hornsj3

Share this post


Link to post
Share on other sites
Oh, it's just a bad copy&paste... Of course in the code the call is uncommented like this:
[CODE]
ID3D11UnorderedAccessView * apUAVs[1] = { m_pDX_UAV };
// Initialize UAV counters
UINT initCounters[] = { 0 };
// bind render targets & UAVs
m_pDevCon->OMSetRenderTargetsAndUnorderedAccessViews(0,NULL, NULL, 0, 1, &(apUAVs[0]), initCounters);
[/CODE]

The previous setting of render targets to NULL is really redundant. Thanks. But it doesn't solve the binding problem.

I used PIX to track the API calls and found out, that DX Effect framework is actually working with UAVs and binding them to shaders. Now I'm confused twice more, than I was before... [img]http://public.gamedev.net//public/style_emoticons/default/biggrin.png[/img] It seems, that Effect framework completely overrides standard API calls. Oh my...

Does anybody have any experience with Effect framework and using UAVs?

When I used PIX, the app crashed. This is a part of log that proves, that UAVs are set by Effect framework, thus making apps UAV related calls redundant:

[source lang="cpp"]
...
PRE: <this=0x09fb1648>ID3D11DeviceContext::OMSetRenderTargetsAndUnorderedAccessViews(0, NULL, NULL, 0, 1, 0x00B575FC, 0x00B575F0) // this is apps call
Frame 000001 ........POST: <><this=0x09fb1648> ID3D11DeviceContext::OMSetRenderTargetsAndUnorderedAccessViews(0, NULL, NULL, 0, 1, 0x00B575FC, 0x00B575F0)
Frame 000001 ........PRE: <this=0x09fb13e0>ID3D11Device::CreateInputLayout(0x0CF86490, 3, 0x2B04514C, 26616, 0x00B5777C) D3D11: INFO: Create InputLayout: Name="unnamed", Addr=0x0872DBF4, ExtRef=1, IntRef=0 [ STATE_CREATION INFO #2097264: CREATE_INPUTLAYOUT ]
Frame 000001 ............PRE: AddObject(D3D11 Input Layout, 0x09F49988, 0x0872DBF4)
Frame 000001 ............POST: <TRUE> AddObject(D3D11 Input Layout, 0x09F49988, 0x0872DBF4)
Frame 000001 ........POST: <S_OK><this=0x09fb13e0> ID3D11Device::CreateInputLayout(0x0CF86490, 3, 0x2B04514C, 26616, 0x00B5777C)
Frame 000001 ........PRE: <this=0x09fb1648>ID3D11DeviceContext::IASetInputLayout(0x09F49988)
Frame 000001 ........POST: <><this=0x09fb1648> ID3D11DeviceContext::IASetInputLayout(0x09F49988)
Frame 000001 ........PRE: <this=0x09fb1648>ID3D11DeviceContext::IASetVertexBuffers(0, 1, 0x2ADF9768, 0x00B57770, 0x00B57764)
Frame 000001 ........POST: <><this=0x09fb1648> ID3D11DeviceContext::IASetVertexBuffers(0, 1, 0x2ADF9768, 0x00B57770, 0x00B57764) Frame 000001 ........PRE: <this=0x09fb1648>ID3D11DeviceContext::IASetIndexBuffer(0x0A0072B8, DXGI_FORMAT_R16_UINT, 0)
Frame 000001 ........POST: <><this=0x09fb1648> ID3D11DeviceContext::IASetIndexBuffer(0x0A0072B8, DXGI_FORMAT_R16_UINT, 0)
Frame 000001 ........PRE: <this=0x09fb1648>ID3D11DeviceContext::IASetPrimitiveTopology(D3D11_PRIMITIVE_TOPOLOGY_TRIANGLESTRIP)
Frame 000001 ........POST: <><this=0x09fb1648> ID3D11DeviceContext::IASetPrimitiveTopology(D3D11_PRIMITIVE_TOPOLOGY_TRIANGLESTRIP)
Frame 000001 ........PRE: <this=0x09fb1648>ID3D11DeviceContext::UpdateSubresource(0x09F55D58, 0, NULL, 0x2B0A3694, 784, 784) Frame 000001 ........POST: <><this=0x09fb1648> ID3D11DeviceContext::UpdateSubresource(0x09F55D58, 0, NULL, 0x2B0A3694, 784, 784) Frame 000001 ........PRE: <this=0x09fb1648>ID3D11DeviceContext::VSSetConstantBuffers(0, 1, 0x2B0A4BC4)
Frame 000001 ........POST: <><this=0x09fb1648> ID3D11DeviceContext::VSSetConstantBuffers(0, 1, 0x2B0A4BC4)
Frame 000001 ........PRE: <this=0x09fb1648>ID3D11DeviceContext::VSSetShader(0x09F53290, NULL, 0)
Frame 000001 ........POST: <><this=0x09fb1648> ID3D11DeviceContext::VSSetShader(0x09F53290, NULL, 0)
Frame 000001 ........PRE: <this=0x09fb1648>ID3D11DeviceContext::PSSetConstantBuffers(0, 1, 0x2B0A4BEC)
Frame 000001 ........POST: <><this=0x09fb1648> ID3D11DeviceContext::PSSetConstantBuffers(0, 1, 0x2B0A4BEC)
Frame 000001 ........PRE: <this=0x09fb1648>ID3D11DeviceContext::OMSetRenderTargetsAndUnorderedAccessViews(-1, NULL, NULL, 0, 1, 0x2B0A4BF4, 0x0D1E3860) // this is Effect framework call
...
[/source]

Share this post


Link to post
Share on other sites
[quote name='xoofx' timestamp='1340095177' post='4950517']
You should setup the OIT_StartOffsetBuffer variable on the Effect (using AsUnorderedAccessView() on the variable) and not directly using OMSetRenderTargetsAndUnorderedAccessViews. (Check Effects11/EffectRuntime.cpp:205 in the DirectX SDK).
[/quote]

Yes, you're right, I've just started analyzing Effects11 and found it myself... I've also noticed, that Effects11 provide no supposrt for setting UAV Counter initial value. So I had to change it a bit... Now it seems to be working properly. :D
Thanks a lot to everyone.

To sum up this topic:
[quote]If you're using Effects11 framework and want to bind UnorderedAccessView to shader to write to,do NOT use OMSetRenderTargetsAndUnorderedAccessViews (has no effect), use ID3DX11EffectUnorderedAccessViewVariable::SetUnorderedAccessView(pUAV) instead. Beware that standard Effect11 framework does NOT provide any way to set UAVs counters initial values! (but can be quickly/easily coded)[/quote]

I

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this  

  • Announcements

  • Forum Statistics

    • Total Topics
      628388
    • Total Posts
      2982403
  • Similar Content

    • By KarimIO
      Hey guys,
      I'm trying to work on adding transparent objects to my deferred-rendered scene. The only issue is the z-buffer. As far as I know, the standard way to handle this is copying the buffer. In OpenGL, I can just blit it. What's the alternative for DirectX? And are there any alternatives to copying the buffer?
      Thanks in advance!
    • By joeblack
      Hi,
      im reading about specular aliasing because of mip maps, as far as i understood it, you need to compute fetched normal lenght and detect now its changed from unit length. I’m currently using BC5 normal maps, so i reconstruct z in shader and therefore my normals are normalized. Can i still somehow use antialiasing or its not needed? Thanks.
    • By 51mon
      I want to change the sampling behaviour to SampleLevel(coord, ddx(coord.y).xx, ddy(coord.y).xx). I was just wondering if it's possible without explicit shader code, e.g. with some flags or so?
    • By GalacticCrew
      Hello,
      I want to improve the performance of my game (engine) and some of your helped me to make a GPU Profiler. After creating the GPU Profiler, I started to measure the time my GPU needs per frame. I refined my GPU time measurements to find my bottleneck.
      Searching the bottleneck
      Rendering a small scene in an Idle state takes around 15.38 ms per frame. 13.54 ms (88.04%) are spent while rendering the scene, 1.57 ms (10.22%) are spent during the SwapChain.Present call (no VSync!) and the rest is spent on other tasks like rendering the UI. I further investigated the scene rendering, since it takes über 88% of my GPU frame rendering time.
      When rendering my scene, most of the time (80.97%) is spent rendering my models. The rest is spent to render the background/skybox, updating animation data, updating pixel shader constant buffer, etc. It wasn't really suprising that most of the time is spent for my models, so I further refined my measurements to find the actual bottleneck.
      In my example scene, I have five animated NPCs. When rendering these NPCs, most actions are almost for free. Setting the proper shaders in the input layout (0.11%), updating vertex shader constant buffers (0.32%), setting textures (0.24%) and setting vertex and index buffers (0.28%). However, the rest of the GPU time (99.05% !!) is spent in two function calls: DrawIndexed and DrawIndexedInstance.
      I searched this forum and the web for other articles and threads about these functions, but I haven't found a lot of useful information. I use SharpDX and .NET Framework 4.5 to develop my game (engine). The developer of SharpDX said, that "The method DrawIndexed in SharpDX is a direct call to DirectX" (Source). DirectX 11 is widely used and SharpDX is "only" a wrapper for DirectX functions, I assume the problem is in my code.
      How I render my scene
      When rendering my scene, I render one model after another. Each model has one or more parts and one or more positions. For example, a human model has parts like head, hands, legs, torso, etc. and may be placed in different locations (on the couch, on a street, ...). For static elements like furniture, houses, etc. I use instancing, because the positions never change at run-time. Dynamic models like humans and monster don't use instancing, because positions change over time.
      When rendering a model, I use this work-flow:
      Set vertex and pixel shaders, if they need to be updated (e.g. PBR shaders, simple shader, depth info shaders, ...) Set animation data as constant buffer in the vertex shader, if the model is animated Set generic vertex shader constant buffer (world matrix, etc.) Render all parts of the model. For each part: Set diffuse, normal, specular and emissive texture shader views Set vertex buffer Set index buffer Call DrawIndexedInstanced for instanced models and DrawIndexed models What's the problem
      After my GPU profiling, I know that over 99% of the rendering time for a single model is spent in the DrawIndexedInstanced and DrawIndexed function calls. But why do they take so long? Do I have to try to optimize my vertex or pixel shaders? I do not use other types of shaders at the moment. "Le Comte du Merde-fou" suggested in this post to merge regions of vertices to larger vertex buffers to reduce the number of Draw calls. While this makes sense to me, it does not explain why rendering my five (!) animated models takes that much GPU time. To make sure I don't analyse something I wrong, I made sure to not use the D3D11_CREATE_DEVICE_DEBUG flag and to run as Release version in Visual Studio as suggested by Hodgman in this forum thread.
      My engine does its job. Multi-texturing, animation, soft shadowing, instancing, etc. are all implemented, but I need to reduce the GPU load for performance reasons. Each frame takes less than 3ms CPU time by the way. So the problem is on the GPU side, I believe.
    • By noodleBowl
      I was wondering if someone could explain this to me
      I'm working on using the windows WIC apis to load in textures for DirectX 11. I see that sometimes the WIC Pixel Formats do not directly match a DXGI Format that is used in DirectX. I see that in cases like this the original WIC Pixel Format is converted into a WIC Pixel Format that does directly match a DXGI Format. And doing this conversion is easy, but I do not understand the reason behind 2 of the WIC Pixel Formats that are converted based on Microsoft's guide
      I was wondering if someone could tell me why Microsoft's guide on this topic says that GUID_WICPixelFormat40bppCMYKAlpha should be converted into GUID_WICPixelFormat64bppRGBA and why GUID_WICPixelFormat80bppCMYKAlpha should be converted into GUID_WICPixelFormat64bppRGBA
      In one case I would think that: 
      GUID_WICPixelFormat40bppCMYKAlpha would convert to GUID_WICPixelFormat32bppRGBA and that GUID_WICPixelFormat80bppCMYKAlpha would convert to GUID_WICPixelFormat64bppRGBA, because the black channel (k) values would get readded / "swallowed" into into the CMY channels
      In the second case I would think that:
      GUID_WICPixelFormat40bppCMYKAlpha would convert to GUID_WICPixelFormat64bppRGBA and that GUID_WICPixelFormat80bppCMYKAlpha would convert to GUID_WICPixelFormat128bppRGBA, because the black channel (k) bits would get redistributed amongst the remaining 4 channels (CYMA) and those "new bits" added to those channels would fit in the GUID_WICPixelFormat64bppRGBA and GUID_WICPixelFormat128bppRGBA formats. But also seeing as there is no GUID_WICPixelFormat128bppRGBA format this case is kind of null and void
      I basically do not understand why Microsoft says GUID_WICPixelFormat40bppCMYKAlpha and GUID_WICPixelFormat80bppCMYKAlpha should convert to GUID_WICPixelFormat64bppRGBA in the end
       
  • Popular Now