Jump to content

  • Log In with Google      Sign In   
  • Create Account

Banner advertising on our site currently available from just $5!

1. Learn about the promo. 2. Sign up for GDNet+. 3. Set up your advert!


Member Since 12 Apr 2010
Offline Last Active Mar 04 2015 03:01 AM

Topics I've Started

How to initialize Texture2DArray?

02 March 2015 - 05:10 AM



I'm using DirectXTK for texture creation from *.dds. Thought documentation says that DDSTextureLoader supports texture arrays, I couldn't figure out how to create an array. So I'm loading textures with D3DReadFileToBlob() and creating texture from it with CreateDDSTextureFromMemory(). And now interesting part - this code works fine:

CreateDDSTextureFromMemory(device, static_cast<uint8_t*>(textureBlob1->GetBufferPointer()), textureBlob1->GetBufferSize(), texture.ReleaseAndGetAddressOf(), nullptr);

D3D11_TEXTURE2D_DESC texElementDesc;

viewDesc.Format = texElementDesc.Format;
viewDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURE2D;
viewDesc.Texture2D.MostDetailedMip = 0;
viewDesc.Texture2D.MipLevels = 1;
hr = device->CreateShaderResourceView(texture.Get(), &viewDesc, textureView.ReleaseAndGetAddressOf());

But this is only single texture. Let's change it for an array:

CreateDDSTextureFromMemory(device, static_cast<uint8_t*>(textureBlob1->GetBufferPointer()), textureBlob1->GetBufferSize(), texture.ReleaseAndGetAddressOf(), nullptr);

D3D11_TEXTURE2D_DESC texElementDesc;

D3D11_TEXTURE2D_DESC texArrayDesc;
texArrayDesc.Width = texElementDesc.Width;
texArrayDesc.Height = texElementDesc.Height;
texArrayDesc.MipLevels = 1;
texArrayDesc.ArraySize = 1;
texArrayDesc.Format = texElementDesc.Format;
texArrayDesc.SampleDesc.Count = 1;
texArrayDesc.SampleDesc.Quality = 0;
texArrayDesc.Usage = D3D11_USAGE_DEFAULT;
texArrayDesc.BindFlags = D3D11_BIND_SHADER_RESOURCE;
texArrayDesc.CPUAccessFlags = 0;
texArrayDesc.MiscFlags = 0;

D3D11_SUBRESOURCE_DATA resourceData1{};
resourceData1.pSysMem = static_cast<uint8_t*>(textureBlob1->GetBufferPointer());
resourceData1.SysMemPitch = texElementDesc.Width * sizeof(uint8_t) * 4;
resourceData1.SysMemSlicePitch = texElementDesc.Width * texElementDesc.Height * sizeof(uint8_t) * 4;

ComPtr<ID3D11Texture2D> texArray;
hr = device->CreateTexture2D(&texArrayDesc, &resourceData1, texArray.ReleaseAndGetAddressOf());

viewDesc.Format = texArrayDesc.Format;
viewDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURE2DARRAY;
viewDesc.Texture2DArray.MostDetailedMip = 0;
viewDesc.Texture2DArray.MipLevels = 1;
viewDesc.Texture2DArray.FirstArraySlice = 0;
viewDesc.Texture2DArray.ArraySize = 1;

hr = device->CreateShaderResourceView(texArray.Get(), &viewDesc, &textureView);

And, as you might guess, this doesn't work. Actually this work, but somewhat strange - some textures looks cutted, some become fully transparent. Personally me think that the problem is here:

D3D11_SUBRESOURCE_DATA resourceData1{};
resourceData1.pSysMem = static_cast<uint8_t*>(textureBlob1->GetBufferPointer());
resourceData1.SysMemPitch = texElementDesc.Width * sizeof(uint8_t) * 4;
resourceData1.SysMemSlicePitch = texElementDesc.Width * texElementDesc.Height * sizeof(uint8_t) * 4;

I know there's a way to update texture with context's update sub resource. But I'm interested in initializing texture on creation.

DirectCompute. How many threads can run at the same time?

27 February 2015 - 10:24 AM


This is information I found on msdn:

  • The maximum number of threads is limited to 1024.
  • The maximum dimension of dispatch is limited to 65535 per dimension.

So, in theory I can call compute shader 65535 * 65535 * 65535 * 1024 in one dispatch. This is huge number and I bet no gpu can run such number in parallel. But what happens if I make a call like this? Will the groups execute by order? I mean if there no "free" thread groups the processing will stall until some group will finish? But what is the maximum number of groups that I can run simultaneously?

Does draw calls number always matter?

12 February 2015 - 04:02 AM



In some places I've read that many draw calls is evil. Other resourses states that draw call by itself is not so bad but gpu state change is. Let's say I need to draw a lot of elements and the only thing I need to know - the depth after previous element. With single draw call I can't read depth buffer (well, actually can, but it will not show me the correct results - it should be sinchronized after draw call). So, if make 1000 draw calls without any state change - how bad is it comparing with single draw call for 1000 elements? How can I measure it precisely (I bet that measurement of how many nanoseconds draw call takes will not show me correct result)?

Clip (mask out) child by parent bounds.

11 February 2015 - 02:54 AM



I'm working on GUI. I have a parent-child relationship and I want to make parent "maskable", i.e. every child pixel outside parent should be discarded. The problem is that I render my GUI in single pass and can't figure out how to use stencil (I think it's not possible in my case). Also my elements are semi-transparent.


I render all elements without depth test - I submit geometry in order - from deepest to nearest. And I'm trying to achieve the following:




I numerated the quads by order. 1 - deepest one, next 2, 3 is a child of 2 and should be clipped, 4 is above 2 (and 3 since it's a child). All quads are semi-transparent.


As you can see - stenciling will not help here. I tried different approach - every element have an id. Every child also have parent id. In fragment shader when I render a parent I write to a RWTexture an id. Next when I render a child - I check if a RWTexture have a parent id. If no - discard pixel. That worked amazingly well! But here http://www.gamedev.net/topic/665588-does-gpu-guaranties-order-of-execution/ clever guys told me that pixel shader executes in random order and writes to RWTexture happens randomly. So it's possible that child will try to write before parent (thought I never see this in my tests). And I can't rely any more on this technique. And right now I don't have any ideas how to achieve clipping in single draw call.


P.S.: the layout and number of element is not known at compile time. Also elements can be transformed and have round corners so there's no way to analytically calculate child pixel position relative to parent.

Does gpu guaranties order of execution?

09 February 2015 - 06:31 AM



In my application I render indexed geometry. For simplicity, let's talk about two triangles A and B. In my index buffer I set first three vertices of triangle A and next tree vertices of triangle B. I turned off depth testing and in my application I always have B above A.


As far as I know - gpu is extremely parallel in execution. And I'm imagine that vertex shader is run for all six vertices at the same time. Next come pixel shader. I also think that pixels processed in parallel, i.e. same pixel can be processed at the same time for A and B. Since I turned off depth/stencil test, pixel shader executed twice for single pixel on the screen. But why B always "wins"? If both pixels processed at the same time there can be situations that pixel shader for A took longer to execute (but I never saw this). Is there a rule? Is there a guarantee?