Jump to content

  • Log In with Google      Sign In   
  • Create Account

We're offering banner ads on our site from just $5!

1. Details HERE. 2. GDNet+ Subscriptions HERE. 3. Ad upload HERE.


Member Since 29 Mar 2007
Offline Last Active Today, 02:10 PM

#5175329 Is it possible to bind an ID3D11RenderTargetView to different slots at the sa...

Posted by MJP on 21 August 2014 - 01:20 PM

I haven't actually tried it, but I'm fairly certain that it won't work since it would result in ambiguity as to which pixel shader output actually gets written to memory. You could verify very easily by trying it with the debug runtimes enabled, and checking for errors in the debugger output.

#5174370 VSM without filtering the maps and easiest way to filter them

Posted by MJP on 17 August 2014 - 11:45 PM

Hardware filtering on its own is not going to give you a large penumbra. A large penumbra is achieved by using a very large filter kernel, and bilinear is only 2x2. Trilinear and anisotropic are only going to kick in once there's some minification involved (shadow map resolution is greater than than the effective pixel shader resolution), and are designed to preserve texture detail while avoiding aliasing. If you want a large penumbra that "softens" your shadows everywhere, then you should pre-filter your shadow map with a large filter kernel (try 7x7 or so).

If you were using PCF, you could not pre-filter your shadow map. You would have to filter in the pixel shader, using however many samples are necessary. Since VSM's are filterable, you can use a separable filter kernel when pre-filtering to reduce the actual number of samples required. In addition, if you use a caching scheme for your shadow maps then you can amortize the filtering cost across multiple frames by caching the filtered result. With standard shadow maps, you must pay the filtering cost every time you sample the shadow map. 

Aside from that, VSM also lets you utilize MSAA and hardware trilinear + anisotropic filtering. MSAA will increase your rasterization resolution to reduce sub-pixel aliasing, but you won't have to pay the cost of filtering or sampling at that increased resolution. As I already mentioned, trilinear and anisotropic will prevent aliasing when there is minification involved. This usually happens when viewing surfaces at a grazing angle. Here's some images from my sample app to show you what I mean:


Shadows_PCF.png Shadows_Aniso.png


The image on the left shows shadows being cast onto a ground plane, using only 2x2 PCF filtering. You can see that as you get further away from the camera, the shadows just turn into a mess of aliasing artifacts due to extreme undersampling. The image on the right is using EVSM (a variant of VSM) with mipmaps and 16x anisotropic filtering. Notice how the shadows don't have the same aliasing artifacts, and smoothly fade into a mix of shadowed and non-shadowed lighting.


#5173690 dx12 - dynamically indexable resources -what exactly it means?

Posted by MJP on 14 August 2014 - 02:49 PM

Yes, that's correct: it allows a shader to dynamically select a texture without having to use atlases or texture arrays. However there may be performance implications for doing this, depending on the hardware.

#5173689 Why does Deferred Lighting work with MSAA?

Posted by MJP on 14 August 2014 - 02:48 PM

Your understanding is correct. In order to get the "correct" MSAA results (same results as if you used forward rendering), your lighting pass has to output lighting for every subsample in each pixel. Then, your geometry pass would also have to sample only the relevant subsamples from the lighting buffer based on the coverage of the triangle over the pixel. If you don't get this then your per-pixel lighting will cause strange artifacts at edges, and the results won't really appear to be antialiased (especially when using HDR lighting with arbitrary intensities).


Deferred lighting for situations where you really need a very small G-Buffer (TBDR GPU's on phones, Xbox 360 eDRAM) but otherwise it really doesn't have any advantages over a more traditional deferred rendering setup.

#5173659 Is it possible to load asm/FX-Shaders in DirectX11

Posted by MJP on 14 August 2014 - 12:23 PM

DX11 only supports loading bytecode that was compiled by the HLSL compiler. There is no longer an assembler, so your only option is to author in HLSL. HLSL also changed quite a bit since DX9, so even your shader code will likely need to be overhauled to make it compile and work.


Also, this forum supports "code" tags for posting code if you'd like to use that in the future.

#5173224 Unbinding resources?

Posted by MJP on 12 August 2014 - 06:08 PM

There are generally 2 reasons to un-bind a resource after you're done drawing with it:


  1. To prevent DX errors and warnings caused by having a resource simultaneously bound as an input and an output. For instance if you had a texture bound to slot 5 and then you tried to bind it as a render target, you will get an error from the debug runtime saying that there's a read/write conflict.
  2. To make it clear which resources are actually being used when using a graphical debugging tool such as PIX/VS Graphics Diagonostics/RenderDoc/etc.


The first one is pretty easy to handle, since you're usually not switching render targets very frequently except during post-processing. For the second one, I usually have a conditional macro that enables unbinding resources after draw calls during debug builds. 

#5172900 Directx 11 constant buffer question

Posted by MJP on 11 August 2014 - 02:38 PM

HLSL has some weird packing/alignment rules for constant buffers due to legacy hardware issues. The rules are described here, and I would suggest reading them.


What's getting you in this case is that your "w" value will get aligned to the next 16-byte boundary to satisfy the packing rules, which state that a vector type can't span a 16-byte boundary. So when w is a float3 it will start at byte offset 80, but when you use scalar types they start at byte offset 76.

#5172530 Draw Textures as Lightflares

Posted by MJP on 09 August 2014 - 07:05 PM

Most games will only draw lens flare sprites at hand-selected locations in their level (usually light sources). Some games will use a screen-space approach similar to bloom, where bright spots have a filter kernel applied to them and the result is composited over the screen (this is mostly likely what's being done in the screenshot that you posted). This is nice because it affects everything on the screen, but it can be difficult to achieve interesting shapes since you have to achieve them using filtering kernels. It also doesn't let you generate flares for off-screen features. In The Order we use a screen space approach, however we use FFT's to apply the filtering in the frequency domain. This lets you use arbitrary kernels from a texture, which is cool. However the FFT is expensive, so you pretty much have to do it at a low resolution. 

With D3D11-level hardware you could certainly try an approach where you analyze the screen, and spawn sprites at bright locations. Append buffers are useful for this, since you can analyze each pixel and throw the bright ones into your append buffer. Then you can use DrawIndirect to render all of the sprites in the buffer. I played around with this a long time ago and it's definitely workable, but at the time I had some trouble making it temporally stable while also keeping the amount of overdraw to a minimum.

#5172529 Alpha-Test and Forward+ Rendering (Z-prepass questions)

Posted by MJP on 09 August 2014 - 06:51 PM

Alpha-tested geometry tends to mess up z-buffer compression and hierarchical z representations. So usually you want to render it after your "normal" opaques, so that the normal geometry can get the benefit of full-speed depth testing.


As for why they return a color from their pixel shader...I have no idea. In our engine we use a void return type for our alpha-tested depth-only pixel shader.

#5172526 Point light shadowmapping

Posted by MJP on 09 August 2014 - 06:35 PM

It looks like you're not setting up your depth stencil views correctly. If you want 6 DSV's where each one targets a single face of your cube map, then you should use D3D11_DSV_DIMENSION_TEXTURE2DARRAY and re-enable the code that you commented out for specifying the array slice.

#5172126 Sampler questions

Posted by MJP on 07 August 2014 - 03:00 PM

  1. It is completely valid for a texture to only have 1 mip level, and still get sampled with mipmapping enabled. It may not be as efficient as it could if you used a different sampler state, but it's still valid.
  2. Sure, you can re-use the same sampler for those cases if you just want to sample them with filtering.

For things like your G-Buffer, you typically don't want to use a sampler at all. Usually you don't want any filtering at all, and you want to sample using a pixel coordinate instead of [0, 1] texture coordinates. This can be easily done using the Load() function, or by using the array operator []. Here's an example:

Texture2D GBufferAlbedo;
float4 PSMain(in float4 ScreenPos : SV_Position) : SV_Target0
    float3 albedo = GBufferAlbedo[uint2(ScreenPos.xy)].xyz;
    // ...do lighting stuff

#5171964 Cubemap texture as depth buffer (shadowmapping)

Posted by MJP on 06 August 2014 - 05:04 PM

Argh! I had a whole bunch of explanatory text following the code, but it looks like the editor ate it. Sorry about that. sad.png


The typeless format is definitely the trickiest part of this. Basically D3D will make the distinction between depth and color formats, so if you want to alias a texture using both types then you need to use a TYPELESS format when creating the underlying texture resource. Then you use the "D" format when creating  the depth stencil view, and the "R" format when creating the shader resource view. This way the depth stencil view interprets the data as a depth buffer, and the shader resource view interprets it as a color texture.


The code I posted doesn't actually handle cubemaps, it just handles texture arrays. However cubemaps are really just a special case of texture arrays with an array size of 6. You'll definitely want to continue using D3D11_RESOURCE_MISC_TEXTURECUBE flag if you're creating a cubemap. To render to each face individually, you'll want to do something similar to what I'm doing in the code I pasted where you create 6 separate depth stencil views. When you create the DSV, you can essentially bind it to a single slice of a texture array. So you can use that to effectively target a single face. Then just bind the appropriate DSV to your context when rendering. Also note that if you're using a cubemap, you'll want to use D3D11_SRV_DIMENSION_TEXTURECUBE (along with the corresponding TextureCube member of the SRV desc structure) instead of Texture2D or Texture2DArray.


The read-only DSV is just something I've used in a few special cases, you probably won't need it. Normally D3D doesn't let you bind a depth buffer to the context if you also have it simultaneously bound as a shader resource view. This is to prevent read-write ordering conflicts. However if you create a read-only DSV, then you can have it bound as both a depth buffer and an SRV simultaneously since it's read-only in both cases. However you of course can only do depth testing like this, and can't enable depth writes.

#5171775 Creating readable mipmaps in D3D11

Posted by MJP on 05 August 2014 - 08:22 PM

You need to use the RowPitch member of D3D11_MAPPED_SUBRESOURCE when reading your mapped staging texture. Staging textures can have their width padded in order to accommodate hardware requirements, so you need to take it into account in your code. Typically what you'll do is read the data one row at a time in a loop. For each iteration you'll memcpy a single row of unpadded texture data, and then increment your source pointer by the pitch size.

#5171771 Cubemap texture as depth buffer (shadowmapping)

Posted by MJP on 05 August 2014 - 08:11 PM

This is actually a little tricky to get right in D3D11, so I don't blame you for having trouble. Here's some code from my sample framework should help you:


void DepthStencilBuffer::Initialize(ID3D11Device* device,
                                    uint32 width,
                                    uint32 height,
                                    DXGI_FORMAT format,
                                    bool32 useAsShaderResource,
                                    uint32 multiSamples,
                                    uint32 msQuality,
                                    uint32 arraySize)
    uint32 bindFlags = D3D11_BIND_DEPTH_STENCIL;
    if (useAsShaderResource)
        bindFlags |= D3D11_BIND_SHADER_RESOURCE;
    DXGI_FORMAT dsTexFormat;
    if (!useAsShaderResource)
        dsTexFormat = format;
    else if (format == DXGI_FORMAT_D16_UNORM)
        dsTexFormat = DXGI_FORMAT_R16_TYPELESS;
    else if(format == DXGI_FORMAT_D24_UNORM_S8_UINT)
        dsTexFormat = DXGI_FORMAT_R24G8_TYPELESS;
        dsTexFormat = DXGI_FORMAT_R32_TYPELESS;
    D3D11_TEXTURE2D_DESC desc;
    desc.Width = width;
    desc.Height = height;
    desc.ArraySize = arraySize;
    desc.BindFlags = bindFlags;
    desc.CPUAccessFlags = 0;
    desc.Format = dsTexFormat;
    desc.MipLevels = 1;
    desc.MiscFlags = 0;
    desc.SampleDesc.Count = multiSamples;
    desc.SampleDesc.Quality = msQuality;
    desc.Usage = D3D11_USAGE_DEFAULT;
    DXCall(device->CreateTexture2D(&desc, nullptr, &Texture));
    for (uint32 i = 0; i < arraySize; ++i)
        D3D11_DEPTH_STENCIL_VIEW_DESC dsvDesc;
        ID3D11DepthStencilViewPtr dsView;
        dsvDesc.Format = format;
        if (arraySize == 1)
            dsvDesc.ViewDimension = multiSamples > 1 ? D3D11_DSV_DIMENSION_TEXTURE2DMS : D3D11_DSV_DIMENSION_TEXTURE2D;
            dsvDesc.Texture2D.MipSlice = 0;
            if(multiSamples > 1)
                dsvDesc.ViewDimension = D3D11_DSV_DIMENSION_TEXTURE2DMSARRAY;
                dsvDesc.Texture2DMSArray.ArraySize = 1;
                dsvDesc.Texture2DMSArray.FirstArraySlice = i;
                dsvDesc.ViewDimension = D3D11_DSV_DIMENSION_TEXTURE2DARRAY;
                dsvDesc.Texture2DArray.ArraySize = 1;
                dsvDesc.Texture2DArray.FirstArraySlice = i;
                dsvDesc.Texture2DArray.MipSlice = 0;
        dsvDesc.Flags = 0;
        DXCall(device->CreateDepthStencilView(Texture, &dsvDesc, &dsView));
        if (i == 0)
            // Also create a read-only DSV
            dsvDesc.Flags = D3D11_DSV_READ_ONLY_DEPTH;
            if (format == DXGI_FORMAT_D24_UNORM_S8_UINT || format == DXGI_FORMAT_D32_FLOAT_S8X24_UINT)
                dsvDesc.Flags |= D3D11_DSV_READ_ONLY_STENCIL;
            DXCall(device->CreateDepthStencilView(Texture, &dsvDesc, &ReadOnlyDSView));
            dsvDesc.Flags = 0;
    DSView = ArraySlices[0];
    if (useAsShaderResource)
        DXGI_FORMAT dsSRVFormat;
        if (format == DXGI_FORMAT_D16_UNORM)
            dsSRVFormat = DXGI_FORMAT_R16_UNORM;
        else if(format == DXGI_FORMAT_D24_UNORM_S8_UINT)
            dsSRVFormat = DXGI_FORMAT_R24_UNORM_X8_TYPELESS ;
            dsSRVFormat = DXGI_FORMAT_R32_FLOAT;
        srvDesc.Format = dsSRVFormat;
        if (arraySize == 1)
            srvDesc.ViewDimension = multiSamples > 1 ? D3D11_SRV_DIMENSION_TEXTURE2DMS : D3D11_SRV_DIMENSION_TEXTURE2D;
            srvDesc.Texture2D.MipLevels = 1;
            srvDesc.Texture2D.MostDetailedMip = 0;
            srvDesc.ViewDimension = multiSamples > 1 ? D3D11_SRV_DIMENSION_TEXTURE2DMSARRAY : D3D11_SRV_DIMENSION_TEXTURE2DARRAY;
            srvDesc.Texture2DArray.ArraySize = arraySize;
            srvDesc.Texture2DArray.FirstArraySlice = 0;
            srvDesc.Texture2DArray.MipLevels = 1;
            srvDesc.Texture2DArray.MostDetailedMip = 0;
        DXCall(device->CreateShaderResourceView(Texture, &srvDesc, &SRView));
        SRView = nullptr;
    Width = width;
    Height = height;
    MultiSamples = multiSamples;
    Format = format;
    ArraySize = arraySize;

#5171760 High cpu usage

Posted by MJP on 05 August 2014 - 07:24 PM

Your  app is set up to render as quickly as it can, with no sleeping. For such a simple scene, you're probably updating and rendering 100's of frames per second. This will naturally result in your main thread saturating a core of your CPU. There's 3 easy things you can do to fix this:


  1. Use VSYNC. This is done by passing "1" as the first parameter of IDXGISwapChain::Present. Doing this will cause the CPU and GPU to wait for the next sync interval, which will essentially prevent your app from running faster than the refresh rate of your display (typically 60Hz). If you're not doing very much in a frame, then your thread will spend a lot of time just waiting which will keep your CPU usage low. However if you start do a lot of CPU work for each frame, then you will see your CPU usage increase.
  2. Do lots of work on the GPU. If your GPU is take a long time per frame to render, the CPU will have to wait for it which will give it some idle time. 
  3. Call Sleep(). For games this is usually not a good idea, since you want your game to achieve as high a framerate as possible. However if you're not making a game and framerate is not a concern, then it's the best way to guarantee that your thread doesn't saturate a CPU core.