Jump to content
  • Advertisement
Holy Fuzz

DX11 What are possible causes of a hung Direct3D 11 device?

Recommended Posts

I am working on a game (shameless plug: Cosmoteer) that is written in a custom game engine on top of Direct3D 11. (It's written in C# using SharpDX, though I think that's immaterial to the problem at hand.)

The problem I'm having is that a small but understandably-frustrated percentage of my players (about 1.5% of about 10K players/day) are getting frequent device hangs. Specifically, the call to IDXGISwapChain::Present() is failing with DXGI_ERROR_DEVICE_REMOVED, and calling GetDeviceRemovedReason() returns DXGI_ERROR_DEVICE_HUNG. I'm not ready to dismiss the errors as unsolveable driver issues because these players claim to not be having problems with any other games, and there are more complaints on my own forums about this issue than there are for games with orders of magnitude more players.

My first debugging step was, of course, to turn on the Direct3D debug layer and look for any errors/warnings in the output. Locally, the game runs 100% free of any errors or warnings. (And yes, I verified that I'm actually getting debug output by deliberately causing a warning.) I've also had several players run the game with the debug layer turned on, and they are also 100% free of errors/warnings, except for the actual hung device:

[MessageIdDeviceRemovalProcessAtFault] [Error] [Execution] : ID3D11Device::RemoveDevice: Device removal has been triggered for the following reason (DXGI_ERROR_DEVICE_HUNG: The Device took an unreasonable amount of time to execute its commands, or the hardware crashed/hung. As a result, the TDR (Timeout Detection and Recovery) mechanism has been triggered. The current Device Context was executing commands when the hang occurred. The application may want to respawn and fallback to less aggressive use of the display hardware).

So something my game is doing is causing the device to hang and the TDR to be triggered for a small percentage of players. The latest update of my game measures the time spent in IDXGISwapChain::Present(), and indeed in every case of a hung device, it spends more than 2 seconds in Present() before returning the error. AFAIK my game isn't doing anything particularly "aggressive" with the display hardware, and logs report that average FPS for the few seconds before the hang is usually 60+.

So now I'm pretty stumped! I have zero clues about what specifically could be causing the hung device for these players, and I can only debug post-mortem since I can't reproduce the issue locally. Are there any additional ways to figure out what could be causing a hung device? Are there any common causes of this?

Here's my remarkably un-interesting Present() call:

SwapChain.Present(_vsyncIn ? 1 : 0, PresentFlags.None);

I'd be happy to share any other code that might be relevant, though I don't myself know what that might be. (And if anyone is feeling especially generous with their time and wants to look at my full code, I can give you read access to my Git repo on Bitbucket.)

Some additional clues and things I've already investigated:

1. The errors happen on all OS'es my game supports (Windows 7, 8, 10, both 32-bit and 64-bit), GPU vendors (Intel, Nvidia, AMD), and driver versions. I've been unable to discern any patterns with the game hanging on specific hardware or drivers.

2. For the most part, the hang seems to happen at random. Some individual players report it crashes in somewhat consistent places (such as on startup or when doing a certain action in the game), but there is no consistency between players.

3. Many players have reported that turning on V-Sync significantly reduces (but does not eliminate) the errors.

4. I have assured that my code never makes calls to the immediate context or DXGI on multiple threads at the same time by wrapping literally every call to the immediate context and DXGI in a mutex region (C# lock statement). (My code *does* sometimes make calls to the immediate context off the main thread to create resources, but these calls are always synchronized with the main thread.) I also tried synchronizing all calls to the D3D device as well, even though that's supposed to be thread-safe. (Which did not solve *this* problem, but did, curiously, fix another crash a few players were having.)

5. The handful of places where my game accesses memory through pointers (it's written in C#, so it's pretty rare to use raw pointers) are done through a special SafePtr that guards against out-of-bounds access and checks to make sure the memory hasn't been deallocated/unmapped. So I'm 99% sure I'm not writing to memory I shouldn't be writing to.

6. None of my shaders use any loops.

Thanks for any clues or insights you can provide. I know there's not a lot to go on here, which is part of my problem. I'm coming to you all because I'm out of ideas for what do investigate next, and I'm hoping someone else here has ideas for possible causes I can investigate.

Thanks again!
 

Share this post


Link to post
Share on other sites
Advertisement

Usually this problem happens because you have:

  1. Corrupted memory or similar memory error. e.g. setting a dangling pointer as a texture SRV is bad.
  2. Shader being used with uninitialized data (const buffer, tex buffer, vertex buffer, etc)
  3. Infinite loop inside shader. Be it vertex, compute, or pixel shader. This is often caused by uninitialized variables, or variables with very large values or NaNs.
  4. Some very obscure API usage the debug layer didn't catch.

However you ruled out most of these (except point #1 & #4). #1 is debugged the same way you debug any kind of memory corruption (either use a third party tool or override malloc and hook your own sanitizer).

Other causes for these issues are:

  • Out of date drivers. Seriously. This happens very often. Ask for driver version. If it's very old, ask them to update their drivers. This happens more often than you think. GPU problems that magically go away after updating drivers.
  • Overclocked / overheating systems. Simple games will often allow everything in the GPU to run at 100%, something AAA games often don't achieve (because there's usually a huge bottleneck somewhere). It would explain why using VSync helps with the problem.
  • Switchable graphics. Some notebooks may have Intel + NVIDIA GPUs combination (or Intel + AMD, but that's less common) and for some reason the system may have decided to switch the GPU (e.g. battery, thermal throttling).
  • Monitor issues. The user literally detached / unplugged the monitor to which the active GPU was rendering to. More common on laptops and Win 10 tablets.
  • Third party applications. Apps like MSI Afterburner, Plays.tv, Mumble hook themselves to D3D11 to intercept game's calls and either capture video or render overlays on top of it. They can also cause problems. Having a dump of all active processes when the game hung the GPU can be a good way to rule this out. If you find a common third app between a large percentage of these users, ask them to turn it off.

 

1 hour ago, Holy Fuzz said:

4. I have assured that my code never makes calls to the immediate context or DXGI on multiple threads at the same time by wrapping literally every call to the immediate context and DXGI in a mutex region (C# lock statement). (My code *does* sometimes make calls to the immediate context off the main thread to create resources, but these calls are always synchronized with the main thread.) I also tried synchronizing all calls to the D3D device as well, even though that's supposed to be thread-safe. (Which did not solve *this* problem, but did, curiously, fix another crash a few players were having.)
 

Errr, unless you're really really good at multithreading, you shouldn't be making D3D API calls from other threads. It's asking for a lot of problems.

This also explains why VSync diminishes the problem, since you're likely in a race condition and now the access patterns have changed.

IIRC accessing the immediate context from two threads is not allowed, even if protected by a mutex.

Update: It's allowed, but still you're asking for trouble. Also you better be synchronizing your context absolutely perfect.

Edited by Matias Goldberg

Share this post


Link to post
Share on other sites

Matias, that's very helpful info, thanks a lot! It gives me some good next-steps to look into.

 

25 minutes ago, Matias Goldberg said:

Infinite loop inside shader. Be it vertex, compute, or pixel shader. This is often caused by uninitialized variables, or variables with very large values or NaNs.

Can this only happen if I have explicit loops in my shader code, or it can it also happen by passing bad values to an intrinsic function?

 

25 minutes ago, Matias Goldberg said:

Simple games will often allow everything in the GPU to run at 100%, something AAA games often don't achieve (because there's usually a huge bottleneck somewhere). It would explain why using VSync helps with the problem.

I had a similar thought, so I added a by-default 100 FPS limit to my game. No discernible drop in device hangs though.

 

24 minutes ago, Matias Goldberg said:

Switchable graphics. Some notebooks may have Intel + NVIDIA GPUs combination (or Intel + AMD, but that's less common) and for some reason the system may have decided to switch the GPU (e.g. battery, thermal throttling).

Would this commonly cause a hang, or a different DEVICE_REMOVED error?

 

33 minutes ago, Matias Goldberg said:

Errr, unless you're really really good at multithreading, you shouldn't be making D3D API calls from other threads. It's asking for a lot of problems.

 

Is this just because multithreading is hard to get right, or additionally because there could be underlying problems in D3D/drivers that could be causing problems when called from multiple threads even with proper synchronization?

 

Thanks again for your help! Much appreciated.

Share this post


Link to post
Share on other sites
1 hour ago, Holy Fuzz said:

Can this only happen if I have explicit loops in my shader code, or it can it also happen by passing bad values to an intrinsic function?

I'm not sure I understand by "bad values to an intrinsic function".

As for your loops, if they look like these:

for( int i=0; i<4; ++i )
{
}
                   
//Or this:
#define LOOP_COUNT 4
for( int i=0; i<LOOP_COUNT; ++i )
{
}

It's fine. But if it looks like this:

uniform int myConstValue;

for( int i=0; i<myConstValue; ++i )
{
}

Then the value you pass to myConstValue is potentially dangerous (you better never send an insanely huge value)

1 hour ago, Holy Fuzz said:

Would this commonly cause a hang, or a different DEVICE_REMOVED error?

Normally yes, but lots of things can happen to report the wrong enum (driver bugs, the GPU actually hung while switching)

1 hour ago, Holy Fuzz said:

Is this just because multithreading is hard to get right, or additionally because there could be underlying problems in D3D/drivers that could be causing problems when called from multiple threads even with proper synchronization?

Both. Multithreading is hard to get right. You said you properly put a mutex around the immediate context... but do you really put the mutex on every single usage of the immediate context? Is it also possible the mutex is malfunctioning (i.e. unlocking from a thread without locking it first)?

Additionally, a driver may be reading data from the immediate context and assuming it's fully single threaded so it begins to read the data from a worker thread while you're actually still writing to it from a secondary thread. Technically, this would be a driver bug. It may even be fixed by now, but your user could be running an old driver.

Edited by Matias Goldberg

Share this post


Link to post
Share on other sites

Oh btw on loops:

If your loop is based on equality of floating point, then there could be issues. For example:

for( float x=0; x == 0.3; x += 0.1 )
{
}

May spin forever due to precision issues.

Share this post


Link to post
Share on other sites
7 hours ago, Holy Fuzz said:

My code *does* sometimes make calls to the immediate context off the main thread to create resources

Resource creation happens on the device, not the context, and the device is already thread safe. If all you're doing is creating resources, you probably don't need to share the immediate context like this.

Share this post


Link to post
Share on other sites
3 hours ago, Hodgman said:

Resource creation happens on the device, not the context, and the device is already thread safe. If all you're doing is creating resources, you probably don't need to share the immediate context like this.

Yeah, you're right. I think I was thinking of updating resources after creation, though looking over my code again, I don't think I actually ever do that on anywhere but the main thread.

That being said, I fixed another GPU crash some players were experiencing simply by wrapping calls to the device in lock statements, so I'm not convinced that all drivers are as thread-safe as they're supposed to be.

Share this post


Link to post
Share on other sites
On 7/28/2017 at 4:32 PM, Holy Fuzz said:

That being said, I fixed another GPU crash some players were experiencing simply by wrapping calls to the device in lock statements, so I'm not convinced that all drivers are as thread-safe as they're supposed to be.

Lots of other games rely on the D3D device adhering to its thread safety promises, and would crash if those promises weren't kept.

So either - you've found a thread-safety bug in D3D which only occurs in your game, or, you've got a threading bug elsewhere in your code, and adding extra sync points elsewhere in your program has happened to change the timings just enough that the race is now unlikely to occur (but still lurking as a non-symptomatic bug).

Share this post


Link to post
Share on other sites

According to MSDN, thread safety is managed by the runtime rather than the driver (clicky), so I'd consider it even more unlikely that the OP has found a bug in this behaviour.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Advertisement
  • Advertisement
  • Popular Tags

  • Similar Content

    • By Void
      Hi, I'm trying to do a comparision with DirectInput GUID e.g GUID_XAxis, GUID_YAxis from a value I get from GetProperty
      eg
      DIPROPRANGE propRange;

      DIJoystick->GetProperty (DIPROP_RANGE, &propRange.diph);
      // This will crash
      if (GUID_XAxis == MAKEDIPROP (propRange.diph.dwObj))
        ;

      How should I be comparing the GUID from GetProperty?
    • By MikhailGorobets
      I have a problem with SSAO. On left hand black area.
      Code shader:
      Texture2D<uint> texGBufferNormal : register(t0); Texture2D<float> texGBufferDepth : register(t1); Texture2D<float4> texSSAONoise : register(t2); float3 GetUV(float3 position) { float4 vp = mul(float4(position, 1.0), ViewProject); vp.xy = float2(0.5, 0.5) + float2(0.5, -0.5) * vp.xy / vp.w; return float3(vp.xy, vp.z / vp.w); } float3 GetNormal(in Texture2D<uint> texNormal, in int3 coord) { return normalize(2.0 * UnpackNormalSphermap(texNormal.Load(coord)) - 1.0); } float3 GetPosition(in Texture2D<float> texDepth, in int3 coord) { float4 position = 1.0; float2 size; texDepth.GetDimensions(size.x, size.y); position.x = 2.0 * (coord.x / size.x) - 1.0; position.y = -(2.0 * (coord.y / size.y) - 1.0); position.z = texDepth.Load(coord); position = mul(position, ViewProjectInverse); position /= position.w; return position.xyz; } float3 GetPosition(in float2 coord, float depth) { float4 position = 1.0; position.x = 2.0 * coord.x - 1.0; position.y = -(2.0 * coord.y - 1.0); position.z = depth; position = mul(position, ViewProjectInverse); position /= position.w; return position.xyz; } float DepthInvSqrt(float nonLinearDepth) { return 1 / sqrt(1.0 - nonLinearDepth); } float GetDepth(in Texture2D<float> texDepth, float2 uv) { return texGBufferDepth.Sample(samplerPoint, uv); } float GetDepth(in Texture2D<float> texDepth, int3 screenPos) { return texGBufferDepth.Load(screenPos); } float CalculateOcclusion(in float3 position, in float3 direction, in float radius, in float pixelDepth) { float3 uv = GetUV(position + radius * direction); float d1 = DepthInvSqrt(GetDepth(texGBufferDepth, uv.xy)); float d2 = DepthInvSqrt(uv.z); return step(d1 - d2, 0) * min(1.0, radius / abs(d2 - pixelDepth)); } float GetRNDTexFactor(float2 texSize) { float width; float height; texGBufferDepth.GetDimensions(width, height); return float2(width, height) / texSize; } float main(FullScreenPSIn input) : SV_TARGET0 { int3 screenPos = int3(input.Position.xy, 0); float depth = DepthInvSqrt(GetDepth(texGBufferDepth, screenPos)); float3 normal = GetNormal(texGBufferNormal, screenPos); float3 position = GetPosition(texGBufferDepth, screenPos) + normal * SSAO_NORMAL_BIAS; float3 random = normalize(2.0 * texSSAONoise.Sample(samplerNoise, input.Texcoord * GetRNDTexFactor(SSAO_RND_TEX_SIZE)).rgb - 1.0); float SSAO = 0; [unroll] for (int index = 0; index < SSAO_KERNEL_SIZE; index++) { float3 dir = reflect(SamplesKernel[index].xyz, random); SSAO += CalculateOcclusion(position, dir * sign(dot(dir, normal)), SSAO_RADIUS, depth); } return 1.0 - SSAO / SSAO_KERNEL_SIZE; }  



    • By MarcusAseth
      I've been following this tutorial -> https://www.3dgep.com/introduction-to-directx-11/#The_Main_Function , did all the steps,and I ended up with the main.cpp you can see below.
      The problem is the call at line 516 
      g_d3dDeviceContext->UpdateSubresource(g_d3dConstantBuffers[CB_Frame], 0, nullptr, &g_ViewMatrix, 0, 0); which is crashing the program, and the very odd thing is that the first time trough it works fine, it crash the app the second time it is called...
      Can someone help me understand why? 😕    I have no idea...
      #include <Direct3D_11PCH.h> //Shaders using namespace DirectX; // Globals //Window const unsigned g_WindowWidth = 1024; const unsigned g_WindowHeight = 768; const char* g_WindowClassName = "DirectXWindowClass"; const char* g_WindowName = "DirectX 11"; HWND g_WinHnd = nullptr; const bool g_EnableVSync = true; //Device and SwapChain ID3D11Device* g_d3dDevice = nullptr; ID3D11DeviceContext* g_d3dDeviceContext = nullptr; IDXGISwapChain* g_d3dSwapChain = nullptr; //RenderTarget view ID3D11RenderTargetView* g_d3dRenderTargerView = nullptr; //DepthStencil view ID3D11DepthStencilView* g_d3dDepthStencilView = nullptr; //Depth Buffer Texture ID3D11Texture2D* g_d3dDepthStencilBuffer = nullptr; // Define the functionality of the depth/stencil stages ID3D11DepthStencilState* g_d3dDepthStencilState = nullptr; // Define the functionality of the rasterizer stage ID3D11RasterizerState* g_d3dRasterizerState = nullptr; D3D11_VIEWPORT g_Viewport{}; //Vertex Buffer data ID3D11InputLayout* g_d3dInputLayout = nullptr; ID3D11Buffer* g_d3dVertexBuffer = nullptr; ID3D11Buffer* g_d3dIndexBuffer = nullptr; //Shader Data ID3D11VertexShader* g_d3dVertexShader = nullptr; ID3D11PixelShader* g_d3dPixelShader = nullptr; //Shader Resources enum ConstantBuffer { CB_Application, CB_Frame, CB_Object, NumConstantBuffers }; ID3D11Buffer* g_d3dConstantBuffers[ConstantBuffer::NumConstantBuffers]; //Demo parameter XMMATRIX g_WorldMatrix; XMMATRIX g_ViewMatrix; XMMATRIX g_ProjectionMatrix; // Vertex data for a colored cube. struct VertexPosColor { XMFLOAT3 Position; XMFLOAT3 Color; }; VertexPosColor g_Vertices[8] = { { XMFLOAT3(-1.0f, -1.0f, -1.0f), XMFLOAT3(0.0f, 0.0f, 0.0f) }, // 0 { XMFLOAT3(-1.0f, 1.0f, -1.0f), XMFLOAT3(0.0f, 1.0f, 0.0f) }, // 1 { XMFLOAT3(1.0f, 1.0f, -1.0f), XMFLOAT3(1.0f, 1.0f, 0.0f) }, // 2 { XMFLOAT3(1.0f, -1.0f, -1.0f), XMFLOAT3(1.0f, 0.0f, 0.0f) }, // 3 { XMFLOAT3(-1.0f, -1.0f, 1.0f), XMFLOAT3(0.0f, 0.0f, 1.0f) }, // 4 { XMFLOAT3(-1.0f, 1.0f, 1.0f), XMFLOAT3(0.0f, 1.0f, 1.0f) }, // 5 { XMFLOAT3(1.0f, 1.0f, 1.0f), XMFLOAT3(1.0f, 1.0f, 1.0f) }, // 6 { XMFLOAT3(1.0f, -1.0f, 1.0f), XMFLOAT3(1.0f, 0.0f, 1.0f) } // 7 }; WORD g_Indicies[36] = { 0, 1, 2, 0, 2, 3, 4, 6, 5, 4, 7, 6, 4, 5, 1, 4, 1, 0, 3, 2, 6, 3, 6, 7, 1, 5, 6, 1, 6, 2, 4, 0, 3, 4, 3, 7 }; //Forward Declaration LRESULT CALLBACK WindowProc(HWND hwnd, UINT msg, WPARAM wParam, LPARAM lParam); bool LoadContent(); int Run(); void Update(float deltaTime); void Clear(const FLOAT clearColor[4], FLOAT clearDepth, UINT8 clearStencil); void Present(bool vSync); void Render(); void CleanUp(); int InitApplication(HINSTANCE hInstance, int cmdShow); int InitDirectX(HINSTANCE hInstance, BOOL vsync); int WINAPI WinMain(HINSTANCE hInstance, HINSTANCE hPrevInstance, LPSTR cmd, int cmdShow) { UNREFERENCED_PARAMETER(hPrevInstance); UNREFERENCED_PARAMETER(cmd); // Check for DirectX Math library support. if (!XMVerifyCPUSupport()) { MessageBox(nullptr, TEXT("Failed to verify DirectX Math library support."), nullptr, MB_OK); return -1; } if (InitApplication(hInstance, cmdShow) != 0) { MessageBox(nullptr, TEXT("Failed to create applicaiton window."), nullptr, MB_OK); return -1; } if (InitDirectX(hInstance, g_EnableVSync) != 0) { MessageBox(nullptr, TEXT("Failed to initialize DirectX."), nullptr, MB_OK); CleanUp(); return -1; } if (!LoadContent()) { MessageBox(nullptr, TEXT("Failed to load content."), nullptr, MB_OK); CleanUp(); return -1; } int returnCode = Run(); CleanUp(); return returnCode; } int Run() { MSG msg{}; static DWORD previousTime = timeGetTime(); static const float targetFramerate = 30.0f; static const float maxTimeStep = 1.0f / targetFramerate; while (msg.message != WM_QUIT) { if (PeekMessage(&msg, 0, 0, 0, PM_REMOVE)) { TranslateMessage(&msg); DispatchMessage(&msg); } else { DWORD currentTime = timeGetTime(); float deltaTime = (currentTime - previousTime) / 1000.0f; previousTime = currentTime; deltaTime = std::min<float>(deltaTime, maxTimeStep); Update(deltaTime); Render(); } } return static_cast<int>(msg.wParam); } LRESULT CALLBACK WindowProc(HWND hwnd, UINT msg, WPARAM wParam, LPARAM lParam) { PAINTSTRUCT paintstruct; HDC hDC; switch (msg) { case WM_PAINT: { hDC = BeginPaint(hwnd, &paintstruct); EndPaint(hwnd, &paintstruct); }break; case WM_DESTROY: { PostQuitMessage(0); }break; default: return DefWindowProc(hwnd, msg, wParam, lParam); break; } return 0; } int InitApplication(HINSTANCE hInstance, int cmdShow) { //Register Window class WNDCLASSEX mainWindow{}; mainWindow.cbSize = sizeof(WNDCLASSEX); mainWindow.style = CS_HREDRAW | CS_VREDRAW; mainWindow.lpfnWndProc = &WindowProc; mainWindow.hInstance = hInstance; mainWindow.hCursor = LoadCursor(NULL, IDC_ARROW); mainWindow.hbrBackground = (HBRUSH)(COLOR_WINDOW + 1); mainWindow.lpszMenuName = nullptr; mainWindow.lpszClassName = g_WindowClassName; if (!RegisterClassEx(&mainWindow)) { return -1; } RECT client{ 0,0,g_WindowWidth,g_WindowHeight }; AdjustWindowRect(&client, WS_OVERLAPPEDWINDOW, false); // Create Window g_WinHnd = CreateWindowEx(NULL, g_WindowClassName, g_WindowName, WS_OVERLAPPEDWINDOW | WS_VISIBLE, CW_USEDEFAULT, CW_USEDEFAULT, client.right - client.left, client.bottom - client.top, nullptr, nullptr, hInstance, nullptr); if (!g_WinHnd) { return -1; } UpdateWindow(g_WinHnd); return 0; } int InitDirectX(HINSTANCE hInstance, BOOL vsync) { assert(g_WinHnd != nullptr); RECT client{}; GetClientRect(g_WinHnd, &client); unsigned int clientWidth = client.right - client.left; unsigned int clientHeight = client.bottom - client.top; //Direct3D Initialization HRESULT hr{}; //SwapChainDesc DXGI_RATIONAL refreshRate = vsync ? DXGI_RATIONAL{ 1, 60 } : DXGI_RATIONAL{ 0, 1 }; DXGI_SWAP_CHAIN_DESC swapChainDesc{}; swapChainDesc.BufferDesc.Width = clientWidth; swapChainDesc.BufferDesc.Height = clientHeight; swapChainDesc.BufferDesc.RefreshRate = refreshRate; swapChainDesc.BufferDesc.Format = DXGI_FORMAT_R8G8B8A8_UNORM; swapChainDesc.BufferDesc.Scaling = DXGI_MODE_SCALING_CENTERED; swapChainDesc.SampleDesc.Count = 1; swapChainDesc.SampleDesc.Quality = 0; swapChainDesc.BufferUsage = DXGI_USAGE_RENDER_TARGET_OUTPUT; swapChainDesc.BufferCount = 1; swapChainDesc.OutputWindow = g_WinHnd; swapChainDesc.Windowed = true; swapChainDesc.SwapEffect = DXGI_SWAP_EFFECT_DISCARD; UINT createDeviceFlags{}; #if _DEBUG createDeviceFlags = D3D11_CREATE_DEVICE_DEBUG; #endif //Feature levels const D3D_FEATURE_LEVEL features[]{ D3D_FEATURE_LEVEL_11_0 }; D3D_FEATURE_LEVEL featureLevel; hr = D3D11CreateDeviceAndSwapChain( nullptr, D3D_DRIVER_TYPE_HARDWARE, nullptr, createDeviceFlags, features, _countof(features), D3D11_SDK_VERSION, &swapChainDesc, &g_d3dSwapChain, &g_d3dDevice, &featureLevel, &g_d3dDeviceContext ); if (FAILED(hr)) { return -1; } //Render Target View ID3D11Texture2D* backBuffer; hr = g_d3dSwapChain->GetBuffer(0, __uuidof(ID3D11Texture2D), reinterpret_cast<void**>(&backBuffer)); if (FAILED(hr)) { return -1; } hr = g_d3dDevice->CreateRenderTargetView(backBuffer, nullptr, &g_d3dRenderTargerView); if (FAILED(hr)) { return -1; } SafeRelease(backBuffer); //Depth Stencil View D3D11_TEXTURE2D_DESC depthStencilBufferDesc{}; depthStencilBufferDesc.Width = clientWidth; depthStencilBufferDesc.Height = clientHeight; depthStencilBufferDesc.MipLevels = 1; depthStencilBufferDesc.ArraySize = 1; depthStencilBufferDesc.Format = DXGI_FORMAT_D24_UNORM_S8_UINT; depthStencilBufferDesc.SampleDesc.Count = 1; depthStencilBufferDesc.SampleDesc.Quality = 0; depthStencilBufferDesc.Usage = D3D11_USAGE_DEFAULT; depthStencilBufferDesc.BindFlags = D3D11_BIND_DEPTH_STENCIL; hr = g_d3dDevice->CreateTexture2D(&depthStencilBufferDesc, nullptr, &g_d3dDepthStencilBuffer); if (FAILED(hr)) { return -1; } hr = g_d3dDevice->CreateDepthStencilView(g_d3dDepthStencilBuffer, nullptr, &g_d3dDepthStencilView); if (FAILED(hr)) { return -1; } //Set States D3D11_DEPTH_STENCIL_DESC depthStencilStateDesc{}; depthStencilStateDesc.DepthEnable = true; depthStencilStateDesc.DepthWriteMask = D3D11_DEPTH_WRITE_MASK_ALL; depthStencilStateDesc.DepthFunc = D3D11_COMPARISON_LESS; depthStencilStateDesc.StencilEnable = false; hr = g_d3dDevice->CreateDepthStencilState(&depthStencilStateDesc, &g_d3dDepthStencilState); if (FAILED(hr)) { return -1; } D3D11_RASTERIZER_DESC rasterizerStateDesc{}; rasterizerStateDesc.FillMode = D3D11_FILL_SOLID; rasterizerStateDesc.CullMode = D3D11_CULL_BACK; rasterizerStateDesc.FrontCounterClockwise = FALSE; rasterizerStateDesc.DepthClipEnable = TRUE; rasterizerStateDesc.ScissorEnable = FALSE;; rasterizerStateDesc.MultisampleEnable = FALSE; hr = g_d3dDevice->CreateRasterizerState(&rasterizerStateDesc, &g_d3dRasterizerState); if (FAILED(hr)) { return -1; } //Set Viewport g_Viewport.Width = static_cast<float>(clientWidth); g_Viewport.Height = static_cast<float>(clientHeight); g_Viewport.TopLeftX = 0.0f; g_Viewport.TopLeftY = 0.0f; g_Viewport.MinDepth = 0.0f; g_Viewport.MaxDepth = 1.0f; return 0; } bool LoadContent() { //Load Shaders HRESULT hr; assert(g_d3dDevice); //VS ID3DBlob* vsBlob = nullptr; D3DReadFileToBlob(L"../Shaders/SimpleVertexShader.cso", &vsBlob); assert(vsBlob); hr = g_d3dDevice->CreateVertexShader(vsBlob->GetBufferPointer(), vsBlob->GetBufferSize(), nullptr, &g_d3dVertexShader); if (FAILED(hr)) { SafeRelease(vsBlob); return false; } //Create VS Input Layout D3D11_INPUT_ELEMENT_DESC vertexLayoutDesc[] = { { "POSITION", 0, DXGI_FORMAT_R32G32B32A32_FLOAT, 0, offsetof(VertexPosColor, Position), D3D11_INPUT_PER_VERTEX_DATA ,0 }, { "COLOR", 0, DXGI_FORMAT_R32G32B32A32_FLOAT, 0, offsetof(VertexPosColor, Color), D3D11_INPUT_PER_VERTEX_DATA ,0 } }; hr = g_d3dDevice->CreateInputLayout(vertexLayoutDesc, _countof(vertexLayoutDesc), vsBlob->GetBufferPointer(), vsBlob->GetBufferSize(), &g_d3dInputLayout); if (FAILED(hr)) { SafeRelease(vsBlob); return false; } SafeRelease(vsBlob); //PS ID3DBlob* psBlob = nullptr; D3DReadFileToBlob(L"../Shaders/SimplePixelShader.cso", &psBlob); assert(psBlob); hr = g_d3dDevice->CreatePixelShader(psBlob->GetBufferPointer(), psBlob->GetBufferSize(), nullptr, &g_d3dPixelShader); SafeRelease(psBlob); if (FAILED(hr)) { return false; } //Load Vertex Buffer D3D11_BUFFER_DESC vertexBufferDesc{}; vertexBufferDesc.ByteWidth = sizeof(VertexPosColor) * _countof(g_Vertices); vertexBufferDesc.Usage = D3D11_USAGE_DEFAULT; vertexBufferDesc.BindFlags = D3D11_BIND_VERTEX_BUFFER; D3D11_SUBRESOURCE_DATA resourceData{}; resourceData.pSysMem = g_Vertices; hr = g_d3dDevice->CreateBuffer(&vertexBufferDesc, &resourceData, &g_d3dVertexBuffer); if (FAILED(hr)) { return false; } //Load Index Buffer D3D11_BUFFER_DESC indexBufferDesc{}; indexBufferDesc.ByteWidth = sizeof(WORD) * _countof(g_Indicies); indexBufferDesc.Usage = D3D11_USAGE_DEFAULT; indexBufferDesc.BindFlags = D3D11_BIND_INDEX_BUFFER; resourceData.pSysMem = g_Indicies; hr = g_d3dDevice->CreateBuffer(&indexBufferDesc, &resourceData, &g_d3dIndexBuffer); if (FAILED(hr)) { return false; } //Load Constant Buffers D3D11_BUFFER_DESC cBufferDesc{}; cBufferDesc.ByteWidth = sizeof(XMMATRIX); cBufferDesc.Usage = D3D11_USAGE_DEFAULT; cBufferDesc.BindFlags = D3D11_BIND_CONSTANT_BUFFER; for (size_t bufferID = 0; bufferID < NumConstantBuffers; bufferID++) { hr = g_d3dDevice->CreateBuffer(&cBufferDesc, nullptr, &g_d3dConstantBuffers[bufferID]); if (FAILED(hr)) { return false; } } //Setup Projection Matrix RECT client{}; GetClientRect(g_WinHnd, &client); float clientWidth = static_cast<float>(client.right - client.left); float clientHeight = static_cast<float>(client.bottom - client.top); g_ProjectionMatrix = DirectX::XMMatrixPerspectiveFovLH(XMConvertToRadians(45.0f), clientWidth / clientHeight, 0.1f, 100.0f); g_d3dDeviceContext->UpdateSubresource(g_d3dConstantBuffers[CB_Application], 0, nullptr, &g_ProjectionMatrix, 0, 0); return true; } void Update(float deltaTime) { XMVECTOR eyePosition = XMVectorSet(0, 0, -10, 1); XMVECTOR focusPoint = XMVectorSet(0, 0, 0, 1); XMVECTOR upDirection = XMVectorSet(0, 1, 0, 0); g_ViewMatrix = DirectX::XMMatrixLookAtLH(eyePosition, focusPoint, upDirection); g_d3dDeviceContext->UpdateSubresource(g_d3dConstantBuffers[CB_Frame], 0, nullptr, &g_ViewMatrix, 0, 0); static float angle = 0.0f; angle += 90.0f * deltaTime; XMVECTOR rotationAxis = XMVectorSet(0, 1, 1, 0); g_WorldMatrix = DirectX::XMMatrixRotationAxis(rotationAxis, XMConvertToRadians(angle)); g_d3dDeviceContext->UpdateSubresource(g_d3dConstantBuffers[CB_Object], 0, nullptr, &g_WorldMatrix, 0, 0); } void Clear(const FLOAT clearColor[4], FLOAT clearDepth, UINT8 clearStencil) { g_d3dDeviceContext->ClearRenderTargetView(g_d3dRenderTargerView, clearColor); g_d3dDeviceContext->ClearDepthStencilView(g_d3dDepthStencilView, D3D11_CLEAR_DEPTH | D3D11_CLEAR_STENCIL, clearDepth, clearStencil); } void Present(bool vSync) { if (vSync) { g_d3dSwapChain->Present(1, 0); } else { g_d3dSwapChain->Present(0, 0); } } void Render() { assert(g_d3dDevice); assert(g_d3dDeviceContext); Clear(Colors::CornflowerBlue, 1.0f, 0); //IA const UINT vertexStride = sizeof(VertexPosColor); const UINT offset = 0; g_d3dDeviceContext->IASetVertexBuffers(0, 1, &g_d3dVertexBuffer, &vertexStride, &offset); g_d3dDeviceContext->IASetInputLayout(g_d3dInputLayout); g_d3dDeviceContext->IASetIndexBuffer(g_d3dIndexBuffer, DXGI_FORMAT_R16_UINT, 0); g_d3dDeviceContext->IASetPrimitiveTopology(D3D11_PRIMITIVE_TOPOLOGY_TRIANGLELIST); //VS g_d3dDeviceContext->VSSetShader(g_d3dVertexShader, nullptr, 0); g_d3dDeviceContext->VSGetConstantBuffers(0, NumConstantBuffers, g_d3dConstantBuffers); //RS g_d3dDeviceContext->RSSetState(g_d3dRasterizerState); g_d3dDeviceContext->RSSetViewports(1, &g_Viewport); //PS g_d3dDeviceContext->PSSetShader(g_d3dPixelShader, nullptr, 0); //OM g_d3dDeviceContext->OMSetRenderTargets(1, &g_d3dRenderTargerView, g_d3dDepthStencilView); g_d3dDeviceContext->OMSetDepthStencilState(g_d3dDepthStencilState, 1); //draw g_d3dDeviceContext->DrawIndexed(_countof(g_Indicies), 0, 0); Present(g_EnableVSync); } void CleanUp() { SafeRelease(g_d3dVertexShader); SafeRelease(g_d3dPixelShader); SafeRelease(g_d3dVertexBuffer); SafeRelease(g_d3dIndexBuffer); SafeRelease(g_d3dInputLayout); SafeRelease(g_d3dDepthStencilBuffer); for (size_t bufferID = 0; bufferID < NumConstantBuffers; bufferID++) { SafeRelease(g_d3dConstantBuffers[bufferID]); } SafeRelease(g_d3dDepthStencilState); SafeRelease(g_d3dRasterizerState); SafeRelease(g_d3dRenderTargerView); SafeRelease(g_d3dDepthStencilView); SafeRelease(g_d3dSwapChain); SafeRelease(g_d3dDeviceContext); SafeRelease(g_d3dDevice); }  
    • By MarcusAseth
      Hi guys, I'm trying to learn this stuff but running into some problems 😕
      I've compiled my .hlsl into a header file which contains the global variable with the precompiled shader data:
      //... // Approximately 83 instruction slots used #endif const BYTE g_vs[] = { 68, 88, 66, 67, 143, 82, 13, 236, 152, 133, 219, 113, 173, 135, 18, 87, 122, 208, 124, 76, 1, 0, 0, 0, 16, 76, 0, 0, 6, 0, //.... And now following the "Compiling at build time to header files" example at this msdn link , I've included the header files in my main.cpp and I'm trying to create the vertex shader like this:
      hr = g_d3dDevice->CreateVertexShader(g_vs, sizeof(g_vs), nullptr, &g_d3dVertexShader); if (FAILED(hr)) { return -1; } and this is failing, entering the if and returing -1.
      Can someone point out what I'm doing wrong? 😕 
    • By Toastmastern
      Hello everyone,
      After a few years of break from coding and my planet render game I'm giving it a go again from a different angle. What I'm struggling with now is that I have created a Frustum that works fine for now atleast, it does what it's supose to do alltho not perfect. But with the frustum came very low FPS, since what I'm doing right now just to see if the Frustum worked is to recreate the vertex buffer every frame that the camera detected movement. This is of course very costly and not the way to do it. Thats why I'm now trying to learn how to create a dynamic vertexbuffer instead and to map and unmap the vertexes, in the end my goal is to update only part of the vertexbuffer that is needed, but one step at a time ^^

      So below is my code which I use to create the Dynamic buffer. The issue is that I want the size of the vertex buffer to be big enough to handle bigger vertex buffers then just mPlanetMesh.vertices.size() due to more vertices being added later when I start to do LOD and stuff, the first render isn't the biggest one I will need.
      vertexBufferDesc.Usage = D3D11_USAGE_DYNAMIC; vertexBufferDesc.ByteWidth = mPlanetMesh.vertices.size(); vertexBufferDesc.BindFlags = D3D11_BIND_VERTEX_BUFFER; vertexBufferDesc.CPUAccessFlags = D3D11_CPU_ACCESS_WRITE; vertexBufferDesc.MiscFlags = 0; vertexBufferDesc.StructureByteStride = 0; vertexData.pSysMem = &mPlanetMesh.vertices[0]; vertexData.SysMemPitch = 0; vertexData.SysMemSlicePitch = 0; result = device->CreateBuffer(&vertexBufferDesc, &vertexData, &mVertexBuffer); if (FAILED(result)) { return false; } What happens is that the 
      result = device->CreateBuffer(&vertexBufferDesc, &vertexData, &mVertexBuffer); Makes it crash due to Access Violation. When I put the vertices.size() in it works without issues, but when I try to set it to like vertices.size() * 2 it crashes.
      I googled my eyes dry tonight but doesn't seem to find people with the same kind of issue, I've read that the vertex buffer can be bigger if needed. What I'm I doing wrong here?
       
      Best Regards and Thanks in advance
      Toastmastern
    • By yonisi
      Hi,
      I have a terrain engine where the terrain and water are on different grids. So I'm trying to render planar reflections of the terrain into the water grid. After reading some web pages and docs and also trying to learn from the RasterTek reflections demo and the small water bodies demo as well. What I do is as follows:
      1. Create a Reflection view matrix  - Technically I ONLY flip the camera position in the Y direction (Positive Y is up) and add to it 2 * waterLevel. Then I update the View matrix and I save that matrix for later. The code:
      void Camera::UpdateReflectionViewMatrix( float waterLevel ) { mBackupPosition = mPosition; mBackupLook = mLook; mPosition.y = -mPosition.y + 2.0f * waterLevel; //mLook.y = -mLook.y + 2.0f * waterLevel; UpdateViewMatrix(); mReflectionView = View(); } 2. I render the Terrain geometry to a 512x512 sized Render target by using the Reflection view matrix and an opposite culling (My Terrain is using front culling by nature so I'm using back culling for the Reflction render pass). Let me say that I checked with the Graphics debugger and the Reflection Render target looks "OK" at this stage (Picture attached). I don't know if the fact that the terrain is shown only at the top are of the texture is expected or not, but it seems OK.

      3. Render the Reflection texture into the water using projective texturing - I hope this step is OK code wise. Basically I'm sending to the shader the WorldReflectionViewProj matrix that was created at step 1 in order to use it for the projective texture coordinates, I then convert the position in the DS (Water and terrain are drawn with Tessellation) to the projective tex coords using that WorldReflectionViewProj matrix, then I sample the reflection texture after setting up the coordinates in the PS. Here is the code:
      //Send the ReflectionWorldViewProj matrix to the shader: XMStoreFloat4x4(&mPerFrameCB.Data.ReflectionWorldViewProj, XMMatrixTranspose( ( mWorld * pCam->GetReflectedView() ) * mProj )); //Setting up the Projective tex coords in the DS: Output.projTexPosition = mul(float4(worldPos.xyz, 1), g_ReflectionWorldViewProj); //Setting up the coords in the PS and sampling the reflection texture: float2 projTexCoords; projTexCoords.x = input.projTexPosition.x / input.projTexPosition.w / 2.0 + 0.5; projTexCoords.y = -input.projTexPosition.y / input.projTexPosition.w / 2.0 + 0.5; projTexCoords += normal.xz * 0.025; float4 reflectionColor = gReflectionMap.SampleLevel(SamplerClampLinear, projTexCoords, 0); texColor += reflectionColor * 0.25; I'll add that when compiling the PS I'm getting a warning on those dividing by input.projTexPosition.w for a possible float division by 0, I tried to add some offset or some minimum to the dividing term but that still not solved my issue.
      Here is the problem itself. At relatively flat view angles I'm seeing correct reflections (Or at least so it seems), but as I pitch the camera down, I'm seeing those artifacts which I have no idea where are coming from. I'm culling the terrain in the reflection render pass when it's lower than water height (I have heightmaps for that).
       
      Any help will be appreciated because I don't know what is wrong or where else to look.
    • By thmfrnk
      Hi,
      I am looking for a usefull commandline based texture compression tool with the rights to be able to ship with my application. It should have following caps:
      Supports all major image format as source files (jpeg, png, tga, bmp) Export as DDS Compression Formats BC1, BC2, BC3, BC4, BC7 I am actually using the nvdxt tool from Nvidia, but it does not support BC4 (which I need for one-channel 8bit textures). Everything else which I found wasn't really useful.
      Any suggestions?
      Thx
       
    • By trojanfoe
      I have been trying to create a BlendState for my UI text sprites so that they are both alpha-blended (so you can see them) and invert the pixel they are rendered over (again, so you can see them).
      In order to get alpha blending you would need:
      SrcBlend = SRC_ALPHA DestBlend = INV_SRC_ALPHA and in order to have inverted colours you would need something like:
      SrcBlend = INV_DEST_COLOR DestBlend = INV_SRC_COLOR and you can't have both.
      So I have come to the conclusion that it's not possible; am I right?
    • By Royma
      In traditional way, it needs 6 passes for a point light and many passes for cascaded shadow mapping to generate shadow maps. Recently I learnt a method that using a geometry shader to generate all the shadow maps in one pass.I specify a render target and a depth-stencil buffer which are both Texture2dArray in DirectX11.It looks much better than the traditional way I think.But after I implemented it, I found cascaded shadow mapping runs much slower than the traditional way.The fps slow down from 60 to 35.I don't know why.I guess may be I should do some culling or maybe the geometry shader is not efficient.
      I want to know the reason that I reduced the drawcalls from 8 to 1, but it runs slow down.Should I abandon this method or is there any way to optimize this method to run more efficiently than multi-pass rendering?
      Here is the gs code:

      [maxvertexcount(24)]
      void main(
          triangle DepthGsIn input[3] : SV_POSITION,
          inout TriangleStream< DepthPsIn > output
      )
      {
          for (uint k = 0; k < 8; ++k)
          {
              DepthPsIn element;
              element.RTIndex = k;
              for (uint i = 0; i < 3; ++i)
              {
                  float2 shadowSlopeBias = calculateShadowSlopeBias(input.normal, -g_cameras[k].world[1]);
                  float shadowBias = shadowSlopeBias.y * g_cameras[k].shadowMapParameters.x + g_cameras[k].shadowMapParameters.y;
                  element.position = input.position + shadowBias * g_cameras[k].world[1];
                  element.position = mul(element.position, g_cameras[k].viewProjection);
                  element.depth = element.position.z / element.position.w;
                  
                  output.Append(element);
              }
              output.RestartStrip();
          }
      }
       
    • By savail
      Hey,
      There are a few things which confuse me regarding DirectX 11 and HLSL shaders in general. I would be very grateful for your advice!
      1. Let's take for example a scene which invokes 2 totally separate pipeline render passes interchangeably. I understand I need to bind correct shaders for each of the render pass and potentially blend/depth or rasterizer state but what about resources such as Constant Buffers, Shader Resource Views and Unordered Access Views? Assuming that the second render pass uses none of the resources used by the first pass, do I still need to unbind the resources and clean pipeline state after first pass? Or is it ok to leave pipeline with unbound garbage since anything I'd need to bind for second pass would overwrite contents in the appropriate register slots anyway?
      2. Is it a good practice to assign register slots manually to all resources in HLSL?
      3. I thought about assigning manually register slots for every distinct render pass up to the maximum slot limit if neccessary. For example in 1 render pass I invoke 3 CS's, 2 VS's and 2 PS's and for all resources used by those shaders I try to fill as many register slots as neccessary and potentially reuse many times the same slot in shaders sharing the same resource. I was wondering if there is any performance penalty or gain when I bind all of my needed resources at the start of render pass and never gonna have to do it again until next render pass? - this means potentially binding a lot of registers and having excessive number of bound resources for every shader that is run.
      4. Is it a good practice to create a separate include file for every resource that occurs in >= 2 shader files or is it better to duplicate the declarations? In first case, the code is imo easier to maintain and edit but might be harder to read if there's too many includes. I've come up with a compromise between these 2 like this: create a separate include file for every CB that occurs in >= 2 shader files and a separate include file for every sampler I ever need to use. All other resources like srvs and uavs I prefer to duplicate in multiple shaders because they take much less space than CB for example... I'm not sure however if that's a good practice
  • Advertisement
  • Popular Now

  • Forum Statistics

    • Total Topics
      631394
    • Total Posts
      2999754
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

Participate in the game development conversation and more when you create an account on GameDev.net!

Sign me up!