• Advertisement
  • Popular Tags

  • Popular Now

  • Advertisement
  • Similar Content

    • By Noparadise
      Hi! I am new here, so if I made some mistakes tell me about it. Also English is not my native language.
      Let’s start. I am part of a young gamedev team. Now we don`t even have a title.
      Shortly about a game. Our project is a T-RPG in the style of a bright noir (the closest example is L.A. Noir), where the story revolves around the misadventures of a private detective and his mixed team of comrades-in-arms. The main rivals are not only gangsters who have entangled the city with criminal networks, but also policemen who disagree fundamentally with the methods of the main characters. But at the beginning we faced with a lack of statistic information.
      So, I want to ask you to tell your opinion about this conception. And, maybe, answer some questions from this Google Form: 
    • By MountainGoblin
      Hi everyone!

      Let me represent my first game.

      It's mix of arcade and logic (just a little).

      Play Market:

      Spiteful UFOs seized the hedgehog commune. Four hedgehog fighting commandos will not give them any chance to destroy it.
      Be the fifth member of the team. 
      Destroy alien’s bases!
      Save your friends!
      Use the commandos’ features to seize and destroy aliens’ bases. 
      You won’t be able to clean the commune from hateful aliens without rationality, speed and agility. 
      - 40 fascinating levels in different parts of the world;
      - 8 different locations.
      I'll wait for yours feedback. It's very important for upcomming updates!

    • By elect
      ok, so, we are having problems with our current mirror reflection implementation.
      At the moment we are doing it very simple, so for the i-th frame, we calculate the reflection vectors given the viewPoint and some predefined points on the mirror surface (position and normal).
      Then, using the least squared algorithm, we find the point that has the minimum distance from all these reflections vectors. This is going to be our virtual viewPoint (with the right orientation).
      After that, we render offscreen to a texture by setting the OpenGL camera on the virtual viewPoint.
      And finally we use the rendered texture on the mirror surface.
      So far this has always been fine, but now we are having some more strong constraints on accuracy.
      What are our best options given that:
      - we have a dynamic scene, the mirror and parts of the scene can change continuously from frame to frame
      - we have about 3k points (with normals) per mirror, calculated offline using some cad program (such as Catia)
      - all the mirror are always perfectly spherical (with different radius vertically and horizontally) and they are always convex
      - a scene can have up to 10 mirror
      - it should be fast enough also for vr (Htc Vive) on fastest gpus (only desktops)

      Looking around, some papers talk about calculating some caustic surface derivation offline, but I don't know if this suits my case
      Also, another paper, used some acceleration structures to detect the intersection between the reflection vectors and the scene, and then adjust the corresponding texture coordinate. This looks the most accurate but also very heavy from a computational point of view.

      Other than that, I couldn't find anything updated/exhaustive around, can you help me?
      Thanks in advance
    • By Krypt0n
      Finally the ray tracing geekyness starts:
      lets collect some interesting articles, I start with:
    • By ANIO chan
      Hi, I'm new here and would like to get some help in what i should do first when designing a game? What would you consider to be the best steps to begin designing my game? Give resources with it as well please.
  • Advertisement
  • Advertisement

DX12 Anyone else having problems with D3D12 compute shaders on NVIDIA hardware?

Recommended Posts

I'm having an odd problem with D3D12 compute shaders. I have a very simple compute shader that does nothing but write the global ID of the current thread out to a buffer:

RWStructuredBuffer<uint> g_pathVisibility : register(u0, space1);

cbuffer cbPushConstants : register(b0)
	uint g_count;

[numthreads(32, 1, 1)]
void main(uint3 DTid : SV_DispatchThreadID)
	if(DTid.x < g_count)
		g_pathVisibility[DTid.x] = DTid.x + 1;

I'm allocating 2 buffers with space or 128 integers. One buffer is the output buffer for the shader above and the other is a copy destination buffer for CPU readback. If I set numthreads() to any power of two, for example it's set to 32 above, I get a device reset error on NVIDIA hardware only. If I set numthreads() to any non-power of 2 value the shader works as expected. The exceptionally odd thing is that all of the compute shaders in the D3D12 samples work fine with numthreads() containing powers of 2. It doesn't matter if I execute the compute shader on a graphics queue or a compute queue - it's the same result either way. I've tested this on a GTX 1080 and a GTX 1070 with identical results. AMD cards seem to work as expected. Anyone have any idea what the hell could be going on? I tried asking NVIDIA on their boards but per-usual they never responded. I'm using their latest drivers. I've attached my sample application if anyone is interested, it's a UWP app since Visual Studio provides a nice D3D12 app template that I use to play around with simple ideas. The shader in question in the project is TestCompute.hlsl and the function where the magic happens is Sample3DSceneRenderer::TestCompute() line 1006 in Sample3DSceneRenderer.cpp.


Share this post

Link to post
Share on other sites

I've definitely run into a few Nvidia DX12 driver bugs (especially when DX12 was new), but I haven't personally seen anything with compute shaders. The driver and/or shader JIT is probably just trying to do something clever, and ends up doing something bad. 

Share this post

Link to post
Share on other sites

I get no GPU hang here on a 980 Ti but I do get a GPU Based Validation error that you seem to have introduced:

D3D12 ERROR: GPU-BASED VALIDATION: Dispatch, Descriptor heap index out of bounds: Heap Index To DescriptorTableStart: [0], Heap Index From HeapStart: [0], Heap Type: D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV, Num Descriptor Entries: 0, Index of Descriptor Range: 0, Shader Stage: COMPUTE, Root Parameter Index: [1], Dispatch Index: [0], Shader Code: TestCompute.hlsl(13,3-40), Asm Instruction Range: [0xbc-0xdf], Asm Operand Index: [0], Command List: 0x000001F3C5E38C20:'m_testComputeList', SRV/UAV/CBV Descriptor Heap: 0x000001F3C5C824B0:'m_testComputeCBVHeap', Sampler Descriptor Heap: <not set>, Pipeline State: 0x000001F3C5973380:'m_testComputePipeline',  [ EXECUTION ERROR #936: GPU_BASED_VALIDATION_DESCRIPTOR_HEAP_INDEX_OUT_OF_BOUNDS]

Share this post

Link to post
Share on other sites

Turns out the hang wasn't 100%. It 'Succeeded' and render the cube after the test for the first few times, but did hang on a later run. The GPU-Based Validation error is still there though.

Share this post

Link to post
Share on other sites

@ajmiles Interesting I don't have any GPU validation errors. Did you change anything in the code or perhaps gobal D3D12 or driver settings? I've tried removing the root constants, setting the UAV register space to 0, and hardcoding g_count to 128 in the shader so that there's only the UAV but that had no effect. I also tried switching it from a RWStructuredBuffer to just RWBuffer but that also had no effect. No matter what I do numthreads() with 32 (or any power of 2) fails and numthreads() with 31 (or any non-power of 2) succeeds. I don't suppose there's any other insight you can provide on your end given that I'm not getting the validation errors? Presumably if the descriptor heap and root descriptor settings were actually invalid it wouldn't be able to successfully write with a non-power of 2 dispatch?

Share this post

Link to post
Share on other sites

It's possible that the version I'm on (16251) has newer GPU Validation bits that what you're running.

What version of Windows 10 are you running? Run 'winver' at a command prompt and there should be an OS Build number in parentheses.

Share this post

Link to post
Share on other sites

That could be it. I'm on build 15063.483 (Creators Update). It looks like you're using a July 26 Windows Insider Preview build. That still doesn't explain why it would be able to successfully write with a non-power of 2 if the descriptor heap was corrupt but not with a power of 2? Do you see anything I'm doing wrong with my descriptor heap?

Share this post

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Advertisement