Jump to content
  • Advertisement
Sign in to follow this  
Mr_Fox

DX12 Bug only appear when Debug layer is off... Need help :( (NV Driver bug, Fixed in 378.49)

This topic is 544 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hey Guys,

 

Recently I encountered a rendering bug which only shows up when debuglayer is off, and later even I fixed all validation errors/warnings (including GPU-Based Validation ones) this bug still exist when debuglayer is off, and it happens on all GPUs available to me (GTX680m, GTX1080) but will not happen on warpdevice*

After days of struggling I found two ways to 'solve' those bug: 1. replacing one particular split barrier with normal barrier; 2. break one cmdlist into 2 and submit them to GPU in order...  All these 'solutions' doesn't make any sense to me, and I am almost run out of ideas. Please see here and there.

 

So I trimmed my project to get rid of Kinect dependencies (it uses Kinect color and depth sensor image as input) and make a repo for anyone who are interested or are willing to test/help (thanks!)

Here is the repo: https://github.com/pengliu916/BugRepo.git

 

To successful compile and run the code you need DX12 capable GPU and need windows sdk 10.0.14393.0, and to get rid of GPU-Based Validation warning/errors, you GPU need to support typedload.

 

(The following paragraph is not necessary for the bug, but just in case someone need more information)

This project originally will use depthMap from Kinect Depth sensor to create/update a TSDF (truncated sign distance field) volume to reconstruct 3D model of what Kinect sees.  To maintain this dynamic sparse volume efficiently, I use blocks to avoid update each voxel every frame. (instead of checking each voxel against the depthmap, I first check each block (contains 8^3 voxels) aginst depthmap, and then in the next pass do voxel-depthmap check only for voxels in needed blocks....... The bug is in this block update routine. And to avoid depending on Kinect, I modified the project to use GPU generated depth map as input (which is a sphere rotating with a radius in foreground with a wall in the background, and to make extremely slow warp device also generate reasonable result, I made the animation based on frame not time, also I change the volume reso to 64^3. You could change it to 512^3, and it will run 70fps on GTX680m, but remember to make voxel size small to see the whole picture, and you will kinda know this is data corruption bug)  You could also press the 'ResetVolume' button to reset related resource. But all other features show up on the right panel may not work or even cause crash since I get rid of a lot important components in a very short time...

 

If you directly compile and run the project you will see the following

 

[attachment=34634:Bug.PNG]

 

So you see the sphere is broken, and background wall is broken due to wrong Block update (and I visualized wrong result block (missing block) as small red box, and correct block as green big box. So ideally you should not see any small red boxes, but only green big box appear and disappear as the sphere moves as the following HWDeviceReference and WarpDeviceReference

 

[attachment=34632:HWDeviceReference.PNG][attachment=34633:WarpDeviceReference.PNG]

 

*I lied, warp device won't give you expected result (though it still didn't give you this bug) unless you uncheck the circled checkbox, but it's totally unrelated: that's for the rendering part, but the bug is in volume updating part.

 

So to 'solve' the bug, there are three ways:

1. Change Core::g_config.enableDebuglayer to true in file KinectVisualizer.cpp line 180 make sure you are in debug build (debug layer will be disabled in other build)


void
KinectVisualizer::OnConfiguration()
{
    Core::g_config.FXAA = false;
    Core::g_config.warpDevice = false;
    Core::g_config.enableDebuglayer = false; // change this to true will enable debug layer
    Core::g_config.enableGPUBasedValidationInDebug = false;
    Core::g_config.swapChainDesc.Width = _width;
    Core::g_config.swapChainDesc.Height = _height;
    Core::g_config.swapChainDesc.BufferCount = 5;
    Core::g_config.passThroughMsg = true;
    Core::g_config.useSceneBuf = false;
}

This will enable debug layer and under Debug build the bug will magically disappear  (I don't know why...)

 

2. Comment out line 1215 in file TSDFVolume\TSDFVolume.cpp

            cptCtx.DispatchIndirect(_indirectParams, 0);
        }
        //======================================================================
        // Code Part A
        //
        // The following line will cause the bug if 'Code Part B' is commented
        
        BeginTrans(cptCtx, _occupiedBlocksBuf, UAV); // this line


        // Add blocks to UpdateBlockQueue from DepthMap
        Trans(cptCtx, _fuseBlockVol, UAV);
        Trans(cptCtx, *pDepthTex, psSRV | csSRV);
        Trans(cptCtx, *pWeightTex, csSRV);
        Trans(cptCtx, _updateBlocksBuf, UAV);

This will remove the split transition (start one, and the end one will automatically become a normal transition). This will 'fix' the bug, I also don't know why...

 

3. Uncomment 3 lins of code from 1259 in TSDFVolume\TSDFVolume.cpp

       //======================================================================
        // Code Part B
        //
        // The following 3 lines is one work around the bug if
        // 'Code Part A' is uncommented
        
        //cptCtx.Flush();
        //cptCtx.SetRootSignature(_rootsig);
        //_UpdateAndBindConstantBuffer(cptCtx);
        
        
        // Update voxels in blocks from UpdateBlockQueue and create queues for
        // NewOccupiedBlocks and FreedOccupiedBlocks
        Trans(cptCtx, _occupiedBlocksBuf, UAV);
        Trans(cptCtx, _renderBlockVol, UAV);
        Trans(cptCtx, _fuseBlockVol, UAV);
        Trans(cptCtx, _newFuseBlocksBuf, UAV);

This will end recording current cmdlist, and flush all cached resource barrier and submit to GPU for execution and grab a new cmdlist from cmdlist pool for the following GPU calls, and set back rootsig and constant buffer.  This also 'fixed' the bug, and I don't know why.....

 

 

Please let me know if you have any trouble compile and running the code, and any comments or any words are appreciated.

 

Thanks in advance

Edited by Mr_Fox

Share this post


Link to post
Share on other sites
Advertisement

I get a error there:

 

macro_[2].Definition = "1";
    V(_Compile(L"BlockQueueCreate_cs", macro_, &blockQCreate_Pass1CS));
    macro_[2].Definition = "0"; macro_[3].Definition = "1";
    V(_Compile(L"BlockQueueCreate_cs", macro_, &blockQCreate_Pass2CS));   <--

 

First shader compiles fine, but the second fails:

 

...
'MiniEngine-KinectVisualizer.exe' (Win32): Loaded 'C:\Windows\System32\ucrtbase.dll'. Cannot find or open the PDB file.
'MiniEngine-KinectVisualizer.exe' (Win32): Loaded 'C:\Windows\System32\vcruntime140.dll'. Cannot find or open the PDB file.
'MiniEngine-KinectVisualizer.exe' (Win32): Unloaded 'C:\Windows\System32\ucrtbase.dll'
'MiniEngine-KinectVisualizer.exe' (Win32): Loaded 'C:\Windows\System32\uxtheme.dll'. Cannot find or open the PDB file.
[ INFO    ]: DX12Framework start
'MiniEngine-KinectVisualizer.exe' (Win32): Loaded 'C:\Windows\System32\msctf.dll'. Cannot find or open the PDB file.
'MiniEngine-KinectVisualizer.exe' (Win32): Loaded 'C:\Windows\System32\dwmapi.dll'. Cannot find or open the PDB file.
'MiniEngine-KinectVisualizer.exe' (Win32): Loaded 'C:\Windows\System32\rmclient.dll'. Cannot find or open the PDB file.
'MiniEngine-KinectVisualizer.exe' (Win32): Unloaded 'C:\Windows\System32\rmclient.dll'
'MiniEngine-KinectVisualizer.exe' (Win32): Loaded 'C:\Windows\System32\amdxc64.dll'. Cannot find or open the PDB file.
'MiniEngine-KinectVisualizer.exe' (Win32): Loaded 'C:\Windows\System32\version.dll'. Cannot find or open the PDB file.
'MiniEngine-KinectVisualizer.exe' (Win32): Loaded 'C:\Windows\System32\amdihk64.dll'. Module was built without symbols.
'MiniEngine-KinectVisualizer.exe' (Win32): Loaded 'C:\Windows\System32\detoured.dll'. Module was built without symbols.
'MiniEngine-KinectVisualizer.exe' (Win32): Unloaded 'C:\Windows\System32\detoured.dll'
'MiniEngine-KinectVisualizer.exe' (Win32): Unloaded 'C:\Windows\System32\amdihk64.dll'
'MiniEngine-KinectVisualizer.exe' (Win32): Unloaded 'C:\Windows\System32\version.dll'
'MiniEngine-KinectVisualizer.exe' (Win32): Unloaded 'C:\Windows\System32\amdxc64.dll'
[ INFO    ]: D3D12-capable hardware found (selected):  AMD Radeon (TM) R9 Fury Series (4072 MB)
[ INFO    ]: D3D12-capable hardware found:  Microsoft Basic Render Driver (0 MB)
'MiniEngine-KinectVisualizer.exe' (Win32): Loaded 'C:\Windows\System32\amdxc64.dll'. Cannot find or open the PDB file.
'MiniEngine-KinectVisualizer.exe' (Win32): Loaded 'C:\Windows\System32\version.dll'. Cannot find or open the PDB file.
'MiniEngine-KinectVisualizer.exe' (Win32): Loaded 'C:\Windows\System32\amdihk64.dll'. Module was built without symbols.
'MiniEngine-KinectVisualizer.exe' (Win32): Loaded 'C:\Windows\System32\detoured.dll'. Module was built without symbols.
[ WARN    ]: Tier 1, 2 and 3 are supported.
'MiniEngine-KinectVisualizer.exe' (Win32): Loaded 'C:\Windows\System32\DXGIDebug.dll'. Cannot find or open the PDB file.
'MiniEngine-KinectVisualizer.exe' (Win32): Loaded 'C:\Windows\System32\dcomp.dll'. Cannot find or open the PDB file.
'MiniEngine-KinectVisualizer.exe' (Win32): Loaded 'C:\Windows\System32\rsaenh.dll'. Cannot find or open the PDB file.
'MiniEngine-KinectVisualizer.exe' (Win32): Loaded 'C:\Windows\System32\bcrypt.dll'. Cannot find or open the PDB file.
'MiniEngine-KinectVisualizer.exe' (Win32): Loaded 'C:\Windows\System32\cryptbase.dll'. Cannot find or open the PDB file.
[ INFO    ]: QueryHeap created
[ INFO    ]: Typed load is supported
[ ERROR    ]: C:\dev\BugRepo-master\x64\Debug\TSDFVolume_BlockQueueCreate_cs.hlsl(108,9-39): error X4532: cannot map expression to cs_5_1 instruction set

The program '[0x2174] MiniEngine-KinectVisualizer.exe' has exited with code 0 (0x0).

 

108 porbably means the line in the source file? There is:

 

    InterlockedOr(
        tex_uavFuseBlockVol[u3BlockIdx], BLOCKSTATEMASK_UPDATE, uOrig);

Share this post


Link to post
Share on other sites

I get a error there:

 

Thanks for trying, but I just tried on all machine on my lab, they don't get such errors (though I only have one gtx680m two gtx1080 machines).   I guess I should ask for a AMD test machine.....

 

all shader source are there, and 108 is line number, for this kinda prototype project I compile all shader during run time. But unfortunately I can't reproduce the error you encountered, maybe you could try compile it with sm 5.0 maybe? (in file TSDFVolume\TSDFVolume.cpp line 123 - 125 change all "_5_1" to "_5_0")  at least this change works for Nvidia GPUs here

Share this post


Link to post
Share on other sites

Or you could try turn on debug layer on  in file KinectVisualizer.cpp line 180 to see whether that will give more informative error message?   It's weird that compile result is so different across different vendor.....

Share this post


Link to post
Share on other sites

Changing to 5_0 fixed it and it runs on AMD now. Changing all volume size settings to 512 i get 7 ms in release mode.

 

The bug does not happen, i see always a whole sphere.

There are some artefacts at the shillouette (like in any screenshot you posted), but this does not indicate the bug i guess.

...making a NV driver bug more likely.

Share this post


Link to post
Share on other sites

Thanks JoeJ, I update the repo with that fix.

The artifacts on silhouette is caused by on purpose rejecting surface with normal diverge too much away from view direction, and that's necessary since those pixels are less stable from kinect, so I just reject them.

 

Wish to have more people to test it and confirm it is not my bug somewhere......

And, well that's both good news and bad news for me, on good side I think I could just use the workaround and moving forward, on bad side, it means all the days of work is just wasted......  

 

But that brought me a questions: As a prototype project like this, I could get weird bug like this which may take me several days and not be my fault, but for much complex game engine they must have much more such issues, how could they know whether it is driver bug or not? Tracing down such bugs may be a wast of precious time for them.

 

Again, thank you for helping me on this, really appreciated

Share this post


Link to post
Share on other sites

I can tell you that on an Intel card (HD 530) the application TDRs. Do you have a DX12 capable Intel GPU to test on?

Share this post


Link to post
Share on other sites

I can tell you that on an Intel card (HD 530) the application TDRs. Do you have a DX12 capable Intel GPU to test on?
 

Thanks Adam, I will try to find a dx12 intel gpu to test it in an hour,  do you have other dedicate GPU to see whether it works? Thanks

Share this post


Link to post
Share on other sites

But that brought me a questions: As a prototype project like this, I could get weird bug like this which may take me several days and not be my fault, but for much complex game engine they must have much more such issues, how could they know whether it is driver bug or not? Tracing down such bugs may be a wast of precious time for them.

 

Driver bugs are a common thing. I can't confirm this is a driver bug (missing DX knowledge), but if you think the behaviour is against the specs you should talk with NV. They will listen, look at your repo and fix the bug. Although the process might take some time (months).

To make it easier for them you could strip down your project until only the bug remains (e. g. printing just some numbers to proof misbehaviour would be enough).

Usually you just post on their public forums.

 

 

I can tell you that on an Intel card (HD 530) the application TDRs. Do you have a DX12 capable Intel GPU to test on?

 

In the reduction code there is the assumption that the GPU has at least 32 threads in lockstep and some LDS barriers are missing.

Intel has only 8 threads in lockstep, so the code will likely go wrong.

You should add all the barriers and leave it to the compiler to remove them if it's save to do so.

 

 

One interesting thing: Changing some block size value from 8x8 to a higher value (16x16 IIRC) at the UI doubled performance for me from 7ms to 3.6 ms.

Is that expected and is it possible to show detailed profiler timings somehow?

Share this post


Link to post
Share on other sites
In the reduction code there is the assumption that the GPU has at least 32 threads in lockstep and some LDS barriers are missing. Intel has only 8 threads in lockstep, so the code will likely go wrong. You should add all the barriers and leave it to the compiler to remove them if it's save to do so.

Thanks JoeJ, the reduction should be removed since the actual call to that pass is removed, I will update the repo

 

 

 

One interesting thing: Changing some block size value from 8x8 to a higher value (16x16 IIRC) at the UI doubled performance for me from 7ms to 3.6 ms. Is that expected and is it possible to show detailed profiler timings somehow?

Yup, I build this project with lots of different ways as options as you could see from the side panel. there are 2 different block volume: one for update block volume, one for render block volume, and the reso could be changed. and that's mean to be tried to figure out the best configuration.

 

For GPU timing, on the bottom right corner, there is Engine Panel, and the first collapse line is GPU Profiler, you could click on it and open options for show timing for all passes. But there is probably an other NV bug for that since I only observe corrupted timing info on GTX 1080.

Edited by Mr_Fox

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!