Sign in to follow this  
maxest

DX11 HLSL race condition when writing to shared memory passed to function

Recommended Posts

maxest    622

I have code like this:

groupshared uint tempData[ElementsCount];

[numthreads(ElementsCount/2, 1, 1)]
void CSMain(uint3 gID: SV_GroupID, uint3 gtID: SV_GroupThreadID)
{
    tempData[gtID.x] = 0;
}

And it works fine. Now I change it to this:

void MyFunc(inout uint3 gtID: SV_GroupThreadID, inout uint inputData[ElementsCount])
{
    inputData[gtID.x] = 0;
}

groupshared uint tempData[ElementsCount];

[numthreads(ElementsCount/2, 1, 1)]
void CSMain(uint3 gID: SV_GroupID, uint3 gtID: SV_GroupThreadID)
{
    MyFunc(gtID, tempData);
}

and I get "error X3695: race condition writing to shared memory detected, consider making this write conditional.". Any way to go around this?

Share this post


Link to post
Share on other sites
Adam_42    3629

Based on a quick search I found these:

 

https://www.gamedev.net/topic/594131-dx11-compute-shader-race-condition-error-when-using-optimization-level-2-or-3/

http://xboxforums.create.msdn.com/forums/t/63981.aspx

 

Based on those it sounds like there might be a bug in certain versions of the compiler. I'd suggest trying to use either command line fxc.exe or a more recent version of the d3dcompiler dll to see if it makes any difference.

Share this post


Link to post
Share on other sites
maxest    622

I stumbled upon those threads as well and it's not it.

Also, I'm not really sure how to update my d3dcompiler. I'm using Windows 10 so I presume it gets updated automatically. Although I use Visual Studio 2013 so I cannot really be sure if the most up-to-date dll is used.

I found out that the problem appears even in this code:

static const int ElementsCount = 512;


groupshared uint tempData[2 * ElementsCount];


void MyFunc(inout uint3 gtID: SV_GroupThreadID, inout uint inputData[2 * ElementsCount])
{

}


[numthreads(ElementsCount, 1, 1)]
void CSMain(uint3 gID: SV_GroupID, uint3 gtID: SV_GroupThreadID)
{
    MyFunc(gtID, tempData);
}

Note that I don't even write anything to tempData in MyFunc.
I also found out the problem goes away if I remove the "inout" modifier but then the array just gets copied probably as the code doesn't work as expected.

Edited by maxest

Share this post


Link to post
Share on other sites
Adam_42    3629

https://blogs.msdn.microsoft.com/chuckw/2012/05/07/hlsl-fxc-and-d3dcompile/ explains all the details of how the compiler DLL works.

If you want to check which version of the DLL your program is using, then just pause it in the debugger and look through the modules window for the DLL. I believe the latest version is D3dcompiler_47.dll

Have you tried compiling the shader using fxc.exe?

Share this post


Link to post
Share on other sites
galop1n    938

There is many version of d3dcompiler_47.dll, a very dumb idea…

If your shader just compile from visual studio as a hlsl source file, the fxc and dll you use is probably bound to the windows sdk that is setup in your project.

 

Not saying that getting the latest one would solve this, but you may still run an outdated compiler :)

Share this post


Link to post
Share on other sites
Adam_42    3629

I've reproduced the behaviour, and simplified the case that goes wrong. Here's my minimal failing case:

groupshared uint tempData[1];

void MyFunc(inout uint inputData[1])
{
}

[numthreads(2, 1, 1)]
void CSMain()
{
    MyFunc(tempData);
}

It looks like just passing the argument to the function is enough to make it fail to compile.

Here's a workaround for the problem - don't pass the array as a function argument:

#define ElementsCount 256

groupshared uint tempData[ElementsCount];

void MyFunc(in uint3 gtID)
{
      tempData[gtID.x] = 0;
}

[numthreads(ElementsCount/2, 1, 1)]
void CSMain(uint3 gID: SV_GroupID, uint3 gtID: SV_GroupThreadID)
{
    MyFunc(gtID);
}

Share this post


Link to post
Share on other sites
maxest    622

Yeah, I'm perfectly aware of that workaround and I do it this way. But because I can't pass a shared memory array to function I can't make the function more general. Instead I need to copy it to a few files I use it in.

Share this post


Link to post
Share on other sites
maxest    622

I found a better workaround. So simple I can't imagine how I could had not come up with it before. I just used macro.

Still, would be nice if this bug was fixed. In the meantime I will be using macros on functions getting shared buffers as input.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this  

  • Similar Content

    • By AlexWIN32
      Hello!
       
      A have an issue with my point light shadows realisation.

       
      First of all, the pixel shader path:
      //.... float3 toLight = plPosW.xyz - input.posW; float3 fromLight = -toLight; //... float depthL = abs(fromLight.x); if(depthL < abs(fromLight.y)) depthL = abs(fromLight.y); if(depthL < abs(fromLight.z)) depthL = abs(fromLight.z); float4 pH = mul(float4(0.0f, 0.0f, depthL, 1.0f), lightProj); pH /= pH.w; isVisible = lightDepthTex.SampleCmpLevelZero(lightDepthSampler, normalize(fromLight), pH.z).x;

      lightProj matrix creation
      Matrix4x4 projMat = Matrix4x4::PerspectiveFovLH(0.5f * Pi, 0.01f, 1000.0f, 1.0f);  
      thats how i create Depth cube texture
       
      viewport->TopLeftX = 0.0f; viewport->TopLeftY = 0.0f; viewport->Width = static_cast<float>(1024); viewport->Height = static_cast<float>(1024); viewport->MinDepth = 0.0f; viewport->MaxDepth = 1.0f; D3D11_TEXTURE2D_DESC textureDesc; textureDesc.Width = 1024; textureDesc.Height = 1024; textureDesc.MipLevels = 1; textureDesc.ArraySize = 6; textureDesc.Format = DXGI_FORMAT_R24G8_TYPELESS; textureDesc.SampleDesc.Count = 1; textureDesc.SampleDesc.Quality = 0; textureDesc.Usage = D3D11_USAGE_DEFAULT; textureDesc.BindFlags = D3D11_BIND_DEPTH_STENCIL | D3D11_BIND_SHADER_RESOURCE; textureDesc.CPUAccessFlags = 0; textureDesc.MiscFlags = D3D11_RESOURCE_MISC_TEXTURECUBE; ID3D11Texture2D* texturePtr; HR(DeviceKeeper::GetDevice()->CreateTexture2D(&textureDesc, NULL, &texturePtr)); for(int i = 0; i < 6; ++i){ D3D11_DEPTH_STENCIL_VIEW_DESC dsvDesc; dsvDesc.Flags = 0; dsvDesc.Format = DXGI_FORMAT_D24_UNORM_S8_UINT; dsvDesc.ViewDimension = D3D11_DSV_DIMENSION_TEXTURE2DARRAY; dsvDesc.Texture2DArray = D3D11_TEX2D_ARRAY_DSV{0, i, 1}; ID3D11DepthStencilView *outDsv; HR(DeviceKeeper::GetDevice()->CreateDepthStencilView(texturePtr, &dsvDesc, &outDsv)); edgeDsv = outDsv; } D3D11_SHADER_RESOURCE_VIEW_DESC srvDesc; srvDesc.Format = DXGI_FORMAT_R24_UNORM_X8_TYPELESS; srvDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURECUBE; srvDesc.TextureCube = D3D11_TEXCUBE_SRV{0, 1}; ID3D11ShaderResourceView *outSRV; HR(DeviceKeeper::GetDevice()->CreateShaderResourceView(texturePtr, &srvDesc, &outSRV));  
      then i create six target oriented cameras and finally draw scene to cube depth according to each camera
      Cameras creation code:  
      std::vector<Vector3> camDirs = { { 1.0f, 0.0f, 0.0f}, {-1.0f, 0.0f, 0.0f}, { 0.0f, 1.0f, 0.0f}, { 0.0f, -1.0f, 0.0f}, { 0.0f, 0.0f, 1.0f}, { 0.0f, 0.0f, -1.0f}, }; std::vector<Vector3> camUps = { {0.0f, 1.0f, 0.0f}, // +X {0.0f, 1.0f, 0.0f}, // -X {0.0f, 0.0f, -1.0f}, // +Y {0.0f, 0.0f, 1.0f}, // -Y {0.0f, 1.0f, 0.0f}, // +Z {0.0f, 1.0f, 0.0f} // -Z }; for(size_t b = 0; b < camDirs.size(); b++){ edgesCameras.SetPos(pl.GetPos()); edgesCameras.SetTarget(pl.GetPos() + camDirs); edgesCameras.SetUp(camUps); edgesCameras.SetProjMatrix(projMat); }  
      I will be very gratefull for any help!
      P.s sorry for my poor English)
       
    • By isu diss
      HRESULT FBXLoader::Open(HWND hWnd, char* Filename) { HRESULT hr = S_OK; if (FBXM) { FBXIOS = FbxIOSettings::Create(FBXM, IOSROOT); FBXM->SetIOSettings(FBXIOS); FBXI = FbxImporter::Create(FBXM, ""); if (!(FBXI->Initialize(Filename, -1, FBXIOS))) MessageBox(hWnd, (wchar_t*)FBXI->GetStatus().GetErrorString(), TEXT("ALM"), MB_OK); FBXS = FbxScene::Create(FBXM, "MCS"); if (!FBXS) MessageBox(hWnd, TEXT("Failed to create the scene"), TEXT("ALM"), MB_OK); if (!(FBXI->Import(FBXS))) MessageBox(hWnd, TEXT("Failed to import fbx file content into the scene"), TEXT("ALM"), MB_OK); if (FBXI) FBXI->Destroy(); FbxNode* MainNode = FBXS->GetRootNode(); int NumKids = MainNode->GetChildCount(); FbxNode* ChildNode = NULL; for (int i=0; i<NumKids; i++) { ChildNode = MainNode->GetChild(i); FbxNodeAttribute* NodeAttribute = ChildNode->GetNodeAttribute(); if (NodeAttribute->GetAttributeType() == FbxNodeAttribute::eMesh) { FbxMesh* Mesh = ChildNode->GetMesh(); NumVertices = Mesh->GetControlPointsCount();//number of vertices MyV = new FBXVTX[NumVertices]; for (DWORD j = 0; j < NumVertices; j++) { FbxVector4 Vertex = Mesh->GetControlPointAt(j);//Gets the control point at the specified index. MyV[j].Position = XMFLOAT3((float)Vertex.mData[0], (float)Vertex.mData[1], (float)Vertex.mData[2]); } NumIndices = Mesh->GetPolygonVertexCount();//number of indices; for cube 20 MyI = new DWORD[NumIndices]; MyI = (DWORD*)Mesh->GetPolygonVertices();//index array NumFaces = Mesh->GetPolygonCount(); MyF = new FBXFACEX[NumFaces]; for (int l=0;l<NumFaces;l++) { MyF[l].Vertices[0] = MyI[4*l]; MyF[l].Vertices[1] = MyI[4*l+1]; MyF[l].Vertices[2] = MyI[4*l+2]; MyF[l].Vertices[3] = MyI[4*l+3]; } UV = new XMFLOAT2[NumIndices]; for (int i = 0; i < Mesh->GetPolygonCount(); i++)//polygon(=mostly rectangle) count { FbxLayerElementArrayTemplate<FbxVector2>* uvVertices = NULL; Mesh->GetTextureUV(&uvVertices); for (int j = 0; j < Mesh->GetPolygonSize(i); j++)//retrieves number of vertices in a polygon { FbxVector2 uv = uvVertices->GetAt(Mesh->GetTextureUVIndex(i, j)); UV[4*i+j] = XMFLOAT2((float)uv.mData[0], (float)uv.mData[1]); } } } } } else MessageBox(hWnd, TEXT("Failed to create the FBX Manager"), TEXT("ALM"), MB_OK); return hr; } I've been trying to load fbx files(cube.fbx) into my programme. but I get this. Can someone pls help me?
       

    • By lonewolff
      Hi Guys,
      I am having a bit of a problem with a dynamic texture.
      It is creating without error and I am attempting to initialize the first pixel to white to make sure I am mapping correctly. But when I draw the texture to the quad it displays the whole quad white (instead of just one pixel).
      This is how I am creating, mapping, and setting the first pixel to white. But as mentioned, when I draw the quad, the entire quad is white.
       
      // Create dynamic texture D3D11_TEXTURE2D_DESC textureDesc = { 0 }; textureDesc.Width = 2048; textureDesc.Height = 2048; textureDesc.MipLevels = 1; textureDesc.ArraySize = 1; textureDesc.Format = DXGI_FORMAT_B8G8R8A8_UNORM; textureDesc.SampleDesc.Count = 1; textureDesc.Usage = D3D11_USAGE_DYNAMIC; textureDesc.BindFlags = D3D11_BIND_SHADER_RESOURCE; textureDesc.CPUAccessFlags = D3D11_CPU_ACCESS_WRITE; textureDesc.MiscFlags = 0; HRESULT result = d3dDevice->CreateTexture2D(&textureDesc, NULL, &textureDynamic); if (FAILED(result)) return -1; result = d3dDevice->CreateShaderResourceView(textureDynamic, 0, &textureRV); if (FAILED(result)) return -2; D3D11_MAPPED_SUBRESOURCE resource; if (FAILED(d3dContext->Map(textureDynamic, 0, D3D11_MAP_WRITE_DISCARD, 0, &resource))) return -1; memset(resource.pData, 255, 4); d3dContext->Unmap(textureDynamic, 0);  
      Hopefully I have just made an oversight somewhere.
      Any assistance would be greatly appreciated
      (If I change the 255 value to 128 the quad then turns grey, so the mapping is definitely doing something. Just can't work out why it is colouring the whole quad and not the first pixel)
    • By KaiserJohan
      Just a really quick question - is there any overhead to using DrawIndexedInstanced even for geometry you just render once vs using DrawIndexed? Or is the details obfuscated by the graphics driver?
      I would assume no but you never know  
    • By isu diss
       I'm trying to code Rayleigh part of Nishita's model (Display Method of the Sky Color Taking into Account Multiple Scattering). I get black screen no colors. Can anyone find the issue for me?
       
      #define InnerRadius 6320000 #define OutterRadius 6420000 #define PI 3.141592653 #define Isteps 20 #define Ksteps 10 static float3 RayleighCoeffs = float3(6.55e-6, 1.73e-5, 2.30e-5); RWTexture2D<float4> SkyColors : register (u0); cbuffer CSCONSTANTBUF : register( b0 ) { float fHeight; float3 vSunDir; } float Density(float Height) { return exp(-Height/8340); } float RaySphereIntersection(float3 RayOrigin, float3 RayDirection, float3 SphereOrigin, float Radius) { float t1, t0; float3 L = SphereOrigin - RayOrigin; float tCA = dot(L, RayDirection); if (tCA < 0) return -1; float lenL = length(L); float D2 = (lenL*lenL) - (tCA*tCA); float Radius2 = (Radius*Radius); if (D2<=Radius2) { float tHC = sqrt(Radius2 - D2); t0 = tCA-tHC; t1 = tCA+tHC; } else return -1; return t1; } float RayleighPhaseFunction(float cosTheta) { return ((3/(16*PI))*(1+cosTheta*cosTheta)); } float OpticalDepth(float3 StartPosition, float3 EndPosition) { float3 Direction = normalize(EndPosition - StartPosition); float RayLength = RaySphereIntersection(StartPosition, Direction, float3(0, 0, 0), OutterRadius); float SampleLength = RayLength / Isteps; float3 tmpPos = StartPosition + 0.5 * SampleLength * Direction; float tmp; for (int i=0; i<Isteps; i++) { tmp += Density(length(tmpPos)-InnerRadius); tmpPos += SampleLength * Direction; } return tmp*SampleLength; } static float fExposure = -2; float3 HDR( float3 LDR) { return 1.0f - exp( fExposure * LDR ); } [numthreads(32, 32, 1)] //disptach 8, 8, 1 it's 256 by 256 image void ComputeSky(uint3 DTID : SV_DispatchThreadID) { float X = ((2 * DTID.x) / 255) - 1; float Y = 1 - ((2 * DTID.y) / 255); float r = sqrt(((X*X)+(Y*Y))); float Theta = r * (PI); float Phi = atan2(Y, X); static float3 Eye = float3(0, 10, 0); float ViewOD = 0, SunOD = 0, tmpDensity = 0; float3 Attenuation = 0, tmp = 0, Irgb = 0; //if (r<=1) { float3 ViewDir = normalize(float3(sin(Theta)*cos(Phi), cos(Theta),sin(Theta)*sin(Phi) )); float ViewRayLength = RaySphereIntersection(Eye, ViewDir, float3(0, 0, 0), OutterRadius); float SampleLength = ViewRayLength / Ksteps; //vSunDir = normalize(vSunDir); float cosTheta = dot(normalize(vSunDir), ViewDir); float3 tmpPos = Eye + 0.5 * SampleLength * ViewDir; for(int k=0; k<Ksteps; k++) { float SunRayLength = RaySphereIntersection(tmpPos, vSunDir, float3(0, 0, 0), OutterRadius); float3 TopAtmosphere = tmpPos + SunRayLength*vSunDir; ViewOD = OpticalDepth(Eye, tmpPos); SunOD = OpticalDepth(tmpPos, TopAtmosphere); tmpDensity = Density(length(tmpPos)-InnerRadius); Attenuation = exp(-RayleighCoeffs*(ViewOD+SunOD)); tmp += tmpDensity*Attenuation; tmpPos += SampleLength * ViewDir; } Irgb = RayleighCoeffs*RayleighPhaseFunction(cosTheta)*tmp*SampleLength; SkyColors[DTID.xy] = float4(Irgb, 1); } }  
  • Popular Now