• Advertisement
  • Popular Tags

  • Popular Now

  • Advertisement
  • Similar Content

    • By cozzie
      Hi all,
      As a part of the debug drawing system in my engine,  I want to add support for rendering simple text on screen  (aka HUD/ HUD style). From what I've read there are a few options, in short:
      1. Write your own font sprite renderer
      2. Using Direct2D/Directwrite, combine with DX11 rendertarget/ backbuffer
      3. Use an external library, like the directx toolkit etc.
      I want to go for number 2, but articles/ documentation confused me a bit. Some say you need to create a DX10 device, to be able to do this, because it doesn't directly work with the DX11 device.  But other articles tell that this was 'patched' later on and should work now.
      Can someone shed some light on this and ideally provide me an example or article on  how to set this up?
      All input is appreciated.
    • By stale
      I've just started learning about tessellation from Frank Luna's DX11 book. I'm getting some very weird behavior when I try to render a tessellated quad patch if I also render a mesh in the same frame. The tessellated quad patch renders just fine if it's the only thing I'm rendering. This is pictured below:
      However, when I attempt to render the same tessellated quad patch along with the other entities in the scene (which are simple triangle-lists), I get the following error:

      I have no idea why this is happening, and google searches have given me no leads at all. I use the following code to render the tessellated quad patch:
      ID3D11DeviceContext* dc = GetGFXDeviceContext(); dc->IASetPrimitiveTopology(D3D11_PRIMITIVE_TOPOLOGY_4_CONTROL_POINT_PATCHLIST); dc->IASetInputLayout(ShaderManager::GetInstance()->m_JQuadTess->m_InputLayout); float blendFactors[] = { 0.0f, 0.0f, 0.0f, 0.0f }; // only used with D3D11_BLEND_BLEND_FACTOR dc->RSSetState(m_rasterizerStates[RSWIREFRAME]); dc->OMSetBlendState(m_blendStates[BSNOBLEND], blendFactors, 0xffffffff); dc->OMSetDepthStencilState(m_depthStencilStates[DSDEFAULT], 0); ID3DX11EffectTechnique* activeTech = ShaderManager::GetInstance()->m_JQuadTess->Tech; D3DX11_TECHNIQUE_DESC techDesc; activeTech->GetDesc(&techDesc); for (unsigned int p = 0; p < techDesc.Passes; p++) { TerrainVisual* terrainVisual = (TerrainVisual*)entity->m_VisualComponent; UINT stride = sizeof(TerrainVertex); UINT offset = 0; GetGFXDeviceContext()->IASetVertexBuffers(0, 1, &terrainVisual->m_VB, &stride, &offset); Vector3 eyePos = Vector3(cam->m_position); Matrix rotation = Matrix::CreateFromYawPitchRoll(entity->m_rotationEuler.x, entity->m_rotationEuler.y, entity->m_rotationEuler.z); Matrix model = rotation * Matrix::CreateTranslation(entity->m_position); Matrix view = cam->GetLookAtMatrix(); Matrix MVP = model * view * m_ProjectionMatrix; ShaderManager::GetInstance()->m_JQuadTess->SetEyePosW(eyePos); ShaderManager::GetInstance()->m_JQuadTess->SetWorld(model); ShaderManager::GetInstance()->m_JQuadTess->SetWorldViewProj(MVP); activeTech->GetPassByIndex(p)->Apply(0, GetGFXDeviceContext()); GetGFXDeviceContext()->Draw(4, 0); } dc->RSSetState(0); dc->OMSetBlendState(0, blendFactors, 0xffffffff); dc->OMSetDepthStencilState(0, 0); I draw my scene by looping through the list of entities and calling the associated draw method depending on the entity's "visual type":
      for (unsigned int i = 0; i < scene->GetEntityList()->size(); i++) { Entity* entity = scene->GetEntityList()->at(i); if (entity->m_VisualComponent->m_visualType == VisualType::MESH) DrawMeshEntity(entity, cam, sun, point); else if (entity->m_VisualComponent->m_visualType == VisualType::BILLBOARD) DrawBillboardEntity(entity, cam, sun, point); else if (entity->m_VisualComponent->m_visualType == VisualType::TERRAIN) DrawTerrainEntity(entity, cam); } HR(m_swapChain->Present(0, 0)); Any help/advice would be much appreciated!
    • By KaiserJohan
      Am trying a basebones tessellation shader and getting unexpected result when increasing the tessellation factor. Am rendering a group of quads and trying to apply tessellation to them.
      OutsideTess = (1,1,1,1), InsideTess= (1,1)

      OutsideTess = (1,1,1,1), InsideTess= (2,1)

      I expected 4 triangles in the quad, not two. Any idea of whats wrong?
      struct PatchTess { float mEdgeTess[4] : SV_TessFactor; float mInsideTess[2] : SV_InsideTessFactor; }; struct VertexOut { float4 mWorldPosition : POSITION; float mTessFactor : TESS; }; struct DomainOut { float4 mWorldPosition : SV_POSITION; }; struct HullOut { float4 mWorldPosition : POSITION; }; Hull shader:
      PatchTess PatchHS(InputPatch<VertexOut, 3> inputVertices) { PatchTess patch; patch.mEdgeTess[ 0 ] = 1; patch.mEdgeTess[ 1 ] = 1; patch.mEdgeTess[ 2 ] = 1; patch.mEdgeTess[ 3 ] = 1; patch.mInsideTess[ 0 ] = 2; patch.mInsideTess[ 1 ] = 1; return patch; } [domain("quad")] [partitioning("fractional_odd")] [outputtopology("triangle_ccw")] [outputcontrolpoints(4)] [patchconstantfunc("PatchHS")] [maxtessfactor( 64.0 )] HullOut hull_main(InputPatch<VertexOut, 3> verticeData, uint index : SV_OutputControlPointID) { HullOut ret; ret.mWorldPosition = verticeData[index].mWorldPosition; return ret; }  
      Domain shader:
      [domain("quad")] DomainOut domain_main(PatchTess patchTess, float2 uv : SV_DomainLocation, const OutputPatch<HullOut, 4> quad) { DomainOut ret; const float MipInterval = 20.0f; ret.mWorldPosition.xz = quad[ 0 ].mWorldPosition.xz * ( 1.0f - uv.x ) * ( 1.0f - uv.y ) + quad[ 1 ].mWorldPosition.xz * uv.x * ( 1.0f - uv.y ) + quad[ 2 ].mWorldPosition.xz * ( 1.0f - uv.x ) * uv.y + quad[ 3 ].mWorldPosition.xz * uv.x * uv.y ; ret.mWorldPosition.y = quad[ 0 ].mWorldPosition.y; ret.mWorldPosition.w = 1; ret.mWorldPosition = mul( gFrameViewProj, ret.mWorldPosition ); return ret; }  
      Any ideas what could be wrong with these shaders?
    • By simco50
      I've stumbled upon Urho3D engine and found that it has a really nice and easy to read code structure.
      I think the graphics abstraction looks really interesting and I like the idea of how it defers pipeline state changes until just before the draw call to resolve redundant state changes.
      This is done by saving the state changes (blendEnabled/SRV changes/RTV changes) in member variables and just before the draw, apply the actual state changes using the graphics context.
      It looks something like this (pseudo):
      void PrepareDraw() { if(renderTargetsDirty) { pD3D11DeviceContext->OMSetRenderTarget(mCurrentRenderTargets); renderTargetsDirty = false } if(texturesDirty) { pD3D11DeviceContext->PSSetShaderResourceView(..., mCurrentSRVs); texturesDirty = false } .... //Some more state changes } This all looked like a great design at first but I've found that there is one big issue with this which I don't really understand how it is solved in their case and how I would tackle it.
      I'll explain it by example, imagine I have two rendertargets: my backbuffer RT and an offscreen RT.
      Say I want to render my backbuffer to the offscreen RT and then back to the backbuffer (Just for the sake of the example).
      You would do something like this:
      //Render to the offscreen RT pGraphics->SetRenderTarget(pOffscreenRT->GetRTV()); pGraphics->SetTexture(diffuseSlot, pDefaultRT->GetSRV()) pGraphics->DrawQuad() pGraphics->SetTexture(diffuseSlot, nullptr); //Remove the default RT from input //Render to the default (screen) RT pGraphics->SetRenderTarget(nullptr); //Default RT pGraphics->SetTexture(diffuseSlot, pOffscreenRT->GetSRV()) pGraphics->DrawQuad(); The problem here is that the second time the application loop comes around, the offscreen rendertarget is still bound as input ShaderResourceView when it gets set as a RenderTargetView because in Urho3D, the state of the RenderTargetView will always be changed before the ShaderResourceViews (see top code snippet) even when I set the SRV to nullptr before using it as a RTV like above causing errors because a resource can't be bound to both input and rendertarget.
      What is usually the solution to this?
    • By MehdiUBP
      I wrote a MatCap shader following this idea:
      Given the image representing the texture, we compute the sample point by taking the dot product of the vertex normal and the camera position and remapping this to [0,1].
      This seems to work well when I look straight at an object with this shader. However, in cases where the camera points slightly on the side, I can see the texture stretch a lot.
      Could anyone give me a hint as how to get a nice matcap shader ?
      Here's what I wrote:
      Shader "Unlit/Matcap"
              _MainTex ("Texture", 2D) = "white" {}
              Tags { "RenderType"="Opaque" }
              LOD 100
                  #pragma vertex vert
                  #pragma fragment frag
                  // make fog work
                  #include "UnityCG.cginc"
                  struct appdata
                      float4 vertex : POSITION;
                      float3 normal : NORMAL;
                  struct v2f
                      float2 worldNormal : TEXCOORD0;
                      float4 vertex : SV_POSITION;
                  sampler2D _MainTex;            
                  v2f vert (appdata v)
                      v2f o;
                      o.vertex = UnityObjectToClipPos(v.vertex);
                      o.worldNormal = mul((float3x3)UNITY_MATRIX_V, UnityObjectToWorldNormal(v.normal)).xy*0.3 + 0.5;  //UnityObjectToClipPos(v.normal)*0.5 + 0.5;
                      return o;
                  fixed4 frag (v2f i) : SV_Target
                      // sample the texture
                      fixed4 col = tex2D(_MainTex, i.worldNormal);
                      // apply fog
                      return col;
  • Advertisement
  • Advertisement

DX11 HLSL unexpected dot product results

Recommended Posts


I have written a deferred renderer a few years ago and now I picked up the project again to fix some outstanding bugs and extend some features. The project is written in C++, DirectX 11 and HLSL.

While fixing the bugs I stombled across a strange behavior in one of my shader files which took me some time to track down. First I thought it had to do with my depth reconstruction algorithm in the point light shader, but after implementing alternate algorithms based on MJPs code snippets I ruled this out. It appears as if the dot function inside the shader sometimes (but reproducable) yields wrong results. Also, when switching from D3D_DRIVER_TYPE_HARDWARE to D3D_DRIVER_TYPE_WARP the problem completely disappeared, so to me it seems like this is either some kind of HLSL/DX11 or driver issue. I am using a GTX 980 for rendering and have the latest NVIDIA driver installed, also tried on an older laptop with NVIDIA card (which gave the same strange results).

Here are some images that show the problem:


The final scene rendered with D3D_DRIVER_TYPE_HARDWARE:


Visualization of the light composition rendertarget with D3D_DRIVER_TYPE_HARDWARE:


The final scene rendered with D3D_DRIVER_TYPE_WARP (this is how it should always look like!) :


So when debugging the wrong pixels with the Visual Studio Graphics Analyzer I found out that the hlsl dot function during my point light computations return unexpected and wrong values:


And the dot product of (-0.51, 0.78, 0.36) and (0, 1, 0) obviously should not be 0...

I am no expert in asm hlsl output, but the compiled shader code looks like this (last line is the dot product of lightVec and normal):


Does anyone have an idea on how to fix this issue or on how to avoid the strange dot product behavior?

Share this post

Link to post
Share on other sites

It's a complete shot in the dark, but how about implementing the dot product yourself, and seeing what that yields? I once ran into a similar issue with a pow() method in a mobile environment where on one device it would give erroneous results, and on the other correct. Though, haha, it was a mobile environment.

But, yeah. The dot product is a relatively simple operation to implement, and other than tooling around with your drivers, you can eliminate that as a variable. Though, tbh if you are experiencing this same issue on different generations of GPUs, i'm not sure it's the right direction either. But, hey who knows.


Share this post

Link to post
Share on other sites

WARP working and hardware not can indeed be an indication for a driver error. But a dot ? I also bet the compiler will issue a dot instruction even you handwrite it :P

That debug view is suspicious but I for one wouldn't trust it. I never had much luck with shader debugging particularly because of such behaviour. But since you got output, color debugging it is. Dump the dot result directly afterwards, "wrapped lighting" style:

return float4(diffuseFactor.xxx * 0.5 + 0.5, 1.0);

I rather suspect NaNs coming from those pows or something. Check the shader compiler log, they might spit out warnings.

Share this post

Link to post
Share on other sites

Thank you guys ;)
I rarely use shader debugging myself but in this case I didn't know what else to do.

Well, I finally found the reason for this strange behavior:

After implementing the dot product by myself like this:

float diffuseFactorXY = lightVec.x * normal.x + lightVec.y * normal.y;
float diffuseFactorZ = lightVec.z * normal.z;

float diffuseFactor = diffuseFactorXY + diffuseFactorZ;

I noticed that only the diffuseFactorZ is causing the issues, and specifically it was the normal.z value. So I took a closer look where it came from.

I am using compressed normals in my g-buffer, so I only store the x and y component and reconstruct the z component with sqrt(1 - normal.x^2 - normal.y^2) and by recovering the normal sign from another g-buffer entry.
However, I forget to normalize my normal before putting it inside the g-buffer and this resulted in negative values in the sqrt function. So yeah, sometimes the simplest mistakes can cause really strange issues at a totally different place.

I still find it confusing that WARP ignored this issue and seemed to return 0 instead of NaN from sqrt. Also, the debugger didn't show the normal.z value as NaN but simply as 0. There still is a NaN value in hlsl, isn't there?

Share this post

Link to post
Share on other sites

WARP uses CPU so floats should behave IEEE compliant. Don't rely on this for GPU. At least assume they can behave differently. Probably the debugger emulates the instructions on CPU, too.

You can check for NaNs in HLSL though:

    return float4(1.0, 0.0, 0.0, 1.0); // Red alert, this is not a drill

Final note: Instrumenting your HLSL code can of course rearrange the instructions and give different results. And then hide the bug :( 

Edited by unbird

Share this post

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Advertisement