Scoob Droolins

  • Content count

  • Joined

  • Last visited

Community Reputation

258 Neutral

About Scoob Droolins

  • Rank
  1. This post reminded me to ask this - my engine does all rendering in HDR using DXGI_FORMAT_R16G16B16A16_FLOAT targets.  The final frame (scan-out) buffer is DXGI_FORMAT_R8G8B8A8_UNORM_SRGB so linear-to-gamma is handled automatically by HW.  Switching to  DXGI_FORMAT_R10G10B10A2_UNORM for scan-out, there is no automatic gamma encode.  Should I use the standard inverse sRGB compand or is there a different method for 10-bits?
  2. Instanced stereo rendering using DirectX

    This will get you there - short and sweet.
  3. Problems with exporting FBX files Try the last function on the page - CalculateGlobalTransform - it worked predictably enough for me when exporting from 3ds Max.
  4. Alpha to Coverage w/o MSAA

    In my D3D11 implementation, I am getting proper alpha test just by setting alpha to coverage flag in blend stage. i have removed the 'clip' instructions from my pixel shaders and it still works even with no MSAA.  So ATC is acting just like clip - is there any performance advantage here?
  5. getting rid of near and far clipping planes

    If you''re on DX11, you can just set DepthClipEnable = false in the D3D11_RASTERIZER_DESC structure when creating a rasterizer state.  According to the docs, this causes the rasterizer to skip the 0 <= z <= w clip check, actually it just skips the z <= w part, but clips close geometry right at 0.  This is a much neater solution than messing with proj matrix especially if you are using pre-defined rasterizer states.
  6. Thanks for all your replies.  The good news is, not a driver bug - this was my bug, a cross-up in identifying the texture format.  The DXT5 was being loaded as BC2 intead of BC3.
  7. Hi - I've implemented a pretty effective alpha test using Texture2D.GatherAlpha, works even better with alpha-to-coverage enabled.  If it gets past the clip(), i multiply the scalar alpha return value with the final color. cbuffer PAlphaTest : register (cb2) {   float4 alphaTestThresh; }; float MainSampleAlphaTest (Texture2D tex, SamplerState samp, float2 texCoord) {   float4 alphaSamp = tex.GatherAlpha (samp, texCoord, int2(0,0));   float alpha = dot (alphaSamp, 0.25);   clip (alpha - alphaTestThresh.x);   return (alpha); } WIth a DXGI_FORMAT_R8G8B8A8_UNORM_SRGB texture I get this:   [attachment=28982:a0.jpg]   WIth a DXGI_FORMAT_BC3_UNORM_SRGB texture I get this:   [attachment=28983:a1.jpg]   Strange bunch of lines with the BC3 texture - anyone seen this effect before? Thanks.  
  8. 5. Use D3DXSaveTextureToFile to save RT2 directly to a file
  9. Soft Particles

    In D3D9, you can indeed create a multisample render target, but it is not a texture, and cannot be used in a shader.  Your only option would be to resolve this target to a non-multisample texture of the same size, and use that in your shader.  Use StretchRect for the resolve.
  10. Simplex Noise Texture Lookups?

    If you're interested in a non-texture method for Perlin simplex noize, I found this code here:  I've used it in a pixel shader to generate really nice fine-grained noise when using screen X,Y as a parameter to the method.   float3 permute(float3 x) { return fmod(((x*34.0)+1.0)*x, 289.0); } // Perlin simplex noise float snoise(float2 v) { const float4 C = float4(0.211324865405187, 0.366025403784439, -0.577350269189626, 0.024390243902439); float2 i = floor(v + dot(v, C.yy) ); float2 x0 = v - i + dot(i, C.xx); float2 i1; i1 = (x0.x > x0.y) ? float2(1.0, 0.0) : float2(0.0, 1.0); // i1.x = step( x0.y, x0.x ); // x0.x > x0.y ? 1.0 : 0.0 // i1.y = 1.0 - i1.x; float4 x12 = x0.xyxy + C.xxzz; x12.xy -= i1; i = fmod(i, 289.0); float3 p = permute( permute( i.y + float3(0.0, i1.y, 1.0 )) + i.x + float3(0.0, i1.x, 1.0 )); float3 m = max(0.5 - float3(dot(x0,x0), dot(x12.xy,x12.xy), dot(,, 0.0); m = m*m ; m = m*m ; float3 x = 2.0 * frac(p * C.www) - 1.0; float3 h = abs(x) - 0.5; float3 ox = floor(x + 0.5); float3 a0 = x - ox; m *= 1.79284291400159 - 0.85373472095314 * ( a0*a0 + h*h ); float3 g; g.x = a0.x * x0.x + h.x * x0.y; g.yz = a0.yz * x12.xz + h.yz * x12.yw; return 130.0 * dot(m, g); }    
  11. You have to sample the leaf texture and somehow output the texel alpha, either by multiplying it by your depth value (if you're using float textures) or just output it straightaway (if you're using a depth texture / PCF).   BTW, here are some useful values to use for alpha test reference value with alpha test func = greater:   0xBF - render when one of four pixels is color key value 0x7F - render when two of four pixels is color key value 0x3F - render when three of four pixels is color key value  
  12. Heat Distortion Effect

    Think of the heat effect as just one type of scene distortion which uses the current backbuffer as a source texture. There's also frosted/rippled glass, raindrops, ice, and so on. These materials should be drawn last, so our approach is to resolve the current backbuffer into a same-sized texture, then sample it with a screen projection. Shaders which use this texture don't use any lighting, it's already pre-lit. Works nicely for a lot of materials, and doesn't cause any serious performance issues.
  13. Using CSM with real depth textures (D24X8) and HW PCF, how is the issue of oversampling in the farther cascades solved? This is solved with MIPs and anisotropic filtering for standard color maps. But with depth maps, you can't create MIPs (except manually) and you can't autogen MIPs. Is the only solution, short of VSM, to dump HW PCF and go with FP24 or FP32 textures, which allow autogen of MIP levels?
  14. fast way of Checking hlsl code

    Try [url=""]GPU Shader Analyzer from AMD[/url] - besides compiling shaders, you get static performance analysis and asm ouput, which is great for figuring out exactly what your shaders are doing.
  15. Shader limitations and best practices

    Regarding your cone stepping loop - you'd have to check the asm output to see if this bit is being in-lined, but if it is, it could be caused by defining cone_steps as a const int = 15. The compiler now knows the max number of loops and will inline the whole thing if there are enough instruction slots. Try to leave this undefined, set value from CPU, now the compiler doesn't know the max number of loops and will leave this bit as a loop. Whether that will help performance is unknown, but it will reduce instruction count by a ton.