# DX11 vs_4_0 optimsations

This topic is 2298 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

Hiya,
Sorry, the title should read ps_4_0 optmisations....

I've searched the forums for this problem and couldn't find anything related.
I'm writing a vs and ps in Dx11.0 using shader level 4. If I use optimisation level 3 in the d3dcompile function, the fps is around 65fps, however if I turn off optimisations using the skip optimisations flag then I get 80fps!
Has anyone heard of this kind of thing? Could it simply be the drivers for my gpu or the gpu itself? The shaders are nothing special, the vs is a basic 'pass-though' to the ps. Any ideas and help is appreciated. The ps is below -

cbuffer ScreenDim { float screenWidth; float screenHeight; float2 padding; } struct PixelLightingType { float4 position : SV_POSITION; float2 tex : TEXCOORDS0; float4 lightPR : TEXCOORDS1; float4 lightCI : TEXCOORDS2; }; Texture2D inTex[2]; SamplerState Sampler; float4 LightingPixelShader(PixelLightingType input) : SV_TARGET { float4 outColor; float depth = inTex[0].Sample(Sampler,input.tex).r; float3 normal = inTex[1].Sample(Sampler,input.tex).rgb; normal = normal*2-1; normal = normalize(normal); float3 pixel; pixel.x = screenWidth * input.tex.x; pixel.y = screenHeight * input.tex.y; pixel.z = depth; float3 shading = 0; float3 lightDir = input.lightPR.xyz - pixel; float cone = saturate(1 - length( lightDir)/input.lightPR.w); if (cone>0) { float distance = 1/length(lightDir) * input.lightCI.w; float amount = max(dot(normal + depth , normalize(distance)),0); shading = distance * amount * cone * input.lightCI.rgb; } outColor = float4(shading,1); return outColor; }

Dave

##### Share on other sites
If you're actually sure that it's the pixel shader that's causing this performance delta, then I would look at the compiled assembly and see what the differences are. It might be putting in a branch instruction in one version, and flattening it in the other.

##### Share on other sites
Hiya MJP,

Yes, I'm 100% sure its the pixel shader. Or at least I think I am
I'm compiling the vs and ps seperately and changing the compilation flag only for the ps. I'm not using the fx framework at all. Without optimisations the ps ends up with 45 instruction slots which includes 2 'if else endif' nested one inside the other. The optimised version is only 26 instruction slots with no nesting or branching, but its almost 20% slower.
TBH, the assembly was the first place I looked. Is it worth me posting the assembly output here?
Are you thinking it might be something stalling in the pipeline ?

##### Share on other sites
You can try forcing the branch in the optimized version, and see if that speeds things up. Just do this:
 [branch] if (cone>0) { ... } 

##### Share on other sites
Wow, Thankyou.

It increased the instruction slots to 31 but brang the framerate back up to 80.
I've read about those commands in the docs but I thought it would make things slower as more instruction slots would be used. Do you know where I could information in regards to the speed of the shader commands and functions?

Thankyou for that tip and fixing it up! And I've learned something new too.
Thanks again.

Dave.

• 12
• 10
• 11
• 18
• 13
• ### Similar Content

• By Stewie.G
Hi,

I've been trying to implement a basic gaussian blur using the gaussian formula, and here is what it looks like so far:
float gaussian(float x, float sigma)
{
float pi = 3.14159;
float sigma_square = sigma * sigma;
float a = 1 / sqrt(2 * pi*sigma_square);
float b = exp(-((x*x) / (2 * sigma_square)));
return a * b;
}
My problem is that I don't quite know what sigma should be.
It seems that if I provide a random value for sigma, weights in my kernel won't add up to 1.
So I ended up calling my gaussian function with sigma == 1, which gives me weights adding up to 1, but also a very subtle blur.
Here is what my kernel looks like with sigma == 1
[0]    0.0033238872995488885
[1]    0.023804742479357766
[2]    0.09713820127276819
[3]    0.22585307043511713
[4]    0.29920669915475656
[5]    0.22585307043511713
[6]    0.09713820127276819
[7]    0.023804742479357766
[8]    0.0033238872995488885

I would have liked it to be more "rounded" at the top, or a better spread instead of wasting [0], [1], [2] with values bellow 0.1.
Based on my experiments, the key to this is to provide a different sigma, but if I do, my kernel values no longer adds up to 1, which results to a darker blur.
I've found this post
... which helped me a bit, but I am really confused with this the part where he divide sigma by 3.
Can someone please explain how sigma works? How is it related to my kernel size, how can I balance my weights with different sigmas, ect...

Thanks :-)

• Is it possible to asynchronously create a Texture2D using DirectX11?
I have a native Unity plugin that downloads 8K textures from a server and displays them to the user for a VR application. This works well, but there's a large frame drop when calling CreateTexture2D. To remedy this, I've tried creating a separate thread that creates the texture, but the frame drop is still present.
Is there anything else that I could do to prevent that frame drop from occuring?

• i'm trying draw a circule using math:
class coordenates { public: coordenates(float x=0, float y=0) { X = x; Y = y; } float X; float Y; }; coordenates RotationPoints(coordenates ActualPosition, double angle) { coordenates NewPosition; NewPosition.X = ActualPosition.X*sin(angle) - ActualPosition.Y*sin(angle); NewPosition.Y = ActualPosition.Y*cos(angle) + ActualPosition.X*cos(angle); return NewPosition; } but now i know that these have 1 problem, because i don't use the orign.
even so i'm getting problems on how i can rotate the point.
these coordinates works between -1 and 1 floating points.
can anyone advice more for i create the circule?
• By isu diss
I managed convert opengl code on http://john-chapman-graphics.blogspot.co.uk/2013/02/pseudo-lens-flare.html to hlsl, but unfortunately I don't know how to add it to my atmospheric scattering code (Sky - first image). Can anyone help me?
I tried to bind the sky texture as SRV and implement lens flare code in pixel shader, I don't know how to separate them (second image)

• By jonwil
I have some code (not written by me) that is creating a window to draw stuff into using these:
CreateDXGIFactory1 to create an IDXGIFactory1
dxgi_factory->CreateSwapChain to create an IDXGISwapChain
D3D11CreateDevice to create an ID3D11Device and an ID3D11DeviceContext
Other code (that I dont quite understand) that creates various IDXGIAdapter1 and IDXGIOutput instances
Still other code (that I dont quite understand) that is creating some ID3D11RenderTargetView and ID3D11DepthStencilView instances and is doing something with those as well (possibly loading them into the graphics context somewhere although I cant quite see where)
What I want to do is to create a second window and draw stuff to that as well as to the main window (all drawing would happen on the one thread with all the drawing to the sub-window happening in one block and outside of any rendering being done to the main window). Do I need to create a second IDXGISwapChain for my new window? Do I need to create a second ID3D11Device or different IDXGIAdapter1 and IDXGIOutput interfaces? How do I tell Direct3D which window I want to render to? Are there particular d3d11 functions I should be looking for that are involved in this?
I am good with Direct3D9 but this is the first time I am working with Direct3D11 (and the guy who wrote the code has left our team so I cant ask him for help