• Advertisement
  • Popular Tags

  • Popular Now

  • Advertisement
  • Similar Content

    • By fs1
      I have been trying to see how the ID3DInclude, and how its methods Open and Close work.
      I would like to add a custom path for the D3DCompile function to search for some of my includes.
      I have not found any working example. Could someone point me on how to implement these functions? I would like D3DCompile to look at a custom C:\Folder path for some of the include files.
    • By stale
      I'm continuing to learn more about terrain rendering, and so far I've managed to load in a heightmap and render it as a tessellated wireframe (following Frank Luna's DX11 book). However, I'm getting some really weird behavior where a large section of the wireframe is being rendered with a yellow color, even though my pixel shader is hard coded to output white. 

      The parts of the mesh that are discolored changes as well, as pictured below (mesh is being clipped by far plane).

      Here is my pixel shader. As mentioned, I simply hard code it to output white:
      float PS(DOUT pin) : SV_Target { return float4(1.0f, 1.0f, 1.0f, 1.0f); } I'm completely lost on what could be causing this, so any help in the right direction would be greatly appreciated. If I can help by providing more information please let me know.
    • By evelyn4you
      i try to implement voxel cone tracing in my game engine.
      I have read many publications about this, but some crucial portions are still not clear to me.
      At first step i try to emplement the easiest "poor mans" method
      a.  my test scene "Sponza Atrium" is voxelized completetly in a static voxel grid 128^3 ( structured buffer contains albedo)
      b. i dont care about "conservative rasterization" and dont use any sparse voxel access structure
      c. every voxel does have the same color for every side ( top, bottom, front .. )
      d.  one directional light injects light to the voxels ( another stuctured buffer )
      I will try to say what i think is correct ( please correct me )
      GI lighting a given vertecie  in a ideal method
      A.  we would shoot many ( e.g. 1000 ) rays in the half hemisphere which is oriented according to the normal of that vertecie
      B.  we would take into account every occluder ( which is very much work load) and sample the color from the hit point.
      C. according to the angle between ray and the vertecie normal we would weigth ( cosin ) the color and sum up all samples and devide by the count of rays
      Voxel GI lighting
      In priciple we want to do the same thing with our voxel structure.
      Even if we would know where the correct hit points of the vertecie are we would have the task to calculate the weighted sum of many voxels.
      Saving time for weighted summing up of colors of each voxel
      To save the time for weighted summing up of colors of each voxel we build bricks or clusters.
      Every 8 neigbour voxels make a "cluster voxel" of level 1, ( this is done recursively for many levels ).
      The color of a side of a "cluster voxel" is the average of the colors of the four containing voxels sides with the same orientation.

      After having done this we can sample the far away parts just by sampling the coresponding "cluster voxel with the coresponding level" and get the summed up color.
      Actually this process is done be mip mapping a texture that contains the colors of the voxels which places the color of the neighbouring voxels also near by in the texture.
      Cone tracing, howto ??
      Here my understanding is confus ?? How is the voxel structure efficiently traced.
      I simply cannot understand how the occlusion problem is fastly solved so that we know which single voxel or "cluster voxel" of which level we have to sample.
      Supposed,  i am in a dark room that is filled with many boxes of different kind of sizes an i have a pocket lamp e.g. with a pyramid formed light cone
      - i would see some single voxels near or far
      - i would also see many different kind of boxes "clustered voxels" of different sizes which are partly occluded
      How do i make a weighted sum of this ligting area ??
      e.g. if i want to sample a "clustered voxel level 4" i have to take into account how much per cent of the area of this "clustered voxel" is occluded.
      Please be patient with me, i really try to understand but maybe i need some more explanation than others
      best regards evelyn
    • By Endemoniada

      Hi guys, when I do picking followed by ray-plane intersection the results are all wrong. I am pretty sure my ray-plane intersection is correct so I'll just show the picking part. Please take a look:
      // get projection_matrix DirectX::XMFLOAT4X4 mat; DirectX::XMStoreFloat4x4(&mat, projection_matrix); float2 v; v.x = (((2.0f * (float)mouse_x) / (float)screen_width) - 1.0f) / mat._11; v.y = -(((2.0f * (float)mouse_y) / (float)screen_height) - 1.0f) / mat._22; // get inverse of view_matrix DirectX::XMMATRIX inv_view = DirectX::XMMatrixInverse(nullptr, view_matrix); DirectX::XMStoreFloat4x4(&mat, inv_view); // create ray origin (camera position) float3 ray_origin; ray_origin.x = mat._41; ray_origin.y = mat._42; ray_origin.z = mat._43; // create ray direction float3 ray_dir; ray_dir.x = v.x * mat._11 + v.y * mat._21 + mat._31; ray_dir.y = v.x * mat._12 + v.y * mat._22 + mat._32; ray_dir.z = v.x * mat._13 + v.y * mat._23 + mat._33;  
      That should give me a ray origin and direction in world space but when I do the ray-plane intersection the results are all wrong.
      If I click on the bottom half of the screen ray_dir.z becomes negative (more so as I click lower). I don't understand how that can be, shouldn't it always be pointing down the z-axis ?
      I had this working in the past but I can't find my old code
      Please help. Thank you.
    • By turanszkij
      I finally managed to get the DX11 emulating Vulkan device working but everything is flipped vertically now because Vulkan has a different clipping space. What are the best practices out there to keep these implementation consistent? I tried using a vertically flipped viewport, and while it works on Nvidia 1050, the Vulkan debug layer is throwing error messages that this is not supported in the spec so it might not work on others. There is also the possibility to flip the clip scpace position Y coordinate before writing out with vertex shader, but that requires changing and recompiling every shader. I could also bake it into the camera projection matrices, though I want to avoid that because then I need to track down for the whole engine where I upload matrices... Any chance of an easy extension or something? If not, I will probably go with changing the vertex shaders.
  • Advertisement
  • Advertisement
Sign in to follow this  

DX11 DX11 slow on integrated graphics

This topic is 2187 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi, i'm writing a simple DX11 app and stumbled on a fact that it's running quite slow on integrated graphics with hardware future level of DX10. More precisely i'm getting only like 18fps on my office Intel [color=#000000][font=sans-serif]

GMA X4500, but my desktop ATI HD6850 renders the same code at around 1900fps. I [/font]expected[color=#000000][font=sans-serif]

code to run slower, but not like 100 times. Here's what I got from PIX:[/font]



I can't imagine, that OMSetRenderTargets or Canstant buffer updates can take that much time, so it must be buffer clearings? guess timings are not accurate at all... What could help improve performance on this machine, could it be that slow down is caused by 32bit texture format?

Share this post

Link to post
Share on other sites

I believe that is a DX10 card so you would be in software emulation if you call DX11.

Nope. If you use a feature level of D3D10 you get hardware acceleration of D3D10-level features - that's kind of the whole point of feature levels.

Share this post

Link to post
Share on other sites

There are several methods for tunning shaders.
Maybe you would reinvestigate your shaders.
For example, developers often are removing if statements if possible, etc. (for better threading)
Other method suggested by Intel is generating static and dynamic shadow maps into separate textures because of the frequency of changes.
Maybe your Intel can not optimize your code.
The solution would be a research for these quick algorithms.
Now Intel GPUs are not too fast smile.png, but the future would be better.

Share this post

Link to post
Share on other sites
Well what do you expect, it's an integrated card. It's not supposed to be fast. That said, 18 fps does seem a bit slow. The timings do make sense because GPU's are asynchronous devices, so when you tell them to Draw() you actually tell them to "Draw() as soon as you can", which is instantaneous, and then later on when you call a constant buffer update or a render target change you are then forced to wait on the GPU to finish rendering since you can't update resources which are being used. This is why those calls take forever.

Try doing it with a very simple test case and see if you get the same timings/results. It could just be that your integrated graphics card isn't very optimized for DX11-reduced/DX10 (it may very well have been slapped on it as an afterthought).

Share this post

Link to post
Share on other sites
I see that you're doing a lot of render target setting and clearing here as well. This is not going to play well with integrated graphics, which are quite weak in terms of fillrate.

Depending on what you're doing you may be able to get away without clearing the render targets. If you're drawing over the full extents of the target, for example, you really don't need to clear as everything is going to be covered anyway - that should get you back a few frames.

For your final draw, do you even need a depth/stencil view? All that you're doing is blasting the end results of your render to the screen, so you may be able to drop the depth/stencil, and disable depth test/depth write for this part of the draw.

Also very important to consider is that if you're clearing depth you should also clear stencil at the same time - even if you're not using it. This is because depth and stencil are often interleaved with 24 bits for depth and 8 for stencil (it's not clear from your shot if you have this format) so clearing both together can get you a MUCH faster clear.

Finally, those R32G32 textures are not going to perform well at all on this kind of hardware. Consider a simpler format - do you really need all that precision?

Share this post

Link to post
Share on other sites
Thanks for the tips. I will try to optimize shaders later, yesterday was just to tired. Here's somethings I tried and results:
changed texture format to R16G16: +0 fps
removed three useless render target clearings: +1 fps
added clear depth flag to last depth/stencil clear: +7 fps
Think i can't remove last depth test, because I would need to implement some sort of geometry sorting by depth. My scene: render variance shadow map to R16G16, perform gaussian blur on X axis, then gaussian on Y axis, then render whole scene. I will try to get more accurate timings with flush() command, maybe then I can track the culprit, or just certify that this GPU can't render squat.

Share this post

Link to post
Share on other sites
Sign in to follow this  

  • Advertisement