• Content count

  • Joined

  • Last visited

  • Days Won


galop1n last won the day on November 2

galop1n had the most liked content!

Community Reputation

1005 Excellent


About galop1n

  • Rank

Personal Information

  • Industry Role
  • Interests


  • Twitter
  • Github
  • Steam

Recent Profile Visitors

4293 profile views
  1. A create view failure result into a device removed, but should not have crashed in the first place. Are you sure you did not destroy the texture and have a stale pointer ?
  2. Dx11 is the worst, most commands building are delayed and present is a black box that does all the work. Deferred context even worse, rolling out even more work in the present… You can use the no wait flag at Present to disociate from vsync wait and real cpu work. But that black box also means that you can't know exactly what part of your frame is causing the real fuzz in your CPU time. It can be memory defrag, first time shaders compiles, etc. Nvidia does a better job at multithreading the driver than nvidia for you but again, at the price of opacity Rule of thumb, optimize for AMD (a lost cause without AMD insiders support), assume will work better/greater on nVidia for your own sanity. On the GPU, heavy use of timestamp help to inspect frame duration
  3. Gamma correction - sanity check

    A swapchain in 8888 either unorm or srgb still expect srgb content for display. So you are likely to keep it srgb unless you do a copy from a separate 8888 unorm view of your offscreen surface. Offscreen surface need enough precision so 8888srgb or hdr. You will see soon that you need tonemap so don't assume ldr all the way.
  4. You can also introduce an angle control step count, if you are looking straight to a wall, you won't see much parallax to start with. Same thing, you can transition to an height of zero on the distance and doing so, skip the stepping. All the subsequent texture fetch, not only to the depth map needs to use a custom gradient or you gonna get edges artifacts. And this is sad because SampleGrad is half rate. I personally use the original uv gradient over everything Using a BC4_UNORM definitely matters on bandwidth too. And last, most attempt to do quad tree are a false good idea, the shader logic overhead is usually crushed by the bruteforce version… And i am not even talking about silhouete aware versions… Maximum sadness
  5. Because between the draw of the cube and the draw of the pyramid, you change the matrix to use, it is not a concern of the shader, it is a concern to change the constant buffer view to a different value ( and because you are doing DX12, deal with memory life time of buffer content ).
  6. No one said it so i will say it : "If you are not yet an expert at DX11 or if you don't know why your application is in the 1% that would benefit from DX12 explicitely, then you should stick to DX11". DX12 is not a replacement for DX11, it is an edge case scenario just like Vulkan is an edge case scenario to OpenGL. You are struggling with rendering two objects, DX12 is not for you for at least many years of intense DX11 usage.
  7. It is illegal, and undefined behavior, it may works on some hardware/driver and not other. Especially with DX11 where a lot happen behind the scene ( defrags, renaming, paging, … )
  8. My recommendation is even better, use a StructuredBuffer for that
  9. Because it would be waste, and because it is not C++. Don't overthink it z)
  10. unless on some nvidia hardware with last chance optimization, you should just stick to a structured buffer for large storage of constants. You do not have the exotic ( not c++ compatible ) alignment rules, you do not have the restriction of updating the full buffer ( dx11.1 windows 8 minimum for that, just saying fyi, as dx12 is not a problem here anyway ).
  11. When you trigger a draw call, you provide a base vertex location, and a start index location. If you want to draw a cube and a pyramid from the same vertex and index buffer, you just have to issue two draw calls with the proper triangle count and offsets to the wanted geometry. If you want to draw both the cube and the pyramid in a single draw call, considered a single geometry, you will have to do something like skinned geometry, provide a bone index in the vertices, and read that in the vertex shader to index into an array of world matrix. Also, i would advocate from describing everything as matrices but as transformation because It is not mandatory to be a matrix. As for an example, a world transformation could be a Quaternion for orientation, a vector3 for translation and a scalar for scale ( no, don't do non uniform scale, it is bad ! ).
  12. Be careful about optimizing the directx bytecode. Optimization to it usually does not translate well to the final uCode of vendors and may even give the driver compiler a harder time.
  13. What is numlevel in your case. I usually keep quality level to zero ( but don't use msaa much at work sadly and did not msaa in a long time ). Hardware like GCN have features to do MSAA but dropping some fragments ( like 4x but only memory for 2 fragments ) and need very custom compute resolve. This is on console, PC don't expose that, but the quality level may and then make UAV undefined if != 0
  14. Zero in pNumQualityLevels, not the HRESULT
  15. DirectCompute thread groups

    Some GCN hardware have a halfed wave spawn rate if you use the Z dimension, not sure if it is still true or not. GCN again, there is an input vgpr per dimension and no combined one, at least on PS4 ( taken from a compute ISA s14 = s_tgid_x s15 = s_tgid_y v0 = v_thread_id_x v1 = v_thread_id_y ). You could look at the ISA in Pix for AMD to confirm all that on PC.