• Content count

  • Joined

  • Last visited

Community Reputation

122 Neutral

About lubbe75

  • Rank

Personal Information

  • Interests
  1. DX12 DX12 and threading

    OK. So if I always get into the wait section it means that the GPU is doing the lengthy work compared to the CPU. Would I gain anything here by adding more allocators? I'm already at good speed, but I'm aiming for all the low-hanging fruit here So, without doing multithreading, does it mean that I'm only using one GPU, even if the hardware has more than one? Does the Dx12 driver ever utilise multiple GPUs without me telling it to do so?
  2. DX12 DX12 and threading

    After testing I always seem to get into the wait section on every frame (gpuLag == 2 on every frame) except for the first frame. In your code, where does CurrentGPUFrame advance one number, except for in the wait section? Maybe I am really confused by the SharpDX equivalent of the code. I can't really find any SharpDX documentation on fencing, or even some of the involved functions. Maybe someone with SharpDX experience can bring clarification here? Here is my rendering code: Note: frameIndex is 0 or 1 fence does not have a Wait function in SharpDX fenceEvent is an AutoResetEvent // populating & executing commandList, presenting ... CurrentCPUFrame++; commandQueue.Signal(fence, CurrentCPUFrame); int gpuLag = CurrentCPUFrame - CurrentGPUFrame; if (gpuLag >= 2) { fence.SetEventOnCompletion(CurrentGPUFrame + 1, fenceEvent.SafeWaitHandle.DangerousGetHandle()); fenceEvent.WaitOne(); CurrentGPUFrame++; } frameIndex = swapChain.CurrentBackBufferIndex; // reseting commandList with allocator[frameIndex]
  3. DX12 DX12 and threading

    Thanks for the good example MJP. Time to draw something really heavy for testing purposes
  4. DX12 DX12 and threading

    Thanks for that link, Infinisearch! MJP, I have tried what you suggested, but I got poorer results compared to the straight forward 1-allocator method. Here is what I tried: After initiating, setting frameIndex to 0 and resetting commandList with allocator 0 I run the following loop (pseudo-code): populate commandList; execute commandList; reset commandList (using allocator[frameIndex]); present the frame; frameIndex = swapChain.CurrentBackBufferIndex; // 0 -> 1, 1 -> 0 if (frameIndex == 1) { // set the fence after frame 0, 2, 4, 6, 8, ... commandQueue.Signal(fence, fenceValue); } else { // wait for the fence after frame 1, 3, 5, 7, 9, ... int currentFence = fenceValue; fenceValue++; if (fence.CompletedValue < currentFence) { fence.SetEventOnCompletion(currentFence, fenceEvent.SafeWaitHandle.DangerousGetHandle()); fenceEvent.WaitOne(); } } Have I understood the idea correctly (I think I do)? Perhaps something here gets done in the wrong order?
  5. DX12 DX12 and threading

    Thanks for the tips and the links! After reading a bit more I get the idea that threading is mainly for recording command lists. Is this correct? Would this also include executing command lists? Before adding threads, will I benefit anything from using multiple command lists, command allocators or command queues? I have read somewhere that using multiple command allocators can increase performance since I may not have to wait as often before recording the next frame. I guess it's a matter of experimenting with the number of allocators that would be needed in my case. Would using multiple command lists or multiple command queues have the same effect as using multiple allocators, or will this only make sense with multi-threading? I'm currently in a stage where my Dx9 renderer is about 20 times faster than my Dx12 renderer, so I guessing it's mainly multi-threading that is missing. Do you know any other obvious and common beginner mistakes when starting with Dx12?
  6. Being new to DirectX 12 I am looking for examples on how to use threading. I have done lots of OpenGL in the past and some DirectX, but with DX12 the threading magic is gone and I understand that threading is crucial to get good performance. In my project I currently have one thread doing it all. I have one command list, one command allocator, one bundle and one bundle allocator. I also have a million triangles, so it's about time that I start doing this. How do I split things up? How many threads should I use? How many command lists and allocators? I realize this is a beginner's question , but I have to begin somewhere. I would be grateful if someone could point me in a direction where I could find a simple code sample, tutorial or something similar. Thanks!
  7. Thank you! I will take a look right away.
  8. I am looking for some example projects and tutorials using sharpDX, in particular DX12 examples using sharpDX. I have only found a few. Among them the porting of Microsoft's D3D12 Hello World examples (, and Johan Falk's tutorials ( For instance, I would like to see an example how to use multisampling, and debugging using sharpDX DX12. Let me know if you have any useful examples. Thanks!
  9. I'm writing a 3D engine using SharpDX and DX12. It takes a handle to a System.Windows.Forms.Control for drawing onto. This handle is used when creating the swapchain (it's set as the OutputHandle in the SwapChainDescription). After rendering I want to give up this control to another renderer (for instance a GDI renderer), so I dispose various objects, among them the swapchain. However, no other renderer seem to be able to draw on this control after my DX12 renderer has used it. I see no exceptions or strange behaviour when debugging the other renderers trying to draw, except that nothing gets drawn to the area. If I then switch back to my DX12 renderer it can still draw to the control, but no other renderers seem to be able to. If I don't use my DX12 renderer, then I am able to switch between other renderers with no problem. My DX12 renderer is clearly messing up something in the control somehow, but what could I be doing wrong with just SharpDX calls? I read a tip about not disposing when in fullscreen mode, but I don't use fullscreen so it can't be that. Anyway, my question is, how do I properly release this handle to my control so that others can draw to it later? Disposing things doesn't seem to be enough.
  10. So what you are saying is that I should not use RGBA16F when reading an HDR texture, but use the RGB16F format instead. Hmm... that makes sense of course. I just have to make sure then that the HDR (RGBE) values are correctly converted to RGB values when reading the texture from file. I'm using OpenSceneGraph and I suppose it is handled there. Thanks for the patience explaining again and again. Still, if anyone has an OpenGL example using HDR textures and tone mapping, please let me know. I'm sure there is plenty more to learn from there.
  11. Does anybody have a working example using HDR textures and tone mapping in OpenGL? Has anybody managed to add two HDR values together properly? I still get banding problems with the addition formula mentioned here. Still searching...
  12. Ok, this is what I am doing... I use Paul Debevec's light probe images (vertical cross, .hdr format) as cube map textures. The texture format for the intermediate surface is GL_RGBA16F_ARB. When I render to this intermediate surface I don't do any encoding/decoding, just fetching values in the fragment shader by calling the built in textureCube function. In the final pass I render my intermediate surface to screen, using a decoding function in the fragment shader: vec4 texture2DRGBE( sampler2D sampler, vec2 coords, vec4 mean ) { vec4 rgbe = floor( texture2D( sampler, coords) * 255. + 0.5 ); float e = rgbe.a - ( 128. + 8. ); vec4 result = vec4( rgbe.rgb * 0.5 * exp2( e ), 1.0 ); } This gives good results. Are you saying that I don't need to do any decoding here? Just using the texture2D straight away? Of course I could use some good tone mapping, but it still looks right when using the texture2DRGBE function.
  13. So, if I use the raw HDR colours (in my case RGBA16F) to which I have not done any decoding or anything, then the GPU understands that my vec4 is an encoded RGBE colour? Let's say I have two vectors vec4 v1 = vec4(a1,b1,c1,d1); vec4 v2 = vec4(a2,b2,c2,d2); Now if I want to add these together I get (a1+a2, b1+b2, c1+c2, d1+d2) But this would not be correct if the vectors represent colours where the last component represents the exponent value (correct me if I am wrong here). Do you mean that the GPU can detect that I am dealing with RGBA16F vectors and do the summation correctly for me? The same goes for scaling a vector I suppose.
  14. Hmm... I'm puzzled. Probably because I am all new to HDR rendering. I am using the GL_RGBA16F_ARB texture format for the intermediate texture (as in the stage before rendering to screen). When you say that I don't have to add any specific encoding/decoding math to my shaders, you mean before the final render-to-screen stage, right? I still have to provide my own decoding shader for rendering to screen, correct? And there are no built-in algorithms for adding, subtracting and scaling RGBA16F values, correct?
  15. MJP, I was not able to find the source code to Jack Hoxley's sample for some reason. Could you give me a pointer to the code itself? In my case I'm just a hobbyist and I don't need to target pre-DX10 hardware. So does it mean I should change my 16 bit float RGBE to something else? Right now it seems to fit nicely since I use images in HDR format.