Jump to content
  • Advertisement


  • Content Count

  • Joined

  • Last visited

Community Reputation

1522 Excellent

About lipsryme

  • Rank
    Advanced Member
  1. I tried your suggested solution and it seems to work nicely but the only thing that I can't quite comprehend is the way those mutex locks work. How can you be sure that the lock() call in RenderDone isn't called at the same time as the lock() in MainDone, or what if one locks it and the other also attempts to lock it before it has been unlocked again. Is that not an issue ? Also mainCount and renderCount are incremented but never reset I understand this might still work even after an overflow but still seems kind of like undefined behavior...just making sure that's 'as designed' ?
  2.   Is this deliberate? That's not likely to be the cause of your problem, but you do know that this is not the same as calling wait()?  On the  contrary, given a zero timeout you are never waiting, this is busy spinning.  I believe that is for avoiding the spurious wakeup case. So if the two threads blocking each other is the issue (which I believe it is) how would I avoid them doing it ? Some kind of restriction on who goes first ? I thought the order of set and wait would be given by the mutex locks. @Krypt0n I figured there needs to be a lock before setting and resetting the flag as well however this implementation seemed to be the consensus of what you find on the internet for some reason...still does not fix the original issue. Also not sure if it helps you figure it out but if I set a wait time in the condition to something like 20ms it runs, however slow (obviously). update: Basically what I can tell you from debugging is that when I run it in debug and pause where it hangs it will stop here: MAIN THREAD: this->renderer->RenderDoneCondition().Wait(); this->renderer->RenderCommandsReadyCondition().Set(); this->renderer->BufferSwappedCondition().Wait(); // Hangs here RENDER THREAD: // Wait for [MAINTHREAD] to provide us with command list data bufferSwappedCondition.Reset(); renderDoneCondition.Set(); renderCommandsReadyCondition.Wait(); // hangs here renderDoneCondition.Reset(); renderCommandsReadyCondition.Reset(); // Swap buffer IDs renderThreadID = !renderThreadID; // Notify [MAINTHREAD] that the buffer was swapped, so we can generate new commands in parallel bufferSwappedCondition.Set(); With the flag states being: renderCommandsReadyCondition = false bufferSwappedCondition = false renderDoneCondition = true
  3. Well basically all the main thread does with renderThreadID is use it as the subscript to access the correct one of the two buffers (aka the one that is not being worked on by the render thread) like so commandList[renderThreadID][size] I tried making it volatile as you suggested but that didnt help. Also I believe I'm already handling the spurious wakeup condition: class Event { public: Event() : flag(false) {} void Set() { flag = true; condition.notify_all(); } void Reset() { flag = false; condition.notify_all(); } void Wait() { std::unique_lock lk(m); while (!condition.wait_for(lk, std::chrono::milliseconds(0), [this]() { return flag; })) { if (flag) // avoid spurious wakeup break; } } void Wait(const long milliseconds) { std::unique_lock lk(m); condition.wait_for(lk, std::chrono::milliseconds(milliseconds), [this]() { return flag; }); } private: mutable std::mutex m; mutable std::condition_variable condition; bool flag; }; #endif The only synchronization between the threads is the wait/signalling. If that works correctly there should be no need for a barrier around that renderThread variable or not ? Or perhaps I need to rethink this more...
  4. EDIT: Sorry guys having issues with the forum Hey guys, I'm trying to implement a double buffered multithreading in my renderer but have run into an issue with the synchronization. Here's the function on the [main thread] that generates the command list: void FERenderer::Render(ISwapChain *swapChain) { // Set current swapChain to render to this->currentChain = swapChain; this->renderer->SetCurrentSwapChain(swapChain); // Create [RenderThread] here (will wait until [MainThread] is done generating command list) #ifdef MULTITHREADED_RENDERER_ENABLED if (!this->renderThreadCreated && !this->contentManager->IsCompilingShader()) { this->renderThread = std::thread(&IBackEndRenderer::ExecuteCommandList, this->renderer, this->commandList, std::ref(renderThreadID)); this->renderThreadCreated = true; this->renderThread.detach(); } this->CreateCommandList(); #else this->CreateCommandList(); if(!this->contentManager->IsCompilingShader()) { renderThreadID = 1; // workaround this->renderer->ExecuteCommandList(this->commandList, std::ref(renderThreadID)); renderThreadID = 0; } #endif this->profiler->FrameComplete(); #ifdef MULTITHREADED_RENDERER_ENABLED this->renderer->RenderDoneCondition().Wait(); this->renderer->RenderCommandsReadyCondition().Set(); this->renderer->BufferSwappedCondition().Wait(); // <- If removed, no deadlock ! #endif } And this is the function on the [render thread] that executes the command list: void LipsRenderD3D11::ExecuteCommandList(std::vector<unsigned char> *cmdList, bool &renderThreadID) { #ifdef MULTITHREADED_RENDERER_ENABLED while (true) #endif { #ifdef MULTITHREADED_RENDERER_ENABLED // Wait for [MAINTHREAD] to provide us with command list data bufferSwappedCondition.Reset(); renderDoneCondition.Set(); renderCommandsReadyCondition.Wait(); renderDoneCondition.Reset(); renderCommandsReadyCondition.Reset(); // Swap buffer IDs renderThreadID = !renderThreadID; // Notify [MAINTHREAD] that the buffer was swapped, so we can generate new commands in parallel bufferSwappedCondition.Set(); #endif // Early out if forcing shutdown of renderThread if (this->forceShutdown) return; const size_t dataSize = cmdList[!renderThreadID].size(); unsigned char* commandList = cmdList[!renderThreadID].data(); unsigned int n = 0; SwapChainD3D11 *swapChain = (SwapChainD3D11*)this->currentChain; while (n < dataSize) { eRC e_RC = (eRC)ReadCommand<unsigned int>(commandList, n); switch (e_RC) { //... } } } } I tested this synchronization design in a console application and logically it should really work, however putting this in my renderer I'm getting a deadlock as soon as I run my engine. This deadlock doesn't happen when I remove the BufferSwappedCondition().Wait() but that makes it crash. It runs okay as far as I can tell if I make renderThreadID atomic and then have it do the swap (which creates a form of synchronization) but I'm not sure if that's a viable workaround. Do you guys have any idea why this leads to a deadlock ?
  5. What would be the advantage of running a postFX pass for each light probe and blend it using alpha blending instead of just sampling all probes in one pass and just apply the weighting using constant buffer variables ? Wouldn't that be more efficient ?
  6. lipsryme

    Glowing objects.

    What you are talking about is normally called an emissive material because it "emits" light. You can also add a light source to it but you probably don't need to.   What you do in a deferred context (which you are using if I read it correctly), you just store a float variable as a kind of emissive intensity multiplier to your gbuffer and during your lighting pass you multiply your result with this multiplier to increase its light output.   To make it glow however you will have to do bloom as a post process. In short: ?1. Mask out brightest parts of your image 2. Blur your image using a gaussian filter 3. Add this to your original image
  7. @frenetic I'm okay for now with manually defining the boundaries.   @kalle_h So I'd have to do a final pass that divides by the accumulated alpha value ? Additionally I'm curious how to do the blend weight calculation...
  8. Sebastien Lagarde briefly mentions a deferred blending method in his blog post here: https://seblagarde.wordpress.com/2012/09/29/image-based-lighting-approaches-and-parallax-corrected-cubemap/     But he doesn't go into detail and only states that it is "easy" to implement in a deferred context. Can anyone tell me how this works exactly ?
  9. Hey guys,   I'm curious what you think about this problem. Let's say I want to blend light probe's together in a deferred way. The way I read it is finding out which probes are visible on screen (basically frustum cull them) and then blend K-LightProbes together.   Now I guess we obviously need to limit this to a fairly low number like let's say 4 ? How would I determine which 4 light probes these are? Compare distance of shaded pixel in WS to light probe WS pos ? Having captured a cubemap for both specular and diffuse reflections this would already mean we need to sample 8 cubemaps just to blend them. However let's assume I want to capture 4 cubemaps for key daytime intervals (sunrise / noon / sunset / night) that's 4 variations of the same cubemap meaning in total we would have to sample 2 * 4 * 4 = 32 cubemaps ! only to blend 4 light probes together. I haven't actually tested it yet to see how real world performance would be but it seems a bit insane to me in theory. That's also 32 shader resource slots I'd have to fill...   Is there a better / faster way to do this ?
  10. ok somehow this is not working still... Let's see... I start with a 32bit float variable that I have scaled to be in between [0.0-1.0] Now if I wanted to set the MSB I would shift this by 31 to the left. However to do the shift I have to convert this float variable to uint in hlsl. So I'm guessing doing something like: float fVariable = 255.0f; float x = float(uint(fVariable)); ...would not guarantee me the same 255.0f in the end. the 255.0f would convert to some 32bit unsinged integer value e.g. 1337 and that value would then become 1337.0f afterwards. So the question is how can I shift my float value by 1 and storing that into the 8bit unorm channel without throwing away the actual float value 0-255 or 0-127 in that case. This gets even trickier when trying to retrieve this value.   Update: Let's try a simple case: output.Target3.b = 0.0f; output.Target3.b = float(uint(output.Target3.b * 255.0f) | 0x000000FE) / 255.0f; Converting the actual float value to uint only works if I multiply by 255 because it will just use this value and cut off the decimals. So the [0-1] value needs to be [0-255]. With that I have my 8bit value (actually 32bit) that I can OR with a mask to set the 8th bit...then divide by 255 to get the normalized float value again [0-1] for storage. So far so good. Let's assume that the gpu stores a 32bit value that is < 255 without re-arranging anything we would have values of [0-1] after retrieving it with the MSB marked as 1. We should be able to retrieve the flag like so: bool flag = uint(Target3.b * 255.0f) & 0x000000FE; And this seems to work so far, however retrieving that value does not  :( float test = float(uint(Target3.b * 255.0f) & 0x00000080) / 255.0f; // doesnt work I multiply this [0-1] value by 255 and convert to uint to get a value in between [0-255], then OR this by 0x80 (8bit) to ignore the MSB of my (expected) 8bit value. And in the end convert this to float again The divide is for outputting it on the screen for debug purposes. I can't think of a reason why this wouldn't work except that the GPU doesnt store values [0.0-255.0] as 1:1 mapping [0.0-1.0]   Update2: ??I figured it out ! After going through all of the values in my head / piece of paper and debugging the values via RenderDoc.... This is the final working code: Encode: output.Target3.b = TranslucencyPower; output.Target3.a = TranslucencyAmbient; #ifdef IsSkinShading output.Target3.b = float(uint(output.Target3.b) | 0x00000080) / 255.0f; // Set MSB to 1 output.Target3.a = float(uint(output.Target3.a) & 0x0000007F) / 255.0f; // Set MSB to 0 #endif Decode: bool materialID_Flag1 = uint(Target3.b * 255.0f) & 0x00000080; bool materialID_Flag2 = uint(Target3.a * 255.0f) & 0x00000080; float TranslucencyPower = (uint(Target3.b * 255.0f) & 0x0000007F) / 255.0f; float TranslucencyAmbient = (uint(Target3.a * 255.0f) & 0x0000007F) / 255.0f;
  11. Hey guys I'm currently having problems with the following thing:   I have a float variable that I use to store 8bit data [0-255] in a color channel of an R8G8B8A8 render target. What I want to do is take the MSB of that float variable and use it as a boolean flag. So basically reduce my value range to [0-127]. However something doesn't seem to work quite right. I'm thinking it has to do with floating point being stored / converted differently ?   Here's my hlsl code right now: GBuffer: output.Target3.b = TranslucencyPower * 0.01f; uint id = 0; #ifdef IsSkinShading id = 1; #endif output.Target3.b = asuint(output.Target3.b) | (id << 7); // Store materialID data in MSB Shading: bool isSkinShading = (asuint(Target3.b) >> 7) & 0x0FFFFFFF; float TranslucencyPower = (asuint(Target3.b) & 0x0FFFFFFF) * 100.0f; The boolean flag works perfectly like this. But not the actual variable (the one that stores 'TranslucencyPower'). Any idea how I can retrieve this value correctly ?   ?UPDATE: ?I figured it out ! First of all you need to preserve the float value by using a regular uint-cast instead of asuint(). Then the next important thing is you have to convert to and from float again when storing in the 8bit unorm render target.   Here's the working code: output.Target3.b = float((uint(TranslucencyPower) | (id << 7)) / 255.0f); // Store materialID data in MSB float TranslucencyPower = float(uint(Target3.b * 255.0f) & 0x0FFFFFFF);
  12. Thanks for the answers, that made it clear :)
  13. I was reading about data driven renderer architecture and it was states that you'd submit a "drawcall" that has all it's necessary states selfcontained. But they also stated that you'd reset all states back to null afterwards so that no conflict remains. I'm currently wondering if that makes sense from a performance sense, since flushing everything only to "re-set" it again in the next drawcall seems like a lot of wasted state switches. Right now I'm leaning more towards leaving the states as they are after the drawcall which might be more error prone but has no wasted state switching. However I did have some issues with state remnants before and it can be a real nuisance to debug sometimes...   Any suggestions ?
  14. Is it possible ? I remember reading about it in Brian Karis's unreal engine 4 temporal AA presentation but am not having success implementing it in my renderer. Setting the tonemap and inverse tonemap thing aside...I can't get my post effects to work without jittering even after the temporal resolve since they are dependent on the depth buffer which I guess is impossible to resolve ?   There's a picture in his presentation that depicts it as Pre-DOF-Setup -> Reproject -> Other PostFX. I'm not sure what kind of secret sauce that pre-setup contains but just doing the CoC generation before resolve and then the blur afterwards still doesn't work for me. Would I have to temporally resolve the CoC texture ? So basically store 2 versions of this texture, alternate and blend ?   I'm using SMAA T2x that jitters over 2 frames during the geometry passes and then combines it using velocity. It just feels so wasted to perform all my post processing on the aliased image to then make the end result smooth again.... Has anyone done this successfully already or knows how they do it in UE4 ?
  15. Not sure if there's some fancy solution for this but the easiest that comes to mind is provide a texture that has the same size as the image you want to blur that acts as a mask (completely black and the parts you want to have blurred white). Afterwards in your blur shader you can use a simple if/else branch to either blur or not blur those pixels.
  • Advertisement

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!