Jump to content
  • Advertisement

acerskyline

Member
  • Content Count

    24
  • Joined

  • Last visited

Community Reputation

2 Neutral

2 Followers

About acerskyline

  • Rank
    Member

Personal Information

  • Interests
    Programming

Social

  • Github
    https://github.com/ACskyline

Recent Profile Visitors

674 profile views
  1. acerskyline

    GPU memory allocator

    Thanks for your answers!
  2. Why do we need GPU memory allocator? One of the most important reason for CPU memory allocator is to minimize memory segmentation and increase cache hit ratio. These goals are achieved by allocating a consecutive block of memory and provision this block memory to application. Now, let's assume I do not need dynamic allocation, do I still need CPU memory allocator? IMO the only potentially plausible advantage is the cache hit ratio. In terms of GPU memory, most of my little demos do not need dynamic buffer or texture allocation (load & unload). They are all loaded at initialization and will never require unloading until the end of the application. Is there any reasons to use memory allocator in this situation?
  3. acerskyline

    D3D12 Fence and Present

    Thank you so much for answering all my questions! All your answers are very helpful. I learned a lot. Thanks again!
  4. acerskyline

    D3D12 Fence and Present

    Oh wait I think I found a possible reason. Maybe it's because the copy operation in blt model is not finished. It's holding the front buffer. There ARE 3 buffers (1 front 2 back) but the "display buffer" is currently using one (front buffer) of them (to copy from) so the GPU command list is blocked by it until the copy operation is finished. Is this valid?
  5. acerskyline

    D3D12 Fence and Present

    Question 1, does this mean the present to render target barrier is unnecessary? (since the entire command list stopped (as opposed to the command list is being executed but get blocked at the barrier) because of some magic that the driver(?) made) A separate question is, according to the Microsoft DX12 page, the buffer count parameter of DXGI_SWAP_CHAIN_DESC is: So, question 2, in the above example, isn't the actual buffer count is 4 (the number you created the swap chain with)? 1 of them is front buffer and 3 of them are back buffer. Only this way can it support the point that Because if the top "colored block" is not a part of the swapchain (means you created the swap chain with buffer count 3), why is the GPU blocked by that?
  6. acerskyline

    D3D12 Fence and Present

    Based on your reply, I changed the original intel diagram a little bit just to make sure I understand what you mean. The first diagram is the original one. The second diagram is what I made. The third one has some marks so that you know what I'm talking about. Looking at the third diagram, you can notice the red rectangle indicates what I changed. I made the GPU work last longer. It caused some other changes to the pipeline. Indicated by the yellow rectangle, I presume this is what you mean by . The GPU work lasts longer for that frame. Consequently, the "present queue" has to wait for the GPU to finish this frame. Also, by I think you are saying now that the "present queue" will wait for GPU work to finish, we might as well think it as it will not be put in the "present queue" until GPU finish its work for that frame. 1.Now, my first question is, which way visualize what happens on the hardware level better? (Even though they make no difference conceptually. It only changes where the start of a "colored block" in "present queue" is conceptually and the start does not matter as much as the end.) 2.My second question is, within the green rectangle, the (light blue) CPU thread is blocked by a fence(dark blue) and then blocked by Present(purple), am I right? 3.My third question is, within the blue rectangle, the brown "GPU thread" (command queue) is blocked by a present to render target barrier, am I right?
  7. acerskyline

    D3D12 Fence and Present

    Yeah! I totally agree. I am waiting for this. So, if Present is a queued operation, why does this diagram indicates that the CPU thread generate two "colored blocks", one on GPU queue and one on "present queue", and the time line looks like the Present is ahead of the actual rendering. Does it make sense?
  8. acerskyline

    D3D12 Fence and Present

    Sorry, I should have given it a little explanation. What I'm asking is what will happen if the GPU haven't finished rendering the frame but the Present is being "executed" to display it. The reason I didn't draw the rest of the pipeline is not that it's drained, it's just because I don't think its necessary to show the rest since it's irrelevant and it's also a lot of work to type the rest of it ; ).
  9. acerskyline

    D3D12 Fence and Present

    Continue my previous example, please bare with me. Now, assume one of the previous 3 frames is done - really done, as in on-screen, and the GPU workload for the current frame is very heavy. ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 0.We have already completed step 1 to step 8 for 3 times.(i = 1, 2, 3. Now i = 1 again) 1.WaitForSingleObject(i) 2.Barrier(i) present->render target <---------------- "GPU thread" (command queue) was here 3.Record commands... 4.Barrier(i) render target->present 5.ExecuteCommandList 6.Signal 7.Present <------------------------------------------------- CPU thread was here 8.Go to step 1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| cpu ... present| gpu ... barrier|----------------heavy work----------------| ----------------------------------------------------------------------------- | 3 | 1 | | 2 | 3 | 1 | | 1 | 2 | 3 | 1 | ----------------------------------------------------------------------------- screen | 3 | 1 | 2 | 3 | ? | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| My question is: What should the question mark be in the diagram above? Or will this happen? Thanks!
  10. acerskyline

    D3D12 Fence and Present

    Isn't calling WaitForSingleObject on a fence block the CPU thread? Also, I am wondering does Present block GPU thread? Assume I have called Present 3 times very quickly, before the 4th time I call Present, I called ExecuteCommandList. After ExecuteCommandList, I called Signal and then I called Present. So it looks like this: 0.We have already completed step 1 to step 8 for 3 times.(i = 1, 2, 3. Now i = 1 again) 1.WaitForSingleObject(i) 2.Barrier(i) present->render target 3.Record commands... 4.Barrier(i) render target->present 5.ExecuteCommandList 6.Signal 7.Present 8.Go to step 1 Under this circumstance, please answer my following questions: A.Step 1 may block CPU thread if previous work of frame 1 is not finished on GPU. Am I right? B. Assume previous work of frame 1 has finished on GPU. Step 7 may block CPU thread if none of the previous 3 frames is done - really done, as in on-screen. Am I right? C.If the answer to B is yes, then CPU thread will be blocked at step 7 but command list has already been submitted, what will happen to GPU thread? Will GPU thread be blocked? If yes, by what (If the answer is yes, I'm suspecting by the barrier recorded in step 2 (present->render target barrier))? If no, where will GPU render to when none of the previous 3 frames is done - really done, as in on-screen?
  11. I have been trying to figure out how does fence and present synchronize the pipeline when using vsync. I have read https://computergraphics.stackexchange.com/questions/2166/how-does-vsync-affect-fps-exactly-when-not-at-full-vsync-fps, https://www.gamedev.net/forums/topic/677527-dx12-fences-and-swap-chain-present/, https://www.gamedev.net/forums/topic/679050-how-come-changing-dxgi-swap-chain-descbuffercount-has-no-effect/, https://software.intel.com/en-us/articles/sample-application-for-direct3d-12-flip-model-swap-chains and https://docs.microsoft.com/en-us/windows/desktop/api/dxgi/nf-dxgi-idxgiswapchain-present. But I'm still a little confuesd. My major question is, assuming we are using triple buffer, will Present block cpu thread? If yes, when will it block cpu thread? I made this picture, please tell me which combination is the correct situation for next frame? In my opinion it should be B,E,H. But if it is really B,E,H, it doesn't conform to what the link#4 suggest under the classic mode section. As a matter of fact, I don't even understand how could GPU thread be 2 vsync late than CPU thread in the first place in that situation. Also, if it is really B,E,H, it doesn't conform to what Nathan Reed suggested in link#1. It seems in his example, cpu thread is not throttled by Present or vsync at all. Cpu threads start to work right after gpu finish its work.
  12. I was signaling before present. Once I moved the signaling after present the error was gone. The way I understand it is this: 1.Present will submit work to the queue 2.We can only assure the work before the Signal is done when the Signal is triggered Therefore when we insert Signal before Present, we can't guarantee all the work on the queue are done even the Signal is triggered because of the work submitted by Present may still be there on the queue waiting for execution. Now I have the same question as John321: So, Signal before Present, or Signal after Present? Which one is the best approach? Pros and Cons? The official example seems to always put Signal before Present, so I guess we should stick to this rule and when it comes to releasing command queues, always do an extra Signal to avoid "command queues being final-released while still in use by the GPU"?
  13. Yes, I agree with this. But why do I have to signal again? At the end of every frame I put a commandQueue->Signal right after commandQueue->ExecuteCommandLists. This means when I upload work to GPU, there will always be a Signal at the end of it. So why do I have to signal again when I want to clean up. Why can't I just wait for that Signal I put after the ExecuteCommandList? Is it caused by the extra system call thing mentioned on this page? My code is here. Uncomment this in main.cpp to enable debug layer. // debug layer if (!EnableDebugLayer()) { MessageBox(0, L"Failed to enable debug layer", L"Error", MB_OK); return 1; } Uncomment the above two code snippet to see this error.
  14. I encountered this problem when releasing D3D12 resources. I didn't use ComPtr so I have to release everything manually. After enabling debug layer, I saw this error: D3D12 ERROR: ID3D12CommandQueue::<final-release>: A Command Queue (0x000002EBE2F3C7A0:'Unnamed ID3D12CommandQueue Object') is being final-released while still in use by the GPU. This is invalid and can lead to application instability. [ EXECUTION ERROR #921: OBJECT_DELETED_WHILE_STILL_IN_USE] I wanted to figure out which queue is it so I enabled debug device and used ReportLiveDeviceObjects trying to identify the queue. But it showed the same error. All my queues had names and ReportLiveDeviceObjects worked on other resources. After googling around, I found this page. It was a similar problem and it seemed it had something to do with finishing the unfinished frames. Before I made any changes, my clean up code looks like this: void Cleanup() { // wait for the gpu to finish all frames for (int i = 0; i < FrameBufferCount; ++i) { frameIndex = i; //fenceValue[i]++; //------------------------------------------> FIRST COMMENTED CODE SNIPPET //commandQueue->Signal(fence[i], fenceValue[i]); //------------> SECOND COMMENTED CODE SNIPPET WaitForPreviousFrame(i); } ////////////////////////////////////////////////////////////////////////////////////////////////////// // FROM HERE ON, CODES HAVE NOTHING TO DO WITH THE QUESTION, THEY ARE HERE FOR THE SAKE OF COMPLETENESS ////////////////////////////////////////////////////////////////////////////////////////////////////// // close the fence event CloseHandle(fenceEvent); // release gpu resources in the scene mRenderer.Release(); mScene.Release(); // imgui stuff ImGui_ImplDX12_Shutdown(); ImGui_ImplWin32_Shutdown(); ImGui::DestroyContext(); SAFE_RELEASE(g_pd3dSrvDescHeap); // direct input stuff DIKeyboard->Unacquire(); DIMouse->Unacquire(); DirectInput->Release(); // other stuff ... } Obviously it didn't work that's why I googled the problem. After seeing what is proposed on that page, I added the first and the second commented code snippet. The problem was immediately solved. The error above completely disappeared. Then I tried to comment the first code snippet out and only use the second code snippet. It also worked. So I am wondering what happened. 1.Why do I have to signal it? Isn't WaitForPreviousFrame enough? 2.What does that answer on that page mean? What does operating system have anything to do with this? For you information, the WaitForPreviousFrame function looks like this: void WaitForPreviousFrame(int frameIndexOverride = -1) { HRESULT hr; // swap the current rtv buffer index so we draw on the correct buffer frameIndex = frameIndexOverride < 0 ? swapChain->GetCurrentBackBufferIndex() : frameIndexOverride; // if the current fence value is still less than "fenceValue", then we know the GPU has not finished executing // the command queue since it has not reached the "commandQueue->Signal(fence, fenceValue)" command if (fence[frameIndex]->GetCompletedValue() < fenceValue[frameIndex]) { // we have the fence create an event which is signaled once the fence's current value is "fenceValue" hr = fence[frameIndex]->SetEventOnCompletion(fenceValue[frameIndex], fenceEvent); if (FAILED(hr)) { Running = false; } // We will wait until the fence has triggered the event that it's current value has reached "fenceValue". once it's value // has reached "fenceValue", we know the command queue has finished executing WaitForSingleObject(fenceEvent, INFINITE); } // increment fenceValue for next frame fenceValue[frameIndex]++; }
  15. acerskyline

    Mipmap in Procedural Virtual Texture

    Is this why there is always a low-res pre-pass to render the whole screen and then run a compute to decide which one is needed? For example the image in this slide.
  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!