Present method is taking for too long...

Started by
9 comments, last by #Include Graphics 7 years, 1 month ago

Hi everyone!

Just wanted to check out if those times are common or not.

Actually in the render loop the time consumption is shared by this processes(high level view):

- World ( 0.000482142845 sc )

- Shadows ( 0.000348214293 sc )

- Post processing ( 0.000352678559 sc )

- UI ( 0.000214285712 sc )

- Present ( 0.0351071432 sc )

As we can see the Present method which is the direct call to the directX API in the way d3dDevice->Present( NULL, NULL, NULL, NULL ) is geting huge time compared with any other process.

- I am actually using vsync off.

- SwapEffect = D3DSWAPEFFECT_FLIP or D3DSWAPEFFECT_DISCARD

So my question now is... is that normal? am I be doing something wrong? is there a common fast way to prepare everything for a faster Present?

May someone please enlight me? :D

Please share ideas...maybe I am wrong but intuitively think that there is something more behind.

Thanks so much in advance.

Advertisement

The simple way to think of it is that every D3D draw/set function is actually recording those commands into a "command buffer". When you call Present, D3D sends the current command buffer to the GPU, and also waits if the GPU hasn't yet finished working on the previous command buffer.

So in your case, I would guess that your GPU is taking around 37 milliseconds to execute the commands that you're giving it each frame... whereas your CPU is only taking ~2 milliseconds to prepare those commands, which means that the CPU has to wait ~35 milliseconds every frame to avoid leaving the GPU further and further behind, with an endless list of commands building up.

Just wanted to check out if those times are common or not. Actually in the render loop the time consumption is shared by this processes(high level view): - World ( 0.000482142845 sc ) - Shadows ( 0.000348214293 sc ) - Post processing ( 0.000352678559 sc ) - UI ( 0.000214285712 sc ) - Present ( 0.0351071432 sc )

One more thing:

Shadow generation time is a little bit small, so I suspect you are measuring CPU time.

For CPU time it is OK.

But probably you want to measure real execution time on GPU (how long it takes to render shadows on GPU, not command list creation on CPU).

If so, you can do it via NSight/GPUView/ ATI/Intel graphic profilers, or write your own

(in DX11, there is a ID3D11Query: https://msdn.microsoft.com/en-us/library/windows/desktop/ff476578(v=vs.85).aspx )

Yes, I realized with Hodgman wise share that I were meassuring cpu time.

Now I need to know howto make the CPU not stall more than the needed on the Present. So it can process the next frame until arrives to the next Present and then stall.

I use a BackBufferCount = 2 (I think its triple buffer, isnt it? )

am I wrong if supposing this will do the trick? or should I have to use some special flags on the behaviourFlags? Recently saw D3DPRESENT_DONOTWAIT but really seems to be quite tricky and buggy.

Now I need to know howto make the CPU not stall more than the needed on the Present. So it can process the next frame until arrives to the next Present and then stall.

Unfortunately, I am not familiar with DX9, and also have 1 single thread of execution for now that runs in this sequence:

1. Handle input

2. Update state.

3. Fill structure required for complete frame rendering.

4. Render entire frame with data from this structure.

As you can see, 3 <-> 4 is a good sync point: Renderer may live in a different thread.

So there are 2 choices (with DX11):

1. If VSync is on, I should have benefits from 2 threads (rendering thread will not stall simulation thread).

2. If VSync is off (no wait on Present()), I will have benefit on time, required to fill all commands to DX.

But! I haven't saw yet with my test, a time required to fill all commands on CPU more, than 2.5 ms per frame.

This is an image taken from NSight long time ago (vsync is off), CPU time is only 2ms total:

aY95yBA.png

IMHO going beyond 2 cores - is to make scene update multithreaded.

Now I need to know howto make the CPU not stall more than the needed on the Present. So it can process the next frame until arrives to the next Present and then stall.
That's what it does do by default. It will be a nice high-latency / high-bandwidth pipeline of the GPU working ~1+ frames behind the CPU.

It will block when there isn't a free back-buffer available. IIRC D3D9 creates 1+BackBufferCount buffers, so yeah, if you've got that set to 2, then you've got triple buffering. This means that you can have three full frames worth of commands queued at the GPU before the CPU will stall.

I use a BackBufferCount = 2 (I think its triple buffer, isnt it? )

That's called double buffering, assuming you are flipping between both those back buffers when presenting

No sorry, I think Hodgman and me were right.

If you have one backbuffer then you have your current buffer and that one, so you are using two render buffers(double buffer).

If you have two backbuffer then you have your current buffer and that two, so you are using three render buffers(triple buffer).

Here there is an old topic that clarifies about:

https://www.gamedev.net/topic/169446-what-is-the-backbuffer-count/

I know what it is, I just didn't realize you had another frame buffer other than the two back buffers. So if you do have 3 buffers in total then your right, that's a triple buffer

Correct, for D3D9, it's a count of back buffers, not including an implicit front buffer. It's indeterminate whether calling Present will let you render to a buffer that's not included in your back buffer count. For example, in windowed mode, you only render to back buffers, not the front buffer, but in fullscreen, the front buffer is renderable.

For D3D10+, the back buffer count is the renderable buffer count. This means that after N calls to Present, you'll be rendering to the same buffer again, for both fullscreen and windowed.

So in fullscreen D3D9, 1 back buffer = double-buffering, but not necessarily in windowed mode. For D3D10+, you need 2 back buffers to get double-buffering in fullscreen.

This topic is closed to new replies.

Advertisement