Rendering Pipeline

Started by
5 comments, last by OuncleJulien 19 years, 10 months ago
I have a question about how I should construct my engines rendering pipeline. I''ve been going through the developer.nvidia.com docs and read how you should keep the CPU busy by having the GPU render the previous frame while getting the current frame ready for processing. My current pipeline it goes like this: AIandPhysics() UpdateGameObjects() Clear() BeginScene() DrawObjects() EndScene() Present() I''m guessing the Present function blocks until the entire scene has been fully rendered and displayed, is this correct? If so the CPU is stalled waiting for the GPU to finish. So... In order to make the CPU and GPU work better in parallel as suggested by the nvidia docs, should the pipeline go like this?: AIandPhysics() UpdateGameObjects() Present() // Renders the previous frame Clear() BeginScene() DrawObjects() EndScene() So after EndScene is called, the GPU can still be working on rendering the frame while the CPU begins on the next frame. Is this the most efficient way to utilize CPU and GPU parallelization? If not what would you suggest? Thanks guys!
Advertisement
quote:Original post by OuncleJulien
I''m guessing the Present function blocks until the entire scene has been fully rendered and displayed, is this correct? If so the CPU is stalled waiting for the GPU to finish.


nah. the only time present will block is if vsync is on and the vertical blanking interval hasnt passed yet (the vertical blanking interval would be the amount of time required before the monitor can refresh and start display again). Otherwise present returns immediately. What present actually does is have the pointer to the back buffer and front buffer change so that the next set of rendering commands you give the graphics card are applied to the new back buffer. I suppose this could also depend on the swap effect you are using. For SWAPEFFEcT discard, there would definetly be no need for rpesent to wait, but for SWAPEFFECT_FILP im not too sure.

There are some methods that would actually stall the graphics card when you call them. Particularly the texture and vertex/index buffer lock functions. See the D3DLOCK flags in the dx sdk to see what happens. For dynamic resources, specifying D3DLOCK_DISCARD or D3DLOCK_NOOVERWRITE allows the gfx driver to return immediatly after the call so, using these flags as wisely as possible is your best bet to achieving cpu/gpu parallelism.

In actuality, the first list you had would seem more appropriate to me.

This block:
Clear()
BeginScene()
DrawObjects()
EndScene()
Present()

would send out all the commands to the graphics card. Then while all that stuff is being rendered, your CPU is already on the next loop doing this:
AIandPhysics()
UpdateGameObjects()

It''s how you handle locking and filling resources inside the rendering block that will truely effect everything else. Infact, I think I had this link lying around somewhere that was about GPU/CPU parallelism, cant find it right now and i must be off. I''ll post later on and get the link to you if I find it.

| TripleBuffer Software |
| Plug-in Manager :: DX Utility Engine :: C++ Debug Kit :: DirectX Tutorials :: Awesome Books |
[size=2]aliak.net
Thanks for the reply! I appreciate you finding that link for me. I''ll keep an eye out for your next post.
Put a timer call arond Present() and observe. Yo will find that it is a blocking call in all cases. When VSync is on, it blocks untill it can sync in with the next frame. If it is off, it blocks untill it has completed doing some work. My guess is either it is AGP transfer tie or some other GPU processing. For example on ATI cards, if you enable n-path, Present() takes a long time to return.

Count your blessings before you count your problems.www.wiu.edu/users/muaiq
WHQL certified drivers can buffer up to 3 frames...So sometimes Present() blocks so that the driver finishes the 3 queued frames before you queue any others.

quote:WHQL certified drivers can buffer up to 3 frames...So sometimes Present() blocks so that the driver finishes the 3 queued frames before you queue any others.


Would that be a maximum of 3 or is 3 like a minimum number? So this would mean that the more frames the card can...buffer, the less trouble present will give.

quote:Thanks for the reply! I appreciate you finding that link for me. I''ll keep an eye out for your next post.


Sorry, the link i had was one on something else. But there was a small discussion on the directxdev mailing lists about this a while back. I dont think they came out with any conclusion though

Anyway, go here and seach for "parallel rendering" and you''ll get the thread with that discussion.

| TripleBuffer Software |
| Plug-in Manager :: DX Utility Engine :: C++ Debug Kit :: DirectX Tutorials :: Awesome Books |
[size=2]aliak.net
quote:Original post by IFooBar
Would that be a maximum of 3 or is 3 like a minimum number? So this would mean that the more frames the card can...buffer, the less trouble present will give.


Maximum of 3. There were many issues with drivers buffering more to make benchmarks look better. The downside is that buffering more frames also makes your game less responsive. Essentially the player is reacting to things that actually occured numerous frames in the past.



Stay Casual,

Ken
Drunken Hyena
Stay Casual,KenDrunken Hyena

This topic is closed to new replies.

Advertisement