Jump to content
  • Advertisement

Archived

This topic is now archived and is closed to further replies.

softimage

how to count the time spent on gpu?

This topic is 5223 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

it seems that dx can send instructions to gpu and return without waiting for completion. so how can i know how much time it takes for gpu to complete the computation? i use vmr9 to read mpeg files and add pixel shader effect on every video frame provided by vmr9. my program uses the following for every frame: beginscene, setrendertarget, setpixelshader, drawprimitive, endscene. i counted the time spent on it using DXUtil_Timer(), the result is 1/580 second. then i added another phase, so it is like this for every frame: beginscene, ..., endscene, beginscene, ..., endscene the video began to be of some stutters, so i doubt the burden is too large. but the time consumed on the beginscene...endscene is 1/560 second. so i doubt it returns without waiting. i know there is a program called fraps which can count the fps of games. but for videos it''s fixed at 30fps.

Share this post


Link to post
Share on other sites
Advertisement
unfortunately I doubt that you''ll get anywhere with this. Nvidia/ATI don''t tend to tell you how the finer workings of their GPU''s operate - and for good reason really.

With all the parallel processing / batched execution type tricks they play to get speed increases it''s probably no simple function to work out how long something will take.

As for the async nature of D3D not waiting for the render to complete before returning, you can sometimes get an idea of what the GPU will do by looking at the drivers. I know one of my previous drivers had a feature of "render no more than __ frames ahead", and I''m guessing that would mean that it''ll only cache __ frames before forcing D3D to wait until its caught up (this is noticable if you use some profiling tools in specific scenarios).

The only option is probably to ask devrel@ati.com - ATI have always helped me in the past, Nvidia weren''t as helpful - but I haven''t spoken to them in a while.

hth
Jack

Share this post


Link to post
Share on other sites
quote:
Original post by jollyjeffers
I know one of my previous drivers had a feature of "render no more than __ frames ahead", and I''m guessing that would mean that it''ll only cache __ frames before forcing D3D to wait until its caught up (this is noticable if you use some profiling tools in specific scenarios).

All WHQL drivers can buffer up to a maximum of 3 frames, most probably so that - as Rich Thomson says - "[...]the GPU can be displaying frame N and rendering frame N+1 while the CPU queues frame N+2. Any more queueing than that and you introduce delays. Any less queueing than that and you lose possibilities for parallelism"

Muhammad Haggag

Share this post


Link to post
Share on other sites
I createrendertarget(lockable) and lockrect() then unlockrect() after endscene().
The result is almost the same as using a tool called FRAPS.
But somehow the cpu usage rises from 6% to 40% when locking.

I found that on geforce fx5600, to keep 30fps of a 512*256 video,
I can put at most 68 "dp4" instructions, compared to 47 on fx5200.

Is that common?
How can games achieve 100fps while having complicated scene?
Or is the pixel shader still not fast enough now?

[edited by - softimage on March 3, 2004 8:56:59 PM]

Share this post


Link to post
Share on other sites

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

Participate in the game development conversation and more when you create an account on GameDev.net!

Sign me up!