Sign in to follow this  

Measuring Latency [Solved]

Recommended Posts

Hi all,


I'm writing a tool to measure latency in 3d applications (mostly games).

My definition of the latency is time between:
1. First D3D command of frame N (dx is 9,10,11)

2. Actual presentation time on the GPU (imagine fullscreen, time of the actual flip)


I prefer not diving into kernel, and using only user code.


Using queries i can get:
a. CPU time of the first d3d command

b. GPU time of the present event


My question is, how to calculate a delta between those? They are obviously two different processors.


Note: assume that frequency didn't change in the middle of the test.

All i can think of is some initial synchronization phase, but it sounds a bit messy.

Any ideas will be highly appreciated =)

Edited by yoelshoshan

Share this post

Link to post
Share on other sites

Thank you MJP!

I need to do it programatically, it's not some "post-mortem" analysis of an application.


But i can take the following from your idea:

1. What i want to do is obviously possible - If GPUView can do it, so should I =)

2. Maybe i can reverse GPUView a bit, but my guess is that they use kernel code which I don't want to do until I make sure it's not solvable in user code.


Any more ideas?

Share this post

Link to post
Share on other sites
It will drastically affect performance, but you can draw to a buffer immediately after the present, and then map/lock the buffer, checking the time when the map operation completes on the CPU (as this will cause the driver to stall the CPU until the buffer has been drawn to).

Share this post

Link to post
Share on other sites

Thanks for your replies =)


Hodgman, as you said - it changes performance and behavior drastically, so it's not acceptable.

The point is to measure current latency, and by doing what you suggest, not only performance is altered, latency itself is altered drastically since i no longer allow queuing of frames.


Sharing the options i can think of:


1. Using GPU time_stamp queries and somehow learning how to match between cpu time_stamp and gpu time_stamp (tricky...)

2. Polling event queries on a dedicated thread in around 1000 getdata queries per minute.
I must check how much it eats from the core it runs on... hopefully not too much since it's not a full busy polling.
3. Probably the best method remains waiting on low level events, the same way as GPUView does.


BTW - this code will not run on my own application, it is injected to other applications. But any solution that works on my own application, without altering the original latency/performance is acceptable as well :)

Share this post

Link to post
Share on other sites

Then one more way that you can try is to use sigslot on your rendering thread, and bind the 2 respective events on it.

Ofcourse, with sigslot, you can play between multiple threads as well.

So everytime, one of these gets executed you can send a signal and retrieve time for them.

GPU time_stamp won't help you with CPU events and vice-versa.


However, to keep things simpler, unless your app is looking for a latency delta precise to nano-seconds, you can still go with simply time(NULL) sort of thing just before and after your events. It doesn't cost much and keeps things fairly simple.

Share this post

Link to post
Share on other sites

If anyone is interested, i solved the problem of measuring latency.

I had to move outside of the scope of user-mode DirectX.


The solution was injecting "D3DKMT_SIGNALSYNCHRONIZATIONOBJECT2":after presents.


It sounds off topic for this forum so i won't go deep into the description.

Share this post

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this