Measuring Latency [Solved]

Started by
6 comments, last by yoelshoshan 11 years, 3 months ago

Hi all,

I'm writing a tool to measure latency in 3d applications (mostly games).

My definition of the latency is time between:
1. First D3D command of frame N (dx is 9,10,11)

2. Actual presentation time on the GPU (imagine fullscreen, time of the actual flip)

I prefer not diving into kernel, and using only user code.

Using queries i can get:
a. CPU time of the first d3d command

b. GPU time of the present event

My question is, how to calculate a delta between those? They are obviously two different processors.

Note: assume that frequency didn't change in the middle of the test.


All i can think of is some initial synchronization phase, but it sounds a bit messy.

Any ideas will be highly appreciated =)

Advertisement

I'm not sure if it will get you exactly what you want, but you should check out GPUView.

Thank you MJP!

I need to do it programatically, it's not some "post-mortem" analysis of an application.

But i can take the following from your idea:

1. What i want to do is obviously possible - If GPUView can do it, so should I =)

2. Maybe i can reverse GPUView a bit, but my guess is that they use kernel code which I don't want to do until I make sure it's not solvable in user code.

Any more ideas?

It will drastically affect performance, but you can draw to a buffer immediately after the present, and then map/lock the buffer, checking the time when the map operation completes on the CPU (as this will cause the driver to stall the CPU until the buffer has been drawn to).

If your rendering is happening on single thread and you must do it programatically then there is only as much possibility as either mentioned above or using a crude get time method immediately before and after.

Thanks for your replies =)

Hodgman, as you said - it changes performance and behavior drastically, so it's not acceptable.

The point is to measure current latency, and by doing what you suggest, not only performance is altered, latency itself is altered drastically since i no longer allow queuing of frames.

Sharing the options i can think of:

1. Using GPU time_stamp queries and somehow learning how to match between cpu time_stamp and gpu time_stamp (tricky...)

2. Polling event queries on a dedicated thread in around 1000 getdata queries per minute.
I must check how much it eats from the core it runs on... hopefully not too much since it's not a full busy polling.
3. Probably the best method remains waiting on low level events, the same way as GPUView does.

BTW - this code will not run on my own application, it is injected to other applications. But any solution that works on my own application, without altering the original latency/performance is acceptable as well :)

Then one more way that you can try is to use sigslot on your rendering thread, and bind the 2 respective events on it.

Ofcourse, with sigslot, you can play between multiple threads as well.

So everytime, one of these gets executed you can send a signal and retrieve time for them.

GPU time_stamp won't help you with CPU events and vice-versa.

However, to keep things simpler, unless your app is looking for a latency delta precise to nano-seconds, you can still go with simply time(NULL) sort of thing just before and after your events. It doesn't cost much and keeps things fairly simple.

If anyone is interested, i solved the problem of measuring latency.

I had to move outside of the scope of user-mode DirectX.

The solution was injecting "D3DKMT_SIGNALSYNCHRONIZATIONOBJECT2":after presents.

http://msdn.microsoft.com/en-us/library/windows/hardware/ff548357%28v=vs.85%29.aspx

It sounds off topic for this forum so i won't go deep into the description.

This topic is closed to new replies.

Advertisement