Sign in to follow this  
duhroach

LOCKRECT performance metrics

Recommended Posts

I'm wondering if anyone knows about lockrect performance wrt how much 'data' is being locked. For example, lets say that i have a 2x2 square of a 256x256 image that i need to modify. Is it more effecient to lock just the 2x2 square? or is their no penalty for locking the entire image? I'm guessing that a larger part of the perf hit is actually how long the object in question is locked, and if the GPU has to stall waiting for the CPU to unlock a resource before it can use it. Anyone have some info? ~Main

Share this post


Link to post
Share on other sites
Quote:
Original post by duhroach
For example, lets say that i have a 2x2 square of a 256x256 image that i need to modify. Is it more effecient to lock just the 2x2 square? or is their no penalty for locking the entire image?

It's been a while since I've tried it, but it is better to lock ONLY the part you need. I seem to remember that it'll be upto the driver how it wants to interpret that - but in your example there is a good chance it can save some memory bandwidth by only working on the 2x2 (or close) area.

When you lock the data you'll get a pointer such that you can modify the data - this pointer will correspond to the size you've requested (obviously). Thus the texture/surface data needs to be in a memory location that you can r/w from (as allowed), so you could envisage the hardware/driver copying the necessary data to AGP/SYSMEM for the duration of the lock. Copying 256x256 pixels is obviously worse than copying 2x2 [wink]

Quote:
Original post by duhroach
I'm guessing that a larger part of the perf hit is actually how long the object in question is locked

Conventional wisdom is to engineer your code such that the lock is maintained only as long as is absolutely necessary.

Quote:
Original post by duhroach
if the GPU has to stall waiting for the CPU to unlock a resource before it can use it.

Stalling the GPU can be a pretty big performance hit as I'm sure you're aware.

One of the best ways to speed up resource locking is making sure that the data is created in the correct pool (D3DPOOL_*) and with the correct usage parameters (D3DUSAGE_*) - this can at the very least hint to the driver as to what you're planning to do with the resource.

Running the debug runtimes against your application will often warn you if you're using a sub-optimal usage/pool combination for your situation.

hth
Jack

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this