Speeding up LockRect

Started by
5 comments, last by Dont Disturb 17 years, 11 months ago
Hi all, I'm working on some software that relies on having graphics in main memory. I'm doing some rendering to a Direct3D surface, then locking the surface and reading the image back to main memory. Unfortunately this is /really/ slow (3 frames per second). Is there any way to speed this up? I'm using a PCI Express video card, and I heard that this was supposed to speed up reading back from video memory. Is there some way to turn on this ability? Thanks very much
Advertisement
That sounds really slow. How large is the texture?
Things to try (just guesswork): use DX release mode, update drivers.
Reading back from video memory is still very slow. It's faster on a PCI-Express slot, but still slow as hell.

Although, you should get more than 3 FPS, I suspect there's somethign else going on in your code. Can we see some code? And are you using the debug D3D runtimes? That'll give you much more information if there's something going wrong.
Just a random thought really... have you run your profiling on a Release build of your code?

In my experience, the debug builds tend to be much slower when dealing with DMA - lots of under/over run and other access-related debugging stuff can really hurt performance.

Also, are you reading back each pixel (e.g. a nested for() loop) or you grabbing the whole block? It'll be substantially faster to grab a single huge block of binary data and then process each element than to combine both during a lock/unlock...

Another one that I've posted about before that people have told me works quite well is to use a ring-buffer approach. Store N images, rendering to each one after another. For each frame you download 1/Nth of the previous frame. This way you can maximize concurrency between the CPU and GPU and keep both busy at the same time without stalling either unnecessarily...

hth
Jack

<hr align="left" width="25%" />
Jack Hoxley <small>[</small><small> Forum FAQ | Revised FAQ | MVP Profile | Developer Journal ]</small>

Use GetRenderTargetData(). It's *much* faster than locking a full render target surface. (It's still not blistering fast, but it should improve your frame rate.)
I posted this reply to a similar question recently. Using this technique may avoid stalling the GPU each frame.
Thanks for all your help guys, GetRenderTargetData is about 130 times faster than doing a memcpy from the locked data!

This topic is closed to new replies.

Advertisement