D3DPOOL_SYSTEMMEM vs D3DPOOL_DEFAULT?

Started by
4 comments, last by renqilin 18 years, 11 months ago
I just had a simple speed test of the operations on different type of VB, here is my result: ----------- Test Begin -------------------- size_t nVertexCount = 1500; size_t nSegmentCount = 15; CVertexBuffer HugeBuffer; CVertexBuffer SmallBuffer[15]; HugeBuffer.Create(15 * 1500); for (i = 0; i < 15; ++i) { SmallBuffer.Create(1500); } // fill some data... --------- Using D3DPOOL_SYSTEMMEM --------- Loop(100) { HugeBuffer.Lock(); HugeBuffer.Unlock(); HugeBuffer.SetStreamSource(); for (i = 0; i < 15; ++i) { // if the comment is a valid sentence, then FPS = 7 //HugeBuffer.DrawPrimitive(i * 1500, 500); } } = FPS(403) Loop(100) { for (i = 0; i < 15; ++i) { SmallBuffer.Lock(); SmallBuffer.Unlock(); SmallBuffer.SetStreamSource(); // if the comment is a valid sentence, then FPS = 7 // SmallBuffer.DrawPrimitive(0, 500); } } = FPS(403) --------- Using D3DPOOL_DEFAULT --------- Loop(100) { HugeBuffer.Lock(); HugeBuffer.Unlock(); HugeBuffer.SetStreamSource(); for (i = 0; i < 15; ++i) { // if the comment is a valid sentence, then FPS = 7 //HugeBuffer.DrawPrimitive(i * 1500, 500); } } = FPS(253) Loop(100) { for (i = 0; i < 15; ++i) { SmallBuffer.Lock(); SmallBuffer.Unlock(); SmallBuffer.SetStreamSource(); // if the comment is a valid sentence, then FPS = 7 // SmallBuffer.DrawPrimitive(0, 500); } } = FPS(17) ----------- Test End -------------------- in my test, i simulated a using of vertex cache(HugeBuffer) and in total same triangle count but simple, small vertex buffers(SmallBuffer[15]), i also tested both SYSTEMMEM flag and DEFAULT flag. But now i'm cofused about the result. My first confusion is: while using SYSTEMMEM flag, the Lock operation is merely no performance hit, is D3D takes the SYSTEMMEM VB just as a normal memory block? and so the address of SYSTEMMEM VB will never changes? second confusion: i believe that the SYSTEMMEM vb will be slower than DEFAULT vb, because it needs to be copied to VEDIO memory.But ... there is no FPS difference between SYSTEMMEM vb and DEFAULT vb. i hope someone can help me.
Advertisement
Is that CVertexBuffer a wrapper or something?


ace
Perhaps the graphics system works with actual data and temporary data.

The actual data is either on the system memory or on the video card memory, thus, locking the actual data is faster in the system memory. The video card memory has to be retrieved before locking. The bright side is that video card memory doesn't have to be uploaded to the graphicscard once there's no locking.

The reason why there's no performance hit is because if you don't change the data each frame, the data uploaded from system to the graphicscard is stored inside a temporary video card memory. When you have a lot of resources, the temporary video card memory will be full and performance will reduce.

I hope you understand my story, but I have no actual prove. It's just a speculation. [smile]
ZzZZ.Z..ZzzZz....Z.. Good night!
well it depends up much on your video memory, and your card chip can play a grear role here, but I think the vertex count you'd used has great distance with the limit set by cards, I mean tens of thousands of vertices, so you can't differentiate between the results, also remember FPS is not an integer, so you may be missing the exact values, another point regarding FPS is that it is not a linear measure about your speed (in fact it is highly non-linear), try using frame time instead of FPS.
And about the difference between default and system VB, it seems somehow strange, since as mentioned widely in Direct3D documentation, default VB should be much faster, but again we turn to point mentioned above, i.e. vertices count, the more vertices you have, the more different you'll have. Of course there's a special point regarding your usage flags, sometimes specifying the readonly flag can hugely improve the performance.
One final note: always try to use the defaults, as non-defaults generally don't have a predictable output.
check out web site: galaxyroad.com
Visit galaxyroad.com which soon will have english contents too
Benchmarking Direct3D applications is a risky sport. Period.

You have to have a very sound understanding of both the mechanics of a theoretical Direct3D application, a reasonable knowledge of the hardware and an exceptional knowledge of the drivers to gain a meaningful result.

Put simply, there are so many ways that the results can be voided - as much because they just aren't representative of real-world performance.

Have you tried running with the debug runtimes and maximum output? You might well be getting some hints regarding performance from there - I can't remember the exact mappings, but some operations on resources are quicker depending on what pool their in. Might well be that your "benchmark" has a natural bias in one direction.

My apologies for ruining the party, but you probably won't get anything particularly useful from this sort of a test - what you really need to do is integrate some profiling code into a "real world" game/application. Even then you need to be careful about how you generate performance statistics as there are so many possible ways to get skewed results.

hth
Jack

<hr align="left" width="25%" />
Jack Hoxley <small>[</small><small> Forum FAQ | Revised FAQ | MVP Profile | Developer Journal ]</small>

thanks for all you warmhearted men.

maybe the result is wrong. i think i should test it again as jollyjeffers's opinion.

This topic is closed to new replies.

Advertisement