Sign in to follow this  
renqilin

D3DPOOL_SYSTEMMEM vs D3DPOOL_DEFAULT?

Recommended Posts

renqilin    194
I just had a simple speed test of the operations on different type of VB, here is my result: ----------- Test Begin -------------------- size_t nVertexCount = 1500; size_t nSegmentCount = 15; CVertexBuffer HugeBuffer; CVertexBuffer SmallBuffer[15]; HugeBuffer.Create(15 * 1500); for (i = 0; i < 15; ++i) { SmallBuffer[i].Create(1500); } // fill some data... --------- Using D3DPOOL_SYSTEMMEM --------- Loop(100) { HugeBuffer.Lock(); HugeBuffer.Unlock(); HugeBuffer.SetStreamSource(); for (i = 0; i < 15; ++i) { // if the comment is a valid sentence, then FPS = 7 //HugeBuffer.DrawPrimitive(i * 1500, 500); } } = FPS(403) Loop(100) { for (i = 0; i < 15; ++i) { SmallBuffer[i].Lock(); SmallBuffer[i].Unlock(); SmallBuffer[i].SetStreamSource(); // if the comment is a valid sentence, then FPS = 7 // SmallBuffer[i].DrawPrimitive(0, 500); } } = FPS(403) --------- Using D3DPOOL_DEFAULT --------- Loop(100) { HugeBuffer.Lock(); HugeBuffer.Unlock(); HugeBuffer.SetStreamSource(); for (i = 0; i < 15; ++i) { // if the comment is a valid sentence, then FPS = 7 //HugeBuffer.DrawPrimitive(i * 1500, 500); } } = FPS(253) Loop(100) { for (i = 0; i < 15; ++i) { SmallBuffer[i].Lock(); SmallBuffer[i].Unlock(); SmallBuffer[i].SetStreamSource(); // if the comment is a valid sentence, then FPS = 7 // SmallBuffer[i].DrawPrimitive(0, 500); } } = FPS(17) ----------- Test End -------------------- in my test, i simulated a using of vertex cache(HugeBuffer) and in total same triangle count but simple, small vertex buffers(SmallBuffer[15]), i also tested both SYSTEMMEM flag and DEFAULT flag. But now i'm cofused about the result. My first confusion is: while using SYSTEMMEM flag, the Lock operation is merely no performance hit, is D3D takes the SYSTEMMEM VB just as a normal memory block? and so the address of SYSTEMMEM VB will never changes? second confusion: i believe that the SYSTEMMEM vb will be slower than DEFAULT vb, because it needs to be copied to VEDIO memory.But ... there is no FPS difference between SYSTEMMEM vb and DEFAULT vb. i hope someone can help me.

Share this post


Link to post
Share on other sites
Pipo DeClown    804
Perhaps the graphics system works with actual data and temporary data.

The actual data is either on the system memory or on the video card memory, thus, locking the actual data is faster in the system memory. The video card memory has to be retrieved before locking. The bright side is that video card memory doesn't have to be uploaded to the graphicscard once there's no locking.

The reason why there's no performance hit is because if you don't change the data each frame, the data uploaded from system to the graphicscard is stored inside a temporary video card memory. When you have a lot of resources, the temporary video card memory will be full and performance will reduce.

I hope you understand my story, but I have no actual prove. It's just a speculation. [smile]
ZzZZ.Z..ZzzZz....Z.. Good night!

Share this post


Link to post
Share on other sites
vcGamer    100
well it depends up much on your video memory, and your card chip can play a grear role here, but I think the vertex count you'd used has great distance with the limit set by cards, I mean tens of thousands of vertices, so you can't differentiate between the results, also remember FPS is not an integer, so you may be missing the exact values, another point regarding FPS is that it is not a linear measure about your speed (in fact it is highly non-linear), try using frame time instead of FPS.
And about the difference between default and system VB, it seems somehow strange, since as mentioned widely in Direct3D documentation, default VB should be much faster, but again we turn to point mentioned above, i.e. vertices count, the more vertices you have, the more different you'll have. Of course there's a special point regarding your usage flags, sometimes specifying the readonly flag can hugely improve the performance.
One final note: always try to use the defaults, as non-defaults generally don't have a predictable output.
check out web site: galaxyroad.com

Share this post


Link to post
Share on other sites
jollyjeffers    1570
Benchmarking Direct3D applications is a risky sport. Period.

You have to have a very sound understanding of both the mechanics of a theoretical Direct3D application, a reasonable knowledge of the hardware and an exceptional knowledge of the drivers to gain a meaningful result.

Put simply, there are so many ways that the results can be voided - as much because they just aren't representative of real-world performance.

Have you tried running with the debug runtimes and maximum output? You might well be getting some hints regarding performance from there - I can't remember the exact mappings, but some operations on resources are quicker depending on what pool their in. Might well be that your "benchmark" has a natural bias in one direction.

My apologies for ruining the party, but you probably won't get anything particularly useful from this sort of a test - what you really need to do is integrate some profiling code into a "real world" game/application. Even then you need to be careful about how you generate performance statistics as there are so many possible ways to get skewed results.

hth
Jack

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this