Performance problem with dynamic vertex buffer

Started by
11 comments, last by juhaszt 21 years, 2 months ago
I have created a dynamic vertex buffer with the size of 1MB. dev->CreateVertexBuffer(sizeVB, D3DUSAGE_DYNAMIC, D3DFVF_MYVERTEX, D3DPOOL_DEFAULT, &pVBuffer); I fill it with the data using a memcpy() function. To fill it takes about 7msec (average value), that means a data transfer value of about 140MB/sec. That is far too slow I think, my computer works with 2X AGP speed. How is it possible that the memcpy() works soo slow when filling a dynamic vertex buffer? (my configuration: 360MHz Celeron, Geforce2 MX)
programmer
Advertisement
I havent done any tests as to actual performance increases BUT IIRC the debug spew will probably reccoment that you also make the VB WRITE ONLY.

That may increase performance.

Neil


WHATCHA GONNA DO WHEN THE LARGEST ARMS IN THE WORLD RUN WILD ON YOU?!?!
WHATCHA GONNA DO WHEN THE LARGEST ARMS IN THE WORLD RUN WILD ON YOU?!?!
I just read something today from Michael Abrash , he stated that memcpy()was too slow . After writing hes own function he doubled the speed .
Thank you all :)
Dont copy into the vertex buffer if not absolutely necessary. Write the data directly into the buffer instead.

And make ABSOLUTELY SURE that you are using the D3DUSAGE_WRITEONLY when creating and D3DLOCK_DISCARD or D3DLOCK_NOOVERWRITE when locking. if not youäll have severe preformance problems.

( i can see now that the D3DUSAGE_WRITEONLY is missing.. yikes.. )

[Insert cool signature here]
[Insert cool signature here]
Out of curiosity, what kind of throughput do you get writing to a sysmem VB?

Ditto on what CamelFly said: Write your vertices directly to the VB if at all possible.

FWIW, I''ve noticed no difference in write performance using D3DUSAGE_WRITEONLY, but that''s not to say it couldn''t make a difference on other hardware/drivers. (If anything, it might motivate the driver to put the VB in AGP mem vs. local mem, where you would have faster writes. But with D3DUSAGE_DYNAMIC I doubt it would put them in local mem in the first place.)

You should still definitely be using it.

I have tried D3DUSAGE_WRITEONLY too, but with that I got the same throughput!!!
programmer
There is no guarantee that writeonly will increase performance, just a warning that not setting it might kill it :-)

[Insert cool signature here]
[Insert cool signature here]
I think you are getting a little confused on AGP memory vs the AGP bus... AGP memory IS system memory (organized into a large block by a technique called GART), the graphics card can access this memory VIA the AGP bus (at fairly high speed).

You are using a celeron @360mhz, which has a 66MHZ FSB (An I would assume <= PC100 ram?) your memory to memory copies are not going to be super fast (sorry). 156MB/S is fairly respectable, given you are doing a copy (1 read 1 write) and all of the OS overhead. Memcpy can be optimized a lot (by removing safeguards and making assumptions), but that is another topic, it is the fastest non-asm option you have (save using someone elses library).

As someone mentioned, do not copy to the vertex buffer unless needed (and only lock the amount of the VB you need not 0-maxSize). Make sure you have writeonly and lock with Discard or No-Overwrite so that you do not have to wait on Lock/Unlock.


Cheers,~Entz-=-=-=-=-=-=-=-=-=-=-=-=-=-http:www.leviathan3d.com (under construction)
Yes, I forgot to say that I use D3DLOCK_DISCARD when locking the vertex buffer.

Another question:

The dynamic vertex buffer will not be created in the video card''s memory, but in the AGP memory? Then rendering from a dynamic vertex buffer will be slower than rendering from a static vertex buffer, which will be created in the video card''s memory. Right?

Thanks for your answers anyway!!



programmer
Correct, a vertex buffer stored in video memory will be substantially faster, but remember these are VB that cannot be written or read from without huge performance penalties (as the entire VB would have to be copied to and from AGP memory).

One thing i forgot to mention is that 1MB is a HUGE amount of data (almost 28000 verticies by my FVF). So again, the 7ms is actually pretty good. The optimal dynamic vertex buffer size is around 1000-2000 verticies depending on your card.

Again with a VB of that size do not lock the entire buffer, it will slow you down alot. Do you need that many verticies? (Just curious)

[edited by - Entz on January 21, 2003 9:34:19 PM]
Cheers,~Entz-=-=-=-=-=-=-=-=-=-=-=-=-=-http:www.leviathan3d.com (under construction)

This topic is closed to new replies.

Advertisement