Sign in to follow this  

Performance of VBO on Nvidia hardware

This topic is 3666 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I am trying to move from unsigned int to unsigned short for the IBO values. Now I have been told that short will give me 2x the performance over int??? I can see 50% less memory, but so far I haven't seen any performance increase, but from what I can tell I am shader or texture bound. What I have read on Nvidia's page using unsigned int is ok, but if you use glDrawRangeElements() the driver will convert 32bit to 16bit if the range is small enough for you? Can anyone verify this? So if this is the case I can assume I am getting the best performance I can without recoding my renderer to use a 16bit IBO, as long as my range is < 16bit? Thanks

Share this post


Link to post
Share on other sites
It's not going to give you 2x the performance. That's ridiculous.

If nVidia doc says it can convert 32 to 16, then that's what they do. Which document says that?

Share this post


Link to post
Share on other sites
glDrawRangeElements Instead of glDrawElements
Using range elements is more efficient for two reasons:
If the specified range can fit into a 16-bit integer, the driver can optimize the
format of indices to pass to the GPU. It can turn a 32-bit integer format into a
16-bit integer format. In this case, there is a gain of 2×.
The range is precious information for the VBO manager, which can use it to
optimize its internal memory configuration.

Thats quoted from Nvidia "using vertex buffer objects.pdf" look it up on there site.
click

Share this post


Link to post
Share on other sites
I am thinking, it's an 4 year old whitepaper and so things could have changed since. However, when I ran a few benchmarks some weeks ago, 16 bit indices were 5-10% faster than 32 bit indices on NVIDIA 7800GT on PCI-Express 16 (On ATI the performance seems the same). Never seen twice the performance. I guess they meant that the data can be transfered twice as fast and maybe that was relevant on AGP in 2003. I think, today we're bottlenecked somewhere else long before that becomes relevant.

Share this post


Link to post
Share on other sites
It probably converts them if you have a specific case like if the indices are in RAM and maybe the upper 16 bit is all zeroes. So the driver sends the card 16 bit indices.
It's not an optimization that I would take seriously and like ndhb, the document is 4 year old.

Share this post


Link to post
Share on other sites

This topic is 3666 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this