• Advertisement
Sign in to follow this  

Data throughput

This topic is 584 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hey all,

Just looking for some guestimates with regards to how many vertices one could update safely per frame in a dynamic buffer.

 

I have done some testing and the limit seems lower than I thought it would be.

 

For reference I am uploading 5mb ish worth of data roughly at 20 bytes per vertex.

So about 250,000 vertices I can push though per frame and that seems to be the limit on my machine.

 

To be honest that should probably be enough for what I need I just wondered if that number seemed on the low to anyone else? I thought double that should have been easily achievable. 

 

Cheers

 

(on a low-middish range gpu if that helps)

Share this post


Link to post
Share on other sites
Advertisement

(on a low-middish range gpu if that helps)

 

This depends a lot on many factors: CPU, RAM, graphic card (what is low-middish range gpu), bandwidth (PCI express), caching, how data are stored, read and sent...

Also, what API are you using ? DirectX ? OpenGL ? Vulkan ?

 

If you're on OpenGL, what buffer do you use, what calls are you using ?

 

What framerate are you expecting ? What other tasks are you doing ?

 

From what I know, graphic cards are still not meant to do that extensively. When VBO arose in OpenGL, there were absolutely no difference when using static, dynamic or stream buffers. It's been a while I haven't tested that but last comments I could have read here and there is that graphic cards are still meant to render long-term stored data in their memory.

Share this post


Link to post
Share on other sites

Thanks Silence!

Yes there are many factors at play here but I was just hoping for a "sounds about right" or "sounds a bit low" kind of replies.

 

That said.

api: DirectX

gpu: Nvidia Quatro k2200

cpu: Intel Xeon E5-2630 @2.5GHz

ram: 32GB

 

extra info:

assume not much else is going on as I am pretty much just trying to find limits of different aspects of the renderer

 

really only looking for rough, ballpack, anecdotal kind of answers though, just curious and nothing major is resting on any answer either.

Share this post


Link to post
Share on other sites

You're not even saying your framerate.

At 60 fps you're pushing 286MB/s. At 120 fps you're pushing 572MB/s.

"and that seems to be the limit"... what is your limit? 60fps? 120fps? 240fps? 30fps?

 

There's a lot of things that could go wrong. First you would have to compare the framerate against having the buffers stored in host memory, to overrule other GPU bottlenecks.

Second if you happen to read from that buffer by accident (i.e. generated assembly reads from data back even if your C code doesn't) you'll hit severe perf. penalties due to write combining memory.

Third you could be hitting CPU limits (i.e. your CPU can't pull the data fast enough) and thus doing it in another thread could increase the framerate.

Four, you could be reaching the DISCARD limit (i.e. AMD is 4MB per frame), so you should use D3D11_MAP_WRITE_NO_OVERWRITE instead.

Five you're not even describing your specs or how you're implementing the upload.

Six, the rest of your code also consumes RAM bandwidth. It's common to see 24-32GB/s RAMs nowadays. If you're doing something else or reading/writing your data more than once, you could be hitting that limit.

Seven... PCI-E 3.0 16x theoretical bandwidth is 15.75GB/s. It's safe to assume in practice you should be able to reach 7GB/s if you do things right. That means pushing 6 million vertices per frame at 60fps w/ 20 bytes per vertex (assuming your GPU & CPU can handle the rest. You may hit another bottleneck before). So: No. Your numbers don't look right (assuming your "limit" was 60fps).

Edited by Matias Goldberg

Share this post


Link to post
Share on other sites
Wow Matias very thorough reply!

Yes 60fps was indeed the target and the fact that you think any of those things could be a problem makes me think more could be achieved.
Some limitations could be out of my control but you gave me a great idea to test the theoretical limits by taking out all the variables I can.

So I will do 2 tests. One with all the data staying on the gpu as a baseline. And a second one uploading all the data each frame but do no calculations on the cpu. I.e. not changing any of the data but just reupload it.

That should give me some good indicators.

I will also use some gou profile to see what commands got issues to help give me a better indication.

Thanks for great reply!

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement