Why is the bus a limiting factor when using immediate mode.

Started by
8 comments, last by Solias 16 years, 2 months ago
Hi, When you are not using any VBO's etc. (I think these are called immediate mode?) why is the bus a limiting factor? Since if you take a GeForce 6600 which has a max bandwidth speed of 8.8GB/s, then rendering at 60fps, still gives ~146MB per frame. Assuming you didn't use anything in the video memory, then surely this is enough to transfer all the textures etc. that is required every frame? Much appreciated, NB.
Advertisement
It isn't the theoretical 8gb/s from main memory to the video card that is the problem. the problem is that ram has X bandwidth(much less than 8 gb/s) which both the video card and cpu have to share.
Quote:
the problem is that ram has X bandwidth(much less than 8 gb/s) which both the video card and cpu have to share.

Which ram are you referring to, the GPU or the PC? Is there a way I find out the bandwidth of the ram?

Thanks,
NB.
system ram. If you have pc3200 ddr ram then your ram has a bandwidth of 3200MB/s, less than half the 8GB/s bandwidth on the PCI-E bus. the gpu's internal memory is usually much higher some new cards have 50GB/s internally.
Quote:Original post by stonemetal
system ram. If you have pc3200 ddr ram then your ram has a bandwidth of 3200MB/S, less than half the 8GB/s bandwidth on the PCI-E bus. the gpu's internal memory is usually much higher some new cards have 50gb/s internally.


Ah, that should explain it... How do they manage to get 50GB [wow] ???
They use really fast memory, and a really wide bus. Currently ati uses a 256 bit bus while system ram is usually 32-64 bit.
Wonderful. Thanks for all your help. :)
Even in OpenGL immediate mode, you're generally not transferring textures across the bus every frame, just geometry. The raw bandwidth isn't the bottleneck anyway; the problem is the inefficient way you're utilizing it (lots and lots of small transfers instead of a few big ones per frame). You're not likely to get anywhere near the actual maximum bandwidth of the bus or system RAM in immediate mode.

If you compare the basic IM style with glBegin() and glEnd() to using Vertex Arrays, both require transferring the geometry each frame so the bandwidth requirement is technically the same, but VA can be a lot faster because you're transferring bigger chunks at a time. Even this probably can't use the entire memory bandwidth since a game usually does other things besides transferring geometry. You'll find that VBO becomes faster than VA in scene sizes far below the theoretical limit (53M/frame for pc3200 RAM) ... Maybe around 10-20M.
Quote:Original post by stonemetal
They use really fast memory, and a really wide bus. Currently ati uses a 256 bit bus while system ram is usually 32-64 bit.


64-128 bits on modern systems.

Quote:Original post by stonemetal
system ram. If you have pc3200 ddr ram then your ram has a bandwidth of 3200MB/s, less than half the 8GB/s bandwidth on the PCI-E bus.


6400MB/s with dual channel.
Quote:Original post by Fingers_
If you compare the basic IM style with glBegin() and glEnd() to using Vertex Arrays, both require transferring the geometry each frame so the bandwidth requirement is technically the same, but VA can be a lot faster because you're transferring bigger chunks at a time.


I would expect that the driver will batch up multiple primitives before sending them, at the very least if they are in the same begin/end block. Another cost of IM is the cpu cost of all of those function calls, one for each vertex attribute for each vertex. Each of those functions is just copying a few bytes of data. Do that in a loop for a large model with multiple attributes and there will be some overhead. There is also some extra validation overhead with IM.

This topic is closed to new replies.

Advertisement