CPU - GPU transfer time

Recommended Posts

NycRunner    122
Hi, I'm trying to make a simple benchmark application in which I can measure the time needed to transfer data to and from the GPU. In my benchmark application I use the following code to send the data:
timer.startTimer();
glTexSubImage2D(GL_TEXTURE_RECTANGLE_ARB,0,0,0,texSize,texSize,
GL_RGBA,GL_FLOAT,data);
timer.stopTimer();
For reading the data back I use:
timer.startTimer();
timer.stopTimer();
I've placed this code in a loop which iterates 10 times, so that I can calculate an average after the loop. Now, the strange part is that during the first loop sending and receiving takes much much longer than during the remaining loops. I'm wondering what causes this. Has it something to do with caching maybe? If so, how can I prevent this so I can measure the real transfer times. I hope someone can clear this up for me. Thanks in advance.

Share on other sites
mikeman    2942
Quote:
 timer.startTimer();glTexSubImage2D(GL_TEXTURE_RECTANGLE_ARB,0,0,0,texSize,texSize, GL_RGBA,GL_FLOAT,data);timer.stopTimer();

You're not measuring anything very useful this way. You think you're measuring how long GL takes to execute glTexSubImage2D, but remember that GL is not running on the CPU. It runs in parallel with your program, in GPU. glTexSubImage2D() just "tells" OpenGL to put that command in the pipeline, then immediately returns without actually having to wait until the command is executed. This is the correct way to time:

glFinish();//Wait until all previous OpenGL operations are finished, so that we don't include them in the timing.timer.startTimer();glTexSubImage2D(...);glFinish();//Wait until glTexSubImage2D() is completed.timer.stopTimer();

Share on other sites
NycRunner    122
Yeah, I knew about that, I've also implemented my benchmark with glFinish() before, but then I read somewhere that glGetTexImage() and glReadPixels() forces an implicit glFinish(), so I removed them again.

Share on other sites
iliak    278
my 2 cents... A call to glFlush() may not be useful too ?

Share on other sites
Code-R    136
using common sense, glTexImage2d HAS to do a glFinish. why? because the data pointer could be delete[]-ed right after it's called! although it's possible that the drivers buffers it by copying the data to a buffer somewhere on the local ram, but I doubt it'd do that. could somebody enlighten us all here?

Share on other sites
_the_phantom_    11250
glTexImage2d causes the driver to take a copy of the data, as to if that data ends up in VRAM or in system ram doesnt matter to the application.

There is no reason for ANY commands to be issued to the GPU when the function is called, the function could just cause the data to be copied to system ram and associated with the correct texture data. The upload doesnt have to occur until a later time.

Share on other sites
Avalon    133
But what if it *does* matter to the application? Do you have any recommendation for timing actual transfers (Pre OpenGL 1.2)?

Share on other sites
_the_phantom_    11250
not using texture objects might well be enuff to simulate it, upload and bind the texture right away.

Althogh there are still no garrentees, its upto the driver what happens.

Share on other sites
bpoint    464
If you have an nVidia card with the GL_NV_fence extension, then you can precisely check the amount of time a specific OpenGL call takes. I use it in my own stuff for timing GPU performance as well as CPU performance.

See the extension under "Can fences be used as a form of performance monitoring?" for some sample code too.

Share on other sites
x86asm    122
I think you have to call glFlush(). That is the only way I got reasonable results from timing texture uploads.