Sign in to follow this  
lwi

glTexImage2D performance issues (long post)

Recommended Posts

Hello... this thread is a follow up to my original thread : http://www.gamedev.net/community/forums/topic.asp?topic_id=285014 To recapitulate: I am making a video player. I obtain my source images from a in-house built library that provides me with a separate array for each color component ie: class myImgStorage { float** pixels; } // actually more complex than this, but you get the picture. To display those images, I used multitexturing: I create one texture for the red component, one for the green, and one for the blue, add them up, and display them on a quad. This works, but it's slow. Here is a bit of code to explain my problem: // texture loader ******************************************** void createTexture( GLuint textureArray[], const GLvoid *pixels, GLsizei width, GLsizei height, GLenum format, int textureID) { glGenTextures(1, &textureArray[textureID]); glBindTexture(GL_TEXTURE_2D, textureArray[textureID]); // create a blank n^2 x m^2 texture GlTexImage2D( GL_TEXTURE_2D, 0, GL_RGB, 1024, 1024, 0, GL_RGB, GL_FLOAT, NULL ); // load the actual pixels. The image size is most likely not powers of 2 glTexSubImage2D( GL_TEXTURE_2D, 0, 0,0, width, height, format, GL_FLOAT, pixels ); glTexParameterf( GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST ); glTexParameterf( GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST ); } // paint method ********************************************** void paintGL() { // creating the textures from each color component array // contained in (*imgStorage). Index 0 is red, 1 is green, // 2 is blue. // (*imgStorage)[i].getArray()[i] gets the pixel intensities // for the color component. createTexture(texture, (*imgStorage)[0].getArray()[0], (*imgStorage).getImageWidth(), (*imgStorage).getImageHeight(), GL_RED, 0); createTexture(texture, (*imgStorage)[1].getArray()[0], (*imgStorage).getImageWidth(), (*imgStorage).getImageHeight(), GL_RED, 1); createTexture(texture, (*imgStorage)[2].getArray()[0], (*imgStorage).getImageWidth(), (*imgStorage).getImageHeight(), GL_RED, 2); glClear (GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT); // Clear Screen And Depth Buffer // texture unit 0 glActiveTextureARB(GL_TEXTURE0_ARB); glEnable(GL_TEXTURE_2D); glBindTexture(GL_TEXTURE_2D, texture[0]); glTexEnvf(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_COMBINE_ARB); glTexEnvf(GL_TEXTURE_ENV, GL_COMBINE_RGB_ARB, GL_REPLACE); glTexEnvf(GL_TEXTURE_ENV, GL_SOURCE0_RGB_ARB, GL_TEXTURE); glTexEnvf(GL_TEXTURE_ENV, GL_OPERAND0_RGB_ARB, GL_SRC_COLOR); // texture unit 1 glActiveTextureARB(GL_TEXTURE1_ARB); glEnable(GL_TEXTURE_2D); glBindTexture(GL_TEXTURE_2D, texture[1]); glTexEnvf(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_COMBINE_ARB); glTexEnvf(GL_TEXTURE_ENV, GL_COMBINE_RGB_ARB, GL_ADD); glTexEnvf(GL_TEXTURE_ENV, GL_SOURCE0_RGB_ARB, GL_TEXTURE1_ARB); glTexEnvf(GL_TEXTURE_ENV, GL_OPERAND0_RGB_ARB, GL_SRC_COLOR); glTexEnvf(GL_TEXTURE_ENV, GL_SOURCE1_RGB_ARB, GL_TEXTURE0_ARB); glTexEnvf(GL_TEXTURE_ENV, GL_OPERAND1_RGB_ARB, GL_SRC_COLOR); // texture unit 2 glActiveTextureARB(GL_TEXTURE2_ARB); glEnable(GL_TEXTURE_2D); glBindTexture(GL_TEXTURE_2D, texture[2]); glTexEnvi(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_COMBINE_ARB); glTexEnvi(GL_TEXTURE_ENV, GL_COMBINE_RGB_ARB, GL_ADD); glTexEnvi(GL_TEXTURE_ENV, GL_SOURCE0_RGB_ARB, GL_PREVIOUS_ARB); glTexEnvi(GL_TEXTURE_ENV, GL_OPERAND0_RGB_ARB, GL_SRC_COLOR); glTexEnvi(GL_TEXTURE_ENV, GL_SOURCE1_RGB_ARB, GL_TEXTURE); glTexEnvi(GL_TEXTURE_ENV, GL_OPERAND1_RGB_ARB, GL_SRC_COLOR); glLoadIdentity(); // Reset The Modelview Matrix glTranslatef(-0.0f,0.0f,-1.1f); float quadWidth = 720.0f / 1024.0f; float quadHeight = 480.0f /1024.0f; float aspectRatio = 480.0f / 720.0f; glBegin(GL_QUADS); // Begin Drawing The Background (One Quad) //glTexCoord2f(quadWidth, 0.0f); glMultiTexCoord2fARB(GL_TEXTURE0_ARB, quadWidth, 0.0f); glMultiTexCoord2fARB(GL_TEXTURE1_ARB, quadWidth, 0.0f); glMultiTexCoord2fARB(GL_TEXTURE2_ARB, quadWidth, 0.0f); glVertex3f( 1.0f, aspectRatio, 0.1f); //glTexCoord2f(0.0f, 0.0f); glMultiTexCoord2fARB(GL_TEXTURE0_ARB, 0.0f, 0.0f); glMultiTexCoord2fARB(GL_TEXTURE1_ARB, 0.0f, 0.0f); glMultiTexCoord2fARB(GL_TEXTURE2_ARB, 0.0f, 0.0f); glVertex3f(-1.0f, aspectRatio, 0.1f); //glTexCoord2f(0.0f, quadHeight); glMultiTexCoord2fARB(GL_TEXTURE0_ARB, 0.0f, quadHeight); glMultiTexCoord2fARB(GL_TEXTURE1_ARB, 0.0f, quadHeight); glMultiTexCoord2fARB(GL_TEXTURE2_ARB, 0.0f, quadHeight); glVertex3f(-1.0f,-aspectRatio, 0.1f); //glTexCoord2f(quadWidth, quadHeight); glMultiTexCoord2fARB(GL_TEXTURE0_ARB, quadWidth, quadHeight); glMultiTexCoord2fARB(GL_TEXTURE1_ARB, quadWidth, quadHeight); glMultiTexCoord2fARB(GL_TEXTURE2_ARB, quadWidth, quadHeight); glVertex3f( 1.0f,-aspectRatio, 0.1f); glEnd(); // Done Drawing The Background glFlush (); // Flush The GL Rendering Pipeline } /******************************************************/ The problem is the texture loading in createTexture(). Since I am dealing with video, and therefore hundreds of images, I must load sequencially each of these images as textures with glTex(Sub)Image2D. This takes time: about 0.05 seconds per call to createTexture(). Yet I see no alternative. Also, note that I am dealing with floats for my image data. Does anyone have any suggestions to may this more efficient? thanks in advance Lwi

Share this post


Link to post
Share on other sites
dont keep recreating the texture, create the textures once on startup, store the handles and then glTexSubImage() them each time instead, this will/should be MUCH faster

as a side note, when posting large amounts of code its best to use the [ source ][ /source ] tags and thanks for only posting the relivent bits, you've no idea how annoying it is when someone posts ALL the program [headshake]

Share this post


Link to post
Share on other sites
but I have to, since there are too many of them. A one minute video at 30fps gives 1800 textures, and I am working 720x480 images. They can't possibly all fit in the video card's memory.

And there is probably a set limit on how many texture handles you can create with glGenTextures and glBindTextures.

If you tell me that there isn't a limit, i'll be very happy!

Share this post


Link to post
Share on other sites
Don't create a new texture each time. Create the three textures just once, in an init function, and then each frame just use glTexSubImage2D to update the texture.

Also try to find out where your bottleneck is. What happens if you remove the createTexture calls so you're doing all the drawing but not actually recreating the textures. Similarly, what happens if you update the textures but don't actually do any drawing.

Enigma

Share this post


Link to post
Share on other sites
Quote:
Original post by lwi
but I have to, since there are too many of them. A one minute video at 30fps gives 1800 textures, and I am working 720x480 images. They can't possibly all fit in the video card's memory.

And there is probably a set limit on how many texture handles you can create with glGenTextures and glBindTextures.

If you tell me that there isn't a limit, i'll be very happy!


glTexSubImage2D will replace the contents of the existing texture, so it won't use up more memory.

Enigma

Share this post


Link to post
Share on other sites
The bottle neck is the call to glTexSubImage2D in createTexture().

In createTexture(), I first do a call to glTexImage2D to create a power of 2 sized texture then fill it with a cal to glTexSubImage2D.

The time for each call is as follows:
glTexImage2D: 0.00578582 seconds
glTexSubImage2D: 0.0428587 seconds

... it varies from frame to frame but this is more or less the average.

You can clearly see where the bottleneck is.


If I created all the textures in an init() function, would it exceed the memory of the video card and produce an error, or would it be somehow managed transparently by some unknown (to me) process?

Lwi

Share this post


Link to post
Share on other sites
Just as enigma said, glTexImage2d is a very very expensive opreation as you are telling the driver to recreate and set up a entirely new texture, glTexSubImage2d will just replace pixel data and nothing else, wich is under a normal OpenGl implementation much faster. Also, OpenGL is a state machine so a state that have been set will not change untill you explicitly change it again, thus it seems to me you are making meny redundant calls to glEnable, glTexEnvf. unless you change the use of texture target and the operations of those texture units within your program somewhere else. They just seems redundant to me, try and do them once only.


About your question on how meny textures glGenTextures can handle. What it returns are ID numbers to textures and they are given back as GLunit type so assuming it is a 32bit int you can have a few billion, but in your oiginal code you seem to create 3 new textures per frame without ever deleting them so I assume you will run out of textures that way after a while unless you delete them (imagine some user would leave your player on with a movie looping while he went to work or something similar).


Create your textures in a init function, then replace pixeldata within your paintloop should increase performance.

Share this post


Link to post
Share on other sites
Yes I already had noticed I didn't delete my textures, and indeed the app was slowing down after a while, but i rectified that problem. But either way, this is not the main issue.

You keep telling me that glTexImage2D is costly, and that glTexSubImage2D is faster, but as I have shown, it is glTexSubImage2D that takes the most time, not glTexImage2D, since glTexImage2D is only used to create a blank power-of-2-sized texture.

Of course i realize creating all the textures for the whole movie in the initialisation would speed up the rendering, but doesn't the very nature of videos (having thousands of potentially large images) prohibit me from doing this?

Lwi

Share this post


Link to post
Share on other sites
Quote:
Original post by lwi
The bottle neck is the call to glTexSubImage2D in createTexture().

In createTexture(), I first do a call to glTexImage2D to create a power of 2 sized texture then fill it with a cal to glTexSubImage2D.

The time for each call is as follows:
glTexImage2D: 0.00578582 seconds
glTexSubImage2D: 0.0428587 seconds

... it varies from frame to frame but this is more or less the average.

You can clearly see where the bottleneck is.


Possible not true. For starters your glTexImage2D() command doesnt update any texture infomation thus its going to be apprently quicker to execute, HOWEVER there could be a hidden cost associated with having a buffer created with no texture data and then glTexSubImage() onto it afterwards which is slowing things down and this is accuring every frame (i'm not certain but I think drivers setup the state but dont reserve the memory until data is pushed into the buffer).

Quote:

If I created all the textures in an init() function, would it exceed the memory of the video card and produce an error, or would it be somehow managed transparently by some unknown (to me) process?


As I said in my first post and as Enigma and todderod have re-echo'd you DONT want to create 1000s of textures on start up this would be foolish.
Instead you create just three textures and fill them with black (to ensure the data area is reserved), setup the texture params and store the handles to it.
Then in your main loop you call an UpdateTexture() function which performs the glTexSubImage() call and updates the pre-existing textures.

do not call glGenTextures() in your main loop
do not call glTexImage2d() in your main loop

short of writing the code for you I cant make it any clearer than that.

Share this post


Link to post
Share on other sites
Edit: Read phantoms post, say the same thing but easier to read.

The slowdown was most likely due to old texture (with the size of 1024x1024 still residing in memory) causing your app to use more and more AGP memory and in the worst case virtual memory.

However, either I don't understand what you mean or I am just slow, but there is a difference between pizel/texel data and an actual texture, the same texture can in one frame contain the image of a cow and in the next frame the SAME texture can be displaying a rabbit. Yes, you point out that glTexSubImage2d takes longer time for you than glTexImage2D, but in your use of glTexImage2d (you provide it with a NULL pointer) you are not filling the texture with any data so indeed it takes less time to execute than sending down width*height*4 bytes of data for your image with glTexSubImage2d. But to me it seems all your calls to glTexImage2d are redundant, you can create 3 textures with glTexImage2d when you start your application, and then reuse these texture for every frame of videodata you wanna display, there is no need for any more textures than this. What you do in your paint function would then be to call glTexSubImage2d to replace the data within the texture to the data that you wanna display this frame.

Share this post


Link to post
Share on other sites
Oh! now I get it. Sorry for being a little dense, I am still rather newbiesque.

I'll reply back when it's done.

Thanks for your patience.

Lwi

Share this post


Link to post
Share on other sites
Oooookk... it's done.

- I now create 3 textures it the init stage.
- I have an updateTexture method that fills the 3 existing textures with only calls to glBindTexture and glTexSubImage2D.
- my paintGL function is now of the form:

updateTextures;
multitexturing;
apply it on a quad;

The call to glTexSubImage2D still takes around 0.04 seconds. Darn diddly-arn.

Right now I'm guessing my real problem is the fact the I use float image data, and that my images are 720x460, which might be a bit much. is it?

Lwi

Share this post


Link to post
Share on other sites
Try changing your glTexSubImage2D calls to:
glTexSubImage2D(GL_TEXTURE_2D, 0, 0, 0, width, height, format, GL_UNSIGNED_BYTE, pixels);

and time it. It's completely wrong and will give you completely useless output, but it'll highlight whether using floats is the main cause of the slowdown or not.

Enigma

Share this post


Link to post
Share on other sites
Changing GL_FLOAT to GL_UNSIGNED_BYTE had no effect. However, as expected, changing the width and the height affected the performance a lot.

So I'm not really *there* yet.

[Edited by - lwi on December 1, 2004 12:38:57 PM]

Share this post


Link to post
Share on other sites
Ok... I'm not getting any replies, so I'll throw another idea:

What if I took on a completly different approach and used gldrawpixels to display my images in a 2D context?

the last post in this thread seems to indicate it would be faster:
http://groups.google.ca/groups?hl=en&lr=&threadm=bhi9l2%24tus%2401%241%40news.t-online.com&rnum=3&prev=/groups%3Fq%3Dopengl%2Bvideo%2BglTexSubImage2D%26hl%3Den%26lr%3D%26selm%3Dbhi9l2%2524tus%252401%25241%2540news.t-online.com%26rnum%3D3

any thoughts?

Share this post


Link to post
Share on other sites
perhaps use drawpixels though texsubimage shouldnt be slow (unless youre doing it wrong on my gffx5900 i can get (RGB) ~300 million texels/second + 1billion with packed pixels

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this