Sign in to follow this  
glJack

EXT_PIXEL_BUFFER_OBJECT for Render to Texture

Recommended Posts

glJack    122
this way PBO can be used for render-to-texture implementation:
glBindBuffer(GL_PIXEL_PACK_BUFFER_EXT, m_iBuffer);
glReadPixels(0,0, 512, 512, GL_RGB, GL_UNSIGNED_BYTE, NULL);
glBindBuffer(GL_PIXEL_PACK_BUFFER_EXT, 0);

glBindTexture(GL_TEXTURE_2D, m_iTexture);
	
glBindBuffer(GL_PIXEL_UNPACK_BUFFER_EXT, m_iBuffer);
glTexSubImage2D(GL_TEXTURE_2D, 0, 0, 0, 512, 512, GL_RGB, GL_UNSIGNED_BYTE, NULL);
glBindBuffer(GL_PIXEL_UNPACK_BUFFER_EXT, 0);
it works, but it is only 3% faster(GeForce2 GTS) than the following:
char buff[512][512][3];
glReadPixels(0,0, 512, 512, GL_RGB, GL_UNSIGNED_BYTE, buff);
glBindTexture(GL_TEXTURE_2D, m_iTexture);
glTexSubImage2D(GL_TEXTURE_2D, 0, 0, 0, 512, 512, GL_RGB, GL_UNSIGNED_BYTE, buff);
what's wrong? Why doesn't PBO accelerate the process of data transmission within the videocard? Well, maybe the PBO main intent is to parallelize data excange between app and gfx card, but in this case it should also increase performance, I think.

Share this post


Link to post
Share on other sites
dimebolt    440
Quote:
Original post by glJack

char buff[512][512][3];
glReadPixels(0,0, 512, 512, GL_RGB, GL_UNSIGNED_BYTE, buff);
glBindTexture(GL_TEXTURE_2D, m_iTexture);
glTexSubImage2D(GL_TEXTURE_2D, 0, 0, 0, 512, 512, GL_RGB, GL_UNSIGNED_BYTE, buff);


If this is all you need, glCopyTexImage2D(...) will be a lot faster. Allthough it's supposed to be better on newer hardware, glReadPixels() is still dreadfully slow. You won't be able to modify or access the texture in your app, though (because it copies within GPU memory).

Tom

Share this post


Link to post
Share on other sites
glJack    122
Quote:

If this is all you need, glCopyTexImage2D(...) will be a lot faster.

I tried it, and it is 2 times faster. Yes, it is faster and I'd use it.

But I just want to compare performance of 1st and 2nd cases. I think that in the first case data copying has be faster than in the second case, because in 1st case all copying operations happen within gfx card/driver memory, and in 2nd case data is copied to system memory, and read back from it.

The first piece of code uses PBO, has to be nearly as fast as glCopyTexImage2D(...) is.

Why there is no difference in performance between these 2 approaches?
Maybe this is because I test them on GeForce 2 GTS card. Maybe GeForce 2 GTS doesn't have hardware support for this extension, and all PBO memory allocations/operations happen within system memory?

Share this post


Link to post
Share on other sites
dimebolt    440
Quote:
Original post by glJack
But I just want to compare performance of 1st and 2nd cases. I think that in the first case data copying has be faster than in the second case, because in 1st case all copying operations happen within gfx card/driver memory, and in 2nd case data is copied to system memory, and read back from it.

The first piece of code uses PBO, has to be nearly as fast as glCopyTexImage2D(...) is.


Do you have any evidence to support this claim?

Tom

Share this post


Link to post
Share on other sites
Guest Anonymous Poster   
Guest Anonymous Poster
Quote:
Original post by dimebolt
Do you have any evidence to support this claim?

Tom


He thinks that it should be faster. If he is wrong just explain him why. Why is it that everybody has to show evidences for everything these days? This is not a politic show.

Share this post


Link to post
Share on other sites
dimebolt    440
Quote:

He thinks that it should be faster. If he is wrong just explain him why. Why is it that everybody has to show evidences for everything these days? This is not a politic show.


If I knew why, I would tell him why. I do not know the mechanics behind GL_PIXEL_PACK_BUFFER_EXT (and I'm sure, I'm not alone in that). From the posts from glJack, I get the impression he is aware of these mechanics. All I meant was to ask for a link or pointer to a document that explained why it should work as he claims it should work. English not being my first language, I probably used the incorrect phrasing to convey this message.

Tom

Share this post


Link to post
Share on other sites
glJack    122
Quote:

Do you have any evidence to support this claim?


No, I don't. If I did, I would post it here. As I allready said
I think that it should be faster, because all data transfer operations happen wihting gfx card memory, as in the case with glCopyTexImage2D, so the speed has to be comparable with that.

Maybe the PBO approach speed doesn't have to be as glCopyTexImage2D but it has to be more than just 3% faster than pure
glReadPixels(0,0, 512, 512, GL_RGB, GL_UNSIGNED_BYTE, buff);

The fact that speed difference is minimal made me think that GeForce 2 GTS
implements PBO ext in software, and keeps all PBO data in system memory...

Quote:

He thinks that it should be faster. If he is wrong just explain him why. Why is it that everybody has to show evidences for everything these days? This is not
a politic show.

I really just think it should be faster, I don't claim that.

Anonymous poster
Thanx :)

Share this post


Link to post
Share on other sites
dimebolt    440
Quote:
Original post by glJack
The fact that speed difference is minimal made me think that GeForce 2 GTS
implements PBO ext in software, and keeps all PBO data in system memory...


I don't know enough about GL_PIXEL_PACK_BUFFER_EXT to answer your question. Could you post the entire code? I can try it later today, on my Geforce6 at home. That one should definately support the extension. Then we'll at least know if the problem is caused by your geforce2. It is quite likely that the problem lies there, because the GL_PIXEL_PACK_BUFFER_EXT extension is more recent than the Geforce2.

Tom

Share this post


Link to post
Share on other sites
glJack    122
dimebolt
the source and binaries are here: www.glplanet3d.newmail.ru/pbo/pbo.html

The demo will work for 30 seconds and generate "out.txt" file.
thank you for assistance!

btw, which tag is used to post a link ?

[Edited by - glJack on July 5, 2005 11:30:57 AM]

Share this post


Link to post
Share on other sites
dimebolt    440
Quote:
Original post by glJack
dimebolt
the source and binaries are here: www.glplanet3d.newmail.ru/pbo/pbo.html

The demo will work for 30 seconds and generate "out.txt" file.
thank you for assistance!

btw, which tag is used to post a link ?


Sorry, but I was already offline when you posted. Did you take it down again? Because the link is dead now... You can use normal html to post links ("a href") tags to post links.

Tom

Share this post


Link to post
Share on other sites
dimebolt    440
Quote:
Original post by glJack
Try this link again. I fixed it:
www.glplanet3d.newmail.ru/pbo/pbo.html


Yes, that worked. I got your program working on a geforce3 (Geforce6 results tomorrow). These were the results (VS6 compiled with O2):
glReadPixels pure : 636
glReadPixels + PBO : 618
glCopyTexImage2D : 3705

As you can see, similar results... But the Geforce3 probably also predates VBO's.

Note: you did not add your 'Primitive' class, so I replaced the drawing of the teapot with:

glBegin(GL_QUADS);
glVertex3d(-1,-1,0);
glVertex3d( 1,-1,0);
glVertex3d( 1, 1,0);
glVertex3d(-1, 1,0);
glEnd();


This should also give more accurate performance results, as the rendering time of the teapot is taken out of the loop. That's why glCopyTexImage is now 6 times faster.

When I get home, I'll try it on my Geforce6 for the interesting test...

Tom

Share this post


Link to post
Share on other sites
dimebolt    440
The results are not much different on my geforce6 (recent Nvidia drivers installed):
glReadPixels pure : 2558
glReadPixels + PBO : 2530
glCopyTexImage2D : 10611

In fact the PBO is slower than the pure version, just like on the geforce3. Clearly the glReadPixels still goes through main memory in all implementations. I don't know why. It appears we'll have to use glCopyTexImage2D or, even better, FBO's for efficient RTT. This presentation from Nvidia explains how to use FBO's and why FBO's are even faster than glCopyTexImage2D(...).

Tom

[Edited by - dimebolt on July 7, 2005 8:15:34 AM]

Share this post


Link to post
Share on other sites
Kylotan    9860
Can anybody confirm the above? Virus checkers can sometimes give out erroneous warnings due to the somewhat primitive signature checking some of them use.

Share this post


Link to post
Share on other sites
glJack    122
Tom
Tom, Thanx. You really helped me a lot! I appreciate your help.

Quote:

VIRUS WARNING


I don't have anti virus software installed, so, I'm not sure about executable safety. Sorry, if there is a virus.
Well, the problem is solved by the time, so don't download it.

Share this post


Link to post
Share on other sites
glJack    122

--------------------------------------------------------
glBindBuffer(GL_PIXEL_PACK_BUFFER_EXT, buff_0);
glReadPixels(0, 0, width, height, GL_RGB, GL_UNSIGNED_BYTE, NULL);
glBindBuffer(GL_PIXEL_PACK_BUFFER_EXT, buff_1);
glReadPixels(0, height/2, width, height, GL_RGB, GL_UNSIGNED_BYTE, NULL);
--------------------------------------------------------

Are these 2 data transfers supposed to be parralel ?

Share this post


Link to post
Share on other sites
Myopic Rhino    2315
Quote:
Original post by Kylotan
Can anybody confirm the above? Virus checkers can sometimes give out erroneous warnings due to the somewhat primitive signature checking some of them use.
NAV 2005 flags it as the same virus.

Share this post


Link to post
Share on other sites
dimebolt    440
Quote:
Original post by Myopic Rhino
Quote:
Original post by Kylotan
Can anybody confirm the above? Virus checkers can sometimes give out erroneous warnings due to the somewhat primitive signature checking some of them use.
NAV 2005 flags it as the same virus.


Good thing I never tried the exe :)

Tom

Share this post


Link to post
Share on other sites
Guest Anonymous Poster   
Guest Anonymous Poster
hi,
im beginner in image processing,
just want to read the framebuffer with glreadpixels and then draw it with gldrawpixels.
i had some problems and didnt worked, could anyone write me a few lines of code for an example?
(with initializing variables, and so on, with anything i need to that)

thanks all

Share this post


Link to post
Share on other sites
Name_Unknown    100
Quote:
Original post by dimebolt
Quote:
Original post by glJack
The fact that speed difference is minimal made me think that GeForce 2 GTS
implements PBO ext in software, and keeps all PBO data in system memory...


I don't know enough about GL_PIXEL_PACK_BUFFER_EXT to answer your question. Could you post the entire code? I can try it later today, on my Geforce6 at home. That one should definately support the extension. Then we'll at least know if the problem is caused by your geforce2. It is quite likely that the problem lies there, because the GL_PIXEL_PACK_BUFFER_EXT extension is more recent than the Geforce2.

Tom



Just because a driver supports an extensions doesn't mean it is actually in hardware.. some cards 'support' ARB Vertex programs but actually do it in software in the driver. It could be that because NVidia drivers support all of their cards doesn't mean it is being done in hardware, so it maybe that the Geforce 2 has this extension but the driver emulates it (maybe even with glTexSubImage2D...). Don't know, nobody but NVidia knows how their drivers work.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this