Jump to content
  • Advertisement
Sign in to follow this  
Renaissanz

Using pixel buffer objects with glReadPixels

This topic is 4866 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi everyone, have a question regarding pixel buffer objects. I am running a P4 1.5 with a GeForce 5200 FX (AGP of course), and the 80+ Forceware drivers. I am trying to use PBO to increase the speed of glReadPixels. I've implemented a class for doing just that, and alternatively, doing a normal glReadPixels (for comparison). I am reading both the color and depth components. My normal glReadPixels looks like:
void readPixelsNormal()
{
  // init mem
  unsigned int * m_pBufferColor = new unsigned int[GetWidth() * GetHeight()]; 
  unsigned int * m_pBufferDepth = new unsigned int[GetWidth() * GetHeight()];

  glReadPixels(0, 0, GetWidth(), GetHeight(), GL_RGBA, GL_UNSIGNED_BYTE, 
               m_pBufferColor);
  //... do stuff to the pixels ...

  glReadPixels(0, 0, GetWidth(), GetHeight(), GL_DEPTH_COMPONENT,
               GL_UNSIGNED_INT, m_pBufferDepth);
  //... do stuff to the pixels ...
}

Getting about 30 FPS using the above approach. Now next is my PBO implementation:
// macro for pointing glReadPixels to ... well ... nowhere
#define BUFFER_OFFSET(i) ((char *)NULL + (i))

// PBO generated IDs
GLuint m_pPBO[2] = {0, 0};
unsigned int * m_pBuffer;

void initPBO()
{
  // init the PBOs
  glGenBuffersARB(2, m_pPBO);
  glBindBufferARB(GL_PIXEL_PACK_BUFFER_EXT, m_pPBO[0]);
  glBufferDataARB(GL_PIXEL_PACK_BUFFER_EXT, 
                  (GetWidth() * GetHeight() * sizeof(unsigned int)), 
                  NULL, 
                  GL_STREAM_READ);

  glBindBufferARB(GL_PIXEL_PACK_BUFFER_EXT, m_pPBO[1]);
  glBufferDataARB(GL_PIXEL_PACK_BUFFER_EXT, 
                  (GetWidth() * GetHeight() * sizeof(unsigned int)), 
                  NULL, 
                  GL_STREAM_READ);
  
  // bind it to nothing so other stuff doesn't
  // think it should use the PBOs
  glBindBufferARB(GL_PIXEL_PACK_BUFFER_EXT, 0);
}

void readPixelsPBO()
{
  // bind buffer #1
  glBindBufferARB(GL_PIXEL_PACK_BUFFER_EXT, m_pPBO[0]);

  // read pixels
  glReadPixels(0, 0, GetWidth(), GetHeight(), GL_RGBA,
               GL_UNSIGNED_BYTE, BUFFER_OFFSET(0));

  // map memory from card
  m_pBuffer = static_cast<unsigned int *>(glMapBufferARB(GL_PIXEL_PACK_BUFFER_EXT, GL_READ_ONLY_ARB));

  //... do stuff to pixels ...
  
  // unmap the memory
  if (!glUnmapBufferARB(GL_PIXEL_PACK_BUFFER_EXT))
  {
    //  handle the error
  }

  // bind buffer #2
  glBindBufferARB(GL_PIXEL_PACK_BUFFER_EXT, m_pPBO[1]);

  // read pixels
  glReadPixels(0, 0, GetWidth(), GetHeight(), GL_RGBA,
               GL_UNSIGNED_BYTE, BUFFER_OFFSET(0));

  // map memory from card
  m_pBuffer = static_cast<unsigned int *>(glMapBufferARB(GL_PIXEL_PACK_BUFFER_EXT, GL_READ_ONLY_ARB));

  //... do stuff to pixels ...

  // unmap the memory
  if (!glUnmapBufferARB(GL_PIXEL_PACK_BUFFER_EXT))
  {
    //  handle the error
  }

  // bind it to nothing so other stuff doesn't
  // think it should use the PBOs
  glBindBufferARB(GL_PIXEL_PACK_BUFFER_EXT, 0);
}

void killPBO()
{
  // kill the PBOs
  glDeleteBuffersARB(2, m_pPBO);
  glBindBufferARB(GL_PIXEL_PACK_BUFFER_EXT, 0);  
}


It's weird. This second approach yields ~27 FPS. If I'm bypassing the normal readback pipeline, and directly accessing card memory, shouldn't I be getting some kick-butt framerate? In addition, since I'm using STREAM data, shouldn't the glReadPixels be returning immediately and behave asynchronously? Am I doing something wrong? What do you gurus think?

Share this post


Link to post
Share on other sites
Advertisement
The glReadPixels should be returning immediately, but you are mapping the data right after it which needs to wait until all the data is there. What you should do is bind the first buffer and read into that, then bind the next buffer and read into that. Then you can bind the first buffer again and map that which will wait until the first glReadPixels is completed. Then you can do whatever you want to the data from the first glReadPixels while the second glReadPixels is completing. When you are done you unmap the first buffer, then map the second buffer and if the second glReadPixels isn't completed by that time it will wait until it is finished. Then you can do what you want to the data from the second glReadPixels. Then remember to unmap the second buffer as well.
 - bind first buffer
- glReadPixels on first buffer
- bind second buffer
- glReadPixels on second buffer
- bind first buffer and map it
- use first buffer's data
- unmap first buffer
- bind second buffer and map it
- use second buffer's data
- unmap second buffer
EDIT: There is an asynchronous glReadPixels example in the GL_ARB_pixel_buffer_object spec. Go there and search for "Example 3"

Share this post


Link to post
Share on other sites
Well, I gave it a try Kalidor, and there was 1 FPS improvement. The traditional glReadPixels still kicks its butt though.

I'd suspect that perhaps my video card wasn't performing well, BUT I also had someone try it on a P4 3.0 w/ a GeForce 6800 Ultra 512 MB, and glReadPixels still kicked PBO's hiney.

I've got to be doing something wrong.

Share this post


Link to post
Share on other sites
Quote:
Original post by Renaissanz
Well, I gave it a try Kalidor, and there was 1 FPS improvement. The traditional glReadPixels still kicks its butt though.

I'd suspect that perhaps my video card wasn't performing well, BUT I also had someone try it on a P4 3.0 w/ a GeForce 6800 Ultra 512 MB, and glReadPixels still kicked PBO's hiney.

I've got to be doing something wrong.
Hmm, I'm not sure then. I don't have too much experience with PBOs so I don't completely understand the ins-and-outs of using it, but doing it the way I described should at least be somewhat faster than the traditional way. Weird... Maybe someone else with more PBO experience will come along to help out. Good luck.

Share this post


Link to post
Share on other sites
It is also possible that PBOs are not implemented in a performant way in the drivers yet (they still could be using the standard glReadPixels path, rather than performing copies asynchronously).

Regardless, Kalidor's suggestion is the proper way to approach this problem. The best way to improve on that approach is to go ahead and do some more work before MapBuffer:

ie:

glReadPixels into buffer 1
glReadPixels into buffer 2
// do something else for awhile
map buffer 1
map buffer 2

This gives the driver more time to perform an asynchronous copy to system memory.

Share this post


Link to post
Share on other sites
RichardS: if I'm understanding you correctly:

Quote:
"This gives the driver more time to perform an asynchronous copy to system memory."

Then I'm not really getting direct memory access to the buffer?

Share this post


Link to post
Share on other sites
Probably not. If your app wants to access data from a glReadPixels, you are still going to be doing a readback from the card to system memory. I may be wrong, but I don't believe that VRAM can be efficiently exposed for reading, at least when using AGP. You definitly can efficiently *write* directly to VRAM though.

By using PBOs in this way, you can start the memcpy, then go off and do other things while the data is DMA'd around, assuming the driver actually optimizes this. Without PBO, you'll block on the glReadPixels until 1) your scene finishes rendering, and 2) the copy can complete. This is essentially putting a glFinish() in the middle of your code.

However, PBOs merely extend the API such that the driver gains the flexility of optimizing this. It may not (yet). The spec doesn't require any particular performance charistics, it requires only correctness.

If the PBO route is only slightly slower, I would use it anyway, because it does allow the drivers a lot more room to maneuver in the future (even if they do not already).

That being said, I have no idea what they're doing today...

Share this post


Link to post
Share on other sites
Well, I can understand what's been said about PBOs, but I have one burning question:

Why isn't PBO read back at least equal in performance to a regular glReadPixels?

Share this post


Link to post
Share on other sites
Here is a good paper on using PBOs to get efficient pixel transfers. It may not help in increasing your performance too much since the previous suggestion didn't help, but it's still a good read and worth the few minutes it'll take.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!