Archived

This topic is now archived and is closed to further replies.

How to read frame buffers fast?

This topic is 5146 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi All, I need to read the z-buffer manny times for preprocessing using the command: glReadPixels(0, 0, height, height, GL_DEPTH_COMPONENT, GL_FLOAT, pixelsDepth); Each pixel is a float and thus contains (in windows) 4 bytes. This reading is a bottle neck for my preprocessing. Is there any faster way to read the buffer or maybe a way to accelerate it? I can use a stencil buffer with 1 byte per pixel instead of 4, will the reading be faster? Thanks for the answers!

Share this post


Link to post
Share on other sites
There isnt a way to speed up readbacks from the card, AGP was only really designed to work in one direction and the cards are basicaly setup to be written to quickly.

Depending on the complexity of your preprocessor it might be better to write a basic software version instead of a fully rendered hardware version

Share this post


Link to post
Share on other sites
if you don''t need the accuracy, consider using one or two bytes instead of a four byte.

unsigned char
unsigned int

it is ALOT faster but you also have a much more limited range.


The good thing is that unless you actually need the accuracy, you can convert from lets say the unsigned char to a float just by using
floatval=charval*0.0039215686274509803922f;

and if you are just trying to render the depth image directly a bitmap or use it for image editing, you only need a char for the bitmap, and an int at most for image editing.

Share this post


Link to post
Share on other sites
Hi

Dredge-Master, thanks for your answer.

I only need one byte of accuracy for the preprocessing.

1. What is prefered (faster)? using the stencil buffer or the depth-buffer(assuming I can change its accuracy only for the preprocessing)?

Share this post


Link to post
Share on other sites
reading 4 byte data compared to 1 byte will be ALOT slower than the 1 byte (stencil)

if you are going to read both as 1 byte (GL_UNSIGNED_BYTE being the fastest), then both will pretty much be the same give or take a few clock ticks.


assuming that yuor buffers are all single precision floats, then the depth buffer will be the faster one.


Basically the floating point is clamped to a range of 0 and 1.
Since we are using a single unsigned byte, we just multiply it by 2^8 and bingo, you have your value.

for the stencil buffer, according to the documentation it is converted to a fixed point number, shifted to either side, has the index_offset added to it, and then can possibly be run through a stencil map.

this is a bit more processing than the depth one, so its a bit slower in theory.

problem is that what you do with the data afterwards can be the real bottle neck. It may end up that your stencil comparison (b[x][y]!=pixelmap[x][y]) is alot faster than your depth comparisons(b[x][y]>pixelmap[x][y]&&c[x][y]

Best way is to use a timer of some sort. Use QueryPerformanceCounter and QueryPerformanceFrequency



Beer - the love catalyst
good ol'' homepage

Share this post


Link to post
Share on other sites
Reading the stencil buffer is not a good idea, either, as it''s practically ALWAYS stored together with the depth buffer, i.e. the depth buffer gets 24 bits, the stencil buffer gets 8 bits, so you end up reading 4 bytes/pixel anyway.
What do you need reading from the frame buffer for anyway? The reason to do this must be a VERY good one. And even if it''s vital, as you don''t mind precision, how about a very tightly coded, simple software rasteriser? You can read back the results from that anytime.
By the way, reading back from the framebuffer can in fact fail on some cards, so be wary of that.

- JQ

Share this post


Link to post
Share on other sites
did a couple of tests and I was way off.

This is in counter ticks from performance counter.

http://www.members.optushome.com.au/jlferry/image/stencil_index_wins.jpg 65kb jpeg.
(I''ll refrain from posting screenshots here since its 1024x768)

basically on the screen (I think it was set to around 400x400x32bit) it was coming up with around 20 ticks for a

glFinish();
QueryPerformanceCounter(&S);
glReadBuffer(stencil_index);
glFinish();
QueryPerformanceCounter(&E);
Label2->Caption=E.LowPart-S.LowPart;

for each one, (glDrawBuffer replacing glReadBuffer for the last half)


Anyway, the stencil index reading was blazing fast compared to the other tests.

For writing I had come across it before, but it is rather slow to write the depth component back to the screen, but I never new how fast the others wrote back (assummed it was slightly slower, not the other way around).

My only guess for the increase is that since alot of it is in the GPU, it may have quicker write speeds to the memory on the card than on my mother board.


GF2-MX400 32mb ram
Athlon 1ghz, 768mb ram 60ms
1024x768x32 Samsumg TFT response 12ms. I THINK it is currently set to 75fps.

And yes both total war and winamp were running when I did the test.

Share this post


Link to post
Share on other sites