Rendering depth to client memory

Started by
10 comments, last by libicocco 12 years, 12 months ago
Hi,

I'm trying to render the depth of a collada model to client memory, to perform some image processing in cpu. I'm using vbo for storing the vertices in gpu memory. So far the only option that has worked for me was a framebuffer object with a texture attached, but it is too slow for my requirements (~1ms/frame for rendering and ~3ms/frame for passing the texture to client memory, 640x480 float). I cannot use two buffers and get the previous frame while computing the current one; I must request a render and obtain the depth corresponding to that request immediately. I wonder:

- what's the best option for these requirements: texture+framebuffer? renderbuffer + framebuffer? pixelbuffer?
- any code sample to see how to do it with pixelbuffer or rbo+fbo? i tried the later but i only got a black screen.
Advertisement
[color=#1C2837][size=2]

[color=#1C2837][size=2]....to perform some image processing in gpu...........for passing the texture to client memory...
[color=#1C2837][size=2][/quote]
[color=#1C2837][size=2]

[color=#1C2837][size=2]By client memory you mean pc ram? Because if you use a framebuffer/render to texture you have a texture to bind to the gpu and can do gpu processing with it.

NBA2K, Madden, Maneater, Killing Floor, Sims http://www.pawlowskipinball.com/pinballeternal

sorry, i meant in cpu! and by client memory i mean pc ram, yes.

[color="#1C2837"]

[color="#1C2837"]....to perform some image processing in gpu...........for passing the texture to client memory...
[color="#1C2837"]

[color="#1C2837"]
[color="#1C2837"]By client memory you mean pc ram? Because if you use a framebuffer/render to texture you have a texture to bind to the gpu and can do gpu processing with it.
[/quote]
The performance of 3ms/frame seems good. You can probably improve performance by reading back the native format of the depth buffer, which is normally 16 bit integer or 24 bit integer and 8 bit junk or 24 bit integer and 8 bit for the stencil (D24S8).
It might be 32 bit integer.
It really depends on what you asked for.
GL_DEPTH_COMPONENT16? GL_DEPTH_COMPONENT24? GL_DEPTH_COMPONENT32? and there was one for D24S8.

http://www.opengl.org/wiki/GL_EXT_framebuffer_object#Quick_example.2C_render_to_texture_.282D.29.2C_mipmaps.2C_depth_stencil
Sig: http://glhlib.sourceforge.net
an open source GLU replacement library. Much more modern than GLU.
float matrix[16], inverse_matrix[16];
glhLoadIdentityf2(matrix);
glhTranslatef2(matrix, 0.0, 0.0, 5.0);
glhRotateAboutXf2(matrix, angleInRadians);
glhScalef2(matrix, 1.0, 1.0, -1.0);
glhQuickInvertMatrixf2(matrix, inverse_matrix);
glUniformMatrix4fv(uniformLocation1, 1, FALSE, matrix);
glUniformMatrix4fv(uniformLocation2, 1, FALSE, inverse_matrix);
I think it should be faster than 3ms/frame; 100K floats/ms (640x480floats/3ms) feels a bit slow for a 9800GT, isn't it?
I asked for GL_DEPTH_COMPONENT:


[source lang="cpp"]
glTexImage2D( GL_TEXTURE_2D, 0, GL_DEPTH_COMPONENT, WIDTH, HEIGHT, 0, GL_DEPTH_COMPONENT, GL_FLOAT, 0);
glFramebufferTexture2D(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT,GL_TEXTURE_2D, depthTextureId, 0);[/source]


and got it as GL_FLOAT:

[source lang="cpp"]
glGetTexImage( GL_TEXTURE_2D, 0, GL_DEPTH_COMPONENT,GL_FLOAT,maskBW.data);
[/source]


The performance of 3ms/frame seems good. You can probably improve performance by reading back the native format of the depth buffer, which is normally 16 bit integer or 24 bit integer and 8 bit junk or 24 bit integer and 8 bit for the stencil (D24S8).
It might be 32 bit integer.
It really depends on what you asked for.
GL_DEPTH_COMPONENT16? GL_DEPTH_COMPONENT24? GL_DEPTH_COMPONENT32? and there was one for D24S8.

http://www.opengl.or...C_depth_stencil

That is a common misconception. GL_FLOAT has no effect. The format is selected by the 3rd parameter which is GL_DEPTH_COMPONENT.
Since you didn't type in the "bit", the driver will select one for you. I have no idea whether it is 16 bit or 24 but you can query with


glBindTexture(GL_TEXTURE_2D, textureID);
int format;
glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, GL_TEXTURE_COMPONENT, &format);


It might be faster than 3 ms if you read it in a native format.

http://www.opengl.org/wiki/Common_Mistakes#Depth_Buffer_Precision
Sig: http://glhlib.sourceforge.net
an open source GLU replacement library. Much more modern than GLU.
float matrix[16], inverse_matrix[16];
glhLoadIdentityf2(matrix);
glhTranslatef2(matrix, 0.0, 0.0, 5.0);
glhRotateAboutXf2(matrix, angleInRadians);
glhScalef2(matrix, 1.0, 1.0, -1.0);
glhQuickInvertMatrixf2(matrix, inverse_matrix);
glUniformMatrix4fv(uniformLocation1, 1, FALSE, matrix);
glUniformMatrix4fv(uniformLocation2, 1, FALSE, inverse_matrix);
[font="Arial"]Thanks for the tip! After a small correction in the command (GL_TEXTURE_COMPONENTS instead of GL_TEXTURE_COMPONENT) I got 6402, which according to this [/font]
[font="Arial"]http://www.cs.duke.e...ant-values.html (better sources?) is just GL_DEPTH_COMPONENT, so i didn't get much information from that.[/font]
[font="Arial"]Since I prefer 32f data (OpenCV only has floats with depth 32) I did it like follows:[/font]


[source lang="cpp"][font="Arial"]glTexImage2D( GL_TEXTURE_2D, 0, GL_DEPTH_COMPONENT32F, WIDTH, HEIGHT, 0, GL_DEPTH_COMPONENT, GL_FLOAT, 0);[/font]
[font="Arial"]glGetTexImage( GL_TEXTURE_2D, 0, GL_DEPTH_COMPONENT,GL_FLOAT,maskBW.data);[/font][/source]

[font="Arial"]with slightly better results ( i guess it's saving a format transformation), but not really good (~2ms/frame). So h[/font]ow do I know which formats are native?
Shouldn't PBOs speed something like this up?
So how do I know which formats are native?[/quote]
You ask for the format.
Example : GL_DEPTH_COMPONENT16 is 16 bit integer.

Shouldn't PBOs speed something like this up? [/quote]
No, PBO doesn't speed up anything. PBO are async objects and if you know how to use them, you can hide the slowness of your graphics card/driver.
You would just use the PBO to download the pixels and then do some CPU work for a while. In the meantime, the pixels get downloaded. After a while, all the pixels will be downloaded.
Sig: http://glhlib.sourceforge.net
an open source GLU replacement library. Much more modern than GLU.
float matrix[16], inverse_matrix[16];
glhLoadIdentityf2(matrix);
glhTranslatef2(matrix, 0.0, 0.0, 5.0);
glhRotateAboutXf2(matrix, angleInRadians);
glhScalef2(matrix, 1.0, 1.0, -1.0);
glhQuickInvertMatrixf2(matrix, inverse_matrix);
glUniformMatrix4fv(uniformLocation1, 1, FALSE, matrix);
glUniformMatrix4fv(uniformLocation2, 1, FALSE, inverse_matrix);

So how do I know which formats are native?

You ask for the format.
Example : GL_DEPTH_COMPONENT16 is 16 bit integer.

[/quote]

Yes, but is the speed of different formats only depending on their size, or some of them are faster because they're more suitable for the hardware. For example, I've read that GL_UNSIGNED_24_8 would be faster; why?

This topic is closed to new replies.

Advertisement