Sign in to follow this  
spek

OpenGL Reconstructing pixel 3D position from depth

Recommended Posts

Hi, I saw its not the first time this has been asked here, but with the information from that posts I didn't succeed either. I'm playing with SSAO (with the help of this website: http://rgba.scenesp.org/iq/computer/articles/ssao/ssao.htm ). One of the things I need to do, is reconstructing the 3D world position for each pixel. Well, first of all, why don't they just store the 3D position in a map(they are only using 1 channel from those RGBA textures in this case), instead of reconstructing it with the help of the depth? Or is not exactly the same? I might be confused with terms as "clip space", "world space", "eye space", and so on. Anyway, this what I did. First I render the whole scene with to a texture. I store the depth by doing: pos = mul( in.pos, modelViewProjectionMatrix ); ... out.color.r = pos.z / pos.w; In a second pass, I draw a quad that fills the screen. Each pixel should reconstruct the 3D position, but here it goes wrong I think. I tried 2 ways:
// 1.
pos.xy = in.vertexPos.xy; // in vertex shader
...
float   depth      = tex2D( sceneDepthMap, texcoords.xy ).r;
float4	pPos3D     = mul( float4(pos.x, pos.y, depth, 1.0), InvProj );
	pPos3D.xyz = pPos3D.xyz / pPos3D.www;
	pPos3D.w   = 1.0f;
Maybe I'm using the wrong matrix for "invProj". If I understand it well, its the "inverse projection matrix" (I'm using OpenGL). I tried other matrices as well though (inverse modelview projection matrix). The other way:
// 2.
pos.xy = in.vertexPos.xy; // in vertex shader
viewVector = pos.xyz - camPos.xyz;
...
viewDir = normalize(viewDir);
float   depth   = tex2D( sceneDepthMap, texcoords.xy ).r;
float3  pPos3D	= cameraPos.xyz + viewVector.xyz * pDepth;
I suppose 'viewVector' is not correct here... Both ways give wrong results. If I compare it with the real 3D position as a color, its just totally different. My lacking knowledge about matrices and "spaces" is probably causing the problem... Anyone an idea what goes wrong? Greetings, Rick

Share this post


Link to post
Share on other sites
Well for your first bit about storing depth...the reason why its not always desirable to store position is because it can require more bandwidth and memory space to store and retrieve the data. Storing position requires at least a 64-bpp floating-point surface, which can also cause you to revert to 64-bpp surfaces for all of your g-buffer textures on hardware that requires uniform bpp with multiple render targets. Precision problems can also be encountered with 16-bits per component, especially when storing world-space position.

What I do, is I store linear eye-space depth instead of z/w. To get this I multiply the vertex position by the worldView matrix, and then store the .z component in the pixel shader (divided by the z-value of the camera frustum's far clip plane). Then in my second pass, which is a full screen pass, I store a "camera direction vector", which is a vector pointing to the far corners of the camera frustum. You can either pass this as part of the vertex, or calculate this for each vertex in the vertex shader (I calculate it). Then all you need to do is pass this to your pixel shader, and your world-space position for the pixel is this:


worldPos = cameraPos + pixelDepth * screenDir;


This is pretty nice, especially since its only one MADD instruction. What I actually do is perform all my calculations in view space, which makes determining the frustum corners much easier. Regardless, I think its a more elegant solution than using z/w.

EDIT: Your code for calculating the "viewVector" in the second bit of code seems to be wrong. Is in.vertexPos.xy the position in object space or in world space? Also, your depth would have to be un-normalized eye-space depth for that to work (not z/w), and I'm not sure if you were doing that.

Share this post


Link to post
Share on other sites
>> the reason why its not always desirable to store position is because it can require more bandwidth and memory space to store and retrieve the data
Sounds logical. But as far as I know, you can only use a 16/32F format with 4 channels anyway (in OpenGL, not sure about Direct3D). Storing only the depth could be very usefull for deferred shading since there is a lot of other data to store as well. But in the examples I saw, the RGBA texture was only used for storing the depth. So I though maybe I was doing something wrong, maybe the pixel position is not converted to "world space", but another space... Or something.


As for the depth z/w part, let's see if I'm doing it exactly right:

// vertex shader
out.pos = mul( modelViewProjection, in.vertexPos );
// Pixel shader
out.depth = out.pos.z / out.pos.w;

If I look at the result of this (stored as a texture), it seems to be ok. Although I can't see if the depth is 100% correct of course.


As for the second part, at least my "3dPos = cameraPos + viewVec * depth" was correct :).
>> Is in.vertexPos.xy the position in object space or in world space?
Normally I would multiply the in.vertexPos with the ModelViewProjection indeed. But in this case, it isn't necesary. I render a quad with corner coordinates (-1,-1 .. +1,+1), since it needs to stay in front of the camera ('HUD quad'). The vertex shader just passes these coordinates. I don't know in which space they are then, "Projection Space"?. But...... maybe this won't work for calculating the viewVector.

And how to calculate the farplane corners? Sorry for these dumb questions, but all that matrix and space stuff is really difficult to me. Talking about it, what exactly is clip-space, screen-space and view-space? I understand world and object space, but these others are confusing. You get that when a point is transformed into the view frustum or something (where 0,0,0 would be the camera position?) ?

Thanks for helping MJP,
Rick

Share this post


Link to post
Share on other sites
I am struggling with this too.. its seems i am close but not there yet... I wish soeoe could post a complete sahder instead of this littel fragments, as well as some pictures that maybe show the interpolants as renderer so we can test to see if each stage is working...

On of my main problems is that method of drawing a fullscreen quad uses pretransformed vertices, so I can us ea vertex sahder for that pass...otherwise I ahve troiuble mapping the pixels to textels perfectly(the screensized textures get a bit filtered otehrwise).

Does anyway have a good method of drawing a screen aligned quad that maps the texture perfectly to the screen pixel, and uses a vertex shader?

Share this post


Link to post
Share on other sites
Quote:
Original post by spek
>> the reason why its not always desirable to store position is because it can require more bandwidth and memory space to store and retrieve the data
Sounds logical. But as far as I know, you can only use a 16/32F format with 4 channels anyway (in OpenGL, not sure about Direct3D). Storing only the depth could be very usefull for deferred shading since there is a lot of other data to store as well. But in the examples I saw, the RGBA texture was only used for storing the depth. So I though maybe I was doing something wrong, maybe the pixel position is not converted to "world space", but another space... Or something.


I have no idea which formats are supported in GL, in D3D I use a very convenient single-channel 32-bit floating-point format for storing my depth. As for storing depth as RGBA8, I remember that fabio policarpo's deferred shading tutorial does this and uses a pair of functions for encoding and decoding a single floating-point value in RGBA8 format. I'd imagine this should work in a pinch, at the cost of some shader math.

Quote:
Original post by spek
As for the depth z/w part, let's see if I'm doing it exactly right:

// vertex shader
out.pos = mul( modelViewProjection, in.vertexPos );
// Pixel shader
out.depth = out.pos.z / out.pos.w;

If I look at the result of this (stored as a texture), it seems to be ok. Although I can't see if the depth is 100% correct of course.


That is right for storing depth as z/w. It should look "right", as this is precisely what gets stored in the z-buffer. But in this case its not really so much a case of "right", and more a case of "different". The difference between linearized eye-space depth (which you need to use for reconstruction position from the frustum corners) and z/w is that z/w is scaled between the far plane and the near plane of your projection frustum. So for example if your projection frustum had a minimum depth of 1.0 and a max depth of 1000.0, a z/w value of 0.0 would correspond to a view-space depth of 1.0 and a z/w value of 1.0 would correspond to a view-space depth 1000.0f. 0.5 would correspond to 500.5, and so on. Now for what I use, for calculating position from the frustum coordinates, the depth is normalized to a range between 0.0 and the depth of the far frustum plane. So in our example from before, a depth value of 0.0 would correspond to a view-space depth of 0.0, while 1.0 -> 1000.0 and 0.5->500.0. The difference is subtle, but important. My code for calculating this depth value looks something like this:


//vertex shader
OUT.viewSpacePos = mul(IN.position, worldViewMatrix);

//pixel shader
OUT.depth = viewSpacePos.z / cameraFarZ; //cameraFarZ is the the z value of the far clip plane of the projection frustum



Quote:
Original post by spek

As for the second part, at least my "3dPos = cameraPos + viewVec * depth" was correct :).
>> Is in.vertexPos.xy the position in object space or in world space?
Normally I would multiply the in.vertexPos with the ModelViewProjection indeed. But in this case, it isn't necesary. I render a quad with corner coordinates (-1,-1 .. +1,+1), since it needs to stay in front of the camera ('HUD quad'). The vertex shader just passes these coordinates. I don't know in which space they are then, "Projection Space"?. But...... maybe this won't work for calculating the viewVector.

And how to calculate the farplane corners? Sorry for these dumb questions, but all that matrix and space stuff is really difficult to me. Talking about it, what exactly is clip-space, screen-space and view-space? I understand world and object space, but these others are confusing. You get that when a point is transformed into the view frustum or something (where 0,0,0 would be the camera position?) ?

Thanks for helping MJP,
Rick


Let's start with the different coordinate spaces. I'll try to explain as best as I understand, please forgive me if it turns out my own understanding is inaccurate:

View-space (also known as eye-space) is a coordinate system based on the location and orientation of the camera. The camera position is always <0,0,0> in view-space, since its centered around the camera. This also means that if a certain point is 5 units directly in front of where the camera is facing, it will have a view-space position of <0,0,5>. Since view-space is just a translation and a rotation of your original world-space, view-space can be used for performing lighting calculations. Other spaces that utilize the perspective projection matrix can't be used for this, since perspective projection is a non-linear operation.

Clip-space is the result of transforming a view-space position by a perspective projection matrix. The result of this is not immediately usable, since the x y and z components must be divided by the w component to determine the point's screen position. This screen position is referred to as normalized device coordinates. In the vertex shader, the clip-space position is output since it can still be linearly interpolated in this form. Once you perform perspective division (divide by w), you can no longer interpolate which is needed for rasterization.

So getting back to your code...Your full-screen quad's vertex coordinates are going to be in some form of post-perspective format (I assume they're pre-transformed), and this means they're not directly usable calculating vectors since they're not in world-space or view-space.

As for your frustum coordinates, they're very easy to calculate. You should check out this article which explains how the frustum works better than I could, after that you should understand how to calculate any corner of the frustum. The corner I use in my implementation is referred to as "ftr" in that article.

EDIT: this is how I actually perform view-space reconstruction in my renderer:

In depth pass:
-multiply object-space position by worldView matrix to get view-space position
-divide view-space z by the farZ of the view frustum

In lighting pass:
-pass in coordinate of upper-right vertex of the view frustum's far clip plane
-viewDirection.xy = (projectedPos.xy/projectedPos.w) * frustumCoord.xy
-viewDirection.z = frustumCoord.z;
-view-space position is then viewDirection * pixelDepth

Share this post


Link to post
Share on other sites
Quote:
Original post by Matt Aufderheide
I am struggling with this too.. its seems i am close but not there yet... I wish soeoe could post a complete sahder instead of this littel fragments, as well as some pictures that maybe show the interpolants as renderer so we can test to see if each stage is working...

On of my main problems is that method of drawing a fullscreen quad uses pretransformed vertices, so I can us ea vertex sahder for that pass...otherwise I ahve troiuble mapping the pixels to textels perfectly(the screensized textures get a bit filtered otehrwise).

Does anyway have a good method of drawing a screen aligned quad that maps the texture perfectly to the screen pixel, and uses a vertex shader?


I'm away for the weekend with only my laptop, so unfortunately right now I can't post more than just code snippets reconstructed from by memory. When I get home on Sunday, I will post some more complete shader code and some pictures of off-screen surfaces.

As for directly mapping pixels to texels, are you using Direct3D9? If you are, this article is required reading.

Share this post


Link to post
Share on other sites
Thanks for the very usefull information! Unfortunately my head is still messed up from the alcohol yesterday, so learning hurts now :). But I print that "space" information out.

I calculate the farplane top right (world) position and pass that as a parameter to the vertex shader. Judging from the numbers, I think its correct. But shouldn't I translate that world position to view-space as well? And you say that you multiply the vertex positions for the depth pass with the "worldView matrix". Do you mean the modelViewProjection matrix with that, or just the modelView? In OpenGL / Cg I can shoose between 4 matrices (texture, projection, modelView, modelViewProjection), in combination with identity, inverse, transpose, or inverse transpose. I suppose you mean the ModelViewProjection, since that one gives me a "good looking" result for the depth texture. So, now I have this:

out.vertexPos = mul( modelViewProj, in.vertexPos );
...
out.color.r = out.vertexPos.z / 500; // 500 is the maximum view distance


Now I'm still messing around with those quad coordinates. Normally, the coordinates aren't transformed at all. In OpenGL, I just pass the 4 "screen corner" coordinates like this:

glVertex2f(-1, -1);
glVertex2f( 1, -1);
glVertex2f( 1, 1);
glVertex2f(-1, 1);

In the vertex shader I just copy those values, and that's it. No multiplications with matrices. I don't know in which space the coordinates are... But probably not the right coordinates to calculate the view direction. And/or the frustum TR coordinate is not in the right space as well?


// vertex shader
out.Pos.xy = in.Pos.xy; // just copy (for a screen filling quad)

// MVP = modelViewProjection Matrix
// farTR is the Farplane top right position, in world space
float4 projPos = mul( MVP, iPos ); // ???

out.ViewDir.xy = (projPos.xy / projPos.w) * farTR.xy;
out.ViewDir.z = farTR.z;

// Fragment shader
in.viewDir = normalize( in.viewDir );
float depth = tex2D( depthMap, texcoords );
float3 pos3D = cameraPos.xyz + in.viewDir * depth;



Do you normalize the viewDirection? Sorry, but I ask everything in detail. I feel that I'm "close, but not cigar", and that could depend on little stupid errors.

Thanks for the help again!
Rick

Share this post


Link to post
Share on other sites
Quote:
Original post by spek
In the vertex shader I just copy those values, and that's it. No multiplications with matrices. I don't know in which space the coordinates are... But probably not the right coordinates to calculate the view direction.

The values you pass in with glVertex*() aren't inherently in any space. The space they're in is defined by the transformations you perform on them in your vertex shader. For instance, if you do this:
out.Pos = mul(worldViewPerspective, in.Pos);
It means the input values were in model space, otherwise you wouldn't have performed the world transformation on them. If you instead used the viewPerspective matrix, it means the input values were in world space, which is why you didn't need the world transformation but still needed the view transformation. Likewise, the perspective transform would mean all values were in view space, which is why no view space transformation was needed. Finally, if you just set the output equal to the input, it would mean all values were already in perspective space. Thus glVertex*() is just a means to input "position" values into your vertex shader. How you interpret those values is up to you.

That being said, this is probably the easiest way to do what you need. In the fragment shader:
float4 perspective_position = float4(in.Pos.x, in.Pos.y, tex2D(sceneDepthMap, texcoords.xy).r * in.Pos.w, in.Pos.w);
float4 world_position_4d = mul(invViewPerspective, perspective_position);
float3 world_position_3d = world_position_4d.xyz / world_position_4d.w;
Where in.Pos.xyzw is in perspective space (should already be the case in the fragment shader). This isn't an optimal solution since it requires a matrix multiplication per-fragment, but it is the "omg-I-can't-get-it-to-work-I'm-going-to-cry" solution that works based on the fundamental principles of these transformations. The vector-based solution proposed by MJP (and used by the article you referenced) gives better performance, yet the logic is a little trickier IMO and it requires some more precise coordination between your input values, vertex shaders and fragment shaders so that everything is in the correct space at the right time. You can pursue that approach if you're interested, but I just wanted to give you something to fall back on in the meantime.

Share this post


Link to post
Share on other sites
Sorry for the late reply, lots of things todo this weekend :) Now its time to relax and do some programming again.

Thanks again for the notes on matrix multiplications. I know the coordinates are just "values", but the matrice usage often confuses me. I should print this explenation along with MJP's text as well. One of the problems is probably that I use the wrong matrices. In this case, I don't know how to get the "inverse view perspective" matrix. I use Cg for the shaders, and in combination with OpenGL it offers to pass the following matrices:
- ModelView Matrix
- ModelViewProjection Matrix
- Texture Matrix
- Projection Matrix
In combination with identity/inverse/transpose/inverse transpose.

But I guess I have to construct this inverse view perspective matrix myself, just like you could modify the texture matrix for projective texturing, true? Or is it listed above, but with another name? I'm not very familiar with the technical "jargon". I tried the inverse modelView and modelViewProjection, but that didn't give the right results, probably that is something different...

Greetings,
Rick

Share this post


Link to post
Share on other sites
First off, sorry for my rambling posts and for taking some time to reply. I've been traveling over the weekend, and my access to the internet was severely limited.

Back to the topic...as for how the various coordinate spaces relate to traditional OpenGL matrix types, I had to go look it up myself since I'm not really familiar with GL. Section 9.011 of the OpenGL FAQ seems to do a good job of explaining it all. According to that, what I call the "worldView" matrix is the same as your "modelView" matrix. In Direct3D, 3 matrices are used for projection instead of 2 (world, view, and projection respectively), whereas in GL the first two parts are combined into one matrix. The process goes something like this:


Object Coordinates are transformed by the World matrix to produce World Coordinates.

World Coordinates are transformed by the View matrix to produce Eye (View-space) Coordinates.

Eye Coordinates are transformed by the Projection matrix to produce Clip Coordinates.

Clip Coordinate X, Y, and Z are divided by Clip Coordinate W to produce Normalized Device Coordinates.


Now for the first solution to your problem, getting world-space coordinates from a buffer filled with z/w values, you need a matrix that does the reverse of those last two steps (the inverse viewProjection matrix). But as you've indicated, you don't have a viewProjection matrix to start off with. This means you'll have to create it and invert it in your application, and then pass the matrix as a shader constant. Doing this isn't too hard, since you already have the projection matrix. You just need a view matrix as well. In direct3d I use a helper function for creating a view matrix, but if you don't have access to such a function in GL it's not a problem creating one. Just think about what a view matrix does: it takes coordinates that are in world space, then transforms them so that they are now coordinates relative to your camera's position and orientation. This means that if your camera is located at <0,10,0>, you must translate your original coordinate by <0,-10,0>. If the camera is rotated 90 degrees about the y-axis, the coordinate must be rotated -90 degrees about the same axis. So in other words, you must come up with a transformation matrix for your camera and then invert it. Then you can multiply this with your projection matrix, invert the product, and voila: an inverse viewProjection matrix.

[Edited by - MJP on December 2, 2007 6:10:29 PM]

Share this post


Link to post
Share on other sites
Now as for the second solution to your problem...I fear I've mislead you a bit by talking too much about my own specific implementation details. For example the method I was describing produces eye-space coordinates rather than world-space coordinates, and certain portions would have to be modified for calculating world-space coordinates (specifically: how I calculate the frustum corners only works for calculating eye-space coordinates of the corners, or world-sace if your camera is not rotated about the z-axis). I also do some things the way I do because I use light volumes and not full-screen quads (viewDirection can be calculated in the app rather than in the shaders with quads).

So I think I'll just start over and explain an algorithm that you can use for generating world-space coordinates, using full screen quads. Then perhaps I will post some example code, or discuss some specific optimizations.



Step 1: create normalized eye-space depth buffer

In vertex shader:
-calculate eye-space position of vertex (transform by worldView matrix for D3D, modelView matrix for GL)
-pass eye-space position to fragment shader

In fragment shader:
-divide z-component of eye-space position by the z-value of the view frustum's far clip plane
-Output the value to the depth buffer

Step 2: calculate world-space position from depth buffer

In application:
-calculate world-space positions of the 4 corners of the view frustum's far clip plane
-pass points to vertex shader as constants or as values in the vertices of the full-screen quad (the 4 corner points should be mapped to the 4 points of the quad)
-pass the world-space position of the camera to the fragment shader
-render the quad

In vertex shader:
-retrieve the position of the the frustum corner for the current vertex
-pass as "viewDirection" to the fragment shader (do not normalize!)

In fragment shader:
-do not normalize "viewDirection"!
-read depth value from the depth buffer
-world-space position of the fragment is "cameraPos + (viewDirection * depth)"


[Edited by - MJP on April 9, 2008 2:33:30 PM]

Share this post


Link to post
Share on other sites
Quote:
Original post by spek
Thanks again for the notes on matrix multiplications. I know the coordinates are just "values", but the matrice usage often confuses me. I should print this explenation along with MJP's text as well. One of the problems is probably that I use the wrong matrices. In this case, I don't know how to get the "inverse view perspective" matrix. I use Cg for the shaders, and in combination with OpenGL it offers to pass the following matrices:
- ModelView Matrix
- ModelViewProjection Matrix
- Texture Matrix
- Projection Matrix
In combination with identity/inverse/transpose/inverse transpose.

But I guess I have to construct this inverse view perspective matrix myself, just like you could modify the texture matrix for projective texturing, true? Or is it listed above, but with another name? I'm not very familiar with the technical "jargon". I tried the inverse modelView and modelViewProjection, but that didn't give the right results, probably that is something different..,

That's my fault, you'll want to use the combination CG_GL_MODELVIEW_PROJECTION_MATRIX/CG_GL_MATRIX_INVERSE. I should have mentioned before that the view matrix in Direct3D is the modelview matrix in OpenGL. The major difference between the two is that the OpenGL modelview matrix is also responsible for transforming from model-space to world-space, whereas Direct3D has a separate world matrix for that. This means you'll need to make sure that the modelview transformation stack contains only the view transformation (i.e. something generated by glLookAt), or else the results you get from the shaders will be in model space. It shouldn't be a problem if you're properly using the stack, but I'm just putting it out there.

Share this post


Link to post
Share on other sites
No need to apoligize, you guys always help me here!

I'm getting close... I followed MJP's way of doing it. Now the results (3d positions) seems to be almost correct. I'm still using a 16 bit texture, so there is probably some inaccuresy in the depth.

But the real problem is still with the matrices, I think. When I start moving the camera (I start on ~0,0,0), the results quickly turn wrong. As you might have noticed, I'm bad with matrices. But I suppose this is because the camera position is used in the OpenGL ModelViewProjection Matrix (is that the difference with the D3D View matrix)? So far I haven't done anything with the stack like you guys warned me about, so probably its going wrong there.


// Application
... Calculate farplane coordinates, I pass later on them as the normals of that quad

// Before rendering the depth and quad, pass the camera matrix


// Depth Vertex Shader
// in.pos is an absolute world coordinate
out.Pos = mul( modelViewProjectionMatrix, in.pos );
// Depth Fragment Shader
out.color.r = out.pos.z / 500; // test 500 is max(test) view distance


// Test Quad Vertex Shader
// Just pass. Quad XY Coords are (-1,-1) (-1,+1) (+1,+1) and (+1,-1)
// Already in eye-space... right?
out.pos.xy = iPos.xy;

// Farplane coordinate (mapped on quad) is placed in the normal iNrm
// MVPI = Inverse ModelViewProjection Matrix, Cg OpenGL:
// CG_GL_MODELVIEWPROJETION_MATRIX, CG_GL_INVERSE_MATRIX
float4 worldPos = mul( MVPI, in.pos );

// The farplane world coordinates are stored inside the 4 normals
out.viewDir.xy = (worldPos.xy / worldPos.w) * normal.xy;
out.viewDir.z = normal.z;

// Test Quad Fragment Shader
float pDepth = f1tex2D( sceneDepth, iTex.xy ).r;
float3 pPos3D = cameraPos.xyz + in.viewDir.xyz * pDepth;
out.color= pPos3D; // test the result

I think the inverse modelviewmatrix is not in the right state when I pass it to the shaders.

Some other small questions:
- @MJP, the "farplane.z", is that the distance between the camera and plane
(maximum view distance?)?
- @Zipster, when using your method, howto calculate the depth? In the same way
as in MJP's method (out.pos.z / farplane.z) or different
(out.pos.z / out.pos.w ...)

[another edit]
I get the "same" results with Zipsters implementation (but with the depth changed to z/w instead of z/farplane.z). I also use the inverse modelviewprojection matrix in here ("invProj"):

float4 perspective_position = float4(iPos.x, iPos.y, pDepth * iPos.w, iPos.w);
perspective_position);
float4 world_position_4d = mul(InvProj, perspective_position);
float3 world_position_3d = world_position_4d.xyz / world_position_4d.w;

Yet again, when I start moving away from 0,0,0 with the camera, the results get wrong.


Thanks for the detailed answers! Maybe I understand those matrices someday :)
Rick

[Edited by - spek on December 3, 2007 1:10:06 PM]

Share this post


Link to post
Share on other sites
Hey guys!

Im currently working on a deferred shading engine. Its up and running in its basic form, and have to calculate the view-space coordinates of each fragment in subsequent lighting passes from depth also.

I acknowledge the fact that the approach I would use for the secondary pass is different (i literally draw a bounding volume with the current projection matrix as it were still the geometry pass) whereas you guys are drawing a fullscreen quad in orthogonal projection (?) to do further processing.

I use opengl, but im sure dx could use similar methodology since we are all using the same hardware.

I set up my frame buffer object with the repective colour buffers (standard 32bit RGBA) along with the standard depth and stencil buffer (24bit + 8bit) bound as a texture (as apposed to a straight forward non-texture render target). Latter lighting passes use that generic 24bit depth buffer read as a texture to derive screen space location per fragment.

Obviously, that depth buffer doesn't store the value linearly to give higher precision closer to the viewer and lesser precision out into the distance. So this needs to be converted during any post passes. The equation for that is quite simple and can be found here. Under the heading "The Resolution of Z", 'a' and 'b' can be pre calculated and handed to the frag shader to simplify that calculation.

During the lighting pass i draw my bounding volume in perspective projection space as i mentioned above. Within the vertex shader i send a varying variable to the frag shader containing the view space location of the "bounding volumes" fragment (gl_ModelviewMatrix * gl_Vertex). In the frag shader, i now have enough information to derive view space location of the actual fragment to be lit.

Since the bounding volumes fragment location lies on the same ray cast from the viewer (0,0,0) through the fragment to be lit, you can derive this location quite easily ( 2 known points and a third point with one known value, Z).

I realize you guys aren't drawing a bounding volume like me, but is there any reason why you couldn't draw "something" (a quad) to cover the entire view and derive everything from that? Seems likely to me. You could still calculate world location by using the inverse modelview matrix. Since the depth buffer uses a 24bit non-linear range, precision should be more than sufficient.

Anyway, I could be way off track. Good luck with it all!

Share this post


Link to post
Share on other sites
Quote:
Original post by MJP

So I think I'll just start over and explain an algorithm that you can use for generating world-space coordinates, using full screen quads. Then perhaps I will post some example code, or discuss some specific optimizations.


MJP: Thank you very much! This explanation is all I needed; now it works perfectly; fast with perfect quality.. its great to eliminate the need for an entire render target and just use depth. Now i just use 2 render targets for my deferred renderer... Thanks again.

Share this post


Link to post
Share on other sites
@Hibread
The full-screen quad has not much to do with deferred shading in this case, I'm trying to implement Screen Space Ambient Occlusion. You calculate the depth differences over the entire screen, so that's why. Nevertheless, I'm also doing deferred shading so there probably will be a point that I need to calculate the world positions for light volumes as well (so far I'm writing x,y and z in a texture, but that's a waste of space of course).

MJP's implementation seems to be the fast, since there only a few basic instructions are needed in the fragment shader. Zipster showed another basic way to do it, which I'll try to use to test the results now.

Anyway, thanks you too for the tips!
Rick

Share this post


Link to post
Share on other sites
Quote:
Original post by Matt Aufderheide
Quote:
Original post by MJP

So I think I'll just start over and explain an algorithm that you can use for generating world-space coordinates, using full screen quads. Then perhaps I will post some example code, or discuss some specific optimizations.


MJP: Thank you very much! This explanation is all I needed; now it works perfectly; fast with perfect quality.. its great to eliminate the need for an entire render target and just use depth. Now i just use 2 render targets for my deferred renderer... Thanks again.


You're very welcome! It's something I've also found to be very useful for deferred renderers, and having spent a good deal of time myself figuring it out I'm always willing to attempt to make it easier for anyone else.

Share this post


Link to post
Share on other sites
Spek I think I see your problem. When rendering your depth buffer for the "MJP method" (modest, aren't I?), you need to calculate the position in eye space before dividing by farplane.z (yes, this is the distance from the camera to your far frustum plane). What you seem do be doing is calculating its clip space position, and then using that. At least, this is based on my assumption that you mean in.pos is in object space, rather than world space. If it were already in world space, multiplying by the modelViewProjection matrix would add an extra transform and everything would be coming out all wrong.

Since what you want is eye space, you should be doing this:


// Depth Vertex Shader
// in.pos is in object space
out.pos = mul( modelViewProjectionMatrix, in.pos ); //out.pos is in clip space
out.eyePos = mul( modelViewMatrix, in.pos ); //out.eyePos is in eye space

// Depth Fragment Shader
out.color.r = out.eyePos.z / 500; // test 500 is max(test) view distance


Now for the Zipster method, what you want is the z component of the position in normalized device coordinates. If you remember the GL faq, you get normalized device coordinates by taking the point in clip space (IE, multiplied by modelViewProjection) and dividing by the w component. So your shaders would look like this:


// Depth Vertex Shader
// in.pos is in object space
out.pos = mul( modelViewProjectionMatrix, in.pos ); //out.pos is in clip space

// Depth Fragment Shader
out.color.r = out.pos.z / out.pos.w;

Share this post


Link to post
Share on other sites
Quote:
Original post by spek

out.color.r = out.pos.z / 500; // test 500 is max(test) view distance


// Test Quad Vertex Shader
// Just pass. Quad XY Coords are (-1,-1) (-1,+1) (+1,+1) and (+1,-1)
// Already in eye-space... right?
out.pos.xy = iPos.xy;

// Farplane coordinate (mapped on quad) is placed in the normal iNrm
// MVPI = Inverse ModelViewProjection Matrix, Cg OpenGL:
// CG_GL_MODELVIEWPROJETION_MATRIX, CG_GL_INVERSE_MATRIX
float4 worldPos = mul( MVPI, in.pos );

// The farplane world coordinates are stored inside the 4 normals
out.viewDir.xy = (worldPos.xy / worldPos.w) * normal.xy;
out.viewDir.z = normal.z;

// Test Quad Fragment Shader
float pDepth = f1tex2D( sceneDepth, iTex.xy ).r;
float3 pPos3D = cameraPos.xyz + in.viewDir.xyz * pDepth;
out.color= pPos3D; // test the result
[/code]


Okay now for the second part: actually using your depth buffer to derive the world-space position. You seem to be combining parts of my method with parts of zipster's method, with a dash of things I was confusing you with previously. Are you storing all four corners of the frustum seperately with one corner for each quad vertex, or are you just storing the position of the upper right corner? If you're storing all four corners, then things should be very simple for you:


// Test Quad Vertex Shader
// Just pass. Quad XY Coords are (-1,-1) (-1,+1) (+1,+1) and (+1,-1)
// (these are actually in normalized device coordinates)
out.pos.xy = iPos.xy;

// The farplane world coordinates are stored inside the 4 normals
out.viewDir = normal;

// Test Quad Fragment Shader
float pDepth = f1tex2D( sceneDepth, iTex.xy ).r;
float3 pPos3D = cameraPos.xyz + in.viewDir.xyz * pDepth;
out.color= pPos3D; // test the result


Now for Zipster's method, you're probably just not setting up your matrices right. You would need to do this:


Application:
-Create a view matrix using gluLookAt
-Set this view matrix as your modelView matrix in the stack
-use the same projection matrix you've been using all along

//vertex shader
// Just pass. Quad XY Coords are (-1,-1) (-1,+1) (+1,+1) and (+1,-1)
out.pos.xy = iPos.xy;

//fragment shader

float pDepth = f1tex2D( sceneDepth, iTex.xy ).r;
float4 perspective_position = float4(in.pos.x, in.pos.y, pDepth * in.pos.w, in.pos.w);

// MVPI = Inverse ModelViewProjection Matrix, Cg OpenGL:
// CG_GL_MODELVIEWPROJETION_MATRIX, CG_GL_INVERSE_MATRIX

float4 world_position_4d = mul(perspective_position, MVPI);
float3 world_position_3d = world_position_4d.xyz / world_position_4d.w;



Share this post


Link to post
Share on other sites
Quote:
Original post by hibread

I realize you guys aren't drawing a bounding volume like me, but is there any reason why you couldn't draw "something" (a quad) to cover the entire view and derive everything from that? Seems likely to me. You could still calculate world location by using the inverse modelview matrix. Since the depth buffer uses a 24bit non-linear range, precision should be more than sufficient.


You could do it this way, but you are doing a bit of redundant calculations that can done in the application instead. It's likely that you have the corners of your frustum stored already in either world space or eye space, so its easy to to just pass it along with the vertex info for the quad. In my renderer I actually calculate the quad positions in my vertex shader, but I'm proably going to make changes to pre-calculate in the application. The difference will probably be minimal though.

For light volumes, I do it almost exactly the way you do. Only difference is that I render my own depth buffer (I'm using D3D9, where it's quicker to just render your own depth buffer rather than access the device z-buffer), so I just store depth as a linear value.


Share this post


Link to post
Share on other sites
Ouch my head.

I got the "Zipster method" working. I don't know what I did, but a little change somewhere made it working. Well, I have quite big "band artefacts" (lines... or how to call it?). But I suppose that is because I'm using 16 bit float buffer instead of 32. I though I'd just switch to 32F, but my GeForce 7600 Go (laptop) card doesn't seem to support that. Weird. Gonna try it on the Big Computer.

Instead of creating a "view matrix" (the one that isn't provided by OpenGL), I still use the inverse of the Model View Projection Matrix. Something happened with that modelView stack I guess... Anyway, I'm happy it works! But now I want the fast method of course, the infamous "MJP method". It almost gives the same result as the other, except when I start moving away from 0,0,0 and/or rotating. This is still with the old shader.

So, I used the modelView matrix in the depth pass instead of the ModelViewProjection. But that gives more odd results. Maybe I'm still using the wrong matrix in the other vertex shader as well?

Greetings,
Rick

Share this post


Link to post
Share on other sites
Quote:
Original post by MJP
Quote:
Original post by hibread

I realize you guys aren't drawing a bounding volume like me, but is there any reason why you couldn't draw "something" (a quad) to cover the entire view and derive everything from that? Seems likely to me. You could still calculate world location by using the inverse modelview matrix. Since the depth buffer uses a 24bit non-linear range, precision should be more than sufficient.


You could do it this way, but you are doing a bit of redundant calculations that can done in the application instead. It's likely that you have the corners of your frustum stored already in either world space or eye space, so its easy to to just pass it along with the vertex info for the quad. In my renderer I actually calculate the quad positions in my vertex shader, but I'm proably going to make changes to pre-calculate in the application. The difference will probably be minimal though.

For light volumes, I do it almost exactly the way you do. Only difference is that I render my own depth buffer (I'm using D3D9, where it's quicker to just render your own depth buffer rather than access the device z-buffer), so I just store depth as a linear value.


Gday MJP!

I'm not sure what you mean by redundant calculations. I'm basically doing no work at all to derive the xyz value in view-space. Here's my code snippets.

// Geometry Pass

- Do nothing special... we're using the generic 24bit depth texture attachment, so just make sure you have depth writes enabled at some stage.


// Lighting passes (this could be any pass really where you want the xyz location of the fragment)

Application Setup

Vector2 vec = pCam->DepthNearFarPlane();	
GLfloat depthParamA = vec.y / ( vec.y - vec.x);
GLfloat depthParamB = vec.y * vec.x / ( vec.x - vec.y);

glUniform2f( m_pShader->m_depthParamsID, depthParamA, depthParamB);


Using the formula here again, storing 'a' and 'b' in a uniform vec2.

Vertex Shader

varying vec3 volumeCoords;

void main()
{
gl_Position = ftransform();
volumeCoords = (gl_ModelViewMatrix * gl_Vertex).xyz;
}


Frag Shader

This is obviously only relevant parts of the shader:

#extension GL_ARB_texture_rectangle : enable
varying vec3 volumeCoords;
uniform vec2 depthParams;
uniform sampler2DRect depth;

void main()
{
// ... other stuff ... //

// - negate depth buffer value to coinside with the depth value in modelview
float fragDepth = -depthParams.y / ( texture2DRect( depth, gl_FragCoord.xy).r - depthParams.x ); // Convert from non-linear to linear
vec3 fragLocation = vec3( volumeCoords.xy * fragDepth / volumeCoords.z , fragDepth); // Calculate fragments location in view-space


// ... other stuff ... //
}


Thats all there is to it. I dont think thats too heavy or containing any redundant calculations.

With regard to D3D9 and the way you do it (to quote: "I'm using D3D9, where it's quicker to just render your own depth buffer rather than access the device z-buffer"), i feel something isn't quite right there. Firstly, reusing the actual depth buffer saves un-nessesary space and bandwidth. And since you are storing the data linearly, precision up close to the camera would be far from adequate i would have thought? If you were storing depth linearly using 16bit floats, thats only 10bits of precision. I could be very wrong on this...

Cheers!

Share this post


Link to post
Share on other sites
hibread:

G-day to you as well!

When I was speaking of redundant calculations, I wasn't speaking of your specific approach for light volumes. When using light volumes, I can't think of way to do it that is better than the approach you're taking. I was speaking for spek's situation, where he is using a full-screen quad with pre-transformed vertices. Those coordinates would have to be transformed to eye-space or to world-space, which is the "calculations" I was referring to. That is probably best left outside the shader, IMO.

As for my depth buffer...while its true you better utilize precision when storing post-perspective z/w, there's not much difference when your near clip plane is relatively close to your camera position (which is often the case). Also, having to convert to linear z is undesirable. I use 32-bit floats for storing my depth, so precision isn't a problem. I'd imagine it would be at 16-bits, no matter how you store your depth.

Also if you know of any convenient way for accessing the depth buffer as a texture in Direct3D 9, then I'd love to hear of it! But AFAIK there's no good way of doing such a thing. Once I move on to D3D 10, things should be much nicer in that regard.

[Edited by - MJP on December 4, 2007 1:25:55 PM]

Share this post


Link to post
Share on other sites
hi!
i've been struggling since weeks to get this working:
Discussion @ OpenGL.org

i can reconstruct position from linear depth stored in a color texture, but not using a depth attachment. the code for this is from one of my profs, i don't really understand what i'm doing there (it works fine, though). take a look at the initial post in the thread i linked. i render light hulls for my spot lights, just like hibread, so i thought i should give his approach a try; but nothing is rendered.

any idea what might be wrong? the way i render the light hulls? projection matrix settings? :( thanks!

//edit: oh and another thing.. i render my light hulls just on top of the scene, without depth test. if i could use depth test, then the section that i marked red would be culled:



however, i don't have the depth buffer from the initial pass.. well, i can render it into a texture, bit would that help?

[Edited by - Vexator on December 4, 2007 6:50:13 AM]

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this  

  • Announcements

  • Forum Statistics

    • Total Topics
      628402
    • Total Posts
      2982470
  • Similar Content

    • By test opty
      Hi all,
       
      I'm starting OpenGL using a tut on the Web. But at this point I would like to know the primitives needed for creating a window using OpenGL. So on Windows and using MS VS 2017, what is the simplest code required to render a window with the title of "First Rectangle", please?
       
       
    • By DejayHextrix
      Hi, New here. 
      I need some help. My fiance and I like to play this mobile game online that goes by real time. Her and I are always working but when we have free time we like to play this game. We don't always got time throughout the day to Queue Buildings, troops, Upgrades....etc.... 
      I was told to look into DLL Injection and OpenGL/DirectX Hooking. Is this true? Is this what I need to learn? 
      How do I read the Android files, or modify the files, or get the in-game tags/variables for the game I want? 
      Any assistance on this would be most appreciated. I been everywhere and seems no one knows or is to lazy to help me out. It would be nice to have assistance for once. I don't know what I need to learn. 
      So links of topics I need to learn within the comment section would be SOOOOO.....Helpful. Anything to just get me started. 
      Thanks, 
      Dejay Hextrix 
    • By mellinoe
      Hi all,
      First time poster here, although I've been reading posts here for quite a while. This place has been invaluable for learning graphics programming -- thanks for a great resource!
      Right now, I'm working on a graphics abstraction layer for .NET which supports D3D11, Vulkan, and OpenGL at the moment. I have implemented most of my planned features already, and things are working well. Some remaining features that I am planning are Compute Shaders, and some flavor of read-write shader resources. At the moment, my shaders can just get simple read-only access to a uniform (or constant) buffer, a texture, or a sampler. Unfortunately, I'm having a tough time grasping the distinctions between all of the different kinds of read-write resources that are available. In D3D alone, there seem to be 5 or 6 different kinds of resources with similar but different characteristics. On top of that, I get the impression that some of them are more or less "obsoleted" by the newer kinds, and don't have much of a place in modern code. There seem to be a few pivots:
      The data source/destination (buffer or texture) Read-write or read-only Structured or unstructured (?) Ordered vs unordered (?) These are just my observations based on a lot of MSDN and OpenGL doc reading. For my library, I'm not interested in exposing every possibility to the user -- just trying to find a good "middle-ground" that can be represented cleanly across API's which is good enough for common scenarios.
      Can anyone give a sort of "overview" of the different options, and perhaps compare/contrast the concepts between Direct3D, OpenGL, and Vulkan? I'd also be very interested in hearing how other folks have abstracted these concepts in their libraries.
    • By aejt
      I recently started getting into graphics programming (2nd try, first try was many years ago) and I'm working on a 3d rendering engine which I hope to be able to make a 3D game with sooner or later. I have plenty of C++ experience, but not a lot when it comes to graphics, and while it's definitely going much better this time, I'm having trouble figuring out how assets are usually handled by engines.
      I'm not having trouble with handling the GPU resources, but more so with how the resources should be defined and used in the system (materials, models, etc).
      This is my plan now, I've implemented most of it except for the XML parts and factories and those are the ones I'm not sure of at all:
      I have these classes:
      For GPU resources:
      Geometry: holds and manages everything needed to render a geometry: VAO, VBO, EBO. Texture: holds and manages a texture which is loaded into the GPU. Shader: holds and manages a shader which is loaded into the GPU. For assets relying on GPU resources:
      Material: holds a shader resource, multiple texture resources, as well as uniform settings. Mesh: holds a geometry and a material. Model: holds multiple meshes, possibly in a tree structure to more easily support skinning later on? For handling GPU resources:
      ResourceCache<T>: T can be any resource loaded into the GPU. It owns these resources and only hands out handles to them on request (currently string identifiers are used when requesting handles, but all resources are stored in a vector and each handle only contains resource's index in that vector) Resource<T>: The handles given out from ResourceCache. The handles are reference counted and to get the underlying resource you simply deference like with pointers (*handle).  
      And my plan is to define everything into these XML documents to abstract away files:
      Resources.xml for ref-counted GPU resources (geometry, shaders, textures) Resources are assigned names/ids and resource files, and possibly some attributes (what vertex attributes does this geometry have? what vertex attributes does this shader expect? what uniforms does this shader use? and so on) Are reference counted using ResourceCache<T> Assets.xml for assets using the GPU resources (materials, meshes, models) Assets are not reference counted, but they hold handles to ref-counted resources. References the resources defined in Resources.xml by names/ids. The XMLs are loaded into some structure in memory which is then used for loading the resources/assets using factory classes:
      Factory classes for resources:
      For example, a texture factory could contain the texture definitions from the XML containing data about textures in the game, as well as a cache containing all loaded textures. This means it has mappings from each name/id to a file and when asked to load a texture with a name/id, it can look up its path and use a "BinaryLoader" to either load the file and create the resource directly, or asynchronously load the file's data into a queue which then can be read from later to create the resources synchronously in the GL context. These factories only return handles.
      Factory classes for assets:
      Much like for resources, these classes contain the definitions for the assets they can load. For example, with the definition the MaterialFactory will know which shader, textures and possibly uniform a certain material has, and with the help of TextureFactory and ShaderFactory, it can retrieve handles to the resources it needs (Shader + Textures), setup itself from XML data (uniform values), and return a created instance of requested material. These factories return actual instances, not handles (but the instances contain handles).
       
       
      Is this a good or commonly used approach? Is this going to bite me in the ass later on? Are there other more preferable approaches? Is this outside of the scope of a 3d renderer and should be on the engine side? I'd love to receive and kind of advice or suggestions!
      Thanks!
    • By nedondev
      I 'm learning how to create game by using opengl with c/c++ coding, so here is my fist game. In video description also have game contain in Dropbox. May be I will make it better in future.
      Thanks.
  • Popular Now