Why are these three methods for determining depth giving me different output?

Started by
3 comments, last by bandages 7 years, 5 months ago

Hi! I'm a beginner and an amateur, just trying to learn, explore, and play. I'm using MikuMikuDance in conjunction with MikuMikuEffect as an engine in order to explore HLSL (although I'm on the lookout for any other engines that will let me play with shaders as easily I can with MMD). The engine uses DX9 and I'm using shader model 3.0. MMD and MME are not open source, and I'm not familiar with other frameworks, so I never know if any problems I run into are entirely my own fault or if they involve anything specific to the engine.

I'm currently exploring techniques to write depth to an offscreen render target. I ended up playing with three different ways to write camera-space depth, but they all give me slightly different output, and I don't know why. I believe that understanding why would help me understand more than just the writing of depth.

Here's a section of my depth_VS. I write the length of Out.Eye to a R32F render target and read it with a post (that rescales it by a constant). Matrices and CameraPosition are provided to the shader by the engine.


    Pos = mul( Pos, WorldMatrix );

    float3 Eye;

    float3 CamPos = CameraPosition;

    //CamPos = float3(ViewMatrix._41, ViewMatrix._42, ViewMatrix._43);

    Eye = CamPos - Pos.xyz;

    Pos = mul(Pos, ViewMatrix);

    //Eye = Pos.z;

    Out.Pos = mul(Pos, ProjMatrix);

    Out.DepthV = length(Eye);

The two commented lines are alternate ways of determining depth that seem to me like they should give the same output. All three techniques give something that looks like view depth-- they have similar values, they change roughly appropriately as I move the camera in the scene. Yet all three give slightly different output from each other.

Thanks in advance for any help. I've been looking for good forums to ask for help regarding HLSL for a while. If this isn't a good forum for it, I apologize, and please let me know!

Advertisement

//CamPos = float3(ViewMatrix._41, ViewMatrix._42, ViewMatrix._43);

The camera position is not stored in the 4th row of the view matrix.

Eye = CamPos - Pos.xyz;

I think you mean Pos.xyz - CamPos, which would be the vector from the camera origin to point Pos. However, this doesn't take into consideration the orientation of the camera.

Pos = mul(Pos, ViewMatrix);

This does take the orientation of the camera into consideration.

//Eye = Pos.z;

Assigning scalar to vector?

Out.Pos = mul(Pos, ProjMatrix);

Out.DepthV = length(Eye);

Length is not the same as just the z-coordinate Pos.z.

-----Quat
Thank you for your response.

The camera position is not stored in the 4th row of the view matrix.


Apparently not. But I was under the impression that the view matrix contained only rotation and translation data (no scale or skew), and that under those circumstances, translations could be accessed out of the fourth row of the matrix?

I think you mean Pos.xyz - CamPos, which would be the vector from the camera origin to point Pos. However, this doesn't take into consideration the orientation of the camera.


My sources have indeed been using CameraPosition - PosWorld for their Eye variable. It's more typical in your experience to call the opposite of this vector Eye? (Same length anyways.)

I think you mean Pos.xyz - CamPos, which would be the vector from the camera origin to point Pos. However, this doesn't take into consideration the orientation of the camera.


I've been using Eye as a world space vector, where you wouldn't want to take camera orientation into account. For example, to get the halfvector in world space, so you don't have to do extra transformations on normals and light vectors. And, of course, depth is depth, regardless of its orientation. Is Eye used more commonly as a view-space vector in your experience?

Assigning scalar to vector?


Yeah, thanks :)

Length is not the same as just the z-coordinate Pos.z.


And yeah, I can see now how, even if I wasn't screwing it up with the float3 = float conversion, that I'd be dropping the x/y camera space component with this.
Your "view" matrix is the inverse of the camera's transformation matrix: it brings coordinates from world-space into the camera's local frame of reference. So if you took the camera's position in world space and transformed it by your view matrix, you would get a result of (0, 0, 0). If you were to create a matrix that contained the camera's transformation (the combined rotation and translation), then the last row would be the camera's position.

As for the different depth values, it looks like you're choosing between two different possibilties here. The first one, which you currently you have in your code, is the distance between the camera and the surface point being shaded:

Pos = mul(Pos, WorldMatrix);
float depth = length(Pos - CamPos);

This works, and is totally fine as long as "Pos" and "CamPos" are in the same coordinate space. So if they're both in world space, then everything is working fine. Note that it doesn't matter if you do "Pos - CamPos" or "CamPos - Pos" here: they're both equivalent for the purposes of computing the distance. The above calculation is also equivalent to the following:

Pos = mul(Pos, WorldMatrix);
Pos = mul(Pos, ViewMatrix);
float depth = length(Pos);   // CamPos is (0, 0, 0) in view space

So if you transform a position to view space, then it's implicitly local to the camera. As I mentioned earlier the camera is at (0, 0, 0) in view space, so you can just compute the length of the view space position to get the distance from the camera.

Now let's talk about using just the "z" component of your view space position:

Pos = mul(Pos, WorldMatrix);
Pos = mul(Pos, ViewMatrix);
float depth = Pos.z;


This is valid, but it gives you a different value than what we computed earlier. The previous value was the distance between the camera and the surface, this time we've computed the result of projecting the camera->surface vector onto the camera's local Z axis. Here's a quick diagram:

[attachment=34082:Depth_Projection.png]

So in this diagram the blue point at "C" is the camera, and the blue arrow is the direction that it's facing (AKA its local Z axis). The light blue point at "P" is the surface being shaded. The length of the red line represents the first depth value: it's the absolute distance between the camera and the surface. The length of the green line is the other depth value: it's the projection of the camera->position onto the camera's local Z axis. The length of this projection is proportional to the cosine of the angle between the C->P vector and the local Z axis, which makes it easy to compute with a dot product. In world space you would do it like this:

Pos = mul(Pos, WorldMatrix);
float3 camToPos = Pos - CamPos;
float projectedDepth = dot(camToPos, CamZAxis); // assuming that CamZAxis is normalized

If you do the same thing in view space, then CamZAxis is implicitly (0, 0, 1) since it's the local Z axis. So the dot product ends up giving you camToPos.x * 0 + camToPos.y * 0 + camToPos.z * 1, which ends up just being camToPos.z. And since camToPos = Pos - (0, 0, 0) in view space, you can just do Pos.z.

This might seem obvious, but it's worth pointing out that the projected depth doesn't depend on the surface's position with respect to the camera's local X and Y directions. What this means is that you could move "P" anywhere along the orange line and you will get the same depth value. This is not true if you use absolute distance.

So now that we've gone through all of that, which one should you use? That really depends on what you're going to do with this depth value. Either one is useful for different scenarios, and you can always compute one from the other with a bit of math. The depth value that's stored in a GPU's depth buffer is a projected depth and not an absolute distance, however it's also exponentially warped as part of the perspective projection.
Thank you so much for the explanation. It means a lot.

I am seeing that CameraPosition == float3(ViewInv._41, ViewInv._42, ViewInv._43). I think I was confused because ViewInv._41_42_43 does not equal -ViewMatrix._41_42_43, not when there's any rotation. It's clear that I have to be more careful about how I think about my matrices.

I understand now why I was getting different results for all three.

I'll try to give myself a little more time to solve my own problems in the future :)

This topic is closed to new replies.

Advertisement