I have been writing an DirectX / OpenGL rendering engine recently. As you may know, DirectX is by default associated with a left-handed coordinate system (LH) and OpenGL with a right-handed system (RH). You can compare both of them in the article title image to the right. You can look at those two systems in another way. If you want to look in a positive direction, for LH, you have Y as UP axis and for RH, you have Z as UP axis. If you dont see it, rotate the RH system in the image. Today, in time of shaders and, you can use one or another in both systems, but you need to take care of few things.
I have calculated both versions of matrices for both systems. I am tired of remembering everything and/or calculating it all over again. For this reason I have created this document, where I summarize needed combinations and some tips & tricks. This is not meant to be a tutorial “How projection works” or “Where those values come from”. It is for people who are tired of looking how to convert one system to another or how to convert one API to another. Or it is for those who don't care “why” but they are happy to copy & paste of my equations (however, don't blame me if there is something wrong).
RH system has become some kind of a standard in a computer graphics. However, for my personal purposes, LH system seems more logical to visualise. In my engine, I wanted to give the decision to the user. At the end, my system supports both orientations.
If we looked more closely at DirectX and OpenGL, we can see one important difference in a projection. It doesn't matter if we use LH or RH system, in DirectX projection is mapped to interval [0, 1] while in OpenGL to [-1, 1]. What does that mean? If we take the near clipping plane of a camera, it will be always mapped to 0 for DirectX, but in OpenGL it is more complicated. For LH system, near will be 1, but for RH, it will became -1 (see graphs 5 and 6 in a later section). Of course, we can use DirectX mapping in OpenGL (not the other way), but in that case, we are throwing away half of the depth buffer precision. In the following sections, we will discuss this more closely.
Personally, I think that whoever invented OpenGL depth coordinates must have had a twisted sense for humour. DirectX's solution is far better and easier to understand.
[Note]Matrix order used in this article will be row based. All operations will be done in order vector · matrix (as we can see at (1) ) with indexing from (2).
For column based matrix, order of operations will be reversed - matrix ⋅ vector (as we can see at 3). You also need to change elements of matrix, as you can see from example.
In a time of a fixed function pipeline, that was more problematic than today. In a time of shaders, we may use whatever system and layout we want and just change the order of operations or read values from the different positions in matrices.
World to View transformation
In every transformation pipeline, we need to first transform geometry from the world coordinates to a view (camera) space. After that, you can do a projection transformation. View matrix must use the same system as your final projection, so it must be LR or RH. This section is mentioned only for complete look up, so you know how to transform a point. There will be no additional details for view transformation.
View matrix has the same layout for both of the systems (4)
Differences are in base vectors and the last row elements calculation. You can see it in table 1.
|wLook - eye|
|eye - wLook|
|wUp x look|
|wUp x look|
|look x right|
|look x right|
Table 1: View vectors calculation. wLook is camera lookAt target, eye is camera position and wUp is camera up vector - usually [0,1,0]. "x" stands for a vector product
For “3D world” rendering, you will probably use a perspective projection. Most of the time (like in 90% of cases) you will need a simplified perspective matrix (with a symmetric viewing volume). Pattern for such a projection matrix can be seen at 5. As you can see, this pattern is symmetric. For column and row major matrices, this simplified pattern will be the same, but values of D and E will be transposed. Be aware of this, it can cause some headaches if you do it the other way and not notice it.
Now, how projection works. We have an input data in the view space coordinates. From those we need to map them into our screen. Since our screen is 2D (even if we have so called 3D display), we need to map a point to our screen. We take a simple example:
where x,y,z,w is an input point ( w is a homogenous coordinate, if we want to “debug” on a paper, the best way is to choose this value as 1.0). Division by ( D · z ) is performed automatically after vertex shader stage.
From equations 6 we have coordinates of a point on 2D screen. You may see, that those values are not coordinates of pixel (like [756, 653]), but they are in a range [-1, 1] for both axis (in DirectX and also in OpenGL).
From equation 7 we have depth of pixel in range [0, 1] for DirectX and [-1, 1] for OpenGL. This value is used in depth buffer for closer / distant object recognition. Later on, we show how depth values look like.
Those +1 / -1 values, that you will obtain after projection, are known as a normalized device coordinates (NDC). They form a cube, where X and Y axis are in interval [-1, 1] for DirectX and OpenGL. Z axis is more tricky. For DirectX, you have an interval [0, 1] and for OpenGL [-1, 1] (see 2). As you can see now, NDC is a LH system, doesn't matter what input system you have chosen.
Everything, that is inside of this cube, is visible on our screen. Screen is taken as a cube face at Z = 0 (DirectX), Z = 1 (OpenGL LH) or Z = -1 (OpenGL RH). What you see on your screen is basically content of a NDC cube pressed to single plane.
Figure 2: OpenGL (Left) and DirectX (Right) NDC
We summarize computations for LH / RH system and for DirectX and OpenGL in two different tables. Those values are different for LH / RH system and of course for API used. In following sections, you can spot the differences. If you are interested where those values come from, look elsewhere (for example OpenGL matrices are explained here: Link). There are plenty of resources and it will be pointless to go over it again here.
Table 2: Projection matrix calculation for DirectX. Input parametrs are: fovY - field of view in Y direction, AR - aspect ratio of screen, n - Z-value of near clipping plane, f - Z-value of far clipping plane
Changing only values at the projection matrix won't work as expected. If we render same scene with same DirectX device settings, we end up with turned scene geometry for one of those matrices. This is caused by depth comparison in depth buffer. To change this settings is a little longer in DirectX, than for OpenGL. You need to call functions in code snippet 1 with values in table 3.
Code 1: Code snippet settings for LH DirectX rendering
Table 3: OpenGL setting for both systems
Table 4: Projection matrix calculation for OpenGL. Input parametrs are: fovY - field of view in Y direction, AR - aspect ratio of screen, n - Z-value of near clipping plane, f - Z-value of far clipping plane
Again, changing only values at the projection matrix won't work as expected. If we render same scene with same OpenGL device settings, we end up with turned scene geometry for one of those matrices. This is caused by depth comparison in depth buffer. We need to change two things as we see in table 5.
Table 5: OpenGL setting for both systems
If you set the comparison and depth buffer clear values incorrectly, most of the time, you will end up with result like on the figure 3. Correct scene should look like on the figure 4.
Figure 3: Incorrectly set depth function and clear for current projection
Figure 4: Correctly set depth function and clear for current projection
Using equation 6, we can calculate projected depth for any input value. If we do this for values in interval [near, far], we will get the following result (see image 5 and 6). Notice second graph x-axis. For RH system, we need to change sign of near to -near in order to obtain same results as for LH system. That means in plain language, that for LH we are looking in positive Z direction and for RH we are looking in negative Z direction. In both cases, viewer is located at origin.
Figure 5: Projected depth with DirectX and OpenGL LH matrices (values used for calculation: near = 0.1, far = 1.0)
Figure 6: Projected depth with DirectX and OpenGL RH matrices (values used for calculation: near = -0.1, far = -1.0)
From above graphs, we can see that for the distances near to the camera, there is a good precision in the depth buffer. On the other hand, for larger values the precision is limited. That is not always desired.
One possible solution is to keep your near and far distances together as close as possible. There will be less problems if you use interval [0.1, 10] instead of [0.1, 100]. This is not always possible if we want to render large 3D world enviroments. This issue can be however solved as we show in the next section.
As mentioned before, using a classic perspective projection brings us a limited depth precision. The bigger the distance from viewer, the lower precision we have. This problem is often noticable as flickering pixels in distance.
We can partially solve this by logarithmic depth. We decrease precision for near surroundings, but we have almost linear distribution throughout the depth range. One disadvantage is that logarithm is not working for negative input. Triangles, that are partially visible and have some points behind viewer (negative Z axis), won't be calculated correctly. Shader programs usually won't crash with negative logarithm, but the result is not defined. There are two possible solutions for this problem. You either tesselate your scene to have triangles so small, that the problem won't matter, or you can write your depth in a pixel shader.
Writing depth in a pixel shader brings disadvantage with turned off depth testing for geometry before rasterizing. There could be some performance impact, but you can limit it by doing this trick only for near geometry, that could be affected. That way, you will need a condition in your shader or use different shaders based on geometry distance from viewer.
If you use this modification, be aware of one thing: The depth from vertex shader has range [-1, 1], but gl_FragDepth has range [0, 1]. It's again something OpenGL only, since DirectX has depth in [0, 1] all the time.
For a more detailed explenation, you can read an excellent article at Outtera blog (Link). Equations in their solution are using RH system (they aimed primary for OpenGL). So once again, we show same equation in LH and RH system. Both version are at table 6. This time only for OpenGL, since in DirectX problem can be solved, as proposed in article, by swapping near and far.
gl_Position.z = (-2.0) * log((-gl_Position.z) * C + 1.0) / log(far * C + 1.0) + 1.0
gl_Position.z = (2.0) * log((gl_Position.z) * C + 1.0) / log(far * C + 1.0) - 1.0
Table 6:Calculation of new Z coordinate for depth using log. C is linearized component, default value is 1.0, far is camera far plane distance, gl_Position is output value from vertex shader (in perspective projection). You MUST remember to multiply gl_Position.z by gl_Position.w before returning it from shader.
If you have read the Outtera article and looked at my equations, you may notice that I used gl_Position.z in logarithm calculations. I don't know if it is a mistake by Outtera, but with W, I have nearly same results for RH system (as if I used Z), but LH is totally messed up. Plus, W is already linearized depth (distance of point from viewer). So first visible point has W = near and last one has W = far.
If we plot classic vs logarithm depth with equations from 6, we end up with the two following graphs. Red curve is same as in previous chapter, green one is our logarithmic depth.
Figure 7: Projected depth with classic perspective and with logarithmic one in LH (values used for calculation: near = 0.1, far = 1.0, C = 1.0)
Figure 8: Projected depth with classic perspective and with logarithmic one in RH (values used for calculation: near = 0.1, far = 1.0, C = 1.0)
You can observe the effect of both projections (classic and logarithmic one) at this video (rendered with LH projection in OpenGL):
Last section related to a projection will be a little different. So far, we have discussed perspective projection and precision for rendering. In this section, another important aspect will be converted to LH and RH system and to OpenGL / DirectX.
Oblique projection is not some kind of special projection, that makes everything shiny. It is classic perspective projection, only with improved clipping planes. Clipping plane for classic projection is near and far, but here we change near to get different effect. This kind of projection is mostly used for water reflection texture rendering. Of course, we can set clipping plane manually in OpenGL or in DirectX, but that won't work in a mobile version (OpenGL ES), a web version (WebGL) and in DirectX we will need a different set of shaders. Bottom line, solution with clipping plane is possible, but not as clean as oblique projection.
First we need to precompute some data. For a clipping, we need obviously a clipping plane. We need it in our current projective space coordinates. This can be achieved by transforming our plane vector with transposed inverse of the view matrix (we are assuming that the world matrix is set as identity).
Transform q into camera space by multiplying it with the inverse of the projection matrix. For a simplified calculation, we have already used an inverted projection matrix.
In DirectX system, we need to be careful, because original article is using OpenGL projection space with Z coordinate in range [-1, 1]. This is not possible in DirectX, so we need to change equations and recalculate them with Z in a range [0, 1].
I have added an Excel file with projection matrices. You can experiment for yourself by changing near and far, or any other parameters and see the differences in depth. This is the same file that I used for creation of posted graphs.