Jump to content
  • Advertisement

Recommended Posts

I think I understand the idea behind Shadow Mapping, however I'm having problems with implementation details.

In VS I need light position - but I don't have one! I only have light direction, what light position should I use?

I have working camera class, with Projection and View matrices and all - how can I reuse this? I should put camera position, but how to calculate "lookAt" parameter?

Is this suppose to be ortographic or perspective camera?

And one more thing - when in the 3D piplene is the actual write to the Depth Buffer? In PS or somewhere earlier?


Share this post

Link to post
Share on other sites
37 minutes ago, Bartosz Boczula said:

In VS I need light position - but I don't have one! I only have light direction, what light position should I use?

I would recommend using the position of the light.  Unless the light is representing the sun, in which case just the direction can be used, you will need the light position.  But you didn't say if the light is a spotlight, point light, etc.  I'll assume spotlight.

53 minutes ago, Bartosz Boczula said:

I have working camera class, with Projection and View matrices and all - how can I reuse this? I should put camera position, but how to calculate "lookAt" parameter?

Create a new camera for the light if you want.  Though it's easy enough from the position/direction to create the ortho/projection matrix to pass to the shader.  I don't know what programming language or graphics API you're using so keeping this generic.  You shouldn't need to calculate the look at, you have the direction, just set the camera direction to the light direction.

55 minutes ago, Bartosz Boczula said:

Is this suppose to be ortographic or perspective camera?

Spotlight would use a perspective camera.  A point light might generate a cubemap.  A sun would use orthographic.  etc.  Depends completely on the type of light source.

57 minutes ago, Bartosz Boczula said:

And one more thing - when in the 3D piplene is the actual write to the Depth Buffer? In PS or somewhere earlier?

The pixel shader.   The vertex shader just multiplies the vertex by the mvp matrix you pass in.  The pixel shader writes depth information to the depth buffer.

1 hour ago, Bartosz Boczula said:

I think I understand the idea behind Shadow Mapping

I'm not trying to be mean, but I think you were using poor resources as all your questions should have been answered by whatever you were reading/watching to learn from.

These are OpenGL resources but the concepts are the same for D3D, the video is Java + OpenGL but again translating the concepts to C++/C# should be quite simple:


The last video is one of like 8 that show different techniques.

Share this post

Link to post
Share on other sites

Thanks @Mike2343. Just for clarification, I'm writing in C++ with DX11. My current version has only Phong lighting, with one "sunlight" component, no point lights or spot lights. My camera component doesn't have "direction" property, only LookAt, Up and Position. For lighting calculations, I only needed sunlight direction and that was ok. So my question was - what sunlight position should I use? I mean, should I like pick whatever or there are some "guidelines"?

P.S. This was my first post, thank you for not dissing it and explaining instead :)

Share this post

Link to post
Share on other sites

I have only limited experience in this, so I really don't honestly know if there are any good tricks (I'm sure more experts will be along soon), but I believe the general idea is to try and include all the objects in your frame and those outside it that might be casting shadows into the frame. Roughly speaking you could start by taking a point in the 'centre' of your scene as seen by the camera, use that as your lookat point, then minus the light direction from this to get the 'shadow camera' position.

You can then widen the field of view of the shadow matrix / move shadow camera further back to try and get everything in. Of course the trick is to try and get everything in without losing shadow map resolution, and this will depend on the game. There are techniques to try and do this plus decrease the shadow map resolution further from the view camera. See:


Share this post

Link to post
Share on other sites
10 hours ago, Bartosz Boczula said:

So my question was - what sunlight position should I use? I mean, should I like pick whatever or there are some "guidelines"?

The first video I posted is actually about only supporting sunlight.  Java is not that far from C++ and it's more about the concepts anyway so it should be a good video to watch.  You want large values anyway for the sun position and use an orthographic projection.  He explains and even shows the math for getting a bounding volume for the light which is handy.  I haven't used DX11 but again, this is fairly simple, I'm sure you can figure out the vertex/pixel shaders from the GLSL ones he makes as they're like ~8 lines each.

As for your camera not having a direction, you might want to add that, it's a fairly useful feature but not essential.  I think he covers using direction in the video also, been a while since I watched it and I was more using it for background noise than a lesson.  Let us/me know if you still have issues after watching/reading the links above. 

10 hours ago, Bartosz Boczula said:

P.S. This was my first post, thank you for not dissing it and explaining instead

No worries we all start someplace, no point in discouraging new blood into the game dev scene.  Nor did you copy and paste your non-functioning code and say, "fix this for me" :)

Share this post

Link to post
Share on other sites

Thank's guys, I'm making some good progress here. One question though - I changed my projection calculation from Perspective to Ortographic, but when I did that, it seems that camera position has no impact at all at the final image. I set (100.0f, -100.0f, 100.0f) and then (1.0f, -1.0f, 1.0f) and the result was exactly the same. The only way to change the output is to change viewWidth and viewHeight of this function:

DirectX::XMMatrixOrthographicLH(128.0f, 128.0f, 0.1f, 100.0f);

So when I set it to 1024x1024 the object gets really small, and when I set it to 128x128 the object gets bigger. Why is that?

Share this post

Link to post
Share on other sites

That is exactly what an orthographic camera does. If you hold the direction same, changing the position of the camera does not change the 'image', only where the view is centred. This is the closest I could find to an ortho depth map pulled off google.

If direction is the same and you move the camera position, you would just be scrolling up and down through a portion of e.g. the following image. If you move the camera further away, there is no effect, as the rays are parallel in an ortho camera.

I would encourage you to try moving an ortho camera in e.g. blender to see how it works, before trying to use it in shadow mapping. I'm guessing the 128x128 figure you are quoting is the scale of the camera, which determines how much is fitted into the view. Fit more in and each object is going to be smaller.


Share this post

Link to post
Share on other sites

@lawnjelly Ok, I decided to take a step back - I'm using my Perspective camera, the one that I know that works. I created a blitter render pass, which should render my already filled depth buffer on the screen, however the result is all red. I'm guessing this has something to do with the formats. My depth texture is in R16_TYPELESS format, Shader Resource View in R16_UNORM, but my Render Target is in R8G8B8A8_UNORM - how would 16-bit value will be written to a 32-bits slot?


Share this post

Link to post
Share on other sites

Hello Bartosz,

I am not a DX god but at the end this is how I understood it. You are wondering what is the position of the light because you want to position the camera correctly for the depth pass right? In my case I had to either assume the bounds of my scene or pre-calculate it. Knowing the bounds of my scene i was able to assume a position for my light that would make sure no 3D object are behind the projection. Knowing the bounds we are also able to calculate the minimum range of near/far values for our projection to keep an optimal depth precision while having all the scene objects inside the projection. With those bounds you can also figure out the left/top/right/bottom of your otho projection to cover the whole scene. Of course it requires some math to do.


Your lookAt can safely be the Position that you gave to the light minus the light direction. LookAt is always a position where your camera is pointing to relative to it's current position.


Also, as mentioned above, an orthographic projection should be used for directional lights as it is supposed to mimic a light that is so far away that all the rays appear to go in the same direction. The orthographic project does exactly that.


On 1/29/2018 at 8:09 AM, Bartosz Boczula said:

@lawnjelly Ok, I decided to take a step back - I'm using my Perspective camera, the one that I know that works. I created a blitter render pass, which should render my already filled depth buffer on the screen, however the result is all red. I'm guessing this has something to do with the formats. My depth texture is in R16_TYPELESS format, Shader Resource View in R16_UNORM, but my Render Target is in R8G8B8A8_UNORM - how would 16-bit value will be written to a 32-bits slot?



First of all I've never heard of a 16 bits depth format. I either use


for full depth precision or 


in case I am using depth + stencil.

See this link : https://msdn.microsoft.com/en-us/library/windows/desktop/ff476464(v=vs.85).aspx

When you set your render target you specify both your render target AND the depth buffer to use. So your render target may be set to a R8G8B8A8_UNORM texture but your depth buffer must also be set to your R16_UNORM texture. The depth will be written to the depth buffer while your pixel shader will write colors to the R8G8B8A8_UNORM texture. Those are two completely different texture.


It is possible to ommit the render target and specify only a depth buffer in case you only want to do a depth pass with vertex shader which is what you should probably do when generating the depth map of your shadows.

Edited by ChuckNovice

Share this post

Link to post
Share on other sites

Hey guys, thanks for all the support, thanks to that I made some progress! My blitter shader looks like this:

float4 main(PS_INPUT input) : SV_TARGET
    float4 val = shaderTexture.Sample(sampleType, input.textureCoordinates);
    return float4(val.b, frac(val.g * 10), frac(val.r * 100), 1.0);

This seem to work, I'm now able to see my D16 texture as R8G8B8A8 Render Target. But of course, this can't be too easy, can it :)

This is my result:


I might add that this looks normal with Perspective camera. @ChuckNovice did you have such issue?

Share this post

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Advertisement
  • Advertisement
  • Popular Tags

  • Popular Now

  • Advertisement
  • Similar Content

    • By chiffre
      In general my questions pertain to the differences between floating- and fixed-point data. Additionally I would like to understand when it can be advantageous to prefer fixed-point representation over floating-point representation in the context of vertex data and how the hardware deals with the different data-types. I believe I should be able to reduce the amount of data (bytes) necessary per vertex by choosing the most opportune representations for my vertex attributes. Thanks ahead of time if you, the reader, are considering the effort of reading this and helping me.
      I found an old topic that shows this is possible in principal, but I am not sure I understand what the pitfalls are when using fixed-point representation and whether there are any hardware-based performance advantages/disadvantages.
      (TLDR at bottom)
      The Actual Post:
      To my understanding HLSL/D3D11 offers not just the traditional floating point model in half-,single-, and double-precision, but also the fixed-point model in form of signed/unsigned normalized integers in 8-,10-,16-,24-, and 32-bit variants. Both models offer a finite sequence of "grid-points". The obvious difference between the two models is that the fixed-point model offers a constant spacing between values in the normalized range of [0,1] or [-1,1], while the floating point model allows for smaller "deltas" as you get closer to 0, and larger "deltas" the further you are away from 0.
      To add some context, let me define a struct as an example:
      struct VertexData { float[3] position; //3x32-bits float[2] texCoord; //2x32-bits float[3] normals; //3x32-bits } //Total of 32 bytes Every vertex gets a position, a coordinate on my texture, and a normal to do some light calculations. In this case we have 8x32=256bits per vertex. Since the texture coordinates lie in the interval [0,1] and the normal vector components are in the interval [-1,1] it would seem useful to use normalized representation as suggested in the topic linked at the top of the post. The texture coordinates might as well be represented in a fixed-point model, because it seems most useful to be able to sample the texture in a uniform manner, as the pixels don't get any "denser" as we get closer to 0. In other words the "delta" does not need to become any smaller as the texture coordinates approach (0,0). A similar argument can be made for the normal-vector, as a normal vector should be normalized anyway, and we want as many points as possible on the sphere around (0,0,0) with a radius of 1, and we don't care about precision around the origin. Even if we have large textures such as 4k by 4k (or the maximum allowed by D3D11, 16k by 16k) we only need as many grid-points on one axis, as there are pixels on one axis. An unsigned normalized 14 bit integer would be ideal, but because it is both unsupported and impractical, we will stick to an unsigned normalized 16 bit integer. The same type should take care of the normal vector coordinates, and might even be a bit overkill.
      struct VertexData { float[3] position; //3x32-bits uint16_t[2] texCoord; //2x16bits uint16_t[3] normals; //3x16bits } //Total of 22 bytes Seems like a good start, and we might even be able to take it further, but before we pursue that path, here is my first question: can the GPU even work with the data in this format, or is all I have accomplished minimizing CPU-side RAM usage? Does the GPU have to convert the texture coordinates back to a floating-point model when I hand them over to the sampler in my pixel shader? I have looked up the data types for HLSL and I am not sure I even comprehend how to declare the vertex input type in HLSL. Would the following work?
      struct VertexInputType { float3 pos; //this one is obvious unorm half2 tex; //half corresponds to a 16-bit float, so I assume this is wrong, but this the only 16-bit type I found on the linked MSDN site snorm half3 normal; //same as above } I assume this is possible somehow, as I have found input element formats such as: DXGI_FORMAT_R16G16B16A16_SNORM and DXGI_FORMAT_R16G16B16A16_UNORM (also available with a different number of components, as well as different component lengths). I might have to avoid 3-component vectors because there is no 3-component 16-bit input element format, but that is the least of my worries. The next question would be: what happens with my normals if I try to do lighting calculations with them in such a normalized-fixed-point format? Is there no issue as long as I take care not to mix floating- and fixed-point data? Or would that work as well? In general this gives rise to the question: how does the GPU handle fixed-point arithmetic? Is it the same as integer-arithmetic, and/or is it faster/slower than floating-point arithmetic?
      Assuming that we still have a valid and useful VertexData format, how far could I take this while remaining on the sensible side of what could be called optimization? Theoretically I could use the an input element format such as DXGI_FORMAT_R10G10B10A2_UNORM to pack my normal coordinates into a 10-bit fixed-point format, and my verticies (in object space) might even be representable in a 16-bit unsigned normalized fixed-point format. That way I could end up with something like the following struct:
      struct VertexData { uint16_t[3] pos; //3x16bits uint16_t[2] texCoord; //2x16bits uint32_t packedNormals; //10+10+10+2bits } //Total of 14 bytes Could I use a vertex structure like this without too much performance-loss on the GPU-side? If the GPU has to execute some sort of unpacking algorithm in the background I might as well let it be. In the end I have a functioning deferred renderer, but I would like to reduce the memory footprint of the huge amount of vertecies involved in rendering my landscape. 
      TLDR: I have a lot of vertices that I need to render and I want to reduce the RAM-usage without introducing crazy compression/decompression algorithms to the CPU or GPU. I am hoping to find a solution by involving fixed-point data-types, but I am not exactly sure how how that would work.
    • By Nikita Sidorenko
      I'm making render just for fun (c++, opengl)
      Want to add decals support. Here what I found
      A couple of slides from doom
      http://advances.realtimerendering.com/s2016/Siggraph2016_idTech6.pdf Decals but deferred 
      http://martindevans.me/game-development/2015/02/27/Drawing-Stuff-… space-Decals/
      No implementation details here
      As I see there should be a list of decals for each tile same as for light sources. But what to do next?
      Let assume that all decals are packed into a spritesheet. Decal will substitute diffuse and normal.
      - What data should be stored for each decal on the GPU? 
      - Articles above describe decals as OBB. Why OBB if decals seem to be flat?
      - How to actually render a decal during object render pass (since it's forward)? Is it projected somehow? Don't understand this part completely.
      Are there any papers for this topic?
    • By cozzie
      Hi all,
      I was wondering it it matters in which order you draw 2D and 3D items, looking at the BeginDraw/EndDraw calls on a D2D rendertarget.
      The order in which you do the actual draw calls is clear, 3D first then 2D, means the 2D (DrawText in this case) is in front of the 3D scene.
      The question is mainly about when to call the BeginDraw and EndDraw.
      Note that I'm drawing D2D stuff through a DXGI surface linked to the 3D RT.
      Option 1:
      A - Begin frame, clear D3D RT
      B - Draw 3D
      C - BeginDraw D2D RT
      D - Draw 2D
      E - EndDraw D2D RT
      F - Present
      Option 2:
      A - Begin frame, clear D3D RT + BeginDraw D2D RT
      B - Draw 3D
      C - Draw 2D
      D - EndDraw D2D RT
      E- Present
      Would there be a difference (performance/issue?) in using option 2? (versus 1)
      Any input is appreciated.
    • By Ming-Lun "Allen" Chou
      Here is the original blog post.
      Edit: Sorry, I can't get embedded LaTeX to display properly.
      The pinned tutorial post says I have to do it in plain HTML without embedded images?
      I actually tried embedding pre-rendered equations and they seemed fine when editing, 
      but once I submit the post it just turned into a huge mess.
      So...until I can find a proper way to fix this, please refer to the original blog post for formatted formulas.
      I've replaced the original LaTex mess in this post with something at least more readable.
      Any advice on fixing this is appreciated.
      This post is part of my Game Math Series.
      Source files are on GitHub.
      Shortcut to sterp implementation.
      Shortcut to code used to generate animations in this post.
      An Alternative to Slerp
      Slerp, spherical linear interpolation, is an operation that interpolates from one orientation to another, using a rotational axis paired with the smallest angle possible.
      Quick note: Jonathan Blow explains here how you should avoid using slerp, if normalized quaternion linear interpolation (nlerp) suffices. Long store short, nlerp is faster but does not maintain constant angular velocity, while slerp is slower but maintains constant angular velocity; use nlerp if you’re interpolating across small angles or you don’t care about constant angular velocity; use slerp if you’re interpolating across large angles and you care about constant angular velocity. But for the sake of using a more commonly known and used building block, the remaining post will only mention slerp. Replacing all following occurrences of slerp with nlerp would not change the validity of this post.
      In general, slerp is considered superior over interpolating individual components of Euler angles, as the latter method usually yields orientational sways.
      But, sometimes slerp might not be ideal. Look at the image below showing two different orientations of a rod. On the left is one orientation, and on the right is the resulting orientation of rotating around the axis shown as a cyan arrow, where the pivot is at one end of the rod.

      If we slerp between the two orientations, this is what we get:

      Mathematically, slerp takes the “shortest rotational path”. The quaternion representing the rod’s orientation travels along the shortest arc on a 4D hyper sphere. But, given the rod’s elongated appearance, the rod’s moving end seems to be deviating from the shortest arc on a 3D sphere.
      My intended effect here is for the rod’s moving end to travel along the shortest arc in 3D, like this:

      The difference is more obvious if we compare them side-by-side:

      This is where swing-twist decomposition comes in.
      Swing-Twist Decomposition
      Swing-Twist decomposition is an operation that splits a rotation into two concatenated rotations, swing and twist. Given a twist axis, we would like to separate out the portion of a rotation that contributes to the twist around this axis, and what’s left behind is the remaining swing portion.
      There are multiple ways to derive the formulas, but this particular one by Michaele Norel seems to be the most elegant and efficient, and it’s the only one I’ve come across that does not involve any use of trigonometry functions. I will first show the formulas now and then paraphrase his proof later:
      Given a rotation represented by a quaternion R = [W_R, vec{V_R}] and a twist axis vec{V_T}, combine the scalar part from R the projection of vec{V_R} onto vec{V_T} to form a new quaternion: T = [W_R, proj_{vec{V_T}}(vec{V_R})]. We want to decompose R into a swing component and a twist component. Let the S denote the swing component, so we can write R = ST. The swing component is then calculated by multiplying R with the inverse (conjugate) of T: S= R T^{-1} Beware that S and T are not yet normalized at this point. It's a good idea to normalize them before use, as unit quaternions are just cuter. Below is my code implementation of swing-twist decomposition. Note that it also takes care of the singularity that occurs when the rotation to be decomposed represents a 180-degree rotation. public static void DecomposeSwingTwist ( Quaternion q, Vector3 twistAxis, out Quaternion swing, out Quaternion twist ) { Vector3 r = new Vector3(q.x, q.y, q.z); // singularity: rotation by 180 degree if (r.sqrMagnitude < MathUtil.Epsilon) { Vector3 rotatedTwistAxis = q * twistAxis; Vector3 swingAxis = Vector3.Cross(twistAxis, rotatedTwistAxis); if (swingAxis.sqrMagnitude > MathUtil.Epsilon) { float swingAngle = Vector3.Angle(twistAxis, rotatedTwistAxis); swing = Quaternion.AngleAxis(swingAngle, swingAxis); } else { // more singularity: // rotation axis parallel to twist axis swing = Quaternion.identity; // no swing } // always twist 180 degree on singularity twist = Quaternion.AngleAxis(180.0f, twistAxis); return; } // meat of swing-twist decomposition Vector3 p = Vector3.Project(r, twistAxis); twist = new Quaternion(p.x, p.y, p.z, q.w); twist = Normalize(twist); swing = q * Quaternion.Inverse(twist); } Now that we have the means to decompose a rotation into swing and twist components, we need a way to use them to interpolate the rod’s orientation, replacing slerp.
      Swing-Twist Interpolation
      Replacing slerp with the swing and twist components is actually pretty straightforward. Let the Q_0 and Q_1 denote the quaternions representing the rod's two orientations we are interpolating between. Given the interpolation parameter t, we use it to find "fractions" of swing and twist components and combine them together. Such fractiona can be obtained by performing slerp from the identity quaternion, Q_I, to the individual components. So we replace: Slerp(Q_0, Q_1, t) with: Slerp(Q_I, S, t) Slerp(Q_I, T, t) From the rod example, we choose the twist axis to align with the rod's longest side. Let's look at the effect of the individual components Slerp(Q_I, S, t) and Slerp(Q_I, T, t) as t varies over time below, swing on left and twist on right:
      And as we concatenate these two components together, we get a swing-twist interpolation that rotates the rod such that its moving end travels in the shortest arc in 3D. Again, here is a side-by-side comparison of slerp (left) and swing-twist interpolation (right):

      I decided to name my swing-twist interpolation function sterp. I think it’s cool because it sounds like it belongs to the function family of lerp and slerp. Here’s to hoping that this name catches on.
      And here’s my code implementation:
      public static Quaternion Sterp ( Quaternion a, Quaternion b, Vector3 twistAxis, float t ) { Quaternion deltaRotation = b * Quaternion.Inverse(a); Quaternion swingFull; Quaternion twistFull; QuaternionUtil.DecomposeSwingTwist ( deltaRotation, twistAxis, out swingFull, out twistFull ); Quaternion swing = Quaternion.Slerp(Quaternion.identity, swingFull, t); Quaternion twist = Quaternion.Slerp(Quaternion.identity, twistFull, t); return twist * swing; } Proof
      Lastly, let’s look at the proof for the swing-twist decomposition formulas. All that needs to be proven is that the swing component S does not contribute to any rotation around the twist axis, i.e. the rotational axis of S is orthogonal to the twist axis. Let vec{V_{R_para}} denote the parallel component of vec{V_R} to vec{V_T}, which can be obtained by projecting vec{V_R} onto vec{V_T}: vec{V_{R_para}} = proj_{vec{V_T}}(vec{V_R}) Let vec{V_{R_perp}} denote the orthogonal component of vec{V_R} to vec{V_T}: vec{V_{R_perp}} = vec{V_R} - vec{V_{R_para}} So the scalar-vector form of T becomes: T = [W_R, proj_{vec{V_T}}(vec{V_R})] = [W_R, vec{V_{R_para}}] Using the quaternion multiplication formula, here is the scalar-vector form of the swing quaternion: S = R T^{-1} = [W_R, vec{V_R}] [W_R, -vec{V_{R_para}}] = [W_R^2 - vec{V_R} ‧ (-vec{V_{R_para}}), vec{V_R} X (-vec{V_{R_para}}) + W_R vec{V_R} + W_R (-vec{V_{R_para}})] = [W_R^2 - vec{V_R} ‧ (-vec{V_{R_para}}), vec{V_R} X (-vec{V_{R_para}}) + W_R (vec{V_R} -vec{V_{R_para}})] = [W_R^2 - vec{V_R} ‧ (-vec{V_{R_para}}), vec{V_R} X (-vec{V_{R_para}}) + W_R vec{V_{R_perp}}] Take notice of the vector part of the result: vec{V_R} X (-vec{V_{R_para}}) + W_R vec{V_{R_perp}} This is a vector parallel to the rotational axis of S. Both vec{V_R} X(-vec{V_{R_para}}) and vec{V_{R_perp}} are orthogonal to the twist axis vec{V_T}, so we have shown that the rotational axis of S is orthogonal to the twist axis. Hence, we have proven that the formulas for S and T are valid for swing-twist decomposition. Conclusion
      That’s all.
      Given a twist axis, I have shown how to decompose a rotation into a swing component and a twist component.
      Such decomposition can be used for swing-twist interpolation, an alternative to slerp that interpolates between two orientations, which can be useful if you’d like some point on a rotating object to travel along the shortest arc.
      I like to call such interpolation sterp.
      Sterp is merely an alternative to slerp, not a replacement. Also, slerp is definitely more efficient than sterp. Most of the time slerp should work just fine, but if you find unwanted orientational sway on an object’s moving end, you might want to give sterp a try.
    • By Sebastian Werema
      Do you know any papers that cover custom data structures like lists or binary trees implemented in hlsl without CUDA that work perfectly fine no matter how many threads try to use them at any given time?
  • Advertisement

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

Participate in the game development conversation and more when you create an account on GameDev.net!

Sign me up!