In the previous thread about Shadowing, i explained the basic implementation of Shadow maps.
I quickly mentioned shadow acne and how it can be solved relatively well. The key point is to apply a z-offset to the vertices in shadow map space, at the shadow-map render-to-texture stage.
I used to multiply pos.w ( pos being the 4D vector containing the vertex position ) by a constant like 0.99; it solved some of those issues, but it was not good enough. In particular, the constant was never perfect in all cases. For example when the light rays were almost parallel with a polygon, acne was horrible.
Reducing the constant to 0.98, 0.97, etc.. fixed the problem of parallel rays, at the cost of offsetting the shadows of orthogonal polygons too far. The shadows did not match the geometry anymore, and the objects seemed to fly a bit above the shadows.
The main trick to solve all those issues is to not use a constant, but to use a variable that is a function of the slope between the light rays and the surface. I'm now simply linearly interpolating between two constants ( say, 0.95 for parallel rays, and 0.999 for orthogonal rays ), and use the interpolated value to multiply pos.w with.
The assembly code looks like this:
# ZFactor blend factor depending on slope compared to light direction:
TEMP lightDirObj; # light direction in object space
DP3 lightDirObj.x, mv, lightDir;
DP3 lightDirObj.y, mv, lightDir;
DP3 lightDirObj.z, mv, lightDir;
DP3 lightDirObj.w, inNormal, lightDirObj;
MAX lightDirObj.w, lightDirObj.w, 0.0;
MAD lightDirObj.w, zfactor.x, lightDirObj.w, zfactor.y;
MUL pos.w, pos.w, lightDirObj.w;
Note: in the above code, mv is the modelview matrix, lightDir is the light direction in world space ( it is assumed to be a directional, infinite light ), and inNormal is the normal in ojbject space. Zfactor is a vector that contains in the X register the ZFactor value for parallel rays, and in the Y register the ZFactor value for orthogonal rays.
Shadow mapping experiments:
As i explained in the first part, standard shadow maps suffer from resolution problems, especially when you want them to cover a large area.
A very obvious improvement is, instead of mapping the shadow map to the whole scene, map it to a portion of the scene: the one centered on the camera ( that's where you want to see shadows ! ), and with a limited view distance. That's probably good enough for a FPS game; if you limit the distance at which shadows appear, say to 200m away from the camera, you'll get an "okay" resolution.. in the same idea, a RTS, with a top-down / ortho camera, will get a pretty much perfect resolution since you can modify the shadow map area to fit perfectly within the frustum of the camera.
Another improvement is adaptating the shadow map area, not to fit a certain area around the camera; but to fit the camera frustum itself. This might help a bit, but because the resolution is no longer constant, you might get a good resolution quality in some views, and an horrible one in other views. That's also the problem with all the perspective-tweaking techniques ( perspective shadow maps, light-space shadow maps ( LSSM ), trapezoidal shadow maps ).
LSSM is a technique i was very interested in, that didn't seem too hard to implement, until i realized it suffered from the same problems than the other techniques. It's usually good ( hell, even.. excellent! ) but there's always a way to find a view where the quality is *worse* than standard shadow maps. Because the resolution is dependant on the frustum, that means shadow texels flicker and change every time the camera moves/turns (ie. all the time).
Per-object shadow maps:
Another idea that i had many months ago was to use standard shadow maps, but to assign shadow maps to each object instead of the whole scene. Then, for each frame, the resolution of the shadow maps would be re-calculated based on the distance of the object and the camera. An object that is small ( in dimensions ) and far away would get, say, a 64x64 shadow map; an object that is big ( in dimensions ) and far away would get a 256x256; an object that is small and close would get a 512x512; an object that is big and very close would get a 1024x1024 or more.. you see the idea.
I implemented this algorithm using a cache / pool of shadow maps, re-cycling shadow maps usage to avoid re-creating them every time, but overall the performance was extremely bad. The reason was simply the explosion of the number of passes, and the amount of geometry to draw. As soon as an object's shadow map needed to be re-calculated, all the objects in the frustum had to be re-rendered. Imagine a scene with 50 objects near you, that need to be updated, and 1000 objects a bit further away, but that all fall in the frustum of some of those 50 objects. That's an insane amount of geometry to re-render.. and it killed the framerate.
In addition, if an object was really, really big ( say, 2 Km big ), there was no way it could get a high enough resolution to have high quality shadows, even using a 2048x2048 on it ( which would give 1 texel per meter ).
Cascaded shadow maps:
My last words on the topic will be about the technique i'm currently using ( which is still nowhere from perfect ), cascaded shadow maps. The idea ( and the implementation ) behind cascaded shadow maps is quite easy: instead of using one shadow map centered around the camera, with a range of 200m ( which would give an "okay" resolution, but with a range that is too close ), we can use N shadow maps, all centered on the camera, but each of them using an increasing maximum range: 200m (shadow map #0), 400m (shadow map #1), 800m (shadow map #2), 1600m (shadow map #3).
Of course, increasing the amount of shadow maps means rendering the scene N times rather than once. It's even worse than what you think, because the frustums of each shadow map also get bigger and bigger, so as N increases, the amount of geometry of the scene that needs to be rendered to each shadow map increases. The biggest level of the shadow map (the 4th one when N=4) almost requires to render the whole scene. And here, frustum culling will not help since we're rendering from the light's point of view ( which affects the whole scene ).
An intermediary solution is to use two levels of shadow maps: one that gets a high quality, very precise, around the camera ( say a 2048x2048 for a 200m distance ). And a rough one ( say a 2048x2048 for a 2000m distance ). This might work more or less well depending on your game; the larger the ranges step up from the previous level, the more you'll have troubles to hide the "transition" between the levels. However, performance will be better than with more levels.
In Infinity, i'm pretty much stuck in a dead-end. The images in this thread use N=4 ( 4 shadow maps ), covering larger and larger areas. Here are the values:
- for level 1, SM is 1024x1024 up to a distance of 150m.
- for level 2, SM is 1024x1024 for distances between 150m and 500m.
- for level 3, SM is 1024x1024 for distances between 500m and 1666m.
- for level 4, SM is 1024x1024 for distances between 1666m and 5555m.
After 5.5 Km, shadows disapear. To hide this, i'm using alpha blending with a factor that is a function of the distance, so that it fades out smoothly.
At this point you're probably wondering.. "okay, but why do you need shadows to appear as far as 5.5 Km" ? The answer is bad: i actually need them to appear at up to 50 Km, and i'm not happy at all with this 5.5 Km limit.
Why is my target 50 Km ? Because i use a planetary engine with an unlimited view distance. It's not impossible to have mountains that are up to 10 Km of altitude, and that cast shadows very far away, especially at sunsets. In addition, some of our spaceships are *huge*. The Flamberge ( the biggest ship in the screenshots ) is 5 Km long. With a view distance limited to 5.5 Km, that means the Flamberge can cast shadows at no more than its own length !
Then, you might ask, why don't you start with a lower limit ? For instance, the first shadow map level could have a range of 1000m instead of 150m. Of course, i can do that, but this would mean loosing resolution for small objects. Some vehicles are only a few meters long, and i'd like their shadows to have a nice shape, and not an horrible blurry blob of pixel(s).
Increasing N to 8 would probably allow me to reach this 50 Km distance target except.. that performance would suffer a lot. And it already does with N = 4, so i'm quite afraid to go that way...
Packed cascaded shadow maps:
To end this (long) topic on shadows, i'll speak quickly about another optimization i added to cascaded shadow maps. For N=4, you can pack 4 1024x1024 shadow maps into a single 2048x2048 one. It's then just a matter of determining per-pixel ( not per-vertex, or you'll get wrong results due to the linear interpolation ) which of the 4 areas to sample. This is very handy, especially when you add softening of shadow samples ( taking 2x2 samples to blur the shadows a bit ).
Caching shadow maps does not seem to work well. One of my motivations for going for cascaded shadow maps was the hope that i'd be able to update the first shadow map level every frame, but only every 2 frames for the 2nd level, every 4 frames for the 3rd one, and every 8 frames ( or so ) for the 4th one. This doesn't work at all, simply because you can have moving objects in the scene, which cause all the shadow maps that are not updated every frame to be incorrectly projected to the geometry; then, objects appear either completely in shadow, or completely in light, depending on the direction the object is traveling versus the camera.. So, forget about caching shadow maps: you'll have to recalculate all the shadow maps every frame anyway, even if you're not using perspective shadow maps derivated algorithms.
I'm currently using the packed cascaded shadow maps algorithm. Performance is pretty bad ( relatively.. i still get more than 60 fps in the screenshots shown in this thread. But the performance hit over no shadows is like 100-150% ). The range is limited to 5.5 Km and i'm not sure how to increase it ( without killing the performance ). All the other shadow mapping techniques have their own limitations. Maybe an hybrid technique would work better; for example, i'm considering experimenting a cascaded light space shadow map technique ! By using two levels of shadow maps, each shadow map being rendered with LSSM, i could maybe achieve much better results at a better framerate ! Unfortunately, all those experiments take a lot of time, so who knows what will happen in the future ?