Shadowing part II

posted in Journal of Ysaneya

Published February 13, 2007

Shadow mapping acne:

In the previous thread about Shadowing, i explained the basic implementation of Shadow maps.

I quickly mentioned shadow acne and how it can be solved relatively well. The key point is to apply a z-offset to the vertices in shadow map space, at the shadow-map render-to-texture stage.

I used to multiply pos.w ( pos being the 4D vector containing the vertex position ) by a constant like 0.99; it solved some of those issues, but it was not good enough. In particular, the constant was never perfect in all cases. For example when the light rays were almost parallel with a polygon, acne was horrible.

Reducing the constant to 0.98, 0.97, etc.. fixed the problem of parallel rays, at the cost of offsetting the shadows of orthogonal polygons too far. The shadows did not match the geometry anymore, and the objects seemed to fly a bit above the shadows.

The main trick to solve all those issues is to not use a constant, but to use a variable that is a function of the slope between the light rays and the surface. I'm now simply linearly interpolating between two constants ( say, 0.95 for parallel rays, and 0.999 for orthogonal rays ), and use the interpolated value to multiply pos.w with.

The assembly code looks like this:

# ZFactor blend factor depending on slope compared to light direction:TEMP lightDirObj; # light direction in object spaceDP3 lightDirObj.x, mv[0], lightDir; DP3 lightDirObj.y, mv[1], lightDir;DP3 lightDirObj.z, mv[2], lightDir;DP3 lightDirObj.w, inNormal, lightDirObj;MAX lightDirObj.w, lightDirObj.w, 0.0;MAD lightDirObj.w, zfactor.x, lightDirObj.w, zfactor.y;MUL pos.w, pos.w, lightDirObj.w;

Note: in the above code, mv is the modelview matrix, lightDir is the light direction in world space ( it is assumed to be a directional, infinite light ), and inNormal is the normal in ojbject space. Zfactor is a vector that contains in the X register the ZFactor value for parallel rays, and in the Y register the ZFactor value for orthogonal rays.

Shadow mapping experiments:

As i explained in the first part, standard shadow maps suffer from resolution problems, especially when you want them to cover a large area.

A very obvious improvement is, instead of mapping the shadow map to the whole scene, map it to a portion of the scene: the one centered on the camera ( that's where you want to see shadows ! ), and with a limited view distance. That's probably good enough for a FPS game; if you limit the distance at which shadows appear, say to 200m away from the camera, you'll get an "okay" resolution.. in the same idea, a RTS, with a top-down / ortho camera, will get a pretty much perfect resolution since you can modify the shadow map area to fit perfectly within the frustum of the camera.

Another improvement is adaptating the shadow map area, not to fit a certain area around the camera; but to fit the camera frustum itself. This might help a bit, but because the resolution is no longer constant, you might get a good resolution quality in some views, and an horrible one in other views. That's also the problem with all the perspective-tweaking techniques ( perspective shadow maps, light-space shadow maps ( LSSM ), trapezoidal shadow maps ).

LSSM is a technique i was very interested in, that didn't seem too hard to implement, until i realized it suffered from the same problems than the other techniques. It's usually good ( hell, even.. excellent! ) but there's always a way to find a view where the quality is *worse* than standard shadow maps. Because the resolution is dependant on the frustum, that means shadow texels flicker and change every time the camera moves/turns (ie. all the time).

Per-object shadow maps:

Another idea that i had many months ago was to use standard shadow maps, but to assign shadow maps to each object instead of the whole scene. Then, for each frame, the resolution of the shadow maps would be re-calculated based on the distance of the object and the camera. An object that is small ( in dimensions ) and far away would get, say, a 64x64 shadow map; an object that is big ( in dimensions ) and far away would get a 256x256; an object that is small and close would get a 512x512; an object that is big and very close would get a 1024x1024 or more.. you see the idea.

I implemented this algorithm using a cache / pool of shadow maps, re-cycling shadow maps usage to avoid re-creating them every time, but overall the performance was extremely bad. The reason was simply the explosion of the number of passes, and the amount of geometry to draw. As soon as an object's shadow map needed to be re-calculated, all the objects in the frustum had to be re-rendered. Imagine a scene with 50 objects near you, that need to be updated, and 1000 objects a bit further away, but that all fall in the frustum of some of those 50 objects. That's an insane amount of geometry to re-render.. and it killed the framerate.

In addition, if an object was really, really big ( say, 2 Km big ), there was no way it could get a high enough resolution to have high quality shadows, even using a 2048x2048 on it ( which would give 1 texel per meter ).

Cascaded shadow maps:

My last words on the topic will be about the technique i'm currently using ( which is still nowhere from perfect ), cascaded shadow maps. The idea ( and the implementation ) behind cascaded shadow maps is quite easy: instead of using one shadow map centered around the camera, with a range of 200m ( which would give an "okay" resolution, but with a range that is too close ), we can use N shadow maps, all centered on the camera, but each of them using an increasing maximum range: 200m (shadow map #0), 400m (shadow map #1), 800m (shadow map #2), 1600m (shadow map #3).

Of course, increasing the amount of shadow maps means rendering the scene N times rather than once. It's even worse than what you think, because the frustums of each shadow map also get bigger and bigger, so as N increases, the amount of geometry of the scene that needs to be rendered to each shadow map increases. The biggest level of the shadow map (the 4th one when N=4) almost requires to render the whole scene. And here, frustum culling will not help since we're rendering from the light's point of view ( which affects the whole scene ).

An intermediary solution is to use two levels of shadow maps: one that gets a high quality, very precise, around the camera ( say a 2048x2048 for a 200m distance ). And a rough one ( say a 2048x2048 for a 2000m distance ). This might work more or less well depending on your game; the larger the ranges step up from the previous level, the more you'll have troubles to hide the "transition" between the levels. However, performance will be better than with more levels.

In Infinity, i'm pretty much stuck in a dead-end. The images in this thread use N=4 ( 4 shadow maps ), covering larger and larger areas. Here are the values:

- for level 1, SM is 1024x1024 up to a distance of 150m.
- for level 2, SM is 1024x1024 for distances between 150m and 500m.
- for level 3, SM is 1024x1024 for distances between 500m and 1666m.
- for level 4, SM is 1024x1024 for distances between 1666m and 5555m.

After 5.5 Km, shadows disapear. To hide this, i'm using alpha blending with a factor that is a function of the distance, so that it fades out smoothly.

At this point you're probably wondering.. "okay, but why do you need shadows to appear as far as 5.5 Km" ? The answer is bad: i actually need them to appear at up to 50 Km, and i'm not happy at all with this 5.5 Km limit.

Why is my target 50 Km ? Because i use a planetary engine with an unlimited view distance. It's not impossible to have mountains that are up to 10 Km of altitude, and that cast shadows very far away, especially at sunsets. In addition, some of our spaceships are *huge*. The Flamberge ( the biggest ship in the screenshots ) is 5 Km long. With a view distance limited to 5.5 Km, that means the Flamberge can cast shadows at no more than its own length !

Then, you might ask, why don't you start with a lower limit ? For instance, the first shadow map level could have a range of 1000m instead of 150m. Of course, i can do that, but this would mean loosing resolution for small objects. Some vehicles are only a few meters long, and i'd like their shadows to have a nice shape, and not an horrible blurry blob of pixel(s).

Increasing N to 8 would probably allow me to reach this 50 Km distance target except.. that performance would suffer a lot. And it already does with N = 4, so i'm quite afraid to go that way...

Packed cascaded shadow maps:

To end this (long) topic on shadows, i'll speak quickly about another optimization i added to cascaded shadow maps. For N=4, you can pack 4 1024x1024 shadow maps into a single 2048x2048 one. It's then just a matter of determining per-pixel ( not per-vertex, or you'll get wrong results due to the linear interpolation ) which of the 4 areas to sample. This is very handy, especially when you add softening of shadow samples ( taking 2x2 samples to blur the shadows a bit ).

Caching shadow maps does not seem to work well. One of my motivations for going for cascaded shadow maps was the hope that i'd be able to update the first shadow map level every frame, but only every 2 frames for the 2nd level, every 4 frames for the 3rd one, and every 8 frames ( or so ) for the 4th one. This doesn't work at all, simply because you can have moving objects in the scene, which cause all the shadow maps that are not updated every frame to be incorrectly projected to the geometry; then, objects appear either completely in shadow, or completely in light, depending on the direction the object is traveling versus the camera.. So, forget about caching shadow maps: you'll have to recalculate all the shadow maps every frame anyway, even if you're not using perspective shadow maps derivated algorithms.

Conclusion:

I'm currently using the packed cascaded shadow maps algorithm. Performance is pretty bad ( relatively.. i still get more than 60 fps in the screenshots shown in this thread. But the performance hit over no shadows is like 100-150% ). The range is limited to 5.5 Km and i'm not sure how to increase it ( without killing the performance ). All the other shadow mapping techniques have their own limitations. Maybe an hybrid technique would work better; for example, i'm considering experimenting a cascaded light space shadow map technique ! By using two levels of shadow maps, each shadow map being rendered with LSSM, i could maybe achieve much better results at a better framerate ! Unfortunately, all those experiments take a lot of time, so who knows what will happen in the future ?

Previous Entry Fast Perlin noise

Next Entry Planetary textures, website, bump maps, SDD..

0 likes 6 comments

Comments

jollyjeffers

Very interesting post!

Have you tried logarithmic shadow maps? I was reading through their slide-deck a few weeks back and it struck me as quite interesting.

As for your problems - is it something where you can get acceptable results on your target hardware, but leave the option to ramp up the SM quality for those with turbo-charged hardware?

hth
Jack

February 13, 2007 02:28 PM

RAZORUNREAL

Ouch... Reading that, are you sure shadow volumes are such a bad trade-off? They take a lot of fillrate, sure, but compared to rendering huge chunks of the scene into high resolution textures, it might not be so bad. Maybe you could extrude shadows different distances depending on the size of the object? I'm just guessing of course. You'd probably have to make low resolution models of everything to generate shadows from, which would be a pain. But robust volume generation isn't that hard, I know of a paper about a simple algorithm that can generate an optimal volume no matter how many triangles share an edge. Haven't got it bookmarked on this pc though.

I don't know, it just seemed like you discarded them out of hand.

February 13, 2007 09:07 PM

Ysaneya

Quote:Original post by jollyjeffers
Very interesting post!

Have you tried logarithmic shadow maps? I was reading through their slide-deck a few weeks back and it struck me as quite interesting.

Ah yes, I remember reading that one too some time ago.

Unfortunately, it requires special hardware support.. if I understood the article well enough, it requires to be able to rasterize curved triangles. At the moment, they render their curved triangles by drawing a bounding rectangle and for each pixel, test if the pixel is inside the curved triangle :)

That might be do-able on DX10, but that's going a bit far just for a shadowing technique.. and I need it to work on DX9-level cards too (meaning: pixel shader 2.0).

February 14, 2007 04:27 AM

Ysaneya

Quote:Original post by RAZORUNREAL
Ouch... Reading that, are you sure shadow volumes are such a bad trade-off? ...
I don't know, it just seemed like you discarded them out of hand.

I discarded them quickly because I worked on them pretty deeply a few years ago (for the article published in ShaderX2).

Their unability to handle alpha-masked textures is a killer. For vegetation, for industrial textures in stations, or even more recently, for city domes:
http://www.fl-tw.com/Infinity/Media/Screenshots/city_20.jpg

Everything in the dome would appear shadowed with them..

February 14, 2007 04:31 AM

qinEtiQ

Very interesting journal! It actually convinced me to get a GameDev-Account... ;-)

Can you describe a bit more how you solved the problem of a flickering shadow when you move the camera? (Because that's what I gnaw on...)

Thanx,

Tim

February 14, 2007 09:45 AM

Brian Lawson

Quote:Of course, increasing the amount of shadow maps means rendering the scene N times rather than once. It's even worse than what you think, because the frustums of each shadow map also get bigger and bigger, so as N increases, the amount of geometry of the scene that needs to be rendered to each shadow map increases. The biggest level of the shadow map (the 4th one when N=4) almost requires to render the whole scene. And here, frustum culling will not help since we're rendering from the light's point of view ( which affects the whole scene ).

I find the previous paragraph to be confusing and unclear and possibly incorrect. In a perfect world it means rendering N subsets of the scene ONE TIME each -- for a combined total ONE scene render.

This one below makes sense -- but clearly contradicts what is said in the previous paragraph:

Quote:- for level 1, SM is 1024x1024 up to a distance of 150m.
- for level 2, SM is 1024x1024 for distances between 150m and 500m.
- for level 3, SM is 1024x1024 for distances between 500m and 1666m.
- for level 4, SM is 1024x1024 for distances between 1666m and 5555m.

With a break down like that in distances -- there is no reason why the total rendered objects should ever amount to more than the entire scene re-rendered ONE TIME. i.e. roughly 1/4 of the scene will be rendered into each shadow map (there will be a few objects that overlap into two shadow maps -- i.e. boundary cases).

Consider this from a top down orthographic view of the view frustum. You're simply dividing it up into four sections. Each shadow map will use its own virtual orthographic frustum/projection to render its assigned section (subset) of the scene. Like I said -- the only redundant rendering will come from objects that sit on a boundary and can then be considered to be in both regions -- and will then get rendered accordingly into each shadow map for the assigned region/subset.

Hopefully that make sense? :)

--
Brian L.

March 08, 2007 09:30 AM

You must log in to join the conversation.

Don't have a GameDev.net account? Sign up!

Ysaneya

Author

Shadowing part II

Comments

Ysaneya

Latest Entries

Patch 0.1.6.0 screenshots

A retrospective on the Infinity project

Tech Demo Video 2010

ASEToBin 1.0 release

Tip of the day: logarithmic zbuffer artifacts fix

Seamless filtering across faces of dynamic cube map

Audio engine and various updates

Galaxy generation

Deferred lighting and instant radiosity

Detail textures

Shadowing part II

Comments

Ysaneya

Latest Entries

Patch 0.1.6.0 screenshots

A retrospective on the Infinity project

Tech Demo Video 2010

ASEToBin 1.0 release

Tip of the day: logarithmic zbuffer artifacts fix

Seamless filtering across faces of dynamic cube map

Audio engine and various updates

Galaxy generation

Deferred lighting and instant radiosity

Detail textures

Reticulating splines