Sign in to follow this  

Opinions on "Pixel-Correct Shadow Maps with Temporal Reprojection .."

This topic is 3722 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Has anyone read / implemented the shadowing technique used in "Pixel-Correct Shadow Maps with Temporal Reprojection and Shadow Test Confidence"? Does anyone know of any major drawbacks? I read over the paper a few times, and it doesn't look like it would be too difficult to implement, but these research techniques sometimes have failure cases not mentioned too much in the papers (I'm remembering PSM and related frustum warping techniques). There's no demo, but it shows results of the dueling frusta case with pixel correct shadows (after a certain number of frames). I was quite impressed by that screenshot (considering the dueling frusta case is known for problems). It seems like it wouldn't really work with dynamic (objects moving) scenes, but I don't really need it to work with dynamic scenes, so that's not really a drawback for me. I have implemented a basic PSSM/VSM shadow mapping algorithm, but still end up with rather noticeable aliasing in large outdoor scenes. I have tried fewer splits (3 or 4) and using VSM, but I end up with very soft shadows due to not having enough resolution. Going the other way (more splits) still resulted in noticeable aliasing and I had trouble blurring the shadow maps for VSM and still maintaining good frame rates on the GeForce 6/7 line of cards. My PSSM implementation is pretty basic (I don't do any scene analysis), but even the GPU Gems 3 demo that does has very noticeable artifacts. So, basically, I'm thinking of implementing that technique for this project I'm working on, but I wanted to check to see if people knew of major drawbacks / reasons why this technique would fail in the general case. It doesn't seem like it would run too slow on say a 7900 GT, which is kind of what I'm aiming for. If no one knows of any drawbacks I'll probably try to implement the technique. Thanks for any comments / suggestions.

Share this post


Link to post
Share on other sites
That's an interesting and very original idea. But I'd like to see it running in practise. I'm not so sure about the case where no pixels exist in the history buffer (due to the parallax effect). Apparently they're using the depth difference as a trick.. and as soon as tricks are introduced, my confidence goes down.

But the idea behind it is brilliant and totally novel I think. Now *if it works* let's hope there won't be a patent to prevent us from using it :)

Ah, another catch: they mention using multi render targets to not have to use another pass. But if I'm not mistaken, you need to use alpha-blending in order to update this history buffer. Because of the restrictions to MRT, that would mean you need to use alpha-blending too for the color buffer. Of course you can always write 1 to the alpha channel of the color fragment, but it sounds like "another trick".

Y.

Share this post


Link to post
Share on other sites
Quote:
But the idea behind it is brilliant and totally novel I think. Now *if it works* let's hope there won't be a patent to prevent us from using it :)

They claim the Real Time Projection Cache is somewhat similiar to theirs. I only briefly looked over it, but it seems like they briefly discuss applications of their technique to shadow mapping. The "Pixel-Correct ..." technique seems much more developed than the brief mention in the technique.

Quote:
But if I'm not mistaken, you need to use alpha-blending in order to update this history buffer.

Perhaps I am mistaken, but I was under the impression they were updating it via render target ping-ponging.

Thanks for the comments, I too thought the depth difference was a little hackish. The Halton random number jittering seems a little hackish too, but perhaps it works well in practice.

I think I'm going to give it a shot, as it doesn't seem like it would take too long to implement once I figure out the right amount of pseudo-random sub-pixel jittering that is needed. They seem to kind of gloss over that step at the end, and it seems like it would be important.

Share this post


Link to post
Share on other sites
Quote:
Original post by Ysaneya
But the idea behind it is brilliant and totally novel I think.

Uhh, I mean no offense to the original authors, but it's not really very novel. It's just iterative, jittered super-sampling applied to shadow maps. There are a few details with respect to shadow maps (multiple projections), but they are easily overcome by letting the shadow degenerate in these cases. They actually use a really simple "confidence" estimate, but you can actually do a bit more work to reuse data better.

There was a paper this year at Graphics Hardware that attacked temporal coherence, etc. in a much more general (and compelling IMHO way) [Edit: now linked above :)] : general data caching with reprojection. This shadowing method is actually less flexible than just using that caching together with some jittering on the shadow term. They even give an example of accelerating a typically expensive PCF kernel in the GH paper.

First the advantages of these techniques: they work great when you have a high frame rate, etc. Once you're beyond the monitor refresh rate there is *no need* to keep rendering redundant frames, and indeed it's a huge waste of processing power. In those cases, definitely we want to do some supersample of shadows, shaders or even the whole framebuffer. With iterative schemes, this can be handled entirely dynamically in that an increase in rendering cost transfers into a loss in quality rather than a loss of performance.

The disadvantage of course is that if you don't have a high frame rate, iterative supersampling techniques waste time in the best case (updating history buffer, etc) and make the quality worse in the worst case (extra flicking if you're not careful). With shadows this problem is magnified since *either* the camera *or* any objects in the scene moving will cause trouble for the caching. Indeed if both are moving, the problem can become pretty significant.

The paper mentions needing 10-60 frames to "converge", and that's with a fairly high resolution shadow map (1024^2 for a fairly small scenes). The framebuffer is also a pretty minimal 1024^2 as well. With larger shadow projections the convergence rates would undoubtedly go down: convergence takes at least N frames where N is the size of the largest projected shadow map texel (~8x8 in this case? that's not too terrible).

Thus if your game is running at 60fps, 60 frames of convergence is unacceptable for even a moderate amount of dynamic objects/cameras. If your scene is largely static, you should be prebake/computing your lighting anyways (which is effectively what this technique is doing).

That said, I want to reiterate that if very little is changing per frame, re-rendering the same thing is stupid. This technique is a simple way to do something a bit more clever with fairly minimal overhead, but it's not particularly novel IMHO.

Share this post


Link to post
Share on other sites
I disagree. I've never seen it before in any paper or game, or demo...(not that I read every shadowmapping paper that gets written, but i read a lot of them.) Frankly It's the first time I've actually seen it mentioned as a possiblity. It immediately made intuitive sense to me just from the title alone.

I think your criticism may be jumping the gun; have you tried implementing it? It seems like a very simple and cheap way to improve shadow mapping. I assume you would use it to improve the quality of an already fairly decent shadow mapping system, like cascaded shadows, so the convergence shouldn't be too noticeable. I dont see why you can't avoid flickering if you are careful.

That said I guess it would be nice to see a demo before trying to implement it myself...

Share this post


Link to post
Share on other sites
Looks very interesting; I can see one "limitation" right off the bat; its going to tricky getting more than one shadowing light working at once. You might have to have more than one history buffer, though you might be able to use the same history buffer for all lights;

i have a couple questions and maybe you guys understand whats going on better than i do.

in section 3.3 he talks about computing the confidence as
c = 1 - max(|x - centerx|, |y - centery|) * 2

im confused about which x,y and centerx,centery hes talking about. I assume that the x and y are the x and y in terms of the history buffer being rendered to in the range [0, 1] right? or is the range [-1, 1]?
And what does he mean by centerx and centery? he says center of the pixel but center of what pixel? how is this center computed?

Also, I understand why jittering the light is important, but how does one go about "jittering the light space projection window in the light view plane"?

thanks
-chris

Share this post


Link to post
Share on other sites
Quote:
Original post by Matt Aufderheide
I disagree. I've never seen it before in any paper or game, or demo...(not that I read every shadowmapping paper that gets written, but i read a lot of them.)

Oh perhaps not in shadow mapping, but iterative jittered supersampling has been around since before I was born. Applying it to shadow maps - or indeed any intermediate result in a shader computation - isn't exactly too far of a stretch IMHO.

Quote:
Original post by Matt Aufderheide
I think your criticism may be jumping the gun; have you tried implementing it?

Not specifically for shadows, but I've implemented iterative supersampling algorithms many times. They work - and look especially good in screenshots - but of course they're only useful with high frame rates; otherwise they make a negligable difference and add overhead.

But so as to be perfectly clear, let me re-reiterate that people should definitely be doing something clever with the extra GPU power they have. Iterative supersampling is the obvious thing to do and degrades pretty well. Applying it to terms that are expected to alias more (potentially shadows) also makes sense. If those terms are low-frequency or largely static so much the better.

Still, I prefer the much more general treatment of the topic in the GH paper. There's nothing specific about this technique to shadow maps, and indeed the paper generally describes the "obvious" application of iterative supersampling to shadows, modulo some details (domain-specific confidence checking stuff).

Anyways I didn't mean to sound harsh, or imply that the technique is useless. It's just not exactly an epiphany IMHO.

Share this post


Link to post
Share on other sites
Quote:
Original post by coderchris
in section 3.3 he talks about computing the confidence as
c = 1 - max(|x - centerx|, |y - centery|) * 2

im confused about which x,y and centerx,centery hes talking about.

It's just the projected light space "distance" (inf-norm) from the projection of the current fragment to the center of the relevant (point-sampled) texel in shadow space, which is assumed to be 100% "correct". (So c = 1 when x,y == centerx,centery.)

Quote:
Original post by coderchris
Also, I understand why jittering the light is important, but how does one go about "jittering the light space projection window in the light view plane"?

Normal projection jittering, except for the light projection. Can be as simple as:

Mat4x4 projection = [...];
Mat4x4 jitter = translate(jitterx, jittery, 0);
projection *= jitter;


Share this post


Link to post
Share on other sites
Quote:
Original post by AndyTX
people should definitely be doing something clever with the extra GPU power they have
There is no such thing. I don't mean anything rude here, but in my experience only people who don't actually do games for a living talk about "extra GPU power." Another 100% mythical beast would be "enough memory."

Share this post


Link to post
Share on other sites
Thanks andy,

THe projection makes sense, but im still confused about this confidence equation..
So the way I understand it is, you render a full screen quad to update the history texture. In this shader, you read the depth and use that to recover world space position. Then you back project this position onto the shadow map to get the shadow map texture coordinates that you would normally use to determine if a fragment is shadowed or not. However, in this paper, these coordinates are used as "x" and "y" (is that right?). Then "centerx" and "centery" is the x and y clamped onto the nearest texel (correct?)

Share this post


Link to post
Share on other sites
Hey, lots of posts (sorry I've been a little busy).

Quote:
Original post by coderchris...

From what I understand, the conf(x,y) term basically just says that points (in eye space) that, when projected into light space, that are closer to the respective 2d point in shadow map space have a higher confidence of being the correct result (either in shadow or not). Seems to be a little bit of a hack, imo, in the case where the points aren't equal (ie, center_x!=x or center_y!=y).

The main reason I don't use pre-baked shadows is I have a sizeable world, so they wouldn't all fit in texture memory (especially when you consider all the other textures, render targets, etc). I realize I could use a mega-texturing technique or something similar, but this is just for a student/class project I'm working on, so I'd rather not spend hours baking shadows either (with some GI method or something) whenever I change one of my meshes.

Yeah, so the fact this is just a little "advanced real-time rendering" class project (we get the entire semester to make something somewhat mildly impressive), I'm not really bound by the same restrictions as a normal game (I have lots of memory, not doing too many things other than culling on the CPU, no animations to skin, etc, etc). Talking to some people it seems PSVSM is going to be a popular choice for shadows so I thought I might try to make something a little more impressive with regards to shadows in addition or added on to PSVSM.

After looking at the GH paper more, I do agree with AndyTX that it seems to be a more general solution, applicable to more problems than simply shadows.

Anyways, if I do implement this technique or anything similar, I probably won't post any demo until late December / January (when my class ends). So I'm just saying not to expect any demo soon or anything like that.

Share this post


Link to post
Share on other sites
The thing that's confusing me a bit is the video ( http://www.cg.tuwien.ac.at/research/publications/2007/Scherzer-2007-PCS/ ). When the camera doesn't move at all, I was expecting the shadows edges to not flicker / update at all.

Y.

Share this post


Link to post
Share on other sites
Quote:
Original post by Christer Ericson
There is no such thing. I don't mean anything rude here, but in my experience only people who don't actually do games for a living talk about "extra GPU power." Another 100% mythical beast would be "enough memory."

Oh not at all, I completely agree with you :) Indeed that's the reason why IMHO these iterative techniques don't end up getting used much: a "modern" game is often running potentially far below 100fps, in which range they make very little difference and add unnecessary overhead. That said if you make - say - a puzzle game (or something fairly static) targeting 60fps on GeForce 6's, you could certainly do some supersampling on GeForce 8800's :)

Quote:
Original post by coderchris
However, in this paper, these coordinates are used as "x" and "y" (is that right?). Then "centerx" and "centery" is the x and y clamped onto the nearest texel (correct?)

I believe that's correct.

Quote:
Original post by Ysaneya
The thing that's confusing me a bit is the video ( http://www.cg.tuwien.ac.at/research/publications/2007/Scherzer-2007-PCS/ ). When the camera doesn't move at all, I was expecting the shadows edges to not flicker / update at all.

The samples are still being "faded out" with an exponential falloff, so there will continually be new samples introduced (every frame) and old ones eliminated. Unless convergence is very good, or there is a very slow falloff (which would produce artifacts when the camera moves quickly), there will always be a little bit of "flicker". Probably not a huge issue in practice.

And as I missed something in the original post:
Quote:
Original post by wyrzy
My PSSM implementation is pretty basic (I don't do any scene analysis), but even the GPU Gems 3 demo that does has very noticeable artifacts.

Did you check out the PSVSM demo in Gems 3 (in the demo provided with the SAVSM chapter)? It's not covered in the actual chapter text, but the implementation is straightforward. I don't claim that it's a perfect implementation (I threw it together fairly quickly), but it produces pretty good results in practice IMHO and eliminates many of the artifacts associated with PSSM. In particular once you get to something like 3x 1024 VSM's with 4x shadow MSAA things are getting pretty near sub-pixel accuracy even for large framebuffers.

Now there are some bugs and inefficiencies in the split computation code, but generally they're explained in the source pretty well with "TODO"'s :D

Share this post


Link to post
Share on other sites
Quote:
Original post by AndyTX
Did you check out the PSVSM demo in Gems 3 (in the demo provided with the SAVSM chapter)? It's not covered in the actual chapter text, but the implementation is straightforward. I don't claim that it's a perfect implementation (I threw it together fairly quickly), but it produces pretty good results in practice IMHO and eliminates many of the artifacts associated with PSSM. In particular once you get to something like 3x 1024 VSM's with 4x shadow MSAA things are getting pretty near sub-pixel accuracy even for large framebuffers.


Yeah, I checked it out briefly. I have an 8-series card, but I'd really like to have it run on 7-series cards under D3D9 too. In that case, I can't really use MSAA (using RGBA32F) and the manual bilinear filtering of RGBA32F slows me down a little too.

I have coded up an initial PSVSM implementation, but I don't claim its optimal/perfect to say the least. I have had troubles with shallow angles with respect to the light and VSM (shadow seems to become unnaturally soft), though shallow angles are a difficult case in general. I'll probably try improving upon my PSVSM before attempting other techniques, or doing something like VSM on the terrain and then a jittered PCF on objects close to the camera (otherwise their shadows appear unnaturally soft). I'm just looking to plan ahead so I have a rough schedule as my deadline can't be moved or delayed.

The 3x 1024x1024 blur (with manual filtering on RGBA32F targets) pushes the 7-series somewhat as well. However, I haven't really taken any time to optimize my implementation yet.

Anyways, thanks for the comments, I think I have enough to work with.

Share this post


Link to post
Share on other sites
Ok, so I have it all implemented as described in the paper, and im almost getting results. My shadows edges are soft and "swimming" (almost looks like the shadow is "boiling"), with no movement going on in the scene. I also have noticed that if I increase the amount of jitter, I get jittery penumbra shadows; which looks cool, but i was under the impression that its suppose to converge towards crisp shadows... Im almost positive it has something to do with my confidence function; does anybody see anything wrong with this:


//projected is the texture coordinates of the shadowmap projected onto the current pixel in the history buffer
float getConfidence(float2 projected)
{
float2 realPos = projected * shadowMapSize;
//compute the 'centerx' and 'centery' by clamping
float2 clampedPos = floor(realPos) + float2(0.5, 0.5);
float2 posAbs = abs(realPos - clampedPos);
return 1 - max(posAbs.x, posAbs.y) * 2.0;
}


Perhaps i shouldnty be using the projected shadow map coords for these calculations?

Heres a pic of whats going on in case your curious...as you can see the shadow edges arent what they should be and theres some wierd artifacts near where the boxes touch the floor

http://img502.imageshack.us/my.php?image=shadowpicta2.jpg

Share this post


Link to post
Share on other sites
Quote:
Original post by coderchris
which looks cool, but i was under the impression that its suppose to converge towards crisp shadows... Im almost positive it has something to do with my confidence function; does anybody see anything wrong with this:

I haven't had a chance to look at your confidence function in detail, but one thing to note is that you shouldn't be jittering your samples more than a single pixel. Since you're effectively using a box filter to reconstruct the multisampling, you definitely don't want to make this too "wide" a filter, or you'll get over-blurring like you're seeing.

I believe in the paper they also used a power function to control some of these things (the video shows C^2, C^15, etc. which I assume is the confidence function). I may be wrong as I didn't read it over extensively, but that's my recollection... higher powers will tend to disfavor samples that are a long way away from the pixel centers, although conversely not converge as fast to a nice solution.

Share this post


Link to post
Share on other sites
Quote:
Original post by coderchris
I also have noticed that if I increase the amount of jitter, I get jittery penumbra shadows

Do you jitter using the Halton random number sequence as suggested in the paper? I don't know if just using rand()/float(RAND_MAX) would be an acceptable random jitter sequence or not, but there might be a reason they specifically recommend Halton.


Regarding getConfidence(), it looks fine to me, but I've been busy lately and haven't had time to even start on this.

Also, as AndyTX mentioned, they use exponential smoothing (see page 3).

Share this post


Link to post
Share on other sites
I am using the hamilton sequence; got the code outline form the very site you linked to. what range of values should i be passing into the hamilton sequence? At the moment, I just have a counter veriable that gets incremented every frame, and I pass this value along with a dimension to get the hamilton. WHen my counter gets to 1000, i reset it to 1; is this a good idea? Also, for the x jitter i use dimension 2, y jitter i use 3, and rotational jitter i use 5...It seems pretty random and well distributed so I dont think my jittering is too off; except i guess i could scale it a little

I also raise my confidence by a power as they suggest; Ill just have to mess around with different configurations of the confidence computation untill it actually converges; because at the moment it doesnt really converge at all; just stays random and jittery

Share this post


Link to post
Share on other sites

This topic is 3722 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this