Opinions on "Pixel-Correct Shadow Maps with Temporal Reprojection .."

Started by
16 comments, last by coderchris 16 years, 6 months ago
Has anyone read / implemented the shadowing technique used in "Pixel-Correct Shadow Maps with Temporal Reprojection and Shadow Test Confidence"? Does anyone know of any major drawbacks? I read over the paper a few times, and it doesn't look like it would be too difficult to implement, but these research techniques sometimes have failure cases not mentioned too much in the papers (I'm remembering PSM and related frustum warping techniques). There's no demo, but it shows results of the dueling frusta case with pixel correct shadows (after a certain number of frames). I was quite impressed by that screenshot (considering the dueling frusta case is known for problems). It seems like it wouldn't really work with dynamic (objects moving) scenes, but I don't really need it to work with dynamic scenes, so that's not really a drawback for me. I have implemented a basic PSSM/VSM shadow mapping algorithm, but still end up with rather noticeable aliasing in large outdoor scenes. I have tried fewer splits (3 or 4) and using VSM, but I end up with very soft shadows due to not having enough resolution. Going the other way (more splits) still resulted in noticeable aliasing and I had trouble blurring the shadow maps for VSM and still maintaining good frame rates on the GeForce 6/7 line of cards. My PSSM implementation is pretty basic (I don't do any scene analysis), but even the GPU Gems 3 demo that does has very noticeable artifacts. So, basically, I'm thinking of implementing that technique for this project I'm working on, but I wanted to check to see if people knew of major drawbacks / reasons why this technique would fail in the general case. It doesn't seem like it would run too slow on say a 7900 GT, which is kind of what I'm aiming for. If no one knows of any drawbacks I'll probably try to implement the technique. Thanks for any comments / suggestions.
Advertisement
That's an interesting and very original idea. But I'd like to see it running in practise. I'm not so sure about the case where no pixels exist in the history buffer (due to the parallax effect). Apparently they're using the depth difference as a trick.. and as soon as tricks are introduced, my confidence goes down.

But the idea behind it is brilliant and totally novel I think. Now *if it works* let's hope there won't be a patent to prevent us from using it :)

Ah, another catch: they mention using multi render targets to not have to use another pass. But if I'm not mistaken, you need to use alpha-blending in order to update this history buffer. Because of the restrictions to MRT, that would mean you need to use alpha-blending too for the color buffer. Of course you can always write 1 to the alpha channel of the color fragment, but it sounds like "another trick".

Y.
Quote:But the idea behind it is brilliant and totally novel I think. Now *if it works* let's hope there won't be a patent to prevent us from using it :)

They claim the Real Time Projection Cache is somewhat similiar to theirs. I only briefly looked over it, but it seems like they briefly discuss applications of their technique to shadow mapping. The "Pixel-Correct ..." technique seems much more developed than the brief mention in the technique.

Quote:But if I'm not mistaken, you need to use alpha-blending in order to update this history buffer.

Perhaps I am mistaken, but I was under the impression they were updating it via render target ping-ponging.

Thanks for the comments, I too thought the depth difference was a little hackish. The Halton random number jittering seems a little hackish too, but perhaps it works well in practice.

I think I'm going to give it a shot, as it doesn't seem like it would take too long to implement once I figure out the right amount of pseudo-random sub-pixel jittering that is needed. They seem to kind of gloss over that step at the end, and it seems like it would be important.
Quote:Original post by Ysaneya
But the idea behind it is brilliant and totally novel I think.

Uhh, I mean no offense to the original authors, but it's not really very novel. It's just iterative, jittered super-sampling applied to shadow maps. There are a few details with respect to shadow maps (multiple projections), but they are easily overcome by letting the shadow degenerate in these cases. They actually use a really simple "confidence" estimate, but you can actually do a bit more work to reuse data better.

There was a paper this year at Graphics Hardware that attacked temporal coherence, etc. in a much more general (and compelling IMHO way) [Edit: now linked above :)] : general data caching with reprojection. This shadowing method is actually less flexible than just using that caching together with some jittering on the shadow term. They even give an example of accelerating a typically expensive PCF kernel in the GH paper.

First the advantages of these techniques: they work great when you have a high frame rate, etc. Once you're beyond the monitor refresh rate there is *no need* to keep rendering redundant frames, and indeed it's a huge waste of processing power. In those cases, definitely we want to do some supersample of shadows, shaders or even the whole framebuffer. With iterative schemes, this can be handled entirely dynamically in that an increase in rendering cost transfers into a loss in quality rather than a loss of performance.

The disadvantage of course is that if you don't have a high frame rate, iterative supersampling techniques waste time in the best case (updating history buffer, etc) and make the quality worse in the worst case (extra flicking if you're not careful). With shadows this problem is magnified since *either* the camera *or* any objects in the scene moving will cause trouble for the caching. Indeed if both are moving, the problem can become pretty significant.

The paper mentions needing 10-60 frames to "converge", and that's with a fairly high resolution shadow map (1024^2 for a fairly small scenes). The framebuffer is also a pretty minimal 1024^2 as well. With larger shadow projections the convergence rates would undoubtedly go down: convergence takes at least N frames where N is the size of the largest projected shadow map texel (~8x8 in this case? that's not too terrible).

Thus if your game is running at 60fps, 60 frames of convergence is unacceptable for even a moderate amount of dynamic objects/cameras. If your scene is largely static, you should be prebake/computing your lighting anyways (which is effectively what this technique is doing).

That said, I want to reiterate that if very little is changing per frame, re-rendering the same thing is stupid. This technique is a simple way to do something a bit more clever with fairly minimal overhead, but it's not particularly novel IMHO.
I disagree. I've never seen it before in any paper or game, or demo...(not that I read every shadowmapping paper that gets written, but i read a lot of them.) Frankly It's the first time I've actually seen it mentioned as a possiblity. It immediately made intuitive sense to me just from the title alone.

I think your criticism may be jumping the gun; have you tried implementing it? It seems like a very simple and cheap way to improve shadow mapping. I assume you would use it to improve the quality of an already fairly decent shadow mapping system, like cascaded shadows, so the convergence shouldn't be too noticeable. I dont see why you can't avoid flickering if you are careful.

That said I guess it would be nice to see a demo before trying to implement it myself...
Looks very interesting; I can see one "limitation" right off the bat; its going to tricky getting more than one shadowing light working at once. You might have to have more than one history buffer, though you might be able to use the same history buffer for all lights;

i have a couple questions and maybe you guys understand whats going on better than i do.

in section 3.3 he talks about computing the confidence as
c = 1 - max(|x - centerx|, |y - centery|) * 2

im confused about which x,y and centerx,centery hes talking about. I assume that the x and y are the x and y in terms of the history buffer being rendered to in the range [0, 1] right? or is the range [-1, 1]?
And what does he mean by centerx and centery? he says center of the pixel but center of what pixel? how is this center computed?

Also, I understand why jittering the light is important, but how does one go about "jittering the light space projection window in the light view plane"?

thanks
-chris
Quote:Original post by Matt Aufderheide
I disagree. I've never seen it before in any paper or game, or demo...(not that I read every shadowmapping paper that gets written, but i read a lot of them.)

Oh perhaps not in shadow mapping, but iterative jittered supersampling has been around since before I was born. Applying it to shadow maps - or indeed any intermediate result in a shader computation - isn't exactly too far of a stretch IMHO.

Quote:Original post by Matt Aufderheide
I think your criticism may be jumping the gun; have you tried implementing it?

Not specifically for shadows, but I've implemented iterative supersampling algorithms many times. They work - and look especially good in screenshots - but of course they're only useful with high frame rates; otherwise they make a negligable difference and add overhead.

But so as to be perfectly clear, let me re-reiterate that people should definitely be doing something clever with the extra GPU power they have. Iterative supersampling is the obvious thing to do and degrades pretty well. Applying it to terms that are expected to alias more (potentially shadows) also makes sense. If those terms are low-frequency or largely static so much the better.

Still, I prefer the much more general treatment of the topic in the GH paper. There's nothing specific about this technique to shadow maps, and indeed the paper generally describes the "obvious" application of iterative supersampling to shadows, modulo some details (domain-specific confidence checking stuff).

Anyways I didn't mean to sound harsh, or imply that the technique is useless. It's just not exactly an epiphany IMHO.
Quote:Original post by coderchris
in section 3.3 he talks about computing the confidence as
c = 1 - max(|x - centerx|, |y - centery|) * 2

im confused about which x,y and centerx,centery hes talking about.

It's just the projected light space "distance" (inf-norm) from the projection of the current fragment to the center of the relevant (point-sampled) texel in shadow space, which is assumed to be 100% "correct". (So c = 1 when x,y == centerx,centery.)

Quote:Original post by coderchris
Also, I understand why jittering the light is important, but how does one go about "jittering the light space projection window in the light view plane"?

Normal projection jittering, except for the light projection. Can be as simple as:
Mat4x4 projection = [...];Mat4x4 jitter = translate(jitterx, jittery, 0);projection *= jitter;
Quote:Original post by AndyTX
people should definitely be doing something clever with the extra GPU power they have
There is no such thing. I don't mean anything rude here, but in my experience only people who don't actually do games for a living talk about "extra GPU power." Another 100% mythical beast would be "enough memory."

Thanks andy,

THe projection makes sense, but im still confused about this confidence equation..
So the way I understand it is, you render a full screen quad to update the history texture. In this shader, you read the depth and use that to recover world space position. Then you back project this position onto the shadow map to get the shadow map texture coordinates that you would normally use to determine if a fragment is shadowed or not. However, in this paper, these coordinates are used as "x" and "y" (is that right?). Then "centerx" and "centery" is the x and y clamped onto the nearest texel (correct?)

This topic is closed to new replies.

Advertisement