Quote:Original post by Yann L
the price you pay for quality. But 4 samples are acceptable in terms of speed, and are comparable to nvidias hardware PCF from a quality point of view.
Comparable to hardware PCF ? My experience has been the opposite, but i only tested my code on ATI cards, so i don't know if the result on NVidia cards looks similar or not (note to self: take some time to test it). My logic tells me it should as i'm only using the ARB_fp path. I only have 4 shades of brightness in my shadows edges (with 4 samples, that is), and from what i saw, NVidia's PCF is similar to bilinear filtering the results (it looks nice and smooth). Unless you are looking at the shadows from the distance, but if you zoom in, even 4 samples are looking quite ugly. 8 samples is a bit better but you can still notice the sampling. 16 samples is almost perfect but is horribly slow. Do you have some screenshots of 4 samples so that i can compare with what i got ?
Quote:Original post by Yann L
Hmm. How exactly did you do the sampling ? I suspect something specific, but, hmm, can you post the relevant fragment program snippet ?
The sampling for dithering is done in eye space. That's one of the reasons why it looks ugly. I basically have a small noise texture (a pattern tiled a lot of times) that is used to offset randomly the tex coords after the projection of the shadow map.
I'm only posting the relevant parts of the shader:
# ditheringTEMP texc;TEMP screenPos;TEMP dither;# texcoord 1 contains the tex coords for shadow projectionTXP dither, fragment.texcoord[1], texture[1], 2D;MAD dither, dither, 2.0, -1.0;MUL dither, dither, 0.0005;RCP texc.w, fragment.texcoord[1].w;MUL texc, fragment.texcoord[1], texc.w;MUL screenPos.x, texc.x, 400.0;MUL screenPos.y, texc.y, 300.0;FRC screenPos, screenPos;SGE dither, screenPos, 0.5;ADD dither.y, dither.y, dither.x;SGE dither.z, dither.y, 1.1;SUB dither.z, 1.0, dither.z;MUL dither.y, dither.y, dither.z;MUL dither, dither, 0.0005;
After this, "dither" contains an offset that is applied when sampling the shadow 4 times. I also tried with a regular pattern (not so random) but results the are always ugly, and the performance drop is tremendous.
Quote:Original post by Yann L
misunderstanding due to my sloppy terminology: when I said "depth map", I was referring to your texture containing depth values, I didn't mean the GL_DEPTH_COMPONENTxx format specifically. No, you can't directly render a GL_DEPTH texture to the depth buffer without going through an intermediate format (eg. as a packed RGBA or floating point depth map, it can be rendered to the "real" zbuffer using a fragment program). If you keep everything in packed RGBA, then downsampling the map on the hardware is rather straightforward. Don't forget to turn off bilinear filtering, though.
True, but then you do not benefit from NVidia's hardware PCF. All of that is becoming a bit confusing :) So as i understand it, since in your cathedral you were using hardware PCF, you were not able to use that shadow map redimensionning trick, or am i even more confused than i thought :p ? Or is there a way to enable hardware PCF in a pixel shader (doubtful) ?
Quote:Original post by Yann L
OK. In that case, the hardware supported downsampling will be a little more involved, and might require a readback operation.
I think i'll just switch all my shadow maps to pixel shaders with depth encoded as RGBA. That way i will have a single path for all cards. Remains the question of compatibility with older cards like the GF3/GF4. Can all of these shaders be implemented with NV pixel shaders ?
Quote:Original post by Yann L
hardware can't unpack the texture as needed. First, I simulated depth cubemaps manually, by projecting six separate 2D maps and recombining the results.
I also tried that, but the performance drop is quite impressive too...
Quote:Original post by Yann L
I later dropped cubemaps in favour of dual paraboloid entirely. Even later, when I added the final ARB_FP code path, I used cubemaps again, with optional DP maps as a fallback option and as part of the LOD system.
That makes me wonder.. i know the theory behind DP maps, but would you say it is really worth the effort ? As i understand it you need a pretty highly tesselated scene in order to avoid the artifacts due to the texture coordinates interpolation which should no longer be linear.
Y.