This topic is 3867 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

Recently I came across a paper called "Convolution Shadow Maps" (short CSM to allow a clear distinction between them and Cascading or Coherent shadow maps ;-) It sounds interesting: No light bleeding but filtering properties like VSM. Standard 8 bits per channel textures instead of floating buffers and for M=4 they need 2 RGBA8 textures and the artefacts look acceptable. What do you think about it? Has anyone already started implementing? Any opinions or experiences?

##### Share on other sites
I'm certainly interested in it, as it certainly falls into the category of interesting ways to parameterize the visibility function, along with deep and variance shadow maps. Using a Fourier decomposition is also useful in that it does filter linearly, just like variance shadow maps.

I haven't had time to implement it myself yet, but I do have a few concerns:

First, even though lower-precision textures can be used, one needs a lot of terms to get reasonable quality. In the paper, they use 16 terms, which is already feeling a tad abusive on texture memory and bandwidth. And it shows: performance is pretty poor (ex. 60fps for a 512^2 shadow map on an 8800GTX for a scene with about a dozen polygons) even with extremely simple scenes.

I'm not *too* concerned about this since this problem will go away as hardware gets better, but note that just brute-forcing a wide PCF kernel will become reasonable as well... in my testing even 8x8 PCF is completely reasonable already so assuming you have a reasonable projection parameterization (which any number of the projection/warping/splitting/etc. algorithms will give you nowadays), the performance benefits of hardware pre-filtering are lost if the alternate technique has to squander these gains in complexity.

Secondly, though the technique does not have the same "light bleeding" artifacts as VSM and other algorithms, it does suffer from significant problems near contact points. In the demo from the authors' page even with 16 Fourier terms (the maximum in their implementation) "fading to light" is clearly visible near contact points, which is pretty unacceptable considering these places are arguably the most important shadows with respect to giving the viewer good depth cues. This is unfortunately a consequence of the Fourier reconstruction, which is arguable not particularly suited to representing step functions like visibility (it was clearly chosen for its linearity and fairly simple reconstruction).

To make matters worse, this problem becomes worse as the light depth ranges increase. In the scene from Figure 6 in the paper if you increase the light depth range from 2 to 10 (i.e. beyond barely covering the tiny scene), the artifacts even with M=16 are as bad or worse than in the paper with M=2! I don't see a good way of getting around this problem; it is a fundamental limitation with the way that truncated Fourier series work.

There are a few other reasons why the technique is interesting with respect to plausible soft shadows (which they did not explore), but those are future work really. IMHO (I'm admittedly somewhat biased ;)), VSMs still have one large theoretical advantage which I believe makes them a good starting point for future work: in the "simplest" case, they get the 100% correct result and only are a bad approximation when the visibility function necessarily must carry more information. Thus they seem like a rather good basis for an adaptive algorithm, as is really required to implement shadows efficiently.

Anyways I don't mean to trash the idea - I actually like it quite a lot and am particularly pleased that people are continuing to come up with new ways to parameterize the shadow function, following in the footsteps of deep, opacity and variance shadow maps to name a few.

To the OP, did you run their demo? I'm surprised that you found the artifacts acceptable with M=4, because I found them to be pretty bad in some cases even with M=16 (as described above). Furthermore I suspect that the quality improvement from adding more terms diminishes quickly, so removing the remaining artifacts would require an unreasonable number of terms.

##### Share on other sites
Quote:
 Original post by AndyTXTo the OP, did you run their demo?

Yes I tried to, but on Vista with a Geforce 7900 it didn't work so I was just judging from the screenshots in the paper.
VSM seems to be the perfect reason for buying a 8800. I had (a little) hope that CSM might be better doable on older hardware.

##### Share on other sites
Quote:
 Original post by krausestYes I tried to, but on Vista with a Geforce 7900 it didn't work so I was just judging from the screenshots in the paper.

Odd that it doesn't run on a 7900... I wonder what feature(s) of the 8 series it uses. Unfortunately the screenshot in the paper is actually taken at a clever angle that occludes many of the artifacts. I can post some screenshots of what I'm talking about if you'd like.

Quote:
 Original post by krausestVSM seems to be the perfect reason for buying a 8800. I had (a little) hope that CSM might be better doable on older hardware.

fp32 filtering is certainly very nice, and I agree that VSM is pretty unusable on fp16 or lower (except for extremely small light ranges). Summed-area variance shadow maps work on older hardware (they do not require hardware filtering support), although they are more performance intensive than 'stock' VSMs.

Back on topic though, I still have some hope for CSMs as well, and some ideas on how to make them better. In their current state however I think it's unfair for them to claim that their artifacts compare favourably to VSM's.

##### Share on other sites
Quote:
 Original post by AndyTXOdd that it doesn't run on a 7900... I wonder what feature(s) of the 8 series it uses.

It uses floating point depth buffers (GL_NV_depth_buffer_float). It didn't run for me either (7950 GT), so after some GLIntercept-ing i found that it uses a depth texture with internal format GL_DEPTH_COMPONENT32F_NV.

HellRaiZer

##### Share on other sites
Quote:
 Original post by HellRaiZerIt uses floating point depth buffers (GL_NV_depth_buffer_float). It didn't run for me either (7950 GT), so after some GLIntercept-ing i found that it uses a depth texture with internal format GL_DEPTH_COMPONENT32F_NV.

Ah, good find. Probably overkill for shadows (or alternatively, just render to a colour 1f and sacrifice the potential double-speed-z... no biggie), but whatever. My guess would be that it's faster to simply render out all of the Fourier terms in the shadow rendering pass rather than rendering a shadow map and transforming it to the Fourier representation, as using a high-precision depth buffer seems to imply (did you notice whether that's true with your GLintercepting?).

##### Share on other sites
I tried this out on my 8800, and it ran OK, but fairly slow using 512*512 shadow maps.

As for the method, I've seen better. The fact that light bleeding is avoided is offset by other problems, and the slow speed. Also, since no true penumbra is created, i dont really see why you'd use this much power to blur the shadow like this. Just use a good PCF filter..with textured surfaces the shadow edge pixels are barely detectable. Frankly I'm not a big fan of VSM either for similar reasons, but its better than this in my opinion.

##### Share on other sites
Quote:
 Original post by AndyTXMy guess would be that it's faster to simply render out all of the Fourier terms in the shadow rendering pass rather than rendering a shadow map and transforming it to the Fourier representation, as using a high-precision depth buffer seems to imply (did you notice whether that's true with your GLintercepting?).

It doesn't look like they render the Fourier terms directly. All the rendering commands for the incomplete FBO (the one with the D32F depth map) use the following shader, which as far as i can tell outputs fragment depth as color.

//== PROGRAM LINK STATUS = TRUE//== PROGRAM VALIDATE STATUS = TRUE//======================================================//   Vertex Shader 22 //======================================================//== SHADER COMPILE STATUS = TRUE//=====================================================================// This shader maps z-values in a regular hyperbolic range!//=====================================================================void main(){  gl_Position = ftransform();	 gl_TexCoord[5] = gl_MultiTexCoord5;}//======================================================//   Fragment Shader 23//======================================================//== SHADER COMPILE STATUS = TRUE//=====================================================================// Pixel shader to pass z-values.																			//=====================================================================void main(){  gl_FragColor.x = gl_FragCoord.z;}

When the D32F shadowmap is used the shader seems to use it as a regular shadow map (a simple shadow2DProj). I don't know if this is the correct way (i haven't read the paper to be honest). The only shader which uses the shadow map is this :

//== PROGRAM LINK STATUS = TRUE//== PROGRAM VALIDATE STATUS = TRUE//======================================================//   Vertex Shader 10 //======================================================//== SHADER COMPILE STATUS = TRUE//=========================================================================//// lighting_lin.vtx//// OpenGL per vertex lighting with linear depth values in texture // coordinate z.////=========================================================================varying vec4 diffuse, ambient;varying vec3 normal, light_dir, half_vector;varying float dist;void main(){		vec4 ec_pos;	vec3 aux;	normal = normalize(gl_NormalMatrix * gl_Normal);	// These are the new lines of code to compute the light's direction.	ec_pos = gl_ModelViewMatrix * gl_Vertex;	light_dir = (gl_LightSource[0].position - ec_pos).xyz;	dist = length(light_dir);	aux = normalize(light_dir);	// Do not use the light source state half-vector! It is wrong!	half_vector = -ec_pos.xyz + light_dir;	// Compute the diffuse, ambient and globalAmbient terms.	diffuse = gl_FrontMaterial.diffuse * gl_LightSource[0].diffuse;	// The ambient terms have been separated since one of them	// suffers attenuation.	ambient = gl_FrontMaterial.ambient * gl_LightSource[0].ambient;	gl_Position = ftransform();	// Do a linear mapping by pre-multiplying perspective division.	/* OLD 	vec4 eye_pos =  gl_TextureMatrix[0] * gl_Vertex;  vec4 lin_pos =  gl_TextureMatrix[2] * eye_pos;  lin_pos.z		 = (gl_TextureMatrix[1] * eye_pos).z;  lin_pos.z		*= lin_pos.w;	*/	vec4 eye_pos =  gl_TextureMatrix[0] * gl_Vertex;  vec4 lin_pos =  gl_TextureMatrix[2] * eye_pos;  lin_pos.z		 = (gl_TextureMatrix[1] * eye_pos).z;	gl_TexCoord[0] = lin_pos;	gl_TexCoord[5] = gl_MultiTexCoord5;}//======================================================//   Fragment Shader 11//======================================================//== SHADER COMPILE STATUS = TRUE//=========================================================================//// shadow_mapping_pcf.pxl////=========================================================================varying vec4 diffuse, ambient, light_dir, half_vector;varying vec3 normal;varying float dist;uniform int use_diffuse;uniform int use_opacity;uniform sampler2D diffuse_tex;uniform sampler2D opacity_tex;uniform sampler2DShadow shadow_map;vec4 GetDiffuseTex()  { return texture2DProj(diffuse_tex, gl_TexCoord[5]); }float GetOpacityTex() { return texture2DProj(opacity_tex, gl_TexCoord[5]).x; }//=========================================================================//// Reconstruct shadow test signal. We only use linear depth.////=========================================================================vec4 ShadowTerm(){  vec4 tc = gl_TexCoord[0];  tc.z *= gl_TexCoord[0].w;	// Undo perspective division for z.  return vec4(shadow2DProj(shadow_map, tc).x);}vec4 DiffusePart(const vec3 p_n, const vec3 p_ldir){	// Compute the dot product between normal and normalized lightdir.	// Support two-sided lighting for dragon wings.	float n_dot_l = dot(p_n, normalize(p_ldir));	vec4 diff_tex = (1 == use_diffuse)? GetDiffuseTex() : vec4(1.0);	return diffuse * diff_tex * n_dot_l;}vec4 SpecularPart(const vec3 p_n, const vec3 p_hv){	float n_dot_hv = dot(p_n, normalize(p_hv));	return gl_FrontMaterial.specular * gl_LightSource[0].specular *				 pow(n_dot_hv,gl_FrontMaterial.shininess);}//=========================================================================//// Main function to compute per pixel lighting and modulate the finale// pixel color by the shadow term.////=========================================================================void main(){	vec4 color = gl_LightModel.ambient * gl_FrontMaterial.ambient;	float att;	// A fragment shader can't write a varying variable, hence we need	// a new variable to store the normalized interpolated normal.	vec3 n = normalize(normal);	float spot_effect = dot(normalize(gl_LightSource[0].spotDirection), 													normalize(-light_dir.xyz));	if(spot_effect > gl_LightSource[0].spotCosCutoff) 	{		att = 1.0 / (gl_LightSource[0].constantAttenuation +								 gl_LightSource[0].linearAttenuation * dist +								 gl_LightSource[0].quadraticAttenuation * dist * dist);		color += vec4(att) * DiffusePart(n, normalize(light_dir.xyz));		color += vec4(att) * SpecularPart(n, normalize(half_vector.xyz));		// Finally, apply the spot effect.		color *= pow(spot_effect, gl_LightSource[0].spotExponent);		color *= ShadowTerm();	}	// Compute shadow term and multiply current color with it.	gl_FragColor = color;	if(1 == use_opacity)		gl_FragColor.w = GetOpacityTex();}

HellRaiZer

##### Share on other sites
Quote:
 Original post by Matt AufderheideAs for the method, I've seen better. The fact that light bleeding is avoided is offset by other problems, and the slow speed.

I do agree with you here, although perhaps the technique can be developed further.

Quote:
 Original post by Matt AufderheideAlso, since no true penumbra is created, i dont really see why you'd use this much power to blur the shadow like this. Just use a good PCF filter..with textured surfaces the shadow edge pixels are barely detectable. Frankly I'm not a big fan of VSM either for similar reasons, but its better than this in my opinion.

The big problem with PCF is once you start using it "properly" (i.e. filtering over the actual sample extents in texture space using ddx/ddy of the texture coordinates), it requires dynamic branching and gets very slow since these filter regions can get rather big. The more common "neighborhood sampled" PCF works in simple cases but really looks quite terrible in some pretty common cases that require anisotropic filtering, such as a first-person camera viewing shadows projected onto the ground. When compared to VSM/CSM with mipmapping PCF starts to look even worse (the video on that site demonstrates nicely what happens, even with PCF)... These poorly-filtered shadows were acceptable when we couldn't do any better, but those times have passed IMHO. The other big problem with PCF is biasing which gets unmanageable with large, dynamic filter sizes.

Anyways I'm happy that people are still researching on this front as "stock PCF" is pretty unacceptable moving forward IMHO - it just scales terribly and if you avoid the poor scaling you must make a significant sacrifice to image quality. I don't think CSM (or even VSM) is the "final answer", but I would not be surprised if a killer robust, probably-adaptive algorithm emerges based on the same ideas.

##### Share on other sites
Quote:
 Original post by HellRaiZerWhen the D32F shadowmap is used the shader seems to use it as a regular shadow map (a simple shadow2DProj). I don't know if this is the correct way (i haven't read the paper to be honest). The only shader which uses the shadow map is this:

From a glance, that looks almost like the standard PCF filter - did you set it to "CSM" before intercepting this stuff (for some reason the default mode is PCF)?

##### Share on other sites
Quote:
 From a glance, that looks almost like the standard PCF filter - did you set it to "CSM" before intercepting this stuff (for some reason the default mode is PCF)?

Oops... Sorry for that. You are right. I forgot about the CSM option.

I did again (with CSM selected this time) and it looks like the shadowmap generation pass is the same as above. The actual shadowmap projection is different :
//== PROGRAM LINK STATUS = TRUE//== PROGRAM VALIDATE STATUS = TRUE/*== INFO LOG ==Fragment info-------------(112) : warning C7503: OpenGL does not allow C-style casts(125) : warning C7503: OpenGL does not allow C-style casts(132) : warning C7503: OpenGL does not allow C-style casts(139) : warning C7503: OpenGL does not allow C-style casts(146) : warning C7503: OpenGL does not allow C-style casts  == INFO LOG END ==*///======================================================//   Vertex Shader 16 //======================================================//== SHADER COMPILE STATUS = TRUE//=========================================================================//// lighting_lin.vtx//// OpenGL per vertex lighting with linear depth values in texture // coordinate z.////=========================================================================varying vec4 diffuse, ambient;varying vec3 normal, light_dir, half_vector;varying float dist;void main(){		vec4 ec_pos;	vec3 aux;	normal = normalize(gl_NormalMatrix * gl_Normal);	// These are the new lines of code to compute the light's direction.	ec_pos = gl_ModelViewMatrix * gl_Vertex;	light_dir = (gl_LightSource[0].position - ec_pos).xyz;	dist = length(light_dir);	aux = normalize(light_dir);	// Do not use the light source state half-vector! It is wrong!	half_vector = -ec_pos.xyz + light_dir;	// Compute the diffuse, ambient and globalAmbient terms.	diffuse = gl_FrontMaterial.diffuse * gl_LightSource[0].diffuse;	// The ambient terms have been separated since one of them	// suffers attenuation.	ambient = gl_FrontMaterial.ambient * gl_LightSource[0].ambient;	gl_Position = ftransform();	// Do a linear mapping by pre-multiplying perspective division.	/* OLD 	vec4 eye_pos =  gl_TextureMatrix[0] * gl_Vertex;  vec4 lin_pos =  gl_TextureMatrix[2] * eye_pos;  lin_pos.z		 = (gl_TextureMatrix[1] * eye_pos).z;  lin_pos.z		*= lin_pos.w;	*/	vec4 eye_pos =  gl_TextureMatrix[0] * gl_Vertex;  vec4 lin_pos =  gl_TextureMatrix[2] * eye_pos;  lin_pos.z		 = (gl_TextureMatrix[1] * eye_pos).z;	gl_TexCoord[0] = lin_pos;	gl_TexCoord[5] = gl_MultiTexCoord5;}//======================================================//   Fragment Shader 17//======================================================//== SHADER COMPILE STATUS = TRUE/*== INFO LOG ==(112) : warning C7503: OpenGL does not allow C-style casts(125) : warning C7503: OpenGL does not allow C-style casts(132) : warning C7503: OpenGL does not allow C-style casts(139) : warning C7503: OpenGL does not allow C-style casts(146) : warning C7503: OpenGL does not allow C-style casts  == INFO LOG END ==*///=========================================================================//// adaptive_fse_shadow_mapping.pxl////=========================================================================varying vec4 diffuse, ambient;varying vec3 normal, light_dir, half_vector;varying float dist;uniform int use_diffuse;uniform int use_opacity;// 0 = uncompress_flag// 1 = n_terms// 2 = near// 3 = far-near;uniform vec4 options;uniform mat4 weights;uniform sampler2D diffuse_tex;uniform sampler2D opacity_tex;uniform sampler2D fse_sin[4];uniform sampler2D fse_cos[4];uniform sampler2DShadow shadow_map;uniform sampler2D lod_tex;const vec4 scale = vec4(2.5);const vec4 bias = vec4(1.25);#define PI 3.14159265vec4 GetDiffuseTex()  { return texture2DProj(diffuse_tex, gl_TexCoord[5]); }float GetOpacityTex() { return texture2DProj(opacity_tex, gl_TexCoord[5]).x; }float GetHWLOD()  { return texture2DProj(lod_tex, gl_TexCoord[0]).x; }//=========================================================================//// Compute a coefficient 4D vector Ck which starts at k.////=========================================================================vec4 Ck_v(const int k) {  return vec4(PI * (2.0 * float(k  ) - 1.0),              PI * (2.0 * float(k+1) - 1.0),              PI * (2.0 * float(k+2) - 1.0),              PI * (2.0 * float(k+3) - 1.0));}//=========================================================================//// Determine the LOD for soft shadows.////=========================================================================float DetermineLOD(const vec3 ray_d, const vec3 ray_z){	vec3 light_dir = vec3(0.0, 0.0, 1.0);  float z = dot(ray_z, light_dir);  float d = dot(ray_d, light_dir);    float diff = max(d-z, 0.0)/z;  //P = S*(d-z)/z  return (ray_d.z <= ray_z.z)? 		0.0 :                   // Lit.    // S  * (d-z)/z * MAX_LOD    1.7 * diff * 9.0;			// In shadow.}// Undo near-far mapping to [0;1].float BackToEyeSpace(const float val){	// -(z_l * (far-near) - near).  return -(-val*options.w-options.z);}vec4 ShadowTerm(){	// Later this register contains the distance d. We'll spare it to convert	// the z texture-coodrinate in order to maintain linear z.  vec4 d_v;	vec3 ray_d;	ray_d.xy  = vec2(2.0) * vec2(gl_TexCoord[0].xy / gl_TexCoord[0].w) - vec2(1.0);  ray_d.z   = BackToEyeSpace(gl_TexCoord[0].z);	// As we don't do any search we only get shadows for those sampled which	// hit the shadow map. Therefore, we don't have to compute the the ray to	// the sample of closest z.	vec3 ray_z = ray_d;  d_v = gl_TexCoord[0];  d_v.z *= d_v.w;	ray_z.z = BackToEyeSpace(shadow2DProj(shadow_map, d_v).x);  d_v = vec4(gl_TexCoord[0].z);  vec4 tmp, sin_val, cos_val;	float sum0 = 0.0;	float sum1 = 0.0;	// Allow an adaptive FSE length. Go uot to the maximum of 16.  //float lod = DetermineLOD(ray_d, ray_z);	//lod += GetHWLOD() * 9.0;	for(int i=0; i<4; ++i)	{		int idx = i * 4;		//sin_val = texture2DProjLod(fse_sin, gl_TexCoord[0], lod);		//cos_val = texture2DProjLod(fse_cos, gl_TexCoord[0], lod);		sin_val = texture2DProj(fse_sin, gl_TexCoord[0]);		cos_val = texture2DProj(fse_cos, gl_TexCoord[0]);		// Scale back.		if(1 == (int)options.x)		{			sin_val = (sin_val * scale) - bias;  			cos_val = (cos_val * scale) - bias;		}		tmp = Ck_v((i*4)+1);			// Add i-th component.		sum0 += cos(tmp.x * d_v.x) / tmp.x * sin_val.x * weights.x;		sum1 += sin(tmp.x * d_v.x) / tmp.x * cos_val.x * weights.x;				++idx;		if((int)options.y <= idx)			break;		sum0 += cos(tmp.y * d_v.y) / tmp.y * sin_val.y * weights.y;		sum1 += sin(tmp.y * d_v.y) / tmp.y * cos_val.y * weights.y;		++idx;		if((int)options.y <= idx)			break;		sum0 += cos(tmp.z * d_v.z) / tmp.z * sin_val.z * weights.z;		sum1 += sin(tmp.z * d_v.z) / tmp.z * cos_val.z * weights.z;		++idx;		if((int)options.y <= idx)			break;		sum0 += cos(tmp.w * d_v.w) / tmp.w * sin_val.w * weights.w;		sum1 += sin(tmp.w * d_v.w) / tmp.w * cos_val.w * weights.w;		++idx;		if((int)options.y <= idx)			break;	}	float rec = 0.5 + 2.0 * (sum0 - sum1);	// This multiplication ensures that un-occluded pixel are not darkened.	// Due to the scale and bias of the reconstructed shadow test function	// it has a value of 0.5 for (d - z) close to 0.0.  return vec4(clamp(2.0*rec, 0.0, 1.0));}vec4 DiffusePart(const vec3 p_n, const vec3 p_ldir){	// Compute the dot product between normal and normalized lightdir.	// Support two-sided lighting for dragon wings.	float n_dot_l = dot(p_n, normalize(p_ldir));	vec4 diff_tex = (1 == use_diffuse)? GetDiffuseTex() : vec4(1.0);	return diffuse * diff_tex * n_dot_l;}vec4 SpecularPart(const vec3 p_n, const vec3 p_hv){	float n_dot_hv = dot(p_n, normalize(p_hv));	return gl_FrontMaterial.specular * gl_LightSource[0].specular *				 pow(n_dot_hv,gl_FrontMaterial.shininess);}void main(){	vec4 color = gl_LightModel.ambient * gl_FrontMaterial.ambient;	float att;	// A fragment shader can't write a varying variable, hence we need	// a new variable to store the normalized interpolated normal.	vec3 n = normalize(normal);	float spot_effect = dot(normalize(gl_LightSource[0].spotDirection), 													normalize(-light_dir.xyz));	if(spot_effect > gl_LightSource[0].spotCosCutoff) 	{		att = 1.0 / (gl_LightSource[0].constantAttenuation +								 gl_LightSource[0].linearAttenuation * dist +								 gl_LightSource[0].quadraticAttenuation * dist * dist);		color += vec4(att) * DiffusePart(n, normalize(light_dir.xyz));		color += vec4(att) * SpecularPart(n, normalize(half_vector.xyz));		// Finally, apply the spot effect.		color *= pow(spot_effect, gl_LightSource[0].spotExponent);		color *= ShadowTerm();	}	// Compute shadow term and multiply current color with it.	gl_FragColor = color;	if(1 == use_opacity)		gl_FragColor.w = GetOpacityTex();}

Does the usage of a D32F depth buffer make sense if they don't do the Fourier transform on the same pass? I'm just curious.

HellRaiZer

##### Share on other sites
That looks like the shading part (the look up the coefficients from the 8 "fse" textures). I'm interested in whether they render those fse textures as part of the geometry rendering pass though, or whether they generate them from the depth buffer.

If they render to the fse textures directly, then a 32-bit float depth buffer is overkill certainly. However if they copy from the depth buffer, it may be necessary to store their linear depth metric.

##### Share on other sites
I skimmed over the paper and i read paragraph 4.1. As far as i can tell (from the GLintercept output) they do it exactly as they explain it in the paper, with one exception, the separable Gaussian filter application, which, i guess has to be enabled through Convolution->Enable checkbox, which i didn't check).

After regular shadowmap generation (the first shader i posted), they perform a number of full screen passes (8 to be exact) with a shader named fse_generator, using the shadowmap from the first pass. I don't know why the do 8 passes instead of just 2 with 4 RT in each one as they mention in the paper. After that, they generate mipmaps for the 8 textures. The next step (which has nothing to do with the algorithm) is to display the channels of all the textures on screen (if i understood the code correctly;as i said the demo didn't run for me so the only thing i see are two big quads on the right of the screen). Finally they do the shadow projection pass which uses the shader from my last post.

I hope i haven't forgotten anything. And sorry for pasting redundant shader :)
I guess if you need more info you can run GLIntercept on the demo to see what's going on. If not, let me know and i'll try to dig as much info as i can.

HellRaiZer

##### Share on other sites
Quote:
 Original post by HellRaiZerI hope i haven't forgotten anything. And sorry for pasting redundant shader :)I guess if you need more info you can run GLIntercept on the demo to see what's going on. If not, let me know and i'll try to dig as much info as i can.

Nope, that's plenty of info - thanks for the detective work!

Seems like an inefficient way to render the shadow map to me, but people get locked into this "render *only* depth or your performance will be terrible!" mentality and unfortunately seem to be willing to sacrifice 8+ passes to attain it. Then again this is certainly speculation with respect to their application, so maybe they tried both ways and this was the fastest. I doubt that however since GLBench clearly shows the "perfect" scaling of G80 across MRTs.

By rendering in a single pass they also don't take advantage of hardware MSAA, which is one of the really big reasons to use VSM/CSM/etc.