Shader works on laptop, hangs on desktop.

Started by
2 comments, last by Tocs1001 10 years, 3 months ago

I recently completed my Single Pass Order Independent Transparency shader.

I decided to add some shadows to my lighting shader. Since my lighting is computed with Tiled Forward Shading I put my shadow maps into texture array of cube maps. When I added the line of code to sample the cube maps. It started to hang on glDrawElements(). After some time the graphics driver kills the program for having too many errors. However it doesn't hang right away, there's a couple seconds of it working correctly before it breaks.

I gave it a try on my laptop (NV 630m) and it works completely, and seemingly smoother than my desktop(NV 770) without the shadows.

If I comment out the line sampling the cube map array for shadows it works.


attenuation *= texture(ShadowMaps, vec4(WL,shadow),comparedepth); 

Curiously if I leave the line sampling the shadows, don't call my BRDF portion, and instead output attenuation. The shader doesn't hang.


color += vec4(attenuation,attenuation,attenuation,0.1);

What that looks like:

eePlhJI.png

I've pasted my shaders here. http://pastie.org/8624432#85,92 Since they're kind of large.

And an opengl log, though CodeXL doesn't seem to want to capture the whole log for a single frame.

https://gist.github.com/LordTocs/c2a59de6c3d9fa811d2b

I'm hoping it's not the drivers because I tried updating them to the latest. I hope someone spots something I'm doing incorrectly. I know it's a lot to sift through but I'm running out of ideas.

Screenshot from my laptop: http://i.imgur.com/9WspPLc.png

EDIT:

I've since added a debug callback to my OpenGL context. When the shader locks up, this comes over the debug output.


Debug(api, m): PERFORMANCE - Program/shader state performance warning: Fragment Shader is going to be recompiled because the shader key based on GL state mismatches.

Advertisement

I've never seen a hang as a result of it, but this is undefined behavior (at least in older versions of GL/D3D; maybe it's been changed in recent versions and I missed the news):


int lightCount = LightCountAndOffsets[index].x; // index is non-uniform so lightCount is non-uniform
int lightOffset = LightCountAndOffsets[index].y;

vec4 color = vec4(0.0, 0.0, 0.0, 0.0);

ShadePrep ();
for (int i = 0; i < lightCount; ++i) // non-uniform loop is slow
{
    int lightIndex = texelFetch(LightIndexLists, lightOffset + i).x; // texture read in non-uniform flow is undefined behavior; don't do this

Even if it were well-defined behavior, the non-uniform loop will be inefficient. Remember how GPU hardware works: multiple 'threads' are running simultaneously using the same instruction pointer in blocks of 4-32 (or more). If only some threads are executing conditional code then the other threads are basically sitting idle (though usually they're still executing all those instructions but ignoring the results). You can't always avoid non-uniform flow but you should keep it to small if blocks and avoid large conditional blocks or loops as much as you possibly can. Some algorithms just aren't suited for today's GPUs. Algorithms with lots of conditions are sometimes better executed when broken into multiple passes (with different passes for different cases).

In particular, this is why texture reads in non-uniform loops were made undefined behavior. All the threads might execute that instruction whether they're supposed to or not; the result is just ignored for threads that aren't supposed to be running that code. Accessing textures is slow (so non-uniform texture access can really hurt) and accessing textures with possibly bogus data may do any number of things (modern hardware should robust and cope with it... should be).

You should also try reducing this to a minimal test case that triggers the behavior. It would be easier to comb through your code then or to file a bug report with your HW vendor if the code seems correct.

Sean Middleditch – Game Systems Engineer – Join my team!

It probably didn't even compile on desktop, so you may just be rendering with fixed function pipeline without knowing it, if the shader is invalid.

"If program does not
contain shader objects of type GL_FRAGMENT_SHADER, an
executable will be installed on the vertex, and possibly geometry processors,
but the results of fragment shader execution will be undefined."

Do you have a robust shader loader? Errors can happen while compiling each shader separately (vertex + fragment,) as well as when linking the program.

So, in total there are 3 possible error scenarios, and thus 3 separate instances of where you would have to query for the "infolog."

Though, I have had it happen several times where the shader compiled, linked, and was clearly not working.

In those cases it could be many things, such as invalid square root parameters, normalization of zero-length vector etc.

Well the loops aren't entirely slow. They're broken up by 32x32 tile. So spatially pixels in the same area are using the same list of lights. So it has the potential for the same warp/wavefront (I think those are the words NV/AMD respectively) to be using the same list.

http://www.cse.chalmers.se/~olaolss/papers/tiled_shading_preprint.pdf

Though when I looked up the undefined-ness of texture sampling. You were correct. However, there are some texture samplings that are ok, mainly ones that don't rely on the computation of mip-maps or filtering. Because I'm using texelFetch() it should be defined... I think. (http://www.opengl.org/wiki/Sampler_(GLSL)#Non-uniform_flow_control)

Which points out in the sampling of shadow maps is non-uniform and I'm relying on the filtering. Perhaps it's the cause of the issue, though I'm not really sure how to fix it.

Thank's for your input.

EDIT: Missed the new post by Kaptein.

My shader loader checks for compile errors on every shader. And link errors on every program. If anything comes up it prints the result and asserts(false); So I can see the error. It's compiling. I just didn't include the vertex shader as it didn't seem related and there was already an enormous bit of code to look through. The vertex shader also isn't doing anything interesting.

This topic is closed to new replies.

Advertisement