Compatibility issues with omnidirectional cube map shadows

Graphics and GPU Programming Programming

Started by theagentd April 19, 2013 03:25 PM

7 comments, last by theagentd 11 years ago

990

Author

April 19, 2013 03:25 PM

Hello. I've implemented shadows for my point lights by rendering my scene to each of the 6 sides of a depth component cube map. It's working perfectly fine on OGL3 (DX10) level Nvidia and AMD cards, but after testing a bit I found out that there is a bug on Intel GPUs that causes the target parameter to be ignored when rendering to a cube map, meaning that it's only possible to render to the first face (POSITIVE_X) no matter which one I tell it to render to. I've reported this bug to Intel but there is no fix out yet (assuming there will be one). I also found out that depth component cube maps aren't supported on OGL2 hardware.

My first idea was to work around the problem by rendering the scene 6 times to a depth component 2D texture and copy the data from it to each face of the cube map. That should've worked around the Intel bug, but instead glCopyTexSubImage2D() gives me an Invalid Operation error even though it works on all other OGL3 level cards. I'm about to give up on this front.

On the OGL2 level cards, I can't use a depth cube map at all. I had this idea of copying the GL_DEPTH_COMPONENT16 texture to a GL_R16 cube map, but it gave me an invalid operation error. The only solution I can see is to create both a depth 2D render buffer and a color buffer to store depth in so I can later copy it to a color cube map.

My questions:

- Is there any way to efficiently copy/convert a 16-bit depth component texture to a single channel 16-bit color cube map? Performance needs to be good since we're talking really old hardware, so PBOs and probably shaders are out of the question. The data is identical, it's just that the OGL spec seems to prohibit it.

- Is it really this hard to get point light shadows working well? Cube map shadows seem like a huge no-go to me even though it's the first thing that comes up when I Google. How did people do cube map shadows before OGL3?

- Would I be better off just implementing dual paraboloid shadow maps instead? That should solve all compatibility problems since I'd just need two normal 2D depth textures (or I could pack them into 1) instead of a cube map. It'd also be more CPU effective since it'd only be two render passes. How problematic is the seam and the distortion? How much worse is the quality compared to cube map shadows?

Hodgman

52,717

April 19, 2013 03:50 PM

On older GL2/DX9 hardware, there probably isn't a way for the GPU hardware to copy a depth texture to a colour texture, at all (besides maybe doing a round-trip via the CPU...). So you'd have to use a pixel shader that outputs depth as colour.

Instead of all this hassle to get depth data into a cube map, you can just use a 2D texture and simulate the cube-map lookup math yourself.
e.g. make a texture that's 6 times taller than it is wide, and look up a different square 2D region in it depending on which axis of your lookup coordinate is the largest and whether it's positive/negative. Yes, this will be slower than a hardware cubemap lookup, but it means you can render your depth values without extra hassle.
Also, hardware PCF on 2D depth targets has been supported since GeForce3+.

. 22 Racing Series .

theagentd

990

Author

April 19, 2013 04:47 PM

Thanks for the response! I now realize that this should've been posted in the OpenGL section...

Well, it'd be a bit of a hassle to simulate a cube map, so I'd rather sink that time into dual paraboloid shadow mapping instead assuming there are no big performance hits or artifacts for doing so. Is there any reason for me to not just go with dual paraboloid shadow mapping instead?

AgentC

2,476

April 19, 2013 06:03 PM

Dual paraboloid shadow mapping is dependent on how finely tessellated your shadow caster meshes are, so you will have artifacts when there are not enough vertices and the paraboloid distortion can not be done with sufficient precision. In contrast using a simulated cube map allows to use any geometry as a shadow caster.

During lighting pass, I would assume the paraboloid distortion and the simulated cube map indexing are roughly equal in terms of performance cost.

Generating the dual paraboloid shadow map has the potential to be faster though, as it's only two views instead of six. However depending on the scene, some of the cube map sides might contain no shadow casters at all, so you only need to clear them in that case.

Github: https://github.com/cadaver C64 development: http://covertbitops.c64.org/

theagentd

990

Author

April 19, 2013 06:36 PM

I understand that the GPU performance would be roughly the same considering that only a very small fraction of all objects need to be duplicated to multiple sides/paraboloids, so I guess the biggest win comes from the reduced number of draw calls and culling costs, so CPU performance should at least slightly better.

Ugh, it seems like there's no perfect solution here. The more I look it up the less attractive dual paraboloid shadow mapping looks. I have no way of guaranteeing that my objects are sufficiently tessellated, and the seam between the paraboloids seems hard to avoid. Sadly emulating cube maps also has its fair share of problems too, like worse performance and texture size limits, though the size issue is only a problem on older hardware that are so slow they won't be able to use any larger shadow maps at realtime framerates.

For now I've decided to just create an OGL2 compatible emulated cube map shadow mapper and keep my current solution for OGL3 hardware. I intend to check out dual paraboloid shadow mapping in the future but for now I just need to get this working with the least amount of work, and it sounds like tweaking dual paraboloid shadow mapping to minimize the artifacts will require more time than just rewriting an existing working solution.

Thanks for your input, AgentC! Our usernames is a funny coincidence... =D

EDIT:

- How do I determine if a graphics card supports hardware shadow mapping? Is it safe to assume that sampler2DShadow is available on OGL2 hardware (Intel, Nvidia and Radeon)?

- I understand how to select a face from a direction vector in a cube map, but how do I calculate texture coordinates after that?

AgentC

2,476

April 19, 2013 07:22 PM

The shadow samplers are part of GLSL 1.10 spec, so if the driver is running at least OpenGL 2.0, they are available. Note however that it took a long time from Intel OpenGL drivers to get up to scratch, so an older Intel GPU might do hardware shadow mapping on D3D, but fail to support OpenGL 2 properly.

When you scale the direction vector so that the largest (face-selecting) coordinate has absolute value 1, the remaining two coordinates indicate the point on the face. Because you need to take into account that either X,Y or Z is largest, the calculation can become complicated, so instead you can encode and lookup the coordinates from a cube map.

Github: https://github.com/cadaver C64 development: http://covertbitops.c64.org/

theagentd

990

Author

April 19, 2013 08:30 PM

We don't target that old Intel cards, so I think we'll be okay as long as people update their drivers to the newest version.

Wouldn't the dependent texture read be expensive on older hardware? Other than that the technique seems a lot less complicated than I thought it would be since I could let the lookup cube map automatically do the mapping to the correct face in the depth texture too.

I'm a bit unsure how to store the needed data in the lookup cube map. I'd need to store the (s, t) texture coordinates, but I believe there might be precision problems. I'd like support depth map resolutions to be up to 1024x1024 per face, packed into a 3072x2048 texture. Thanks to interpolation I suppose that I can get by with a pretty low-resolution lookup cube map, but would I need a 16-bit RG texture to store the (s, t) texture coordinates or would 8-bit channels suffice?

AgentC

2,476

April 20, 2013 11:13 AM

Using a 16-bit lookup cube map will be more straightforward.

However, I currently use a 256x256 8-bit RGBA lookup map, where the RG channels are the per-face coords ranging from 0 to 1 and the BA channels add the X & Y face offset. The RG channel values have to be divided with 3 & 2 respectively, as the faces are laid out in a 3x2 formation.

Github: https://github.com/cadaver C64 development: http://covertbitops.c64.org/