360 degrees shadow speed up

Started by
9 comments, last by MJP 13 years, 11 months ago
Hello This is my first post on this forum and it will probably not be my last :) I have been to this forum several times gathering information about problems I may have, and know I think it is time for me to show myself :) Well, back to the problem.. I am writing my own HLSL shader in Direct3D 9 and just heard of a technic called Deferred Rendering. I found that it was rather easy to insert into the shader I was creating, and all the lights it an handle is to the extreme. But there was one problem that bothered me a bit, I've just completed created lights with shadows that can handle it all directions, meaning I have to render the lights enviorment 6 times, one time for every direction (Up, down, left, right, back, forward). The thing is that this takes alot of the graphiccard, not to render the depthbuffer for the shadows, but to erase the depthbuffer over and over again. I was wondering if there are any faster way to erase textures than clean or fill? Or, if there are some way to cheat on the shadows and not create them in every direction depending on where the camera is. I can set the projection to more than 90 degrees, but still it can't project a view bigger than 45 degrees in ALL directions from the center view. Here is a screenshot on my work, I have keept it simple so far without parallax or bumpmapping.. http://i955.photobucket.com/albums/ae36/ralleman/DMT2Shadows1.jpg
------------------------------Check out me work at www.dmtribute.webs.com
Advertisement
Clearing a depth buffer is extremely fast in most cases. ATI actually encourages you to clear it as often as possible.

How are you determining that this operation is slow? Perhaps when you issue the clear command it causes a stall where the driver has to wait for all previous commands to complete first.

When you say "to erase the depthbuffer over and over", do you mean that you're only using one depth buffer, shared for all the shadows?
Thanks for your quick answer :)
Well, I may have discovered one disadvantage here, I don't you a depthbuffer. I use a 32 bit texture with one colorchannel to write the z-depth to in the shader. And uses clear or colorfill with color 1.0f to erase the buffer. Maybe it will go faster if I use a depthbuffer instead and clear that one. But then again, can I instruct the hlsl shader to only write to the z-depthbuffer and not to any texture? It would be a speed up if that would work...

The code is something like this:

tex1 = 32 bit depthbuffer for shadow;
tex2 = 8 bit texture for writing the shadow results to;
tex3 = 8 bit rgba texture for writing the lights to;

1. Erase tex3.
2. Erase tex2.
3. Erase tex1.
4. Create a depthbuffer from the shadows in one of the 6 directions to tex1.
5. Compare all the light pixels depth with tex1 and if ok then: write color 1.0f to tex2.
6. goto step 3 until all 6 directions have been looped.
7. Render all the light pixels color / specular if tex2 pixelcolor is 1.0f. to tex3.
8. goto step 2 until all the lights has been looped.
9. Multiply tex3 lights with the diffuse color from a previous rendering result and output it to the backbuffer.

As you can see the tex1 gets erased six times for every light and tex2 one time for every light.

I checked how much cpu/gpu it requied for the computer to erase all the textures by erasing them twise insted of once (doing step 3 two times before proceding to step 4) just to check how big the diffrence was. And the fps was lowered from 50 to 40.

The reason I use one depthbuffer per light is that I want to have a more realistic result if two lights comes close two eachother. If I only use one buffer for all shadows, the shadows will be black or nothing at all and not gray if other lights are nearby.
------------------------------Check out me work at www.dmtribute.webs.com
Certain older GPU's can't natively clear a floating-point surface, and so the driver does it by essentially drawing a full-screen quad and outputting the clear value. So doing that isn't cheap.

However I think your bigger problem here is that you're doing an unnecessary step: you don't need to do the shadow depth comparison for all 6 faces. Instead what you can do is render depth to all 6 faces of a cubemap, and then in your light shader you sample the cubemap and do the depth comparison right there. This should be much quicker, and will save you memory as well.

Also if you can determine that one of the 6 faces of the depth map won't be used, you can skip it. The simplest method for doing this is to determine a frustum shape for each of the 6 faces, and test that frustum for intersection with the frustum for your camera.
Thanks for your reply :)
I havn't been looking into the Cubemap that much, but what I got when checking out a tutorial in all hurry is that I still have to render in all 6 angels. I was just wondering about how to get the texturecoords. Is this by using a normal? If it is, this would be great :) But is it accurate? Do the projection when rendering to the cubemap have to have some specific values?
------------------------------Check out me work at www.dmtribute.webs.com
Here's some more options for speeding it up.

- Use dual paraboloids instead of the six cube faces.

- Read the z-buffer directly ( Hardware permitting ).
To sample a cubemap you use a normalized direction vector. So if you orient your cubemap faces so that they line up with your world-space axes, then all you do is get a vector pointing from the light position to the pixel you're shading and normalize it to sample your shadow map. It's just as accurate as sampling a regular 2D texture.

For rendering to the faces of a cubemap, typically you use a perspective projection with an aspect ratio of 1.0 and a FOV of PI/2. This makes it so that the 6 frustums line up exactly and there's no gaps.
I must say that I really like this forum, there is alot of diffrent ideas from diffrent people helping me to decide the best choice :)

I have never heard of paraboloid textures, but I read abit about it here:
http://diaryofagraphicsprogrammer.blogspot.com/2008/12/dual-paraboloid-shadow-maps.html

I guess the quality is a bit lesser than with cubmaps. But the best thing is that I only have to render the scene two times instead of six times. And this would help me alot, so I think I will give it a try with paraboloids texture. By the way, is there any good tutorials online on this, may save me some time ;)

I was also wondering about writing and reading directly to the depthbuffer.. Is this possible without having to render to any texture, not have to go thought the hlsl shader?
------------------------------Check out me work at www.dmtribute.webs.com
Sorry MJP, didn't see your post before writing to the one before you..
Well I will have this as a second option if the paraboloids don't work out. But I guess the technich is the same for the both..
------------------------------Check out me work at www.dmtribute.webs.com
@MJP it ended up with that I used the cubemap for shadowtexture.

Thanks you all for all you help :)

Here's a screenshoot of the result:
http://i955.photobucket.com/albums/ae36/ralleman/DMT2Shadows2.jpg
------------------------------Check out me work at www.dmtribute.webs.com

This topic is closed to new replies.

Advertisement