Sign in to follow this  
Rasmus

360 degrees shadow speed up

Recommended Posts

Hello This is my first post on this forum and it will probably not be my last :) I have been to this forum several times gathering information about problems I may have, and know I think it is time for me to show myself :) Well, back to the problem.. I am writing my own HLSL shader in Direct3D 9 and just heard of a technic called Deferred Rendering. I found that it was rather easy to insert into the shader I was creating, and all the lights it an handle is to the extreme. But there was one problem that bothered me a bit, I've just completed created lights with shadows that can handle it all directions, meaning I have to render the lights enviorment 6 times, one time for every direction (Up, down, left, right, back, forward). The thing is that this takes alot of the graphiccard, not to render the depthbuffer for the shadows, but to erase the depthbuffer over and over again. I was wondering if there are any faster way to erase textures than clean or fill? Or, if there are some way to cheat on the shadows and not create them in every direction depending on where the camera is. I can set the projection to more than 90 degrees, but still it can't project a view bigger than 45 degrees in ALL directions from the center view. Here is a screenshot on my work, I have keept it simple so far without parallax or bumpmapping.. http://i955.photobucket.com/albums/ae36/ralleman/DMT2Shadows1.jpg

Share this post


Link to post
Share on other sites
Clearing a depth buffer is extremely fast in most cases. ATI actually encourages you to clear it as often as possible.

How are you determining that this operation is slow? Perhaps when you issue the clear command it causes a stall where the driver has to wait for all previous commands to complete first.

When you say "to erase the depthbuffer over and over", do you mean that you're only using one depth buffer, shared for all the shadows?

Share this post


Link to post
Share on other sites
Thanks for your quick answer :)
Well, I may have discovered one disadvantage here, I don't you a depthbuffer. I use a 32 bit texture with one colorchannel to write the z-depth to in the shader. And uses clear or colorfill with color 1.0f to erase the buffer. Maybe it will go faster if I use a depthbuffer instead and clear that one. But then again, can I instruct the hlsl shader to only write to the z-depthbuffer and not to any texture? It would be a speed up if that would work...

The code is something like this:

tex1 = 32 bit depthbuffer for shadow;
tex2 = 8 bit texture for writing the shadow results to;
tex3 = 8 bit rgba texture for writing the lights to;

1. Erase tex3.
2. Erase tex2.
3. Erase tex1.
4. Create a depthbuffer from the shadows in one of the 6 directions to tex1.
5. Compare all the light pixels depth with tex1 and if ok then: write color 1.0f to tex2.
6. goto step 3 until all 6 directions have been looped.
7. Render all the light pixels color / specular if tex2 pixelcolor is 1.0f. to tex3.
8. goto step 2 until all the lights has been looped.
9. Multiply tex3 lights with the diffuse color from a previous rendering result and output it to the backbuffer.

As you can see the tex1 gets erased six times for every light and tex2 one time for every light.

I checked how much cpu/gpu it requied for the computer to erase all the textures by erasing them twise insted of once (doing step 3 two times before proceding to step 4) just to check how big the diffrence was. And the fps was lowered from 50 to 40.

The reason I use one depthbuffer per light is that I want to have a more realistic result if two lights comes close two eachother. If I only use one buffer for all shadows, the shadows will be black or nothing at all and not gray if other lights are nearby.

Share this post


Link to post
Share on other sites
Certain older GPU's can't natively clear a floating-point surface, and so the driver does it by essentially drawing a full-screen quad and outputting the clear value. So doing that isn't cheap.

However I think your bigger problem here is that you're doing an unnecessary step: you don't need to do the shadow depth comparison for all 6 faces. Instead what you can do is render depth to all 6 faces of a cubemap, and then in your light shader you sample the cubemap and do the depth comparison right there. This should be much quicker, and will save you memory as well.

Also if you can determine that one of the 6 faces of the depth map won't be used, you can skip it. The simplest method for doing this is to determine a frustum shape for each of the 6 faces, and test that frustum for intersection with the frustum for your camera.

Share this post


Link to post
Share on other sites
Thanks for your reply :)
I havn't been looking into the Cubemap that much, but what I got when checking out a tutorial in all hurry is that I still have to render in all 6 angels. I was just wondering about how to get the texturecoords. Is this by using a normal? If it is, this would be great :) But is it accurate? Do the projection when rendering to the cubemap have to have some specific values?

Share this post


Link to post
Share on other sites
To sample a cubemap you use a normalized direction vector. So if you orient your cubemap faces so that they line up with your world-space axes, then all you do is get a vector pointing from the light position to the pixel you're shading and normalize it to sample your shadow map. It's just as accurate as sampling a regular 2D texture.

For rendering to the faces of a cubemap, typically you use a perspective projection with an aspect ratio of 1.0 and a FOV of PI/2. This makes it so that the 6 frustums line up exactly and there's no gaps.

Share this post


Link to post
Share on other sites
I must say that I really like this forum, there is alot of diffrent ideas from diffrent people helping me to decide the best choice :)

I have never heard of paraboloid textures, but I read abit about it here:
http://diaryofagraphicsprogrammer.blogspot.com/2008/12/dual-paraboloid-shadow-maps.html

I guess the quality is a bit lesser than with cubmaps. But the best thing is that I only have to render the scene two times instead of six times. And this would help me alot, so I think I will give it a try with paraboloids texture. By the way, is there any good tutorials online on this, may save me some time ;)

I was also wondering about writing and reading directly to the depthbuffer.. Is this possible without having to render to any texture, not have to go thought the hlsl shader?

Share this post


Link to post
Share on other sites
Sorry MJP, didn't see your post before writing to the one before you..
Well I will have this as a second option if the paraboloids don't work out. But I guess the technich is the same for the both..

Share this post


Link to post
Share on other sites
@MJP it ended up with that I used the cubemap for shadowtexture.

Thanks you all for all you help :)

Here's a screenshoot of the result:
http://i955.photobucket.com/albums/ae36/ralleman/DMT2Shadows2.jpg

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this