DEMO: Deferred Shading

Started by
28 comments, last by MickeyMouse 18 years, 9 months ago
Hi everyone,I decided to give a try deferred shading under Direct3D recently and here is result of my work: Demo Site.

deferred_shading_1.jpg


I'd be glad if you give me some feedback on whether it runs at all and what was rendering speed on your specs. Unfortunately it requires GeForce 6 class card (or higher) or Radeon 9500 (or higher). I'd expect it work on GeForce FX as well, but switching to another rendering mode should fail (no support for 64 bit render targets)

Note: By default there are 14 light sources in the demo. Disabling few of them should make things smoother.
Thanks!

[edit: Now tested on Radeon cards as well)
Maciej Sawitus
my blog | my games
Advertisement
Oh sorry, i got Geforce 4 . Too bad, it looks really nice
Hope I was helpful. And thank you if you were!
When I tried, it said it's missing d3dx9d_24.dll. I may even have this lying around, but I didn't feel like hunting for it.

Some responses to your nvidia-related questions. I just resigned from nvidia, but I can help you with some of your questions. Contact developer relations for more specific help.

from your readme.txt :

Issues found when implementing deferred renderer (probably
very GeForce cards specific):

- an optimization with stencil masking pixels not being lit
by the light was actually not optimization at all (but it's
still necessary to correctly determine lit pixels); you can
see stencil test in action by pressing <J> and disabling
few lights (just to see more clearly)

Geforce6 cards use a very fast, but somewhat picky stencil culling algorithm. Unless you perform your stencil within certain parameters, the entire shader will be run before stencil is tested. It IS possible to get it fast, just requires a few tweaks.

- cube normalization texture didn't give any speed up

Not surprising, as the GeForce6 has very fast normalize ( especially in half precision ), and you are most likely bound by other things.

- rendering to 4 render targets in pre-processing step is very
expensive (but it's only once)

The sweet spot for GF6 cards is 3 MRTs. Most deferred shading algorithms can be squeezed into 3 MRTs, so this is not a big deal in practice.

- fastest (and good quality) deferred renderer mode for me was:
R32F for position (as depth; stored in clip space)
A8R8G8B8 for normals (biased; stored in world space)

- best quality deferred renderer mode was obviously:
R16G16B16F for position (stored in world space)
R16G16B16F for normal (stored in world space)

- speed of rendering using deferred renderer was varying
depending on when (yes when) were render target textures
allocated; e.g. for me mode R16G16B16 (non-float) when
switched on for the first time was usually about 2 times
slower than when switched on for the second time (every time
I switch, I recreate all required render targets); looks
like card drivers are doing some unpredictable job when
allocating / deallocating render target textures

The GF6 has some limited hw resources for non-power of 2 render targets. If you fall outside of the limits, speed can suffer. Typically the current drivers rely on the order of allocation for who gets the resources. Future drivers will address this more automatically.
I can't run it because I don't have d3dx9d_24.dll ...I have the _25 and _26 dlls from the April and June 2005 DX9.0c SDK though, but that won't help. I tried recompiling, and I got this error:

deferredrenderer.cpp(369) : error C2552: 'quad' : non-aggregates cannot be initialized with initializer list
'D3DHelper::SimpleVertex' : Types with user defined constructors are not aggregate

I'm not sure how to fix it, but it looks like it's something to do with the D3DHelper class.

If you can fix it (maybe upgrading to June 2005 SDK), I'll test it for you. I'm running on a Athlon XP 2500+ (1.8GHz), and a GeForce 6600GT AGP.

Ok, I've just included missing d3dx9d_24.dll. I'll try recompile it with newer D3D version soon too.

BTW, thanks for feedback SimmerD,

> Geforce6 cards use a very fast, but somewhat picky stencil culling algorithm.
> Unless you perform your stencil within certain parameters, the entire shader
> will be run before stencil is tested. It IS possible to get it fast, just
> requires a few tweaks.

I admit I was a bit surprised seeing I didn't get speed up using stencil test here. What I do in the demo is for each light render its bounding sphere 2 times:
1. stencil "tag" pixels lit by the light (disable depth and color writes; enable depth test)
2. if stencil test passes, light pixel using pixel shader; for each pixel set stencil value to 0 (just clear stencil)

Do you suggest I might somehow not benefit from stencil culling (that is the pixel shader is executed for culled away pixels too)? What parameters and tweaks do you mean?

Thanks :)

[edit: Changed "culled" to "culled away"]

[Edited by - MickeyMouse on July 19, 2005 1:57:41 PM]
Maciej Sawitus
my blog | my games
Ive downloaded the new version, but it looks like youve included the wrong file, the program wants "d3dx9d_24.dll", not "d3dx9_24.dll". It works if you just rename the dll to the right name, but its not a great idea as the program will be expecting to open up the debug dll and its getting the release dll.

edit: You might also want to change the way the fps is displayed, as currently its changing a bit too fast for me to read easily.
Oh yes, I'm sorry, now should be ok.
Maciej Sawitus
my blog | my games
Quote:Original post by MickeyMouse
Do you suggest I might somehow not benefit from stencil culling (that is the pixel shader is executed for culled pixels too)? What parameters and tweaks do you mean?


I cannot try the demo but judging from your screenshot the test scene is not going to benefit from stencil test since even with a simple zbuffer based rejection the number of false hit will stay fairly low. That's assuming you shade if greater than Z when outside the volume and shade when less than when inside it.
Praise the alternative.
Hi,

I renamed the d3dx dll and got ~85 Frames with default options.

Specs:
Sempron 2600+
512 MB DDR2 Memory/333 MHz
Radeon 9800 Pro 128 MB.
Well, ive tried it with the newest version and here are my results:

AMD64 3200+, 1.5GB RAM, 6800GT

With the information off the framerate changes very quickly and its hard get an exact reading, the lowest i can see is about 102fps, the highest is about 126fps.

If i overlcock my 6800GT to the speed of a 6800 Ultra, the highest is still 126fps, but it doesnt drop below 120fps.

This topic is closed to new replies.

Advertisement