Modern Graphics Engine Design

Started by
28 comments, last by SimmerD 19 years, 11 months ago
hmm..

"Stencil shadow (on CPU or GPU). Comment: Limited to 3 lights per surface."

Wtf? Since when? Shadow volumes just mean you need another rendering pass per light. Where''d you pluck this limitation from?
Advertisement
The lighting I use is semi-dynamic. Occlusion is precalculated for world geometry, so lights can''t move around. They can decrease in intensity, and change color for free at runtime.

The occlusion is calculated in a preprocess and stored in the vertex world geometry, via 5-17 raycasts per vertex.

For moving entities ( which are fairly small ), 9 raycasts are done per light for occlusion, and the vertex lighting is done in the normal way.

Re: Lights per polygon. Only 7 lights can be touching each 3d grid cell ( plus ambient ). So each cell has a list of which 7 lights are touching it.

Re : KD Trees. I actually coded these up, and there were so many splits generated, that it became infeasible for collision purposes. Because I wanted to use the same tessellation for collision and rendering, kd trees wouldn''t be the best approach. I started to change it to a loose-kd tree, and then realized this is essentially an aabb tree.

The engine only splits to the 3d grid cells, and the AABB Tree never splits.


Re: "Stencil shadow (on CPU or GPU). Comment: Limited to 3 lights per surface."

This is where the difference between hearing a talk and reading the .ppt comes in. I meant for performance reasons, you need to limit the # of shadowed lights that touch each surface. My understanding is that many stencil shadow engines in development today go to considerable trouble to limit the # of light/surface interactions for speed reasons.




Hey thanks for the previous answers. Been thinking over some concepts, and have another question (open to everyone really)

How do you design your engine to be able to handle the multiple types of materials that exist? Say, extra processing required by BDRF vs an object with a reflective property requirnig a cube map, vs your standard one texture data. I''ve been tyring to devlop a generalized answer for this problem, but the only thing i can come up with is "You have to taylor make the engine to the game" which basicly comes down to reducing the type of materials that can exist in the engine. I''m wondering if there''s another / better way around this?

~Main

==
Colt "MainRoach" McAnlis
Programmer
www.badheat.com/sinewave
==Colt "MainRoach" McAnlisGraphics Engineer - http://mainroach.blogspot.com
You have to decide if you''re making an engine or a game.

It''s fun on an engineering level to create a general engine, but it can become hard later on to ship the game with good perf if there are too many material switches, etc.

One way to look at it is that you get about 250 draw calls per frame, depending on your CPU and frame time budget. By switching materials or shaders in ways that can''t be batched, you have to spend one of your draw calls for each object with that material or shader.

That''s fine, if the material or shader really adds something to the scene.

It''s not fine if it could have been reasonably collapsed to a similar material and done all in one go.

My test engine is designed around a game concept, so that I can make the tradeoffs I need for good speed. Currently it gets ~160 fps on a geforce4 4400, and 100fps on a gffx 5200 at 800x800 resolution.

What I do is to separate the things to be drawn into groups : opaque world geometry, entities ( characters, etc. ), particles, layers ( fog & mist & water ), and decals ( shadows and blood and scorchmarks ).

The opaque world is drawn by material in world grid cells ( ~250-500 polys per batch ). These do the averaged L and H bump mapping, emissive, gloss, vertex lighting and diffuse texturing in one pass. Water or fog density is stored in DestAlpha.

The alpha layers are drawn as 2 or 4 texture layers moving against one another, blended into dest alpha from 1st pass.

The characters are vertex lit only, but similarly to the way the world is lit, to make them match. They are skinned on the CPU into a dynamicVB.

The particles will be drawn a system at a time. All particles in the system will use the same texture page or texture cubemap.

The decals will be drawn all at once from a dynamicVB, using a cubemap to store 6 different decal textures.

By doing things in few batches, you can have more things in your scene for the same or better performance.

So, I am recommending you create your engine more tailored for your game. You really pay for generality in terms of performance and scene complexity. Just be sure the tradeoff is worth it.

SimmerD: there is a discussion going on in this thread http://www.gamedev.net/community/forums/topic.asp?topic_id=185226 that is discussing using Genetic Algorithms to generate an efficient poly grouping hierarchy (specifically for culling, but I don''t see why it couldn''t be extended to collision and AI). It is being compared specifically to your approach with AABB-trees. I would love to hear your thoughts on this approach (either in this thread or the other one).
For those who are curious, here is an update on where I am with this engine since the talk.

I now have particle systems, where each particle does collision response with the world - bouncing sparks, smoke that hugs walls, etc.

I also implemented swept-sphere collision using Igor''s library, for characters and boulders, etc. I do simpler euler integration now, to a non-fixed frame rate. I move each object through the entire time-step, do collisions & bounces for it, then move on to the next object, etc. I''m threatening to put in verlet, and handle convex objects, ropes, cloth, but not yet.

I use fmod for sound, right now only ambient sounds.

I switced from a 3d grid to an octtree. The main reason is to handle things falling out of the world nicely without wasting memory. Adding falling objects showed this flaw with the grid.

I removed the averaged-L bump mapping and per-vertex shadowing. Mainly, I wanted proper shadows for trees, etc, and to get higher detail shadows. My worlds were pretty highly tessellated as it was, and to have vertex shadows would have required clipping the shadows into the geometry. Also, I wanted opening and closing doors to be able to block lights. Also moving trees would have been a no-no.

So, instead, I now do lightmap-like occlusion maps. There can be any # of lights touching a tile. I perform one raycast per lightmap texel, and light about a 2 texel border, with position & normal extrapolated out from the nearest triangle. I also modulate in distance attenuation. After each occlusion map is lit, I do a 3x3 box filter blur.

For dynamic shadows, I do a per-light, per-object shadow frustum. Each frustum finds world tiles that it is touching.

At runtime the loop is

1) draw z, ambient & store sunlight occlusion in alpha
2) Draw all sunlight shadow casters into 63x63 chunk of 512x512 shadow atlas
3) draw sunlight shadows into dest alpha for moving objects
4) render sunlight diffuse & specular bump mapping, blend with dest alpha shadow term

for each additional light

1) If light doesn''t cast shadows, or no shadow casters nearby, render light diffuse & specular bump mapping, blend with occlusion map shadow in one pass

else

1) draw occlusion map into dest alpha
2) Draw all light shadow casters into 63x63 chunk of 512x512 shadow atlas
3) draw light shadows into dest alpha for moving objects
4) render sunlight diffuse & specular bump mapping, blend with dest alpha shadow term

I did a few more optimizations, like creating a per-light index buffer of just the triangles that are partially lit by the light. This saves in the lighting & shadowing passes.

Right now I''m getting ~90 fps with 2 shadowed lights on a geforce fx 5700 ( roughly geforce 4 ti 4200 speed on this ) in a large level with about 8 materials and 3 dynamic shadowing objects in view.

quote:Original post by SimmerD


Thanks a lot. Sounds like flexible enough design. So, you trade dest alpha for sunlight occlusion term, instead of depth for fog and water layers...?
Re: software vertex shaders on FFP hw (apologies for revisiting this again)
Isn''t using sw vshader means all T&L done on CPU, leaving the card''s T&L unit idle?

Re: low-end hw in general
A *very* large percentage (80-90%) of the market in our region (poor south east asia) uses MX cards (GF2MX,GF4MX). I''d appreciate any suggestion on doing things optimally (looks as good as possible, runs fast) on these cards.
Yes, I took away the depth in dest-alpha for now. I may add it later, but on balance, fast sunlight shadows are more visual bang/buck compared to true distance fog.

What I may do instead is just another pass for foggy levels.

Although, since I''m just doing additive blending, I can probably use black fog, and just fog the 1st ambient pass.

If you are doing bump mapping, then using sw vertex shaders is almost certainly the way to go on these cards.

If you are not doing bump mapping, and using fixed-function lighting, it may be faster to use hw t&l with no vertex shaders.

I can''t remember, are you planning on releasing the source to your engine?

This topic is closed to new replies.

Advertisement