Dynamic lighting & sorting geometry to reduce switches

Started by
15 comments, last by spek 16 years, 6 months ago
Hi, I've been asking here several times about reducing texture switches and things like that. In most cases, I just shouldn't worry that much. But since my scenes are getting more complex, I still wonder if I'm doing it allright. In the perfect situation, I should have 1 VBO, and a minimum amount of texture switches (by the way, does switching other parameters such as vectors also count relative "heavily"?). But, the problem is that my polygons can differ pretty much: * other shader * other texture(s) / parameters for that shader * other cubemap for reflections * other assignment of dynamic lights First I splitted the world up into "material groups". For example, everything that has texture "wood01" will be placed in group 1. Each group had a list of geometry, which could be a VBO as well. But later on, I added cubeMaps for reflections. I can have wood everywhere, but each polygon could potentionally have another cubeMap for its reflections. So, in each material group, I made subgroups (sorted on cubeMap). The amount of polygons under each subgroup is now so small (2 polygons average, sometimes more though), that a VBO won't be very sufficient anymore I suppose. So far everything is running. Quite alot switching and no VBO's, but it runs. But now yet another difficulty comes, dynamic lighting. I never did this before, so correct me if I say stupid things. But I suppose you should tell for each polygon/face which (nearby) lights its using. Some faces could use only 1 light, others maybe 4, and so on. And the lighting pass could differ as well. Some materials use parallax mapping, others not. So I get different lighting shaders as well. As far as I know its still not really possible to make a shader that loops through a variable count of lights. And how to handle them with different types of lighting (omni, spot lights, directional, ...)? Now I have the following sorting/render flow:

	< for each "room" (I do portal culling) >
	*. First render opaque stuff
	*. Then render the transparent stuff
	
	1. Apply textures that are used for the entire room (lightMap for 
           example)
	2. Geometry is sorted on shader program (normal mapping yes/no, cubeMap
           reflections yes/no, and so on)
		- Now the shader program is activated for all upcoming subgroups
	3. Each shader-group has 1 or more "material groups". Each material is 
           a set of parameters for the shader program
	   (texture maps, vector parameters, ...)
	   	- Now the (texture)parameters are activated in the current   
                  shader for all coming "faces"
	4. Each material group is divided in "faces" (pairs of polygons, such 
           as a floor, wall or ceiling). Each face can have its own cubeMap 
           texture, and is assigned to a couple of dynamic lights.
	5. Each face will be rendered. Since the amount of polygons is low 
           (only 2 in many cases), no VBO's are used.
	   Just simple (OpenGL) operations to draw the geometry.
	   	- Polygons are using the current shader/material settings. This 
                  ussually means doing lightMapping (with normals), diffuse 
                  texture mapping, maybe a cubeMap refleciton, and 
	   	  eventually parallax mapping.
I don't know if this is the wisest way to do it. And the dynamic lighting has not been done yet. I could make a step 6. In most cases, the lighting pass(es) will need the same parameters (normal/specular/parallax map). But, then I have to deactivate the current shader for each face, apply the lighting shader, and then switch back to the previous one. Another way to do it would be doing the lighting afterwards. First I do the basic (static) rendering, and then I do the whole thing again, but then sorted on the lighting parameters per face/group. Pfff... Hard to explain. What I really want to know is, is this a proper way to do it? For example, how do games such as Doom3 or Facry handle all the sorting? I think I'm switching too much between shaders/textures, and I also don't use faster methods such as VBO's or at least vertex array's. Because the sub-sub-sub groups only have a few polygons on most cases... Greetings, Rick
Advertisement
Ok, here is what i do:

I sort the geometry by material (your materialgroup), but not textures. The reason is, that materials are quite unique but textures are most likely diverse. An other problem with textures is, that most geometry uses more than one texture (multitexturing).

I'm not an hardware expert, but I hope that by reducing the number of texture switches (not texture bind commands) by grouping your geometry by classes of texture 'sets' is quite a good compromise. Example:

You got terrain, trees, characters,particles,water each with their own set of textures. So by rendering your terrain first, then trees, water, characters, particles you always consider only a 'small' set of textures concurrently, avoiding frequent texture uploads and benefiting texture cache.

That is my way to reduce the number of texture switches.

A fast way of sorting is bucket/radix sort, perfectly suitable for sorting geometry by materials.

--
Ashaman
Well, I basically do that already. I sort on shader first, then on the "parameter set". Most of the shaders require multiple textures indeed, but the combination of those is the same in 99% of the cases. That means a diffuse texture won't be used with 3 different normal map textures, or really different number/vector parameters. The textures that are used for alot of surfaces (a lightmap for example) are activated before the whole thing, so that is not a real problem either.

It becomes nasty when adding dynamic lights. Luckily, most of the "dynamic" lights won't really move around though, but I still need to tell for each face/group/whatever which lights it will use.

I was hoping SM3.0 would support looping through an x number of lights. Then I could just do the normal (lightmapping ala Halflife2) rendering pass, and then do lighting in the same texture. But it seems that I need at least 2 passes, unless I want to make hunderds of different shaders... I could "auto generate" those, but even with everything 1 shader, I still need to change parameters/parameters quite alot. Each face has different lighting settings, and therefore possibly also another shader that needs to be activated/deactivated.

Greetings,
Rick
I suggest that you don't SORT your geometry by material but that you GROUP it by material. The difference is that sorting is at best an O(n * log n) operation while grouping is a O(n) operation.

Here's how grouping can be done very efficiently:
Give each material an ID an increment it with each material your create, so something like

class Material{private:  static lastId = 0;  int id;public:  Material()  {    id = Material::lastId++;  };};


Separate your geometry into chunks of the same material and then give each of these chunks a next-pointer as well as the material id:

class GeometryChunk{private:  int materialId;  GeometryChunk* next;};


Also, once all material-objects have been created, create a new array like this:

GeometryChunk** chunksByMaterial = new GeometryChunk*[Material::lastId + 1];



Now when you render your geometry do something like this (heavily simplified)

for (int i = 0; i <= Material::lastId; i++){  chunksByMaterial = NULL;}for (int i = 0; i < numChunks; i++) {  if (IsVisible(allChunks))   {    int matId = allChunks->materialId;    allChunks->next = chunksByMaterial[matId];    chunksByMaterial[matId] = allChunks;  }}for (int i = 0; i <= Material::lastId; i++){  if (chunksByMaterial == NULL)  {    continue;  }  StartMaterial(i);  GeometryChunk* chunk = chunksByMaterial;  while (chunk != NULL)  {    RenderChunk(chunk);    chunk = chunk->next;  }  EndMaterial(i);}


This is quite fast and doesn't require you to sort anything in advance. Plus, it reduces shader/material switches just like sorting would but with far less overhead.
Uhm, I think I do that :) Sorting, grouping... Everthing is grouped indeed once. First everything with material "wood" is rendered, then "metal", and so on. I don't sort at runtime, since nothing in the geometry changes. So far. So far it also worked. Maybe its not the best grouping in all situations (sometimes you do everything with the same shader, in other case really not), but it worked so far.

But now I want to add dynamic lighting as well, and that makes it difficult. Each polygon/face could have a different parameter set/shader/cubemap and now also different light(s). All these possible combinations make it hard to sort. And I think I need to render multi-pass now. So far I did everything in 1 pass (static lighting with a lightMap, is predictable). Although my lights might not even really "move around" (except for a flashlight maybe), I get hundreds of different shaders. 1 light, 2 lights, with parallax, without parallax, omni/directional/spotlights, fog enabled/disabled, lightMap on/off, other specific effects ....

Thanks for the tips guys,
Rick
There doesnt seem to be much more that you can do to sort out fundamentally different shaders (parallax, normal mapping, ect..). I think the way you have that art set up is about as good as you can get. At least, im pretty sure you dont want to flood your shaders with if statements that select material...

As far as lighting with multiple dynamic lights goes, It really depends on your target hardware. If your targeting SM3 or SM4, then I personally think you would be better off making a "supershader" for each of your different materials. For example, you would compute all the lights that would effect an object on the cpu, then send down an array of these lights as a shader constant and simply loop through them and accumulate the effect. For complex shaders especially this is a plus because, in the case of parallax mapping, you avoid the costly operation of performing the raytrace more than once.

THe problem with the supershader idea is the 1) it wont work well on SM2 + SM1 hardware and 2) it becomes difficult when you start using shadowmaps. My solution in the case of shadowmaps is to not store all the shadowmaps in a seperate texture, but tile them in one big texture and then you dont have the problem of needing to bind a variable number of textures; you just bind the big tiled one and read from wherever you need. In my engine, I limit the total number of lights done in one pass to 4, since the shadowmap texture can only be so large; however, if your using directx 10, you could use texture arrays and have all your lights and shadows done in a single pass.
if you like trying new things, you can try to create huge texture atlas and do megatexturing. you would get rid of texture switches for good. there is a thread about it nearby(http://www.gamedev.net/community/forums/topic.asp?topic_id=460053)



there is one more thing i`m thinking about. what if you stored some material things in texture? i mean specular term and similar. in most cases it wouldnt consume much memory, becouse if you have for example mesh of wood, you need single specular term value for whole mesh and you could put 1x1 texture on it(stored in texture atlas, of cource). then you would have to do only shader switches, like normal mapping/ parallax mapping and so on. if you are thinking about using more lights than four, i think deferred shading can solve the problem.
I whish I had 20 hands and 10 brains, then I could try all the possible solutions :)

I don't have that much dynamic lights. 0 to 3 or 4 (per face), I think. I also use a lightMap, so its some sort of mixture (is that a smart idea? I don't know how else to get the nice radiosity effect). I don't aim for SM2 either. If I ever get this finished, I think SM23.0 will be out :). For now, its just for me and my current card.

I like the Atlas idea. I have no idea how mega texturing works, but it means I should be able to put "as many textures as I want" into 1 atlas (if there was no memory limit)? That certainly makes life easier, although I don't know the additional cost to access those textures in a shader. And how about tiling? Most of my textures are tiled. Some info is indeed combined in 1 texture. I use alpha channels for heights or specular terms in many cases.

Deffered shading is also something I like. Since it's a post effect, it doesn't affect the sorting or multipassing. But I never tried it... I heard you can get nasty problems with transparent surfaces...

About the looping... Is that possible then? It would indeed be really nice if I could use a "dynamic lighting loop". I could put it after all the basic shaders programs, and do everything in 1 pass. Reading the texture once and doing somewhat more complex stuff like parallax once indeed saves energy as well. I recently tried some things with the newest version of Cg, but it did not support real loops. I could check per light if its enabled, but that's not a real fast approach I suppose. What I want is just "for (i=0; i<lightCount...)" where lightCount is a variable parameter. Unsized array's weren't supported either, although I could just say I can maximally use 4 lights in a shader for example (which still gives me fixed sized array's). Am I doing something wrong here, is Cg not up-to-date with SM3.0, or is this just not (yet) possible?

Greetings,
Rick
I havent much experience with the new cg, but i know when your working with hlsl, you have to do a few hacks in the shader to get the loop to compile. Basically, instead of for(int i=0;i<lightCount;i++), you have to fix it at some high value like 8 and inside the for loop put an if...so like this:
for (int i=0;i<8;i++)
{
if (lightEnabled)
//do lighting
}

I think the reason that works and the regular loop doesnt is because the hacked forloop unrolls the if statement 8 times and does dynamic branching on the 8 lights instead of dynamic branching in the loop; I suppose shader model 3 cards dont like to dynamic branch inside a loop or something...
But im no expert here, hopefully someone will step in and give a better answer.

As far as I know though, the real unhacked loop works in SM4
"Unrolled loops", that was the word I was looking for. Yeah, the Shader Model really should support that. But so far, I need to use your solution, if I want to do everything in 1 shader. And maybe a little bit more, I also need to check the type (spotlight, omnin, etc.) per source. That would be 4 if's or something in my case... A waste probably, I think in most cases at least 50% is used. I could also disable lights just by using a black color (which means no contribution), but that probably costs even more energy than a couple if's.

Well, the if is not that much of a problem. But its another "heavy" technique that piles on top of so many other heavy techniques (HDR, DoF, complex shaders, .... ). On the other hand, only 1 pass. I think I will use either this technique, or deffered shading (if there are some tricks for the transparency).

But I still wonder how games such as Doom3 sort/group it. In those days, multipassing was nescesary. So what was the "flow" of rendering a frame? I'd like to know that.

[edit]
Ow, and by the way, when doing multiple lights in 1 shader, how to pass all the data from the vertex shader to the fragment shader? I mean, I was always limited to 4 x 8 numbers (8 texcoords available to put the data in). The 4 light directions will fit, but I need it also for other stuff. Especially when building that "uber shader". I haven't tried it lately, but I assume SM3.0 cards are still limited to 8 texture channels and 8 texcoords, right? I believe my GeForce 6600 was...

Thanks for the tips,
Rick

[Edited by - spek on October 2, 2007 3:06:57 AM]

This topic is closed to new replies.

Advertisement