Sign in to follow this  
Medo Mex

Shader Unlimited Lights

Recommended Posts

Medo Mex    891

I don't know have many lights will be used in the scene, so I'm looking for a way to be able to pass any number of lights.

 

I thought that I could create array with high number of lights in the shader and then tell the shader how many lights was passed for light calculation, for example:

// Shader
#define MAX_LIGHTS 8 // *** I don't know how many lights, so I will assume 1000 is the maximum possible ***
PointLight pointLights[MAX_LIGHTS];
DirectionalLight directionalLights[MAX_LIGHTS];
SpotLight spotLights[MAX_LIGHTS];

float numOfLights;

// Then loop in the shader
(for int i = 0; i < numOfLights; i++)
{
     // Code to calculate light here...
}

Is this way a good idea?

Edited by Medo3337

Share this post


Link to post
Share on other sites
MJP    19754

D3D9 pixel shaders have a major limitation, which is that they can't dynamically index into shader constants. This means that it can't use an actual loop construct in assembly to implement your for loop, instead it has to unroll it and do something like this:

 

if(numLights > 0)
    CalcLight(Light0);
 
if(numLights > 1)
    CalcLight(Light1);
 
if(numLights > 2)
    CalcLight(Light2);
 
// ...and so on

The only way to dynamically index into your light data would be to use textures to store your light properties. This is a major reason why shader permutations were very popular for this era of hardware.

 

Now even if you get this working, you have to keep in mind that it's very important for performance to not just blindly apply every light to every mesh that you draw. Even a dozen or so lights can be a huge strain on the GPU if those lights affect every pixel on the screen. This means either doing work on the CPU to selectively bind lights to certain meshes/draw calls, or switching to a deferred approach that uses the GPU to figure out which lights affect which pixels.

 

Share this post


Link to post
Share on other sites
Medo Mex    891

@belfegor: That wouldn't be efficient, some important lights MIGHT not affect the mesh, since I wouldn't be able to figure out 100% which is important, what if more than 4 light sources affecting the mesh?

Share this post


Link to post
Share on other sites
Medo Mex    891

@Migi0027: Don't you think that I can just calculate the distance between the light source and the mesh? I guess that's easier and faster...

 

What If more than 4 lights affecting the mesh while the shader only allow 4 lights per pass?

Share this post


Link to post
Share on other sites
Medo Mex    891
D3D9 pixel shaders have a major limitation, which is that they can't dynamically index into shader constants. This means that it can't use an actual loop construct in assembly to implement your for loop, instead it has to unroll it and do something like this:

 

 

It's possible to use a loop with D3D9, take a look at this sample:

http://www.dhpoware.com/demos/d3d9NormalMappingWithManyLights.html

Share this post


Link to post
Share on other sites
Migi0027    4628

 

 

What If more than 4 lights affecting the mesh while the shader only allow 4 lights per pass?

 

Then if you don't want to go the deferred way, then this might suit you: (Very rough code)

void RenderMyMesh(Mesh *m, PointLights *p, int pointCount)
{
   ApplyDiffuseShader();

   m->Prepare();
   diffuseShader.Render();

   EnableAdditiveBlending();
   ApplyPointLightShader();
   for (int i = 0; i < pointCount; i++)
   {
      if (intersects(m->boundingSphere, p[i].boundingSphere))
      {
          pointShader.setLight(p[i]);
          pointShader.Render();
      }
   }
}

What I do here:

  • Render the mesh in diffuse mode
  • Enable additive blending for the lights
  • Loop over all lights
  • Check if they matter
  • If they do, render them.

 

I do think this is how it works, but then, I never used this method, so this is just from my memory.

 

 

 

 Don't you think that I can just calculate the distance between the light source and the mesh? I guess that's easier and faster...

 

You are a bit too fast there, think of this: (This is made in 60 seconds, so there may be errors)

 

30uvdhz.png

Share this post


Link to post
Share on other sites
Medo Mex    891
Okay, how about I use 8 lights for each type of light?
 
Is that good?
#define MAX_LIGHTS 8
PointLight pointLights[MAX_LIGHTS];
DirectionalLight directionalLights[MAX_LIGHTS];
SpotLight spotLights[MAX_LIGHTS];

Since 99% of the cases the lights would never exceed 8 lights on a single mesh.

Edited by Medo3337

Share this post


Link to post
Share on other sites
Adam_42    3629

Do you really need that many directional lights? I suspect you can probably get away with just one for sunlight.

 

Does 8 lights per object look any better than 4 lights per object in the scenes you'll be rendering?

 

What's the performance like with 8 lights on your target hardware spec?

 

Essentially what I'm saying is that whatever number you come up with may have to change later. Don't worry too much about that yet, other than to make sure you can easily adjust it later when you know what your actual requirements are.

Share this post


Link to post
Share on other sites
Chris_F    3030

Unless I'm mistaken, you've never actually stated you are working exclusively with D3D9, so if you are so concerned with the number of lights you can support, then why not use deferred shading or some type of tiled forward renderer? Even if you are limited to D3D9, you could still use deferred shading.

Share this post


Link to post
Share on other sites
Medo Mex    891

@Chris_F: Not planning to get into deferred shading now, and I would run into some problems for example (transparent mesh)

 

@Adam_42: Hmm.. I could make directional light max: 2, other lights max: 8, do you think that's okay?

 

I think even 10 should be fine, because I will only calculate the lights that I passed to the shader.

Share this post


Link to post
Share on other sites
MJP    19754

 

D3D9 pixel shaders have a major limitation, which is that they can't dynamically index into shader constants. This means that it can't use an actual loop construct in assembly to implement your for loop, instead it has to unroll it and do something like this:

 

 

It's possible to use a loop with D3D9, take a look at this sample:

http://www.dhpoware.com/demos/d3d9NormalMappingWithManyLights.html

 

Have you looked at the generated assembly? It looks like this:

ps_3_0
def c46, -4, -5, -6, -7
def c47, 0, 1, 2, 3
dcl_texcoord v0.xyz
dcl_texcoord1 v1.xy
dcl_texcoord2 v2.xyz
dcl_texcoord3 v3.xyz
dcl_2d s0
nrm r0.xyz, v3
dp3 r0.w, v2, v2
rsq r0.w, r0.w
mov r1, c47.x
mov r2.x, c47.x
rep i0
  add r3, r2.x, -c47
  add r4, r2.x, c46
  mov r5.x, c47.x
  cmp r2.yzw, -r3_abs.x, c0.xxyz, r5.x
  cmp r2.yzw, -r3_abs.y, c5.xxyz, r2
  cmp r2.yzw, -r3_abs.z, c10.xxyz, r2
  cmp r2.yzw, -r3_abs.w, c15.xxyz, r2
  cmp r2.yzw, -r4_abs.x, c20.xxyz, r2
  cmp r2.yzw, -r4_abs.y, c25.xxyz, r2
  cmp r2.yzw, -r4_abs.z, c30.xxyz, r2
  cmp r2.yzw, -r4_abs.w, c35.xxyz, r2
  add r2.yzw, r2, -v0.xxyz
  cmp r5.y, -r3_abs.x, c4.x, r5.x
  cmp r5.y, -r3_abs.y, c9.x, r5.y
  cmp r5.y, -r3_abs.z, c14.x, r5.y
  cmp r5.y, -r3_abs.w, c19.x, r5.y
  cmp r5.y, -r4_abs.x, c24.x, r5.y
  cmp r5.y, -r4_abs.y, c29.x, r5.y
  cmp r5.y, -r4_abs.z, c34.x, r5.y
  cmp r5.y, -r4_abs.w, c39.x, r5.y
  rcp r5.y, r5.y
  mul r2.yzw, r2, r5.y
  dp3 r5.y, r2.yzww, r2.yzww
  add r5.z, -r5.y, c47.y
  max r6.x, r5.z, c47.x
  rsq r5.y, r5.y
  mul r2.yzw, r2, r5.y
  mad r5.yzw, v2.xxyz, r0.w, r2
  nrm r7.xyz, r5.yzww
  dp3_sat r2.y, r0, r2.yzww
  dp3_sat r2.z, r0, r7
  pow r5.y, r2.z, c44.x
  cmp r7, -r3_abs.x, c1, r5.x
  cmp r7, -r3_abs.y, c6, r7
  cmp r7, -r3_abs.z, c11, r7
  cmp r7, -r3_abs.w, c16, r7
  cmp r7, -r4_abs.x, c21, r7
  cmp r7, -r4_abs.y, c26, r7
  cmp r7, -r4_abs.z, c31, r7
  cmp r7, -r4_abs.w, c36, r7
  mad r7, r6.x, r7, c45
  cmp r8, -r3_abs.x, c2, r5.x
  cmp r8, -r3_abs.y, c7, r8
  cmp r8, -r3_abs.z, c12, r8
  cmp r8, -r3_abs.w, c17, r8
  cmp r8, -r4_abs.x, c22, r8
  cmp r8, -r4_abs.y, c27, r8
  cmp r8, -r4_abs.z, c32, r8
  cmp r8, -r4_abs.w, c37, r8
  mul r8, r8, c41
  mul r8, r2.y, r8
  mul r8, r6.x, r8
  mad r7, c40, r7, r8
  cmp r8, -r3_abs.x, c3, r5.x
  cmp r8, -r3_abs.y, c8, r8
  cmp r8, -r3_abs.z, c13, r8
  cmp r3, -r3_abs.w, c18, r8
  cmp r3, -r4_abs.x, c23, r3
  cmp r3, -r4_abs.y, c28, r3
  cmp r3, -r4_abs.z, c33, r3
  cmp r3, -r4_abs.w, c38, r3
  mul r3, r3, c43
  mul r3, r5.y, r3
  cmp r3, -r2.y, c47.x, r3
  mad r3, r3, r6.x, r7
  add r1, r1, r3
  add r2.x, r2.x, c47.y
endrep
texld r0, v1, s0
mul oC0, r0, r1

Because of the constant indexing limitation it has to do a compare and select for every single constant register. It's just a different variant of what I mentioned. Basically it's like doing this:

for(uint i = 0; i < NumLights; ++i)
{
    float3 LightPos = Lights[0].Position;
    if(i == 1)
        LightPos = Lights[1].Position;
    else if(i == 2)
        LightPos = Lights[2].Position;
    else if(i == 3)
        LightPos = Lights[3].Position;
    ...
    else if(i == 7)
        LightPos = Lights[7].Position;
        
    float3 LightColor = Lights[0].Color;
    if(i == 1)
        LightColor = Lights[1].Color;
    else if(i == 2)
        LightColor = Lights[2].Color;
    else if(i == 3)
        LightColor = Lights[3].Color;
    ...
    else if(i == 7)
        LightColor = Lights[7].Color;
        
    // and so on
}
Edited by MJP

Share this post


Link to post
Share on other sites
unbird    8336

One can break the constant limit - and force to use a proper loop - in SM 3.0 by encoding stuff in textures, using dynamic textures and uploading your data with LockRectangle. 

 

*digging up my D3D9 stuff*

 

This is a quick check with 100 point lights. The texture is 4-channel float, 3 texels are needed per point light. 

 

83d765266027189.jpg

 

Is this a good idea ? I doubt it. Though I'm surprised it still runs smooth I wonder how that scales (this was just one model). Rather go with a deferred approach and cull the lights by distance/visibility as others suggested. 

Share this post


Link to post
Share on other sites
Migi0027    4628

One can break the constant limit - and force to use a proper loop - in SM 3.0 by encoding stuff in textures, using dynamic textures and uploading your data with LockRectangle. 

 

*digging up my D3D9 stuff*

 

This is a quick check with 100 point lights. The texture is 4-channel float, 3 texels are needed per point light. 

 

83d765266027189.jpg

 

Is this a good idea ? I doubt it. Though I'm surprised it still runs smooth I wonder how that scales (this was just one model). Rather go with a deferred approach and cull the lights by distance/visibility as others suggested. 

 

It may not be a good idea, but one thing it is, it's interesting! huh.png

Share this post


Link to post
Share on other sites
unbird    8336

Off topic:

 

It has it's use, e.g. here it's used for skinning a massive amount of characters (vertex texture fetch, not pixel shader this time). This approach is also useful if you have a model with a hell of a lot of bones.

 

I'm posting since you are using D3D11: The constant limit is considerably higher and even if you're hitting the limit, you don't need to use this approach - directly. Have a look into structured buffers. The usage is quite convenient: you really just define the struct in both shader and C++ and use with an array syntax. Under the hood something similar is happening as above: a "texture buffer" is used (HLSL register t#).

Share this post


Link to post
Share on other sites
Medo Mex    891

@Steve_Segreto: So, If I don't want to run into performance issues for most graphic cards out there and If I don't want to run into constant limits problems.

 

What is the maximum number that I should use for point, spot, directional lights in the shader array?

Edited by Medo3337

Share this post


Link to post
Share on other sites
Steve_Segreto    2080

@Steve_Segreto: So, If I don't want to run into performance issues for most graphic cards out there and If I don't want to run into constant limits problems.

 

What is the maximum number that I should use for point, spot, directional lights in the shader array?

I believe for most graphic cards out there you should try to stay under 256 float constant registers (shader model 2.0).

Share this post


Link to post
Share on other sites
Medo Mex    891

@Steve_Segreto: Even If there are many variables?

 

Example:

LightDirectional directionalLight[8];
PointLight pointLight[100];
SpotLight spotLight[100];

Basically, I'm trying to set the maximum number possible.

Share this post


Link to post
Share on other sites
osmanb    2082

How complex is your lighting model? How much math are you planning to do per light? How many grains of sand are there on a beach? These questions can't be answered in a vacuum, which is why people are trying to steer you to figure out the right answer (or create scalable solutions) for yourself. If I say: "You can support exactly 8 point lights in a SM3 shader", while assuming my particular lighting equation, that doesn't mean anything. Plus, you need to budget the rest of the frame (shadows, AO, defocus, particles, whatever...). How much GPU time do you want to spend rendering lights? What's your target hardware spec? More importantly - how many lights do you actually need to make your target content look good? Everything else is irrelevant noise.

Share this post


Link to post
Share on other sites
Medo Mex    891

@osmanb: I'm creating a Game Engine, I'm NOT sure how many light each game or scene will include.

 

So basically, I'm trying to set the maximum possible, I believe 8 lights per mesh should be enough for MOST games, but still, If I could allow more then it would be better.

 

The thing that I'm concerned about is that I could have a huge building (one mesh) which could include many lights (even more than the maximum in shader)

Edited by Medo3337

Share this post


Link to post
Share on other sites
Steve_Segreto    2080

@Steve_Segreto: Even If there are many variables?

 

Example:

LightDirectional directionalLight[8];
PointLight pointLight[100];
SpotLight spotLight[100];

Basically, I'm trying to set the maximum number possible.

Hi Medo,

 

First convert the max number of floating point registers available for your shader model from a count of float4s to a count of bytes (16 per register). Then you need to take the sizeof( LightDirectional ) in bytes (multiplied by 100, since you want that many), plus the sizeof( PointLight ) in bytes (multiplied by 100) plus sizeof( SpotLight ) multiplied by 100. Now subtract that from the number of bytes available for variables/constants and that should tell you if you could compile that declaration within that shader model. Or you could just run fxc from the command line and try it out :)

 

Also if you were confused because I said 256 const float registers, I was just using some terminology I've seen in DX9 docs, I believe those same registers are used for uniform externs that you can set from a C++ program (i.e. variables)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this