Sign in to follow this  

Shader Model 3 compiler optimisation fail? [Solved]

This topic is 2662 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

As you all know in shader model 3.0 there are 16 texture addresses and in SM2 there are 8. Well as soon as I found that out I got excited!

I started modifying my vertex data so I could pump more point lights into one pass. Then I found a catch 22 that is confusing me.

I figured out there is an #o buffer that has a limit of aprox 28 values that can be stored in one array so 3x9 is the maximum lights I can hold in one array. However if I do what is shown below I get success.

float3 LightDir[9] : TEXCOORD1;
float3 ViewDir : TEXCOORD10;
float3 LightDir2[5] : TEXCOORD11;

I assume because the memory is broken up happily. However if I do this.

float3 LightDir[9] : TEXCOORD1;
float3 LightDir2[5] : TEXCOORD10;
float3 ViewDir : TEXCOORD15;

Error X5629: Invalid register number: 16. Max allowed for #o register is 11. Which is similar to the error I get dumping all 14 lights into one array. Is this optimisation failing?

I haven't setup a mechanism to read out from LightDir2 when a value goes above 9 so I cannot garuntee it would work the first way but all the green lights are there. Can anyone tell me what is happening here or validate my theory that its trying to group the data into one storage component on the GPU and it dislikes that?

Until I am sure whats going wrong here I dont want to use a second array to push through all my point lights in one go. Especially if its avoidable/dangerous! So I am going to wait for some info on what is going on here!

[Edited by - EnlightenedOne on September 2, 2010 12:03:34 PM]

Share this post


Link to post
Share on other sites
The same is true with my projective texture shader. The below fails but if I make the arrays hold 4 it works fine.

float4 Position : POSITION;
float2 TexCoord0 : TEXCOORD0;
float4 TexCoordProj[6] :TEXCOORD1;
float3 ProjVec[6] : TEXCOORD7;
float3 ViewDir : TEXCOORD14;
float4 ScreenCoord : TEXCOORD15;

So this verifies the problem is in arrays one after the other even if they are different sizes I get this problem.

float 4 * 6 = 24 within the estimated 28 value limit
float 3 * 6 = 18 within the estimated limit

But what it must be doing is combining those two values
to, the meaning of life 42.

4 * 4 = 16
3 * 4 = 12

28 the estimated limit for values in a row!

These two together stick within the limit of 28 values in arrays next to one another. So my first question is why on earth does it care about arrays next to each other. My second question is why can't the GPU shatter the stored array data into two places so it can hold more than 28 values per array? working around these things is going to make my shaders very convoluted! I am just stoked to have more generic textures to pass values into if I am honest!

Share this post


Link to post
Share on other sites
Unfortunately for me the total number of things you can store in array also affects the global values.

float4 LightPos[6]; //Light position in world space
float4 LightPos2[3]; //Light position in world space
float4 LightColor[6]; //Color of the light
float4 LightColor2[3]; //Color of the light

Will fly but if I have two things with 9 in them I am stuck. The ironic thing is that I can actually just overload this really easily by creating secondary arrays. This does not change the fact that I am confused as to why I have too in the first place.

Fortunately in global declarations I am not limited to having a buffer between arrays to stop them being combined. So I dont actually gain any extra point lights without makign secondary arrays. Until I know more about this oddity I wont gamble it. Please if you know what the deal is here please inform me :)

Share this post


Link to post
Share on other sites
I mean I am not wrong here am I? Passing all the geometry over to the GPU to draw a technique is much more intensive than me overloading the programs rules for arrays when the program will accept more as a general rule so long as I feed them in painfully. Better two arrays make up data for one pass than I run two renders to the screen one after the other.

The only exemption to the 28components rule is this massive array.
float4x4 TexTransform[4]; //Texture transformation matrix

I will load all my data into these things then "decode" it in my pixel shader to bypass this limitation when I have the generic texture buffers to do it if I have too! But why do I have too.

Share this post


Link to post
Share on other sites
Its definetely a limit around 27 odd values. So I am just going to split the lights in two [7] arrays to use up all the spare textures available and "optimise" my passes for point lights.

Ok that didn't work so I tried it with 6 in the arrays and that failed. So I tried to modify things so that I had what I wanted going on. It seems regardless of how I try and throw things past this object if I intend to use more than 8 textures to draw my object it fails.

float3 LightDir[6] : TEXCOORD1;
float3 ViewDir : TEXCOORD8; //Splits up lights data to stop optimisation crash.
float3 LightDir2[3] : TEXCOORD9;

Ah just found this http://msdn.microsoft.com/en-us/library/bb205573(VS.85).aspx

Seems the limit is not 8 as I originally thought or 16 but infact 11. The arrays were not related to size but me trying to break that limit in some instances the compiler would pickup that I was trying to perform an illegal task only when trying to use the extra registers :p

Share this post


Link to post
Share on other sites

This topic is 2662 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this