Jump to content
  • Advertisement
Sign in to follow this  
Sinaster

Converting 2.0 to 3.0 shaders

This topic is 3878 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I have tried to convert 2.0 shaders to 3.0 shaders so they use less instructions. They are working. Is it possible to save here more instructions? Refraction Shader 2.0: // Src0 s0 1 // Src1 s1 1 ps_2_x def c0, -0.5, 2, 0, 1 dcl t0.xy dcl t1.xy dcl_2d s0 dcl_2d s1 texld r0, t1, s1 add r0.w, r0.x, c0.x mul r1.w, r0.z, r0.w add r0.w, r0.y, c0.x mad r1.x, c0.y, -r1.w, t0.x mul r0.w, r0.z, r0.w mad r1.y, c0.y, r0.w, t0.y texld r0, r1, s1 texld r1, r1, s0 texld r2, t0, s0 mul r0.w, r0.w, r0.w cmp r3.w, -r0.w, c0.z, c0.w lrp r0, r3.w, r1, r2 mov oC0, r0 // approximately 15 instruction slots used (4 texture, 11 arithmetic) ------------------------------------ // texRatio0 c6 1 // texRatio1 c7 1 vs_1_1 dcl_position v0 dcl_texcoord v1 mad oT0.xy, v1, c6, c6.zwzw mad oT1.xy, v1, c7, c7.zwzw mov oPos, v0 // approximately 3 instruction slots used ------------------------------------ Refraction Shader 3.0: // Src0 s0 1 // Src1 s1 1 ps_3_0 def c0, -0.5, 2, 0, 1 dcl_texcoord v0.xy dcl_texcoord1 v1.xy dcl_2d s0 dcl_2d s1 texld r0, v1, s1 add r0.w, r0.x, c0.x mul r1.w, r0.z, r0.w add r0.w, r0.y, c0.x mad r1.x, c0.y, -r1.w, v0.x mul r0.w, r0.z, r0.w mad r1.y, c0.y, r0.w, v0.y texld r0, r1, s1 texld r1, r1, s0 texld r2, v0, s0 mul r0.w, r0.w, r0.w cmp r3.w, -r0.w, c0.z, c0.w lrp oC0, r3.w, r1, r2 // approximately 13 instruction slots used (4 texture, 9 arithmetic) ------------------------------- // texRatio0 c6 1 // texRatio1 c7 1 vs_3_0 dcl_position v0 dcl_texcoord v1 dcl_position o0 dcl_texcoord o1.xy dcl_texcoord1 o2.xy mad o1.xy, v1, c6, c6.zwzw mad o2.xy, v1, c7, c7.zwzw mov o0, v0 // approximately 3 instruction slots used ------------------------------------------------- It would be nice if someone could help me.

Share this post


Link to post
Share on other sites
Advertisement

Hi,

Have you planned to use a high level shader language and let the compiler do the optimizations ?

Cheers!

Share this post


Link to post
Share on other sites
Well no, I have only the 2.0 or 1.1 shaders in this language...ASM?
I have compared 2.0 and 3.0 shaders and have so optimized them. But I use GPU Shader Analyzer to view if there is a failure and then I test them. ;)

Share this post


Link to post
Share on other sites
I'd recommend you use hlsl to write the 3.0 shaders. The optimizer works very well, and will probably save you a few instructions over hand written assembly. To show you an example I converted that first shader to hlsl and compiled it.

Unless I made a silly mistake somewhere it's saved you 2 instructions.

Command lines:

fxc /Tps_3_0 /O3 filename.fx


const float4 c0 = float4(-0.5f, 2.0f, 0.0f, 1.0f);

struct PS_OUTPUT
{
float4 oC0 : COLOR0;
};

PS_OUTPUT main(float2 t0 : TEXCOORD0, float2 t1 : TEXCOORD1, sampler s0, sampler s1)
{
PS_OUTPUT result;
float4 r0, r1, r2, r3;

r0 = tex2D(s1, t1);
r0.w = r0.x + c0.x;
r1.w = r0.z + r0.w;
r0.w = r0.y + c0.x;
r1.x = (c0.y - r1.w) * t0.x;
r0.w = r0.z * r0.w;
r1.y = (c0.y + r0.w) * t0.y;

r0 = tex2D(s1, r1);
r1 = tex2D(s0, r1);
r2 = tex2D(s0, t0);

r0.w *= r0.w;
r3.w = (-r0.w >= 0.0f) ? c0.z : c0.w;
r0 = lerp(r1, r2, r3.w);

result.oC0 = r0;

return result;
}




ps_3_0
dcl_texcoord v0.xy
dcl_texcoord1 v1.xy
dcl_2d s0
dcl_2d s1
texld r0, v1, s1
add r0.xy, r0, c0.x
add r0.x, r0.z, r0.x
mad r0.y, r0.z, r0.y, c0.y
add r0.x, -r0.x, c0.y
mul r0.xz, r0.xyyw, v0.xyyw
texld r1, r0.xzzw, s1
texld r0, r0.xzzw, s0
mul r1.x, r1.w, r1.w
cmp r1.x, -r1.x, c0.z, c0.w
texld r2, v0, s0
add r2, -r0, r2
mad oC0, r1.x, r2, r0

// approximately 13 instruction slots used (4 texture, 9 arithmetic)

Share this post


Link to post
Share on other sites
You have forgotten:
def c0, -0.5, 2, 0, 1

Converting to HLSL must be made by hand? Or have you used a program?
Is there a reference or tutorial for converting ASM to HLSL shaders?^^ (have I right understand you? )

And thank you for the example! ;)

Share this post


Link to post
Share on other sites
I would also go with the "use HLSL" comments. It's been ditched with SM4 and D3D10, so you'll have to make the jump at some point unless you want to stick with D3D8/D3D9 for ever more!

Jack

Share this post


Link to post
Share on other sites
Jep, but I need only 3.0 shaders and the shaders I have are in ASM.^^

So must I convert ASM to HLSL per hand and is there shaderreference? :)

Share this post


Link to post
Share on other sites

Hi,

most likely you'll need to convert your shaders to HLSL by hand.

Anyway, if you want to do some assembly level analysis, I think that the shader compiler is able to export the generated assembly in readable form.

Regards!

Share this post


Link to post
Share on other sites
Yep, I did that conversion by hand. It's actually quite simple - I just did the conversion line by line using variables with the same names as the registers. You should be able to match up lines in the original source to the lines in my hlsl. It shouldn't be too much work to write a simple tool to do the conversion, but it'd only be worth it if you have lots and lots of them to convert.

That assembly in my post above is a copy and paste of what the compiler output from compiling that hlsl, I'm not sure why it left out the constant definition.

Share this post


Link to post
Share on other sites
The translation looks good, but I would change:
const float4 c0 = float4(-0.5f, 2.0f, 0.0f, 1.0f);

to:
static const float4 c0 = float4(-0.5f, 2.0f, 0.0f, 1.0f);

This will make the compiler actually use the values, instead of assuming they can change externally. The initializer is just a hint to the application that they might want to initialize the value to these values, it will assume that the application wants to change them later.

Making the variable static means it's not visible externally and so can't be changed, so it will assume any references to the value can be replaced by the literal value.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!