Converting 2.0 to 3.0 shaders

Started by
12 comments, last by Jalibr 16 years, 2 months ago
I have tried to convert 2.0 shaders to 3.0 shaders so they use less instructions. They are working. Is it possible to save here more instructions? Refraction Shader 2.0: // Src0 s0 1 // Src1 s1 1 ps_2_x def c0, -0.5, 2, 0, 1 dcl t0.xy dcl t1.xy dcl_2d s0 dcl_2d s1 texld r0, t1, s1 add r0.w, r0.x, c0.x mul r1.w, r0.z, r0.w add r0.w, r0.y, c0.x mad r1.x, c0.y, -r1.w, t0.x mul r0.w, r0.z, r0.w mad r1.y, c0.y, r0.w, t0.y texld r0, r1, s1 texld r1, r1, s0 texld r2, t0, s0 mul r0.w, r0.w, r0.w cmp r3.w, -r0.w, c0.z, c0.w lrp r0, r3.w, r1, r2 mov oC0, r0 // approximately 15 instruction slots used (4 texture, 11 arithmetic) ------------------------------------ // texRatio0 c6 1 // texRatio1 c7 1 vs_1_1 dcl_position v0 dcl_texcoord v1 mad oT0.xy, v1, c6, c6.zwzw mad oT1.xy, v1, c7, c7.zwzw mov oPos, v0 // approximately 3 instruction slots used ------------------------------------ Refraction Shader 3.0: // Src0 s0 1 // Src1 s1 1 ps_3_0 def c0, -0.5, 2, 0, 1 dcl_texcoord v0.xy dcl_texcoord1 v1.xy dcl_2d s0 dcl_2d s1 texld r0, v1, s1 add r0.w, r0.x, c0.x mul r1.w, r0.z, r0.w add r0.w, r0.y, c0.x mad r1.x, c0.y, -r1.w, v0.x mul r0.w, r0.z, r0.w mad r1.y, c0.y, r0.w, v0.y texld r0, r1, s1 texld r1, r1, s0 texld r2, v0, s0 mul r0.w, r0.w, r0.w cmp r3.w, -r0.w, c0.z, c0.w lrp oC0, r3.w, r1, r2 // approximately 13 instruction slots used (4 texture, 9 arithmetic) ------------------------------- // texRatio0 c6 1 // texRatio1 c7 1 vs_3_0 dcl_position v0 dcl_texcoord v1 dcl_position o0 dcl_texcoord o1.xy dcl_texcoord1 o2.xy mad o1.xy, v1, c6, c6.zwzw mad o2.xy, v1, c7, c7.zwzw mov o0, v0 // approximately 3 instruction slots used ------------------------------------------------- It would be nice if someone could help me.
Advertisement

Hi,

Have you planned to use a high level shader language and let the compiler do the optimizations ?

Cheers!
Well no, I have only the 2.0 or 1.1 shaders in this language...ASM?
I have compared 2.0 and 3.0 shaders and have so optimized them. But I use GPU Shader Analyzer to view if there is a failure and then I test them. ;)
I'd recommend you use hlsl to write the 3.0 shaders. The optimizer works very well, and will probably save you a few instructions over hand written assembly. To show you an example I converted that first shader to hlsl and compiled it.

Unless I made a silly mistake somewhere it's saved you 2 instructions.

Command lines:

fxc /Tps_3_0 /O3 filename.fx

const float4 c0 = float4(-0.5f, 2.0f, 0.0f, 1.0f);struct PS_OUTPUT{	float4 oC0 : COLOR0;};PS_OUTPUT main(float2 t0 : TEXCOORD0, float2 t1 : TEXCOORD1, sampler s0, sampler s1){	PS_OUTPUT result;	float4 r0, r1, r2, r3;	r0 = tex2D(s1, t1);	r0.w = r0.x + c0.x;	r1.w = r0.z + r0.w;	r0.w = r0.y + c0.x;	r1.x = (c0.y - r1.w) * t0.x;	r0.w = r0.z * r0.w;	r1.y = (c0.y + r0.w) * t0.y;	r0 = tex2D(s1, r1);	r1 = tex2D(s0, r1);	r2 = tex2D(s0, t0);	r0.w *= r0.w;	r3.w = (-r0.w >= 0.0f) ? c0.z : c0.w;	r0 = lerp(r1, r2, r3.w);	result.oC0 = r0;	return result;}


    ps_3_0    dcl_texcoord v0.xy    dcl_texcoord1 v1.xy    dcl_2d s0    dcl_2d s1    texld r0, v1, s1    add r0.xy, r0, c0.x    add r0.x, r0.z, r0.x    mad r0.y, r0.z, r0.y, c0.y    add r0.x, -r0.x, c0.y    mul r0.xz, r0.xyyw, v0.xyyw    texld r1, r0.xzzw, s1    texld r0, r0.xzzw, s0    mul r1.x, r1.w, r1.w    cmp r1.x, -r1.x, c0.z, c0.w    texld r2, v0, s0    add r2, -r0, r2    mad oC0, r1.x, r2, r0// approximately 13 instruction slots used (4 texture, 9 arithmetic)
You have forgotten:
def c0, -0.5, 2, 0, 1

Converting to HLSL must be made by hand? Or have you used a program?
Is there a reference or tutorial for converting ASM to HLSL shaders?^^ (have I right understand you? )

And thank you for the example! ;)
I would also go with the "use HLSL" comments. It's been ditched with SM4 and D3D10, so you'll have to make the jump at some point unless you want to stick with D3D8/D3D9 for ever more!

Jack

<hr align="left" width="25%" />
Jack Hoxley <small>[</small><small> Forum FAQ | Revised FAQ | MVP Profile | Developer Journal ]</small>

Jep, but I need only 3.0 shaders and the shaders I have are in ASM.^^

So must I convert ASM to HLSL per hand and is there shaderreference? :)

Hi,

most likely you'll need to convert your shaders to HLSL by hand.

Anyway, if you want to do some assembly level analysis, I think that the shader compiler is able to export the generated assembly in readable form.

Regards!
Yep, I did that conversion by hand. It's actually quite simple - I just did the conversion line by line using variables with the same names as the registers. You should be able to match up lines in the original source to the lines in my hlsl. It shouldn't be too much work to write a simple tool to do the conversion, but it'd only be worth it if you have lots and lots of them to convert.

That assembly in my post above is a copy and paste of what the compiler output from compiling that hlsl, I'm not sure why it left out the constant definition.
The translation looks good, but I would change:
const float4 c0 = float4(-0.5f, 2.0f, 0.0f, 1.0f);

to:
static const float4 c0 = float4(-0.5f, 2.0f, 0.0f, 1.0f);

This will make the compiler actually use the values, instead of assuming they can change externally. The initializer is just a hint to the application that they might want to initialize the value to these values, it will assume that the application wants to change them later.

Making the variable static means it's not visible externally and so can't be changed, so it will assume any references to the value can be replaced by the literal value.

This topic is closed to new replies.

Advertisement