#### Archived

This topic is now archived and is closed to further replies.

# is blending faster with vertex shader ?

This topic is 5489 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

hello coders.. i do mesh interpolating of positions and normals and vu .. i do it by my own code ( x= x1*weight + x2*(1-weight) ) is it faster to do that in the vertex shader and assemble it? and thanks you all.

##### Share on other sites
Yes.

1) With hardware vertex processing, it''s fast because the GPU performs specialised SIMD processing; it takes the work off the CPU and also means you can usually avoid locking the vertex buffer, uploading new data to the chip etc (i.e. use a static buffer and perform any modifications in the shader).

2) With software vertex processing, when you create the shader, the D3D runtime generates native x86 SIMD code (SSE, SSE2, 3DNow!) for that shader to perform the operations in parallel. That should be faster than plain C/C++/x86ASM code, the only thing faster would be your own well pipelined hand written, specialised SIMD.

500

--
Simon O''Connor
Creative Asylum Ltd
www.creative-asylum.com

##### Share on other sites
I suspect that if you have hardware support for vertex shaders, it would be faster than doing it in software. I''m not positive, but logically, you would be taking those operations off the CPU and giving them to the GPU instead, which should give a performance increase. Even if you don''t have support for hardware vertex shaders it still might be faster because of the optimized assebmly code that is used to run vertex shaders in software. I haven''t tested this myself though, so I''m not entirely sure, but I assume vertex shaders would be faster in this case.

I use a vertex shader for a similar thing in my programs and using hardware vertex shaders is WAY faster than software shaders.

##### Share on other sites
thank you Mr. Simon and Mr. Steel for your interest
that was the answer i''m looking for.

##### Share on other sites
thank you Mr. Simon and Mr. Steel for your interest
that was the answer i''m looking for.

##### Share on other sites
could someone please explain how you would do this.

How would you give the vertex shader the 2nd vertex and weight?

would you have to upload them to the gpu shader constants for every vertex of every frame ?

say you had a mesh (just a bunch of triangles, not the d3d class), would you draw it with 1 DrawPrimitives() call ? if so, how would you give the vertex shader the next vertex and weight for the interpolating?

or would you call DrawPrimitives() for every triangle, and set the shader constants just before you called it? (this way seems fundamentally flawed to me?)

(btw, incase you havnt noticed, i know very little about vertex shader theory, so the more details you give the better it will help me)

[edited by - Smurfwow on February 8, 2003 2:44:53 AM]

^^^^^^^^^^^^^^ <

##### Share on other sites
quote:
How would you give the vertex shader the 2nd vertex and weight?

Generally, you pass the 2 positions in the same vertex, say:
struct T_Vertex
{
T_Vec3 pos1,pos2;
T_Vec2 tex;
}

And then you create the appropriate vertex declaration that matches your vertex structure (a vertex declaration tells the shader what to expect in the input registers).

A DX8 decl would look like:
DWORD dwDecl[] = {
D3DVSD_STREAM( 0 )
D3DVSD_REG( 0, D3DVSDT_FLOAT3 ) // first pos
D3DVSD_REG( 1, D3DVSDT_FLOAT3 ) // second pos
D3DVSD_REG( 2, D3DVSDT_FLOAT2 ( // tex coords
D3DVSD_END()
}

V0: First position
V1: Second position
V2: Texture coordinates

For the weight: If it’s per-mesh or something ( for animation interpolation for example ), you can pass it as a constant. Otherwise (== if it's per-vertex), you’d put it in the vertex structure.

Hope this helps.

[edited by - Coder on February 9, 2003 5:21:48 AM]

##### Share on other sites
So this is the "tweening" feature of Direct X?

It seems that this would be a good thing to implement on landscape LOD in order to avoid the popping between different levels of detail. And you don’t really need to know anything about writing vertex shaders, which I don’t.

##### Share on other sites
ok... ive implimented vertex blending with an "effect" using "hlsl"

the animations seem to be slightly more jerky than when i was doing them manually.

if anyones interested... this is the "effect".

  float t;float4x4 World      : WORLD;struct VS_OUTPUT{	float4 Pos : POSITION0;	float2 Tex : TEXCOORD0; };VS_OUTPUT main (float3 Pos1:POSITION0, float3 Pos2:POSITION1, float3 Norm:NORMAL0, float2 Tex:TEXCOORD0){	VS_OUTPUT Out = (VS_OUTPUT) 0; 		Pos1 = mul(World, Pos1);  	        Pos2 = mul(World, Pos2); 		float4 newPos1;	float4 newPos2;		newPos1.x = Pos1.x;	newPos1.y = Pos1.y;	newPos1.z = Pos1.z;	newPos1.w = 0.0f;		newPos2.x = Pos2.x;	newPos2.y = Pos2.y;	newPos2.z = Pos2.z;	newPos2.w = 0.0f;	Out.Pos = lerp(newPos1, newPos2, t);	   	Out.Tex = Tex;	return Out;}technique T0{    pass P0    {               	VertexShader = compile vs_1_1 main();		         pixelshader  = NULL;    }}

in the game... i just set the t value once a frame, then just put an effect.Begin(0) and effect.End() around my drawing code.

does anyone know what could possible cause "jerky" animations... given that they were smoother when i was doind them in C#, id expect them to be at least as smooth when done on the gpu...

##### Share on other sites
I''ve not written HLSL/Cg before, so just bare with me

This code

  newPos1.x = Pos1.x;newPos1.y = Pos1.y;newPos1.z = Pos1.z;newPos1.w = 0.0f;newPos2.x = Pos2.x;newPos2.y = Pos2.y;newPos2.z = Pos2.z;newPos2.w = 0.0f;Out.Pos = lerp(newPos1, newPos2, t);

seems to be modifying the w component (setting it to 0) for some reason, what''s that reason?

In the Direct3D pipeline, the homogeneous divide (by w) occurs AFTER the vertex shader execution. Your output vertex contains a w of 0?

##### Share on other sites
It has to be set to something, if you dont set it direct3d has a cry and throws an exception. (one of those mystical managed exceptions... the ones that give you anywhere between -1% and 0% useful information)

and there didn''t seem to be any difference between settings it to 1.0 or settings it to 0.0.

so... What *should* it be set to ?

##### Share on other sites
Input vertices generally have a w of 1, meaning they''re in the euclidean space.

You don''t manually modify the w yourself, it''s modified by the multiplication of your vertices by the combined worldViewProjection matrix.

I can think of 2 things to do:
1. Lerp your positions, then multiply by the matrix: I think this would work fine. Why don''t you try it first?

2. Multiply your positions by the matrix, then lerp:
I''m bit rusty on the homogeneous thingy, and I don''t know whether it''d be valid to ''linearly'' interpolate in the homogeneous space. Try it, if it works, then you can!

Setting w to 0 is likely to give undefined behavior, I think (unless the docs define some behavior corresponding to this)

##### Share on other sites
Doh! I wrote some weird things last time. I think I''d better think before I write, next time.

quote:
I can think of 2 things to do:
1. Lerp your positions, then multiply by the matrix: I think this would work fine. Why don''t you try it first?

Should set w to 1 after lerp, before multiplication.

quote:
2. Multiply your positions by the matrix, then lerp:
I''m bit rusty on the homogeneous thingy, and I don''t know whether it''d be valid to ''linearly'' interpolate in the homogeneous space. Try it, if it works, then you can!

In short: After thinking, the 2nd would work just fine too.

Proof:
Using method (1) would give us (after lerp and w modification, before matrix mul) :
lerped.x = x1 + t( x2 - x1 )lerped.y = y1 + t( y2 - y1 )lerped.z = z1 + t( z2 - z1 )lerped.w = 1

Let the 4th column in the matrix be
| A || B || C || D |

Then the w component of any vertex (x,y,z,1) multiplied by this matrix is:
Ax + By + Cz + D

Multiplying lerped by the matrix would give the following w:
result1.w = A * [ x1 + t( x2 - x1 ) ] + B * [ y1 + t( y2 - y1 ) ] + C * [ z2 + t( z2 - z1 ) ] + D          = Ax1 + By1 + Cz1 + D + t * [ A( x2 - x1 ) + B( y2 - y1 ) + C( z2 - z1 ) ]

With the 2nd method, after multiplying pos1, pos2 by the matrix (and storing the results in newPos1/2):
newPos1.w = Ax1 + By1 + Cz1 + DnewPos2.w = Ax2 + By2 + Cz2 + D

A simple lerp would do:
result2.w = newPos1.w + t * ( newPos2.w - newPos1.w )          = Ax1 + By1 + Cz1 + D + t * [ A( x2 - x1 ) + B( y2 - y1 ) + C( z2 - z1 ) + D2 - D1 ] // since D2 == D1          = Ax1 + By1 + Cz1 + D + t * [ A( x2 - x1 ) + B( y2 - y1 ) + C( z2 - z1 ) ]result2.w == result1.w

Hopefully, I didn''t blow up anything in the math, and it all made sense

##### Share on other sites
thanks...

im still getting the "jerky animations" problem tho :/

is there anyway to debug shaders/effects ?

I''m thinking the 2nd vertex isnt getting passed in. this is the fvf im using: (with the shader input listed above)

public struct Md2Vertex{    public Vector3 Position0;    public Vector3 Position1;    public Vector3 Normal;    public Vector2 Texture;	    public static readonly VertexFormats Format = VertexFormats.Position | VertexFormats.PositionBlend1 | VertexFormats.Normal | VertexFormats.Texture1;}

whats the best way to pass the 2nd vertex into the shader? (keeping in mind i dont think effects can use the "vertex declaration").

##### Share on other sites
I had a second stream of the same VB but one position ahead in my MD2 class.

Neil

WHATCHA GONNA DO WHEN THE LARGEST ARMS IN THE WORLD RUN WILD ON YOU?!?!

##### Share on other sites
if possible, could you please show me some code.

how did you manage the different animations? the index''s moving all around the buffer?

##### Share on other sites
quote:
is there anyway to debug shaders/effects ?

Sure, the shader debugger. You''d need VS.NET and WinXP pro to be able to install it though. It comes with the DX9 SDK.
A Win2K release is being considered, by the way.

quote:
I''m thinking the 2nd vertex isnt getting passed in

You can always test that. Just copy the 2nd pos into the output register and check the output (Make sure you pass some identifiable value in second position for all verts)

quote:
whats the best way to pass the 2nd vertex into the shader? (keeping in mind i dont think effects can use the "vertex declaration").

You can always use another stream. But the method mentioned before (passing 2 positions in a single struct) should work. So you''ve got to be doing something wrong somewhere (a useless note, I guess. It''s just meant to urge you not to give up and use another stream right away. When you get the 1st method up and running, try the 2nd. That way you''re learning more stuff)

I''m not sure I understand what you mean with "I don''t think effects can use the "vertex declaration"" thing

##### Share on other sites
quote:

A DX8 decl would look like:
DWORD dwDecl[] = {
D3DVSD_STREAM( 0 )
D3DVSD_REG( 0, D3DVSDT_FLOAT3 ) // first pos
D3DVSD_REG( 1, D3DVSDT_FLOAT3 ) // second pos
D3DVSD_REG( 2, D3DVSDT_FLOAT2 ( // tex coords
D3DVSD_END()
}

this is what i ment by the "vertex declaration". I know with effects there is a "SetValue()" method, but i dont think you can use the above code?

The problem im having is that i cant figure out how to pass the 2nd vertex in. It may be my vertex declation, or it may be the shader struct i use for input into the shader function.

The Managed VertexFormat enumeration is different from the unmanaged one, and i cant figure out the equivilants, and as you may be aware there is no Managed Documentation (just an auto-generated help file made from xml comments)

##### Share on other sites
quote:
I know with effects there is a "SetValue()" method, but i dont think you can use the above code?

In DX8.1, you can simply type the following into an effect file (copied from one of the examples in the dx8 docs):

technique tec0{     pass p0    {        // Load matrices        // [...]        VertexShader =        decl        {            stream 0;            float v0[3]; // Position            float v3[3]; // Normal            float v7[3]; // Texture coord1            float v8[3]; // Texture coord2        }           asm        {            vs.1.1             // Version number            m4x4 oPos, v0, c4  // Transform point to projection space            m4x4 r0,v0,c0      // Transform point to world space    	                add r0,-r0,c24     // Get a vector toward the camera position	                           // This is the negative of the camera direction             // Normalize            dp3 r11.x,r0.xyz,r0.xyz   // Load the square into r1            rsq r11.xyz,r11.x         // Get the inverse of the square            mul r0.xyz,r0.xyz,r11.xyz // Multiply, r0 = -(camera vector)            add r2.xyz,r0.xyz,-c16    // Get half angle	            // Normalize            dp3 r11.x,r2.xyz,r2.xyz   // Load the square into r1            rsq r11.xyz,r11.x         // Get the inverse of the square            mul r2.xyz,r2.xyz,r11.xyz // Multiply, r2 = HalfAngle            m3x3 r1,v3,c0             // Transform normal to world space, put in r1                    // r2 = half angle, r1 = normal, r3 (output) = intensity            dp3  r3.xyzw,r1,r2            // Now raise it several times            mul r3,r3,r3 //  2nd            mul r3,r3,r3 //  4th            mul r3,r3,r3 //  8th            mul r3,r3,r3 // 16th	                // Compute diffuse term            dp3 r4,r1,-c16 	             // Blend it in            mul r3,c20,r3   // Kd            mul r4,r4,c21   // Ks            mul r4,r4,c10   // Specular            mad r4,r3,c9,r4 // Diffuse			             mov oD0,r4      // Put into Diffuse Color       };         }}

• 11
• 16
• 11
• 10
• 11