Jump to content

  • Log In with Google      Sign In   
  • Create Account

Steve_Segreto

Member Since 26 Jun 2009
Offline Last Active Today, 08:20 AM

Posts I've Made

In Topic: Skeletal animation optimization

13 July 2016 - 11:57 PM

Your debug configuration may have iterator debugging turned on which is ridiculously slow compared to how useful it is (in addition to the fact that all memory allocations use the debug heap which is also slow, but at least useful).

 

Follow this mess and turn all of the iterator debugging off and see if Debug configuration is acceptably fast again.

 

https://msdn.microsoft.com/en-us/library/aa985982.aspx


In Topic: D3DXCompileShader Failing on User Computer

13 July 2016 - 11:50 PM

You're using D3DXCompileShader and hence D3DX, which is an optional addition to DirectX and thus requires the redist (https://www.microsoft.com/en-us/download/details.aspx?id=8109)

 

Basically your users need the suite of D3DX9_**.dll's (but probably d3dx9_43.dll and d3dx9_31.dll would be good enough, depending if you passed the legacy 9_31 flag to the D3DXCompileShader call). Also as mentioned you should log both the HR of the call and the LPD3DXBUFFER for the error message.

 

d3dx9_24.dll
d3dx9_25.dll
d3dx9_26.dll
d3dx9_27.dll
d3dx9_28.dll
d3dx9_29.dll
d3dx9_30.dll
d3dx9_31.dll
d3dx9_32.dll
d3dx9_33.dll
d3dx9_34.dll
d3dx9_35.dll
d3dx9_36.dll
D3DX9_37.dll
D3DX9_38.dll
D3DX9_39.dll
D3DX9_40.dll
D3DX9_41.dll
D3DX9_42.dll
D3DX9_43.dll

In Topic: Relation between 2 different sets of coordinates

13 July 2016 - 10:10 PM

Your "Marien_text" looks like it's nearly in the same position, are you sure you're taking into account the scale and rotation specified in the original world transform for the car?


In Topic: Skeletal animation shader questions

04 January 2016 - 09:53 PM

Hi this is a great list of questions!

 

In my experience constants buffer are the best way in dx9, 10 and 11 to send the updated bone transforms to the shader for the vertex skinning. A buffer of about 50 or less 4x4 matrices should be able to handle most (> 90%) of the low to mid class game engine skinned actors in one draw call (I can confirm this for Dark Age of Camelot, Unreal Tournament, Mortal Kombat Armageddon, Zelda Twilight Princess and TitanQuest skinned actors).

 

As for the 4 bone weights per vertex limit, this does seem to be an arbitrary legacy based limit from shader model 2 and the fixed function before it when the blend indices were packed into 4 dwords. I think I saw some research paper that concluded for bipedal meshes 4 blend indices is sufficient for a certain fidelity of realistic motion, but I can't find the paper at hand to link to it.

 

As for the last question of whether you should always blend in 4 bone weights, I actually use another number to tell me how many of the 4 blend indices are needed. Here is some example shader model 2 code for skinning:

 

CRenderableMesh 0F7B78E8:
=================================================================================
Vertex Declaration: 0B98C860
=================================================================================
 8 Vertex Elements
{ Stream = 0, Offset = 0, Type = D3DDECLTYPE_FLOAT3, Method = D3DDECLMETHOD_DEFAULT, Usage = D3DDECLUSAGE_POSITION, UsageIndex = 0 },
{ Stream = 0, Offset = 12, Type = D3DDECLTYPE_FLOAT3, Method = D3DDECLMETHOD_DEFAULT, Usage = D3DDECLUSAGE_NORMAL, UsageIndex = 0 },
{ Stream = 0, Offset = 24, Type = D3DDECLTYPE_FLOAT3, Method = D3DDECLMETHOD_DEFAULT, Usage = D3DDECLUSAGE_BLENDWEIGHT, UsageIndex = 0 },
{ Stream = 0, Offset = 36, Type = D3DDECLTYPE_D3DCOLOR, Method = D3DDECLMETHOD_DEFAULT, Usage = D3DDECLUSAGE_BLENDINDICES, UsageIndex = 0 },
{ Stream = 0, Offset = 40, Type = D3DDECLTYPE_FLOAT4, Method = D3DDECLMETHOD_DEFAULT, Usage = D3DDECLUSAGE_TEXCOORD, UsageIndex = 0 },
{ Stream = 0, Offset = 56, Type = D3DDECLTYPE_FLOAT4, Method = D3DDECLMETHOD_DEFAULT, Usage = D3DDECLUSAGE_TEXCOORD, UsageIndex = 1 },
{ Stream = 0, Offset = 72, Type = D3DDECLTYPE_FLOAT4, Method = D3DDECLMETHOD_DEFAULT, Usage = D3DDECLUSAGE_TEXCOORD, UsageIndex = 2 },
{ Stream = 0, Offset = 88, Type = D3DDECLTYPE_FLOAT4, Method = D3DDECLMETHOD_DEFAULT, Usage = D3DDECLUSAGE_TEXCOORD, UsageIndex = 3 }
Vertex Shader 0FA01450:
=================================================================================
//--------------------------------------------------------------------------------------
// Automatically generated Vertex Shader.
//
// Copyright (c) Steve Segreto. All rights reserved.
// Shader Flags = 887f9
// Shader Type = Linear-Based Quaternion Skinning
// Shader Quality = PHONG_LIGHTING
//--------------------------------------------------------------------------------------
 
struct DirLight
{
    float4 ambient;
    float4 diffuse;
    float4 spec;
    float3 dirW;
    float4 fogColor;
    float3 lightPosW;
};
 
struct Mtrl
{
    float4 ambient;
    float4 diffuse;
    float4 spec;
    float  specPower;
    float4 emissive;
};
 
//--------------------------------------------------------------------------------------
// Macro defines
//--------------------------------------------------------------------------------------
#define MATRIX_PALETTE_SIZE (13)
 
//--------------------------------------------------------------------------------------
// Global variables
//--------------------------------------------------------------------------------------
uniform extern DirLight gLight;
uniform extern Mtrl gMtrl;
uniform extern float4x4 gWorld;
uniform extern float4x4 gWVP;
uniform extern float4x4 gInvWorld;
uniform extern float4x4 gView;
uniform extern float3 gEyePosW;
uniform extern float gFarClipDist;
uniform extern float gAlphaRef = 0.29f;
uniform extern float gFogRange = 250.0f;
uniform extern float gFogStart = 1.0f;
uniform extern matrix amPalette[ MATRIX_PALETTE_SIZE ];
uniform extern float gNumBones;
 
//----------------------------------------------------------------------------
// Shader body - VS_Skin
//----------------------------------------------------------------------------
 
//
// Define the inputs -- caller must fill this, usually right from the VB.
//
struct VS_SKIN_INPUT
{
    float4 vPos;
    float3 vNor;
    float3 vBlendWeights;
    float4 vBlendIndices;
};
 
//
// Return skinned position and normal
//
struct VS_SKIN_OUTPUT
{
    float4 vPos;
    float3 vNor;
};
 
//
// Call this function to skin VB position and normal.
//
VS_SKIN_OUTPUT VS_Skin( const VS_SKIN_INPUT vInput, int iNumBones )
{
    VS_SKIN_OUTPUT vOutput = (VS_SKIN_OUTPUT) 0;
 
    float fLastWeight = 1.0;
    float afBlendWeights[ 3 ] = (float[ 3 ]) vInput.vBlendWeights;
    int aiIndices[ 4 ]        = (int[ 4 ])   D3DCOLORtoUBYTE4( vInput.vBlendIndices );
 
    for( int iBone = 0; (iBone < 3) && (iBone < iNumBones - 1); ++ iBone )
    {
        float fWeight = afBlendWeights[ iBone ];
        fLastWeight -= fWeight;
        vOutput.vPos.xyz += mul( vInput.vPos, amPalette[ aiIndices[ iBone  ] ] ) * fWeight;
        vOutput.vNor     += mul( float4(vInput.vNor, 0.0f), amPalette[ aiIndices[ iBone  ] ] ) * fWeight;
    }
 
    vOutput.vPos.xyz += mul( vInput.vPos, amPalette[ aiIndices[ iNumBones - 1 ] ] ) * fLastWeight;
    vOutput.vNor     += mul( float4(vInput.vNor, 0.0f), amPalette[ aiIndices[ iNumBones - 1 ] ] ) * fLastWeight;
 
    return vOutput;
}
struct VS_in
{
    float3 posL         : POSITION0;
    float3 normalL      : NORMAL0;
    float3 BlendWeights : BLENDWEIGHT;
    float4 BlendIndices : BLENDINDICES;
    float4 tex0_tex1    : TEXCOORD0;
    float4 tex2_tex3    : TEXCOORD1;
    float4 tex4_tex5    : TEXCOORD2;
    float4 tex6_tex7    : TEXCOORD3;
};
 
struct VS_out
{
    float4 posH         : POSITION0;
    float4 tex0_tex1    : TEXCOORD0;
    float4 tex2_tex3    : TEXCOORD1;
    float4 tex4_tex5    : TEXCOORD2;
    float4 tex6_tex7    : TEXCOORD3;
    float3 normalW      : TEXCOORD4;
    float4 posVS        : TEXCOORD5;
    float4 color        : COLOR0;
    float  fogLerpParam : COLOR1;
};
 
VS_out VS_Scene( VS_in i )
{
    //
    // Zero out our output.
    //
    VS_out o = (VS_out)0;
 
    //
    // Skin VB inputs
    //
    VS_SKIN_INPUT  vsi = { float4( i.posL, 1.0f ), i.normalL, i.BlendWeights, i.BlendIndices };
    VS_SKIN_OUTPUT vso = VS_Skin( vsi, gNumBones );
    i.posL = vso.vPos.xyz;
    i.normalL = vso.vNor;
 
    //
    // Transform normal to world space and pass along
    // to be interpolated by rasterizer.
    //
    o.normalW = mul( gInvWorld, float4(i.normalL, 0) ).xyz;
 
    //
    // Pass along per-vertex color to be interpolated by rasterizer.
    //
    o.color = gMtrl.diffuse;
 
    //
    // Transform position to homogeneous clip space.
    //
    float4 vPositionVS = mul(float4(i.posL, 1.0f), mul(gWorld, gView));
    o.posH = mul(float4(i.posL, 1.0f), gWVP);
 
    //
    // This position will be used to output view space depth.
    //
    o.posVS = vPositionVS;
    o.posVS.z = max(o.posVS.z, 0.0f);
 
    //
    // Pass on texture coordinates to be interpolated in rasterization.
    //
    o.tex0_tex1.xy = i.tex0_tex1.xy;
    o.tex0_tex1.zw = i.tex0_tex1.zw;
    o.tex2_tex3.xy = i.tex2_tex3.xy;
    o.tex2_tex3.zw = i.tex2_tex3.zw;
    o.tex4_tex5.xy = i.tex4_tex5.xy;
    o.tex4_tex5.zw = i.tex4_tex5.zw;
    o.tex6_tex7.xy = i.tex6_tex7.xy;
    o.tex6_tex7.zw = i.tex6_tex7.zw;
 
    //
    // Compute vertex distance from camera in world
    // space for fog calculation.
    //
    float dist = distance(mul(float4(i.posL, 1.0f), gWorld).xyz, gEyePosW);
    o.fogLerpParam = saturate((dist - gFogStart) / gFogRange);
 
    //
    // Done--return the output.
    //
    return o;
}
Pixel Shader 0FA11C80:
=================================================================================
//--------------------------------------------------------------------------------------
// Automatically generated Pixel Shader.
//
// Copyright (c) Steve Segreto. All rights reserved.
// Shader Flags = 827e8
// Shader Type = Linear-Based Quaternion Skinning
// Shader Quality = PHONG_LIGHTING
//--------------------------------------------------------------------------------------
 
struct DirLight
{
    float4 ambient;
    float4 diffuse;
    float4 spec;
    float3 dirW;
    float4 fogColor;
    float3 lightPosW;
};
 
struct Mtrl
{
    float4 ambient;
    float4 diffuse;
    float4 spec;
    float  specPower;
    float4 emissive;
};
 
//--------------------------------------------------------------------------------------
// Macro defines
//--------------------------------------------------------------------------------------
 
//--------------------------------------------------------------------------------------
// Global variables
//--------------------------------------------------------------------------------------
uniform extern DirLight gLight;
uniform extern Mtrl gMtrl;
uniform extern float4x4 gInvWorld;
uniform extern float4x4 gView;
uniform extern float3 gEyePosW;
uniform extern float gFarClipDist;
uniform extern float gAlphaRef = 0.29f;
uniform extern float3 gFogColor;
uniform extern texture gTex0;
 
struct PS_in
{
    float4 tex0_tex1    : TEXCOORD0;
    float4 tex2_tex3    : TEXCOORD1;
    float4 tex4_tex5    : TEXCOORD2;
    float4 tex6_tex7    : TEXCOORD3;
    float3 normalW      : TEXCOORD4;
    float4 posVS        : TEXCOORD5;
    float4 color        : COLOR0;
    float  fogLerpParam : COLOR1;
};
 
struct PS_out
{
    float4 vMaterial    : COLOR0;
    float4 vWorldNrm    : COLOR1;
    float4 vEmittance   : COLOR2;
    float4 vDepth       : COLOR3;
};
 
sampler TexS0 = sampler_state
{
    Texture   = <gTex0>;
    MinFilter = Linear;
    MagFilter = Linear;
    MipFilter = Point;
    AddressU  = Wrap;
    AddressV  = Wrap;
};
 
PS_out PS_Scene( PS_in i )
{
    //
    // Zero out our output.
    //
    PS_out o = (PS_out)0;
 
    //
    // Interpolated normals can become unnormal.
    //
    i.normalW   = normalize(i.normalW);
 
    //
    // VERT_MODE_SRC_IGNORE
    //
    float3 matAmbient  = gMtrl.ambient.rgb;
    float4 matDiffuse  = gMtrl.diffuse;
    float3 matEmissive = gMtrl.emissive.rgb;
 
    //
    // Incoming colors.
    //
    float3 color_stage0 = saturate((matAmbient * gLight.ambient) + matDiffuse + matEmissive);
    o.vEmittance.y = gMtrl.spec.r;
    o.vEmittance.z = gMtrl.specPower;
    float  alpha_stage0 = matDiffuse.a;
 
    //
    // Sample textures.
    //
    float4 color0 = tex2D(TexS0, i.tex0_tex1.xy);
 
    //
    // Apply texturing stages
    //
 
    //
    // Diffuse map.
    //
    float3 color_stage1  = color_stage0 * color0.rgb;
 
    //
    // Final (pre-fog) color.
    //
    float4 texColor = float4( color_stage1.rgb, alpha_stage0 );
 
    //
    // Add fog
    //
    o.vMaterial = texColor;
    o.vEmittance.w = i.fogLerpParam;
    // convert normal to texture space [-1;+1] -> [0;1]
    o.vWorldNrm.xyz = i.normalW * 0.5 + 0.5;
 
    // post-perspective z/w depth
    o.vDepth = i.posVS.z / gFarClipDist;
 
    //
    // Done--return the output.
    //
    return o;
}

In Topic: Memory Leaks on shutdown of QT application

03 January 2016 - 10:42 AM

That's the funny thing about chasing heap leaks, when you examine the code everything looks like its being cleaned up correctly. You need to enlist the help of tools like the CRT debug heap or gflags debug heap to help you understand which allocations are being made without being freed, because this can provoke new thought and cause you to see what you are missing. I just caught a heap leak in my own source code last night that resulted from some scene graph nodes not being parented as I would have expected so even though the de-allocation code was rock solid and correct, there was still a heap leak because for a certain class of model a certain group of scene graph nodes were created as orphans without parents and thus a top-down de-allocation algorithm couldn't find them.


PARTNERS