Jump to content

  • Log In with Google      Sign In   
  • Create Account

KaiserJohan

Member Since 08 Apr 2011
Offline Last Active Today, 04:14 AM

Posts I've Made

In Topic: Packing vertex attributes

05 January 2016 - 06:50 AM

Sorry, I totally forgot about input layout and formats. :) Never mind the question!


In Topic: Skeletal animation shader questions

05 January 2016 - 04:33 AM

1) That one is probably a little hardware specific. Constant buffers are geared more towards access patterns that involve all the threads accessing the same piece of data on the same instruction. So gBoneMatrices[7] would be fine, but gBoneMatrices[input.mBoneIndices.z] is not, since the compiler(s) can't resolve what the bone index will be at compile time.

 

Strictly speaking I believe 'tbuffers' are designed more for non-coherent random access, however I've never actually seen anyone use one (except in the Skinning10 sample, referenced here). In practice I'd probably opt for a Buffer<float4> or StructuredBuffer<float4x4> and use those instead. Buffer<float4> would let you compress your matrices to a more compact format if you can get away with it, also note you may only need 3 float4s per bone unless you're doing non-standard things. Check the last row and see if it's always the same across all bones.

 

2) With a constant buffer you're constrained to 1024 bones (4096 float4s per CB), but with a Buffer/StructuredBuffer/tbuffer you effectively have no limit. If you opt for any of the last three options you can just create the buffer to the exact size for how many bones you have.

 

3) I think it's just a reasonable trade-off between quality, performance and the fact that a single vertex attribute can only store 4 indices. If you want more you add another attribute and can now index 8, which is probably more than necessary. I expect there's legacy reasons for it, but you should/can support as many as you feel necessary.

 

4) Whether branching over a single matrix multiplication is worth it or not is again hardware and situation specific.These days I can certainly imagine that skipping zero weight bones would be worth it, but the easiest thing to do is try it on 2 (or ideally all 3) vendor's recent hardware and see what results you get with and without the branches.

 

1) and 2): It does indeed seem StructuredBuffer would be better due to random access.

 

Some additional references I found on the subject;

https://developer.nvidia.com/content/redundancy-and-latency-structured-buffer-use

http://www.gamedev.net/topic/624529-structured-buffers-vs-constant-buffers/

 

 

4): see below

 

 

Hi this is a great list of questions!

 

In my experience constants buffer are the best way in dx9, 10 and 11 to send the updated bone transforms to the shader for the vertex skinning. A buffer of about 50 or less 4x4 matrices should be able to handle most (> 90%) of the low to mid class game engine skinned actors in one draw call (I can confirm this for Dark Age of Camelot, Unreal Tournament, Mortal Kombat Armageddon, Zelda Twilight Princess and TitanQuest skinned actors).

 

As for the 4 bone weights per vertex limit, this does seem to be an arbitrary legacy based limit from shader model 2 and the fixed function before it when the blend indices were packed into 4 dwords. I think I saw some research paper that concluded for bipedal meshes 4 blend indices is sufficient for a certain fidelity of realistic motion, but I can't find the paper at hand to link to it.

 

As for the last question of whether you should always blend in 4 bone weights, I actually use another number to tell me how many of the 4 blend indices are needed. Here is some example shader model 2 code for skinning:

CRenderableMesh 0F7B78E8:
=================================================================================
Vertex Declaration: 0B98C860
=================================================================================
 8 Vertex Elements
{ Stream = 0, Offset = 0, Type = D3DDECLTYPE_FLOAT3, Method = D3DDECLMETHOD_DEFAULT, Usage = D3DDECLUSAGE_POSITION, UsageIndex = 0 },
{ Stream = 0, Offset = 12, Type = D3DDECLTYPE_FLOAT3, Method = D3DDECLMETHOD_DEFAULT, Usage = D3DDECLUSAGE_NORMAL, UsageIndex = 0 },
{ Stream = 0, Offset = 24, Type = D3DDECLTYPE_FLOAT3, Method = D3DDECLMETHOD_DEFAULT, Usage = D3DDECLUSAGE_BLENDWEIGHT, UsageIndex = 0 },
{ Stream = 0, Offset = 36, Type = D3DDECLTYPE_D3DCOLOR, Method = D3DDECLMETHOD_DEFAULT, Usage = D3DDECLUSAGE_BLENDINDICES, UsageIndex = 0 },
{ Stream = 0, Offset = 40, Type = D3DDECLTYPE_FLOAT4, Method = D3DDECLMETHOD_DEFAULT, Usage = D3DDECLUSAGE_TEXCOORD, UsageIndex = 0 },
{ Stream = 0, Offset = 56, Type = D3DDECLTYPE_FLOAT4, Method = D3DDECLMETHOD_DEFAULT, Usage = D3DDECLUSAGE_TEXCOORD, UsageIndex = 1 },
{ Stream = 0, Offset = 72, Type = D3DDECLTYPE_FLOAT4, Method = D3DDECLMETHOD_DEFAULT, Usage = D3DDECLUSAGE_TEXCOORD, UsageIndex = 2 },
{ Stream = 0, Offset = 88, Type = D3DDECLTYPE_FLOAT4, Method = D3DDECLMETHOD_DEFAULT, Usage = D3DDECLUSAGE_TEXCOORD, UsageIndex = 3 }
Vertex Shader 0FA01450:
=================================================================================
//--------------------------------------------------------------------------------------
// Automatically generated Vertex Shader.
//
// Copyright (c) Steve Segreto. All rights reserved.
// Shader Flags = 887f9
// Shader Type = Linear-Based Quaternion Skinning
// Shader Quality = PHONG_LIGHTING
//--------------------------------------------------------------------------------------
 
struct DirLight
{
    float4 ambient;
    float4 diffuse;
    float4 spec;
    float3 dirW;
    float4 fogColor;
    float3 lightPosW;
};
 
struct Mtrl
{
    float4 ambient;
    float4 diffuse;
    float4 spec;
    float  specPower;
    float4 emissive;
};
 
//--------------------------------------------------------------------------------------
// Macro defines
//--------------------------------------------------------------------------------------
#define MATRIX_PALETTE_SIZE (13)
 
//--------------------------------------------------------------------------------------
// Global variables
//--------------------------------------------------------------------------------------
uniform extern DirLight gLight;
uniform extern Mtrl gMtrl;
uniform extern float4x4 gWorld;
uniform extern float4x4 gWVP;
uniform extern float4x4 gInvWorld;
uniform extern float4x4 gView;
uniform extern float3 gEyePosW;
uniform extern float gFarClipDist;
uniform extern float gAlphaRef = 0.29f;
uniform extern float gFogRange = 250.0f;
uniform extern float gFogStart = 1.0f;
uniform extern matrix amPalette[ MATRIX_PALETTE_SIZE ];
uniform extern float gNumBones;
 
//----------------------------------------------------------------------------
// Shader body - VS_Skin
//----------------------------------------------------------------------------
 
//
// Define the inputs -- caller must fill this, usually right from the VB.
//
struct VS_SKIN_INPUT
{
    float4 vPos;
    float3 vNor;
    float3 vBlendWeights;
    float4 vBlendIndices;
};
 
//
// Return skinned position and normal
//
struct VS_SKIN_OUTPUT
{
    float4 vPos;
    float3 vNor;
};
 
//
// Call this function to skin VB position and normal.
//
VS_SKIN_OUTPUT VS_Skin( const VS_SKIN_INPUT vInput, int iNumBones )
{
    VS_SKIN_OUTPUT vOutput = (VS_SKIN_OUTPUT) 0;
 
    float fLastWeight = 1.0;
    float afBlendWeights[ 3 ] = (float[ 3 ]) vInput.vBlendWeights;
    int aiIndices[ 4 ]        = (int[ 4 ])   D3DCOLORtoUBYTE4( vInput.vBlendIndices );
 
    for( int iBone = 0; (iBone < 3) && (iBone < iNumBones - 1); ++ iBone )
    {
        float fWeight = afBlendWeights[ iBone ];
        fLastWeight -= fWeight;
        vOutput.vPos.xyz += mul( vInput.vPos, amPalette[ aiIndices[ iBone  ] ] ) * fWeight;
        vOutput.vNor     += mul( float4(vInput.vNor, 0.0f), amPalette[ aiIndices[ iBone  ] ] ) * fWeight;
    }
 
    vOutput.vPos.xyz += mul( vInput.vPos, amPalette[ aiIndices[ iNumBones - 1 ] ] ) * fLastWeight;
    vOutput.vNor     += mul( float4(vInput.vNor, 0.0f), amPalette[ aiIndices[ iNumBones - 1 ] ] ) * fLastWeight;
 
    return vOutput;
}
struct VS_in
{
    float3 posL         : POSITION0;
    float3 normalL      : NORMAL0;
    float3 BlendWeights : BLENDWEIGHT;
    float4 BlendIndices : BLENDINDICES;
    float4 tex0_tex1    : TEXCOORD0;
    float4 tex2_tex3    : TEXCOORD1;
    float4 tex4_tex5    : TEXCOORD2;
    float4 tex6_tex7    : TEXCOORD3;
};
 
struct VS_out
{
    float4 posH         : POSITION0;
    float4 tex0_tex1    : TEXCOORD0;
    float4 tex2_tex3    : TEXCOORD1;
    float4 tex4_tex5    : TEXCOORD2;
    float4 tex6_tex7    : TEXCOORD3;
    float3 normalW      : TEXCOORD4;
    float4 posVS        : TEXCOORD5;
    float4 color        : COLOR0;
    float  fogLerpParam : COLOR1;
};
 
VS_out VS_Scene( VS_in i )
{
    //
    // Zero out our output.
    //
    VS_out o = (VS_out)0;
 
    //
    // Skin VB inputs
    //
    VS_SKIN_INPUT  vsi = { float4( i.posL, 1.0f ), i.normalL, i.BlendWeights, i.BlendIndices };
    VS_SKIN_OUTPUT vso = VS_Skin( vsi, gNumBones );
    i.posL = vso.vPos.xyz;
    i.normalL = vso.vNor;
 
    //
    // Transform normal to world space and pass along
    // to be interpolated by rasterizer.
    //
    o.normalW = mul( gInvWorld, float4(i.normalL, 0) ).xyz;
 
    //
    // Pass along per-vertex color to be interpolated by rasterizer.
    //
    o.color = gMtrl.diffuse;
 
    //
    // Transform position to homogeneous clip space.
    //
    float4 vPositionVS = mul(float4(i.posL, 1.0f), mul(gWorld, gView));
    o.posH = mul(float4(i.posL, 1.0f), gWVP);
 
    //
    // This position will be used to output view space depth.
    //
    o.posVS = vPositionVS;
    o.posVS.z = max(o.posVS.z, 0.0f);
 
    //
    // Pass on texture coordinates to be interpolated in rasterization.
    //
    o.tex0_tex1.xy = i.tex0_tex1.xy;
    o.tex0_tex1.zw = i.tex0_tex1.zw;
    o.tex2_tex3.xy = i.tex2_tex3.xy;
    o.tex2_tex3.zw = i.tex2_tex3.zw;
    o.tex4_tex5.xy = i.tex4_tex5.xy;
    o.tex4_tex5.zw = i.tex4_tex5.zw;
    o.tex6_tex7.xy = i.tex6_tex7.xy;
    o.tex6_tex7.zw = i.tex6_tex7.zw;
 
    //
    // Compute vertex distance from camera in world
    // space for fog calculation.
    //
    float dist = distance(mul(float4(i.posL, 1.0f), gWorld).xyz, gEyePosW);
    o.fogLerpParam = saturate((dist - gFogStart) / gFogRange);
 
    //
    // Done--return the output.
    //
    return o;
}
Pixel Shader 0FA11C80:
=================================================================================
//--------------------------------------------------------------------------------------
// Automatically generated Pixel Shader.
//
// Copyright (c) Steve Segreto. All rights reserved.
// Shader Flags = 827e8
// Shader Type = Linear-Based Quaternion Skinning
// Shader Quality = PHONG_LIGHTING
//--------------------------------------------------------------------------------------
 
struct DirLight
{
    float4 ambient;
    float4 diffuse;
    float4 spec;
    float3 dirW;
    float4 fogColor;
    float3 lightPosW;
};
 
struct Mtrl
{
    float4 ambient;
    float4 diffuse;
    float4 spec;
    float  specPower;
    float4 emissive;
};
 
//--------------------------------------------------------------------------------------
// Macro defines
//--------------------------------------------------------------------------------------
 
//--------------------------------------------------------------------------------------
// Global variables
//--------------------------------------------------------------------------------------
uniform extern DirLight gLight;
uniform extern Mtrl gMtrl;
uniform extern float4x4 gInvWorld;
uniform extern float4x4 gView;
uniform extern float3 gEyePosW;
uniform extern float gFarClipDist;
uniform extern float gAlphaRef = 0.29f;
uniform extern float3 gFogColor;
uniform extern texture gTex0;
 
struct PS_in
{
    float4 tex0_tex1    : TEXCOORD0;
    float4 tex2_tex3    : TEXCOORD1;
    float4 tex4_tex5    : TEXCOORD2;
    float4 tex6_tex7    : TEXCOORD3;
    float3 normalW      : TEXCOORD4;
    float4 posVS        : TEXCOORD5;
    float4 color        : COLOR0;
    float  fogLerpParam : COLOR1;
};
 
struct PS_out
{
    float4 vMaterial    : COLOR0;
    float4 vWorldNrm    : COLOR1;
    float4 vEmittance   : COLOR2;
    float4 vDepth       : COLOR3;
};
 
sampler TexS0 = sampler_state
{
    Texture   = <gTex0>;
    MinFilter = Linear;
    MagFilter = Linear;
    MipFilter = Point;
    AddressU  = Wrap;
    AddressV  = Wrap;
};
 
PS_out PS_Scene( PS_in i )
{
    //
    // Zero out our output.
    //
    PS_out o = (PS_out)0;
 
    //
    // Interpolated normals can become unnormal.
    //
    i.normalW   = normalize(i.normalW);
 
    //
    // VERT_MODE_SRC_IGNORE
    //
    float3 matAmbient  = gMtrl.ambient.rgb;
    float4 matDiffuse  = gMtrl.diffuse;
    float3 matEmissive = gMtrl.emissive.rgb;
 
    //
    // Incoming colors.
    //
    float3 color_stage0 = saturate((matAmbient * gLight.ambient) + matDiffuse + matEmissive);
    o.vEmittance.y = gMtrl.spec.r;
    o.vEmittance.z = gMtrl.specPower;
    float  alpha_stage0 = matDiffuse.a;
 
    //
    // Sample textures.
    //
    float4 color0 = tex2D(TexS0, i.tex0_tex1.xy);
 
    //
    // Apply texturing stages
    //
 
    //
    // Diffuse map.
    //
    float3 color_stage1  = color_stage0 * color0.rgb;
 
    //
    // Final (pre-fog) color.
    //
    float4 texColor = float4( color_stage1.rgb, alpha_stage0 );
 
    //
    // Add fog
    //
    o.vMaterial = texColor;
    o.vEmittance.w = i.fogLerpParam;
    // convert normal to texture space [-1;+1] -> [0;1]
    o.vWorldNrm.xyz = i.normalW * 0.5 + 0.5;
 
    // post-perspective z/w depth
    o.vDepth = i.posVS.z / gFarClipDist;
 
    //
    // Done--return the output.
    //
    return o;
}
uniform extern float gNumBones;

// ...

VS_SKIN_OUTPUT VS_Skin( const VS_SKIN_INPUT vInput, int iNumBones )
{
    VS_SKIN_OUTPUT vOutput = (VS_SKIN_OUTPUT) 0;
 
    float fLastWeight = 1.0;
    float afBlendWeights[ 3 ] = (float[ 3 ]) vInput.vBlendWeights;
    int aiIndices[ 4 ]        = (int[ 4 ])   D3DCOLORtoUBYTE4( vInput.vBlendIndices );
 
    for( int iBone = 0; (iBone < 3) && (iBone < iNumBones - 1); ++ iBone )
    {
        float fWeight = afBlendWeights[ iBone ];
        fLastWeight -= fWeight;
        vOutput.vPos.xyz += mul( vInput.vPos, amPalette[ aiIndices[ iBone  ] ] ) * fWeight;
        vOutput.vNor     += mul( float4(vInput.vNor, 0.0f), amPalette[ aiIndices[ iBone  ] ] ) * fWeight;
    }
 
    vOutput.vPos.xyz += mul( vInput.vPos, amPalette[ aiIndices[ iNumBones - 1 ] ] ) * fLastWeight;
    vOutput.vNor     += mul( float4(vInput.vNor, 0.0f), amPalette[ aiIndices[ iNumBones - 1 ] ] ) * fLastWeight;
 
    return vOutput;
}

// ...

VS_SKIN_OUTPUT vso = VS_Skin( vsi, gNumBones );
gNumBones is a uniform and so set once per draw call - what if only a few vertices used four bones but the majority less? It would still compute up to gNumBones for all vertices. Does it make sense to pack the gNumBones as a vertex attribute instead?
 
Also does a variable value per vertex avoid the problem of branching? Or is a fixed iteration value better as the compiler could just unroll the loop?

In Topic: Texturing tools you should know about

28 October 2015 - 03:23 AM

They are aimed at speeding up texturing processes. It has nothing to do with where the models came from. As long as you have a model with UV's and a normal map it is a good tool.

 

If you look at just about any hard surface model, people spend time going around every edge to create "edge wear". In one of the links I posted you can see how it automatically does this with a button and a slider. And it creates edge wear on the entire model. Something you would have to perform manually. You also do have the option to select brushes and paint directly onto the 3d model using projection, rather than in photoshop working in 2D.

 

Typically you have grunge brushes and scratches etc that go on the surfaces and it can place these for you as well.

 

Also for PBR it supports painting materials so it has the proper metal/reflective properties.  It also helps that all of this stuff updates in real time and you get to see your work immediately.

 

So it also supports creating textures entirely from scratch? I've never done it before so I'm curious how it actually works - you just apply color on a mesh in 3D and the end result is a 2D texture?


In Topic: Texturing tools you should know about

27 October 2015 - 05:41 AM

Are these specifically aimed at AAA studios? Any suitable for indie devs?

 

I've never used a texturing tool before. How does it work with models from say 3ds max? Whats the workflow like?


In Topic: Recomputing AABBs for animated actors

23 October 2015 - 07:14 AM

 

Precomputing the bounds of each bone, and then applying the bone transforms to that data to get per-frame animated bounds, is pretty standard AFAIK.

It's the common method but a lot of engine also support (or only support) fixed AABB to avoid to compute that each time for each animated actors because it cost a lot.

Animation is one of the heaviest thing in a real time application, one little optimisation here gives lot of win of performance.

 

 

So you use the fixed/"bind"-pose AABB for frustum culling? Won't that cause pop-in say if the animation stretches a characters arm but your AABB is for the bind pose where the arm is close to the body?

 

Also applying a rotaton/translation matrix to an AABB... returns a valid AABB? Or does it put it in some other space?


PARTNERS