Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 21 Jun 2009
Offline Last Active Oct 20 2014 10:21 AM

Topics I've Started

StructuredBuffer and matrix layout

14 September 2014 - 02:40 AM

Not sure if this is a bug of a feature, but apparently this code always generates column-major matrices:


This is regardless of using D3DCOMPILE_PACK_MATRIX_ROW_MAJOR or #pragma pack_matrix(row_major).

Anyone has an elegant way to fix it? It's a real irritating 'feature'.

HLSL compiler weird performance behavior

08 September 2014 - 08:13 AM

I have this animation demo, which takes quite a few seconds to load. I initially thought that it was related to the model, but apparently the culprit is the skinning VS, and specifically the bones-matrices.

The VS looks like

cbuffer cbPerMesh : register(b1)
	matrix gBones[256];

struct VS_IN
	float4 PosL : POSITION;
	float3 NormalL : NORMAL;
	float2 TexC : TEXCOORD;
	float4 BonesWeights[2] : BONE_WEIGHTS;
	uint4  BonesIDs[2]    : BONE_IDS;

struct VS_OUT
	float4 svPos : SV_POSITION;
	float2 TexC : TEXCOORD;
	float3 NormalW : NORMAL;

float4x4 CalculateWorldMatrixFromBones(float4 BonesWeights[2], uint4  BonesIDs[2], float4x4 Bones[256])
	float4x4 WorldMat = { float4(0, 0, 0, 0), float4(0, 0, 0, 0), float4(0, 0, 0, 0), float4(0, 0, 0, 0) };

		for(int i = 0; i < 2; i++)
			WorldMat += Bones[BonesIDs[i].x] * BonesWeights[i].x;
			WorldMat += Bones[BonesIDs[i].y] * BonesWeights[i].y;
			WorldMat += Bones[BonesIDs[i].z] * BonesWeights[i].z;
			WorldMat += Bones[BonesIDs[i].w] * BonesWeights[i].w;

	return WorldMat;

	VS_OUT vOut;
	float4x4 World = CalculateWorldMatrixFromBones(vIn.BonesWeights, vIn.BonesIDs, gBones);
	vOut.svPos = mul(mul(vIn.PosL, World), gVPMat);
	vOut.TexC = vIn.TexC;
	vOut.NormalW = mul(float4(vIn.NormalL, 0), World).xyz;
	return vOut;

As you can see, this shader supports up to 256 bones per model. Compiling this shader takes around 5 seconds on my Core-i7 CPU.

If I reduce the number of supported bones to 16, it compiles almost immediately.

Funny thing is that the generated assembly is exactly the same (except for the CB declaration).


I find it weird - the code doesn't rely in any way on the matrices count.

Anyone has any idea why the performance degradation?

Anttweakbar direction widget

31 August 2014 - 01:44 PM

It seems that by default the widget is in RHS (+Z points toward me). That's kind of a problem, since my framework is LHS. I couldn't find in the documentation a way to define the widget as LHS.


Before I dive into the ATB code - anyone knows if there's a way to programmatically define the widget to work in LHS?

Best way to render text with DirectX11

19 August 2014 - 11:46 PM

So, there's a bunch of methods for text rendering with DX11. The ones I know of are:

  1. Sprite rendering. Pretty straightforward and well-known. Only con I see is the dynamic VB calculation each draw.
  2. Use D2D. You get a simple, HW accelerated API, but on Win7 means another DX10.1 device, sharing the BB, synchronizing, which basically takes the simple API and wraps it with not so nice looking code. Not sure if it has real performance gain over sprites.
  3. Use GDI+ to render text directly to the back-buffer. I assume performance will be bad, though I haven't tried it.


In terms of performance and code complexity, which one is better?

Is there another option I'm unaware of?


(And MS, why did you leave D2D/D3D11 interop out of Win7!?!?)

Engine for 3D sidescrolling game

24 November 2013 - 02:25 AM

Anyone knows a good free/open-source game engine for 3D side-scrolling game? Something similar to Little Big Planet.