Sign in to follow this  

Extract/Unpack position from UBYTE4 (D3DDECLTYPE_UBYTE4) ???

This topic is 2659 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hey guys,

I'm currently trying to build a proxy dll for Direct3D9 to extract the mesh data from some of my favourite games. I know that there are some tools out there, that can do the same (like 3D Rippter DX). But they don't work for me :(

Anyhow...I've managed to create a simple frame-grabber. I can get all the stuff: Vertexbuffers, Indexbuffers, Shaders, Textures...

But here comes the problem:
I've checked the VertexDeclaration and it is quite strange for me.


VertexDeclaration:
{
--- ELEMENT 0---
Stream: 0
Offset: 0
D3DDECLTYPE: D3DDECLTYPE_UBYTE4
D3DDECLMETHOD: D3DDECLMETHOD_DEFAULT
D3DDECLUSAGE : D3DDECLUSAGE_POSITION

--- ELEMENT 1---
Stream: 0
Offset: 4
D3DDECLTYPE: D3DDECLTYPE_UBYTE4
D3DDECLMETHOD: D3DDECLMETHOD_DEFAULT
D3DDECLUSAGE : D3DDECLUSAGE_TANGENT

--- ELEMENT 2---
Stream: 0
Offset: 8
D3DDECLTYPE: D3DDECLTYPE_UBYTE4
D3DDECLMETHOD: D3DDECLMETHOD_DEFAULT
D3DDECLUSAGE : D3DDECLUSAGE_NORMAL

--- ELEMENT 3---
Stream: 0
Offset: 12
D3DDECLTYPE: D3DDECLTYPE_UBYTE4
D3DDECLMETHOD: D3DDECLMETHOD_DEFAULT
D3DDECLUSAGE : D3DDECLUSAGE_PSIZE

--- ELEMENT 4---
Stream: 0
Offset: 16
D3DDECLTYPE: D3DDECLTYPE_FLOAT16_2
D3DDECLMETHOD: D3DDECLMETHOD_DEFAULT
D3DDECLUSAGE : D3DDECLUSAGE_TEXCOORD

--- ELEMENT 5---
Stream: 0
Offset: 20
D3DDECLTYPE: D3DDECLTYPE_FLOAT16_2
D3DDECLMETHOD: D3DDECLMETHOD_DEFAULT
D3DDECLUSAGE : D3DDECLUSAGE_TEXCOORD
}


What bothers me most is the fact that the normal and position data is stored as D3DDECLTYPE_UBYTE4 and I don't really know how to unpack these values.

I tried something like this, after locking the vertex buffer (pseudo-code!):


BYTE * positionData = pVertex + position_offset;

if ( (D3DDECLTYPE)decl[i].Type == D3DDECLTYPE_UBYTE4 )
{
float X = (float) (( ((unsigned int)*((unsigned int*)positionData)) & 0x000000FF) );
float Y = (float) (( ((unsigned int)*((unsigned int*)positionData)) & 0x0000FF00) >> 8);
float Z = (float) (( ((unsigned int)*((unsigned int*)positionData)) & 0x00FF0000) >> 16);
float W = (float) (( ((unsigned int)*((unsigned int*)positionData)) & 0xFF000000) >> 24);
}


After pushing all the values from the vertex- and indexbuffer to an *.OBJ-File I imported them in Maya 2011, but everything seams to be messed up.
I think the problem lies within my unpacking code, but I am a noob with all this byte conversions.

Maybe someone could give me a hint!?

Thank you!
Stefan

Share this post


Link to post
Share on other sites
Oh yeah, I took that into account...

Tried XYZW and WZYX. I think any other order would be stupid as it has to be read back within the shader. No good programmer would mess up the order to somewhat like XWZY, huh?!

But even the values in any order seams to be strange..here is some output:


# OBJ FILE...
g Object

# (( ((unsigned int)*((unsigned int*)positionData)) & 0xFF000000) >> 24)
# (( ((unsigned int)*((unsigned int*)positionData)) & 0x00FF0000) >> 16)
# (( ((unsigned int)*((unsigned int*)positionData)) & 0x0000FF00) >> 8)
# (( ((unsigned int)*((unsigned int*)positionData)) & 0x000000FF) )

# vertex: << UINT: 955219247
v 56 239 125 47
# vertex: << UINT: 963918384
v 57 116 58 48
# vertex: << UINT: 0
v 0 0 0 0
# vertex: << UINT: 16744319
v 0 255 127 127 begin_of_the_skype_highlighting              0 255 127 127      end_of_the_skype_highlighting begin_of_the_skype_highlighting              0 255 127 127      end_of_the_skype_highlighting begin_of_the_skype_highlighting              0 255 127 127      end_of_the_skype_highlighting
# vertex: << UINT: 16721
v 0 0 65 81
# vertex: << UINT: 955234098
v 56 239 183 50
# vertex: << UINT: 963918384
v 57 116 58 48
# vertex: << UINT: 0
v 0 0 0 0
# vertex: << UINT: 16744319
v 0 255 127 127
# vertex: << UINT: 16721
v 0 0 65 81
....
..





There must be an error...I think the W value should be the same in every vertex, shouldn't it!?

Maybe some mode code:


IDirect3DVertexBuffer9 * pVertexBuffer = NULL;
UINT offset = 0;
UINT stride = 0;
this->pIDirect3DDevice9->GetStreamSource( 0, &pVertexBuffer, &offset, &stride );

if (pVertexBuffer)
{
BYTE * pVertices = NULL;
pVertexBuffer->Lock( 0, sizeof(pVertexBuffer), (void**)&pVertices, 0 );

for(unsigned int i = 0; i < numElements; ++i)
{
// Did we reach D3DDECL_END() { 0xFF, 0, D3DDECLTYPE_UNUSED, 0, 0, 0 } ?
if( decl[i].Stream == 0xff )
break;

if ( (D3DDECLUSAGE)decl[i].Usage == D3DDECLUSAGE_POSITION )
{
unsigned short position_offset = decl[i].Offset;

for (unsigned int v = (unsigned int)BaseVertexIndex; v < (unsigned int)NumVertices; v++) // BaseVertexIndex is 0 here!
{
BYTE * pVertex = pVertices + v * stride;
BYTE * positionData = pVertex + position_offset;

if ( (D3DDECLTYPE)decl[i].Type == D3DDECLTYPE_UBYTE4 )
{
...
}
}
}
}

pVertexBuffer->Unlock();
pVertexBuffer->Release();
}




Share this post


Link to post
Share on other sites
Looks like positions are encoded is some kind of fixed point format. W may be a scale or offset or both. Take a look in the vertex shader code to see how positions get reconstructed from UBYTE4.

Share this post


Link to post
Share on other sites
Okay here is the disassembled vertex shader code:


vs_3_0
def c4, 128, 1, 256, 3
dcl_position v0
dcl_tangent v1
dcl_blendindices v2
dcl_blendweight v3
dcl_position o0
dcl_texcoord o1
sge r0.x, v1.y, c4.x
mad r0.x, r0.x, -c4.x, v1.y
mov r0.yzw, c4.xyzy
mul r1, r0.yzyz, c56.w
mul r0.x, r0.x, r1.w
mad r0.z, v1.x, r1.z, r0.x
mul r1, r1, v0
add r0.xy, r1.ywzw, r1.xzzw
add r0.xyz, r0, c56
mul r1, c4.w, v2
mova a0, r1
mul r1, v3.y, c60[a0.y]
mad r1, v3.x, c60[a0.x], r1
mad r1, v3.z, c60[a0.z], r1
mad r1, v3.w, c60[a0.w], r1
dp4 r1.x, r1, r0
mul r2, v3.y, c61[a0.y]
mad r2, v3.x, c61[a0.x], r2
mad r2, v3.z, c61[a0.z], r2
mad r2, v3.w, c61[a0.w], r2
dp4 r1.y, r2, r0
mul r2, v3.y, c62[a0.y]
mad r2, v3.x, c62[a0.x], r2
mad r2, v3.z, c62[a0.z], r2
mad r2, v3.w, c62[a0.w], r2
dp4 r1.z, r2, r0
add r0.xyz, r1, -c9
mov r0.w, c4.y
dp4 o0.x, c0, r0
dp4 o0.y, c1, r0
dp4 o0.z, c2, r0
dp4 r0.x, c3, r0
mul o1, r0.x, c5.w
mov o0.w, r0.x

// approximately 34 instruction slots used



As one can see dcl_position v0 is the position's shader input register definition.
It is only used then in the line


mul r1, r1, v0



as a complete reference. I think the shader itself will unpack it automatically to a float4 - that's what I read in other post you might find, if you search for "ubyte4".

Share this post


Link to post
Share on other sites
Yes, UBYTE4 will be converted to a float4 in the shader. The conversion is a simple cast to float, which is what you are doing in your unpacking code. The shader you posted does some funky math to convert this encoded value to a model space position and eventually screen space. Try stuffing an identity matrix into c0-c3, capture a frame with PIX and look at how positions get converted to model space. Also, you can try PIX shader debugger.

Share this post


Link to post
Share on other sites
OK, let's see if I can write pieces of this shader in HLSL. Might not be totally correct but you should get the gist:


vs_3_0
def c4, 128, 1, 256, 3
dcl_position v0
dcl_tangent v1
dcl_blendindices v2
dcl_blendweight v3
dcl_position o0
dcl_texcoord o1
sge r0.x, v1.y, c4.x // float offset = input.tangent.y >= 128 ? 1.0f : 0.0f;
mad r0.x, r0.x, -c4.x, v1.y // offset = input.tangent.y - offset * 128;
mov r0.yzw, c4.xyzy // float2 scale0 = float2(128, 1);
mul r1, r0.yzyz, c56.w // float4 scale1 = scale0.xyxy * c56.w;
mul r0.x, r0.x, r1.w // offset *= scale1.w;
mad r0.z, v1.x, r1.z, r0.x // float posZ = input.tangent.x * scale1.z + offset;
mul r1, r1, v0 // float4 pos0 = input.position * scale1;
add r0.xy, r1.ywzw, r1.xzzw // pos0.xy = pos0.yw + pos0.xz; pos0.z = posZ; pos0.w = 256;
add r0.xyz, r0, c56 // pos0.xyz += c56.xyz;
mul r1, c4.w, v2 // float4 indices0 = 3 * input.blendindices;
mova a0, r1
mul r1, v3.y, c60[a0.y] // float4 row0 = input.blendweights.y * c60[indices.y];
mad r1, v3.x, c60[a0.x], r1 // row0 += input.blendweights.x * c60[indices.x];
mad r1, v3.z, c60[a0.z], r1 // row0 += input.blendweights.z * c60[indices.z];
mad r1, v3.w, c60[a0.w], r1 // row0 += input.blendweights.w * c60[indices.w];
dp4 r1.x, r1, r0 // float4 pos1; pos1.x = dot( row0, pos0 );
mul r2, v3.y, c61[a0.y] // float4 row1 = input.blendweights.y * c61[indices.y];
mad r2, v3.x, c61[a0.x], r2 // row1 += input.blendweights.x * c61[indices.x];
mad r2, v3.z, c61[a0.z], r2 // row1 += input.blendweights.z * c61[indices.z];
mad r2, v3.w, c61[a0.w], r2 // row1 += input.blendweights.w * c61[indices.w];
dp4 r1.y, r2, r0 // pos1.y = dot( row1, pos0 );
mul r2, v3.y, c62[a0.y] // float4 row2 = input.blendweights.y * c62[indices.y];
mad r2, v3.x, c62[a0.x], r2 // row2 += input.blendweights.x * c62[indices.x];
mad r2, v3.z, c62[a0.z], r2 // row2 += input.blendweights.z * c62[indices.z];
mad r2, v3.w, c62[a0.w], r2 // row2 += input.blendweights.w * c62[indices.w];
dp4 r1.z, r2, r0 // pos1.z = dot( row2, pos0 );
add r0.xyz, r1, -c9 // pos1.xyz -= c9.xyz;
mov r0.w, c4.y // pos1.w = 1;
dp4 o0.x, c0, r0 // output.position.x = dot( c0, pos1 );
dp4 o0.y, c1, r0 // output.position.y = dot( c1, pos1 );
dp4 o0.z, c2, r0 // output.position.z = dot( c2, pos1 );
dp4 r0.x, c3, r0 // output.position.w = dot( c3, pos1 );
mul o1, r0.x, c5.w
mov o0.w, r0.x



Share this post


Link to post
Share on other sites
I am sorry, I can't get it to work using PIX. The app I am talking about is Mafia 2 on Steam. When I start PIX on the Mafia2.exe a message pops up saying it could not find "detoured.dll" on my system.
So I downloaded Detours 2.1 Express and tried to build it, but I had no success.
Also tried to download it directly and place it into C:\Windows\System32, but it did not work either (.. I am running Win7 x64 Ultimate).

So I think I will have to try another way to fix this problem. :(

EDIT: Okay, moved detoured.dll to C:\Windows\SysWOW64\ but now when I try to use PIX now, it says: "The process "Mafia2.exe" exited unexpectedly while PIX was analyzing it. And it does not create any debug log. :(

[Edited by - brush87 on September 2, 2010 4:24:13 PM]

Share this post


Link to post
Share on other sites
Here is the complete shader written in HLSL, it compiles into assembly that looks close to the one you posted. Follow the math that reconstructs local space position from input UBYTE4 position and tangent.


#pragma pack_matrix( row_major )

struct VertexOut
{
float4 Position : POSITION;
float4 W : TEXCOORD;
};

struct VertexIn
{
float4 Position : POSITION;
float4 Tangent : TANGENT;
int4 Blendindices : BLENDINDICES;
float4 Blendweight : BLENDWEIGHT;
};

float3x4 Bones[64] : register( c60 );

float4 c56 : register( c56 );

float4 c5 : register( c5 );
float4 c9 : register( c9 );

float4x4 WorldViewProjection : register( c0 );

VertexOut main( VertexIn IN )
{
VertexOut OUT = (VertexOut)0;

float4 r0, r1;

r0.x = IN.Tangent.y >= 128 ? 1 : 0; // sge r0.x, v1.y, c4.x
r0.x = IN.Tangent.y - r0.x * 128; // mad r0.x, r0.x, -c4.x, v1.y

r0.yzw = float3(128, 1, 256); // mov r0.yzw, c4.xyzy
r1 = r0.yzyz * c56.w; // mul r1, r0.yzyz, c56.w
r0.x *= r1.w; // mul r0.x, r0.x, r1.w

r0.z = IN.Tangent.x * r1.z + r0.x; // mad r0.z, v1.x, r1.z, r0.x

r1 = r1 * IN.Position; // mul r1, r1, v0
r0.xy = r1.yw + r1.xz; // add r0.xy, r1.ywzw, r1.xzzw
r0.xyz += c56.xyz; // add r0.xyz, r0, c56

float4 pos0 = r0;

int4 indices = IN.Blendindices;

float3x4 blendMatrix = IN.Blendweight.x * Bones[indices.x];
blendMatrix += IN.Blendweight.y * Bones[indices.y];
blendMatrix += IN.Blendweight.z * Bones[indices.z];
blendMatrix += IN.Blendweight.w * Bones[indices.w];

float4 pos1;
pos1.xyz = mul( blendMatrix, pos0 );

pos1.xyz -= c9.xyz;
pos1.w = 1;

OUT.Position = mul( WorldViewProjection, pos1 );

OUT.W.xyzw = OUT.Position.w * c5.w;

return OUT;
}

Share this post


Link to post
Share on other sites
Why would anybody specify a model space position using XYZW?

I think you need to try casting the 4 bytes as XYZ, spreading however many bits you can across each of X,y, and Z and using a few bits for the scale or position of the decimal point. Look up 32-bit fixed point on Google.

There shouldn't be a 'W' stored in those 4 bytes.

Share this post


Link to post
Share on other sites
Thanks for the shader code!

Could it be that the math in this shader looks that funky,
because it is for some shadow rendering or something like this...?!

Within the single frame I am logging are 40 calls to SetRenderTarget(..).

I wish I could use PIX for debugging. :(
But maybe I can dump all necessary inputs to a file,
build an XNA sample application and then run PIX on it.

I mean, I can get all the shader constants, shader code,
and vertex information, you know...it must be possible, huh?

Share this post


Link to post
Share on other sites

This topic is 2659 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this