Passing vertex positions to the Vertex Shader

Started by
5 comments, last by MJP 6 years, 8 months ago

I have two questions with regard to passing vertices to the Vertex Shader.

 

1) I have seen code samples (i.e. DirectXTK) using the semantic SV_Position (instead of POSITION/POSITION0) for passing the position of a vertex to the Vertex Shader.

Why do you want to do that?

 

2)

A vertex structure could look like this:

struct VertexPosition {
    XMFLOAT3 p;
};

The corresponding input element descriptor looks like this:

const D3D11_INPUT_ELEMENT_DESC desc[] = {
        { "POSITION", 0, DXGI_FORMAT_R32G32B32_FLOAT, 0, D3D11_APPEND_ALIGNED_ELEMENT, D3D11_INPUT_PER_VERTEX_DATA, 0 }
};

The GPU vertex shader input structure looks like this:

struct VSInputPosition {
    float4 p : POSITION0;
};

So how is it possible that affine transformations (especially the translation component) work in the following Vertex Shader:

PSInputPosition Transform_VS(VSInputPosition input) {
    PSInputPosition output;
    output.p = mul(input.p,  g_local_to_projection);
    return output;
}

It makes more sense to use the following code:

struct VSInputPosition {
    float3 p : POSITION0;
};

PSInputPosition Transform_VS(VSInputPosition input) {
    PSInputPosition output;
    output.p = mul(float4(input.p, 1.0f),  g_local_to_projection);
    return output;
}

🧙

Advertisement
1 hour ago, matt77hias said:

1) I have seen code samples (i.e. DirectXTK) using the semantic SV_Position (instead of POSITION/POSITION0) for passing the position of a vertex to the Vertex Shader.

You have to use SV_Position if you want your data directed to the rasterizer. Back in D3D9, you had to use POSITION/POSITION0.

https://msdn.microsoft.com/en-us/library/windows/desktop/bb205073(v=vs.85).aspx

1 hour ago, matt77hias said:

So how is it possible that affine transformations (especially the translation component) work in the following Vertex Shader

If the IA is filling in a missing x/y/z component it will insert a 0.0f, but if it's filling in a missing w component it will insert a 1.0f.

17 minutes ago, Hodgman said:

You have to use SV_Position if you want your data directed to the rasterizer. Back in D3D9, you had to use POSITION/POSITION0.

Indeed, the last stage (VS, DS or GS) before RS needs to output a SV_Position, but you always need a VS (I presume). So why not letting the VS output a SV_Position? The only reason, I can think of is reducing the number of different input/output structs in the code by passing the input directly as output if no operations are required. This will, however, not result in any performance increase or resource usage decrease?

25 minutes ago, Hodgman said:

If the IA is filling in a missing x/y/z component it will insert a 0.0f, but if it's filling in a missing w component it will insert a 1.0f.

Is it possible to show me where msdn mentions this explicitly?

🧙

On 8/25/2017 at 4:37 AM, matt77hias said:

Is it possible to show me where msdn mentions this explicitly?

I took a quick look around, and I wasn't able to find where this is documented. However it's been this way since D3D9, and maybe even D3D8. 

Personally I prefer to always add the w component myself in the shader code instead of relying on the IA to fill it in, because I like being explicit. It also lets the compiler optimize the code a bit better, since it can strip out code where that 1.0 is multiplied with another value.

9 hours ago, MJP said:

Personally I prefer to always add the w component myself in the shader code instead of relying on the IA to fill it in, because I like being explicit. It also lets the compiler optimize the code a bit better, since it can strip out code where that 1.0 is multiplied with another value.

That's a very good point you made. Time to start refactoring some things. (Although, I am a bit sceptical of the HLSL compiler included in Visual Studio. For example: I expect the compiler to eliminate statements such as SomeStruct s = (SomeStruct)0; in case all the fields are initialized in the following statements. HLSL structs could not introduce side-effects anyway.)

9 hours ago, MJP said:

I took a quick look around, and I wasn't able to find where this is documented. However it's been this way since D3D9, and maybe even D3D8. 

Btw: I was also expecting that this was explicitly mentioned somewhere in "Practical Rendering and Computation with Direct3D 11" (but this could also be due to the Direct3D 1 ;) ). Excellent read anyway.

🧙

In general FXC is pretty good at dead-stripping no-ops like multliply-by-1 or adding 0. It will also aggressively remove branches that can be statically evaluated, and strip out dead code that has no effect on the shader outputs. Is can do this because the shader language is so simple: it can always see all of the code used for a program (since there's no linking), and so there's no chance of unknown side effects. But to be sure, we can try it out real quick. Here's two versions of a dead-simple vertex shader, both compiled with the latest version of FXC from the latest Windows 10 SDK (10.0.15063.0):


cbuffer VSConstants
{
    row_major float4x4 WorldViewProj;
}

float4 VSMain1(in float4 pos : POSITION) : SV_Position
{
    return mul(pos, WorldViewProj);
}

// vs_5_0
// dcl_globalFlags refactoringAllowed
// dcl_constantbuffer CB0[4], immediateIndexed
// dcl_input v0.xyzw
// dcl_output_siv o0.xyzw, position
// dcl_temps 1
// mul r0.xyzw, v0.yyyy, cb0[1].xyzw
// mad r0.xyzw, v0.xxxx, cb0[0].xyzw, r0.xyzw
// mad r0.xyzw, v0.zzzz, cb0[2].xyzw, r0.xyzw
// mad o0.xyzw, v0.wwww, cb0[3].xyzw, r0.xyzw
// ret

float4 VSMain2(in float3 pos : POSITION) : SV_Position
{
    return mul(float4(pos, 1.0f), WorldViewProj);
}

// vs_5_0
// dcl_globalFlags refactoringAllowed
// dcl_constantbuffer CB0[4], immediateIndexed
// dcl_input v0.xyz
// dcl_output_siv o0.xyzw, position
// dcl_temps 1
// mul r0.xyzw, v0.yyyy, cb0[1].xyzw
// mad r0.xyzw, v0.xxxx, cb0[0].xyzw, r0.xyzw
// mad r0.xyzw, v0.zzzz, cb0[2].xyzw, r0.xyzw
// add o0.xyzw, r0.xyzw, cb0[3].xyzw
// ret

So you can see that the second shader skips multiplying the W component by the 4th row of the matrix, and instead does a normal add. In this case this doesn't actually buy us anything since most GPU's can do a single-cycle MAD, but you can imagine how this would extend to more complex scenarios.

By the way, my co-workers still make fun of me for the Direct3D 1 thing. :)

This topic is closed to new replies.

Advertisement