Compute Shader Output To Stencil Buffer

Started by
7 comments, last by MoeTM 7 years, 9 months ago

Hello,

I want to accomplish to write into a stencil mask from a compute, at the moment i generate a vertex buffer witht the locations i want to write to and simply render it to the stencil mask. But i want to write to the stencil buffer directly or indirecty and use copyresource. My problem is now that i cannot find a depth stencil format for which i can create a compatible format i can use for copyresource.

Thanks for help

Advertisement

You definitely can't directly write into a stencil buffer. Depth-stencil buffers can't be used as UAV's or RTV's, so the only way to write to them is through copies or normal depth/stencil operations. I don't think that you can do it through a copy either. Copying to a resource requires using a format from the same family, and none formats in the same family as depth/stencil formats support UAV's or DSV's.

There is the new SV_StencilRef semantic that lets a pixel shader directly specify the stencil ref value, which you could use to write specific values into a stencil buffer. But it's only available in D3D11.3 and D3D12 (Windows 10-only), and I believe it's only supported by AMD hardware at the moment.

Thank you for fast answere, that was what expected.

The old school way of copying arbitrary data into the stencil buffer requires one pass per bit that you'd like to copy. If you need to copy an 8 bit value, you can render 8 fullscreen quads. Each quad is rendered with the stencil-state configured to write bit#N into the stencil buffer. The shader on the quad then reads the source data, checks if bit#N is set in the source data, and uses clip(-1) if the bit is not set (which stops the stencil test from writing that bit).

This is obviously stupidly inefficient, but I know of one AAA PC port that did actually use this technique, because the original console version contained graphics code that did write to the stencil buffer, and it seemed impossible to port this code to D3D9... until the above work-around was implemented.

and I believe it's only supported by AMD hardware at the moment.

Yep, only AMD GCN and Intel's Skylake / 9th gen hardware support SV_StencilRef.

Alright as you pointed out it is faster to render the pixels to mark into the stencil masks. But i have one problem doing that, with this technique i have two way the pixels are marked. The frist is the compute shader which writes the data, later needed, to an uav with the same dimensions as the viewport, i am masking on a second way using the stencil buffer. My problem is that they do not 100% match. Here is some code for clearification:

RWTexture2DArray<uint> pointer : register(u1);

AppendStructuredBuffer<float4> pixelMark : register(u3);

...

float4 projected = mul(float4(position, 1.0), mCameraWorldProjectionGS[z]);

projected /= projected.w;
projected = Frustum2TextureSpace(projected);

int2 xy = projected.xy * textureDimensions.xy;

int z = LightDirectionDetermineView(depth, g_fLightDirection_BasePow, g_fLightDirection_Far, g_Camera_vProps.w, NUM_PROJECTION);

pixelMark.Append(positionIndex); // store pixel for later create stencil mask

pointer[int3(xy, z)] = uint4(1, 1, 1, 1);

i am not sure that xy is calculated correctly. Here you can see the result, there is one pixel that does not match, the upper is the data written by the compute shader and the other is the stencil buffer. I render to the stencil buffer using DrawInstancedIndirect.

The frustum coordinates of of the error pixel are from the projection: -0.372919900 -0.142580700 0.305269900 1.0 after division by w

The index the compute shader writes to is therefore: 321.0650112 585.0013184 though int3(321 585 1)

The frustum coordinates are written in markPixel and later rendered by DrawInstancedIndirect, but the pixel 321 585 1 gets never written, is that an rounding error or something in the viewport transfrom i have not encounted?

example.png

In your code snippet, what is positionIndex?

In your code snippet, what is positionIndex?

Ah, sorry it is float4(projected.xyz, float(z)) and projected just after division by w

Maybe if you store integers in your AppendBuffer too, both data set will have the same rounding errors?

Do you apply the AppendBuffer contents to the stencil buffer by rendering point primitives, or something like that?

Maybe if you store integers in your AppendBuffer too, both data set will have the same rounding errors?

Yes, but how do I render them using integers, is it possible to output the 'target pixel index' from the vertex shader? I thought i could only output SV_Position which is in range -1 and 1 for xy and 0 and 1 for z.

Okay thankts i could solve the problem using integers in the appendbuffer, thanks for that :) But anyways, is there another way than output in frustum (-1 1) coordinates, e.g. maybe set the viewport to (-1 1)?

Do you apply the AppendBuffer contents to the stencil buffer by rendering point primitives, or something like that?

Yes exactly

And another small question, it is not problem to use an appendStructuredBuffer to insert the locations and the during the rendering just bind it as an structedbuffer and read from location given by SV_VertexID instead of using the ConsumeStructuredBuffer and use consume?

This topic is closed to new replies.

Advertisement