How to read DXGI_FORMAT_NV12 or DXGI_FORMAT_P010 texture in Direct3D shader

Started by
8 comments, last by SoldierOfLight 7 years, 12 months ago

I have a 2D texture in DXGI_FORMAT_NV12 or some other chroma subsampled texture (4:2:0 here) in DirectX. How can I read the texture values to retrieve the corresponding YUV value? Here the data is layed out as planar in Y component and UV in another plane interleaved.

Advertisement

You create two views of your data. You create one SRV with format R8_UNORM (for example), which lets you read the Y value, and then you create a second SRV with format R8G8_UNORM (again, for example), which lets you read the UV data. The texture coordinates you use in the shader to sample for the UV should be properly subsampled.

For P010, you'd use 16bpc formats. The padding bits are properly ignored by the sampling hardware.

Example:


Texture2D <float4> Data : register(t0);
Texture2D <float2> DataPlane1 : register(t1);
cbuffer SubsampleData : register(b0)
{
    float g_SubsampleFactorX : packoffset(c0.x);
    float g_SubsampleFactorY : packoffset(c0.y);
};
?
// DXGI_FORMAT_NV12/NV11/P010/P016/P208/420_OPAQUE -> t0.r = Y, t1.rg = UV (4:2:0, 4:2:2, or 4:1:1 subsampled)
float4 PS2PlaneYUV(float4 input : SV_POSITION) : SV_TARGET
{
    int3 coordsY = int3(input.xy, 0);
    int3 coordsUV = coordsY * float3(g_SubsampleFactorX, g_SubsampleFactorY, 0);
    float4 YUVA = float4(Data.Load(coordsY).r, DataPlane1.Load(coordsUV).rg, 1);
    return YUVToRGB(YUVA);
}

Bonus: My YUVToRGB function:


float4 YUVToRGB(float4 YUVA)
{
    float C = (YUVA.r * 256) - 16;
    float D = (YUVA.g * 256) - 128;
    float E = (YUVA.b * 256) - 128;

    float R = clamp(( 298 * C           + 409 * E + 128) / 256, 0, 256);
    float G = clamp(( 298 * C - 100 * D - 208 * E + 128) / 256, 0, 256);
    float B = clamp(( 298 * C + 516 * D           + 128) / 256, 0, 256);

    return float4(R / 256, G / 256, B / 256, YUVA.a);
}

Thank you very much for the code. On a side note, how can I output 10 bit data to the back buffer? The back buffer could be set to R10G10B10A2_UNORM format, not R16G16B16A16_UNORM format. In the shader the output is float4. If I just output float4, will Direct3D convert it to R10G10B10A2 automatically?

Correct. You should also be able to use R16G16B16A16_FLOAT if you'd like.

Correct. You should also be able to use R16G16B16A16_FLOAT if you'd like.

My question was, I could not use R16G16B16A16_FLOAT as back buffer format, it is not supported. But R10G10B10A2_UNORM is ok. In the shader I simply write the output as float4, and it seems the result is fine. Did DirectX magically convert it to R10G10B10A2_UNORM at the output? I want to make sure the output is 10 bit.

Your question was about R16G16B16A16_UNORM, not _FLOAT. The latter should work.

Either way, yes, floats that are output from the shader will be converted to the render target data format by the hardware.

Could you share the SRV creation code? If there are two SRVs, should it specify where the Y and UV start from? I don't see there is a way in the CreateShaderResourceView to describe the offset. In P010 case, I suppose the Y format should be R16_UNORM, and UV format should be R16G16_UNORM.

For 4:2:0, g_SubsampleFactorX=1.f, g_SubsampleFactorY=0.5f, is it correct?

The SRV creation code is not really centralized to the point that I can share it from this particular source.

The two SRVs are distinguishable just by the format of the view. One channel indicates Y, while two channels indicates UV. For 8 bit planar YUV, you're only allowed to specify formats of R8 and R8G8. For 10 or more bits, the only allowed formats are R16 and R16G16.

For the subsample factors:

The pitch is the same between the Y and UV planes, but the UV plane has two components where Y only has 1. So while the bit-width of the UV plane is the same as the Y plane, the logical width (number of pixels) in the UV plane is half.

g_SubsampleFactorX = 0.5f, g_SubsampleFactorY = 0.5f.

https://en.wikipedia.org/wiki/Chroma_subsampling#Sampling_systems_and_ratios has a description. 4:2:0 is half vertical and half horizontal resolution.

What I don't understand is, the underlying texture (NV12 or P010) is one object, if the two SRVs point to the same data, shouldn't the UV load have an offset to Y?

When YCoords = (0, 0), UVCoords will be (0, 0) per your equation. How could it be correct when UV is actually (width, height) away in the data laid out?

I understand your confusion. The offset is implied by the format of the view. A view with a format with one channel means you're looking at the Y plane, at offset 0. When your view format has two channels, it means you're looking at the UV plane, at offset 1.

You can also take a look at ID3D11Device3::CreateShaderResourceView1, where the texture2D descs for the view have an additional parameter for a plane index. This is the offset you're looking for, but it can be inferred just based on the view format.

This topic is closed to new replies.

Advertisement