Smeared edges pattern using HLSL for color space conversion

Started by
9 comments, last by Dr_Asik 9 years, 10 months ago

I've asked the same question on Stackoverflow but I doubt I'll get a good answer there.

I'm trying to write a YUV to RGB shader in HLSL. Specifically, it converts the Yuv420p format which consists of an N*M plane of Y values, followed by an (N/2)*(M/2) plane of U values and then an (N/2)*(M/2) plane of V values. For example this 1280x720 picture:

lnnKWnt.png

looks like this in YUV format interpreted as an 8-bit, 1280x1080 texture:

ndYpPuK.png

In Direct3D11, I'm loading this as a Texture2D with format R8_UNorm and dimensions 1280x1080. The tricky part is reconstituting the U and V planes, because as you can see, half the lines are on the left side of the texture, and the other half is on the right side (this is simply due to how Direct3D views it as a texture; the lines are simply one after the other in memory). In the shader, I do this like so:


    struct PS_IN
    {
        float4 pos : SV_POSITION;
        float2 tex : TEXCOORD;
    };

    Texture2D picture;
    SamplerState pictureSampler;
    
    float4 PS(PS_IN input) : SV_Target
    {
        int pixelCoord = input.tex.y * 720;
        bool evenRow = (pixelCoord / 2) % 2 == 0;
    
        //(...) illustrating U values:
        float ux = input.tex.x / 2.0;
        float uy = input.tex.y / 6.0 + (4.0 / 6.0);
        if (!evenRow)
        {
            ux += 0.5;
        }
    
        float u = picture.Sample(pictureSampler, float2(ux, uy)).r;
        u *= 255.0;
    
        // for debug purposes, display just the U values
        float4 rgb;
        rgb.r = u;//y + (1.402 * (v - 128.0));
        rgb.g = u;//y - (0.344 * (u - 128.0)) - (0.714 * (v - 128.0));
        rgb.b = u;//y + (1.772 * (u - 128.0));
        rgb.a = 255.0;
    
        return rgb / 255.0;
    }

However, for some strange reason, this seems to produce a weird horizontal pattern of smeared edges:

YeJ1C56.png

Note that if put either true or false as the condition (so that ux is either always or never incremented by 0.5f), the pattern doesn't appear - although of course we get half the resolution. Also note that I did a basically copy-paste C# translation of the HLSL code and it doesn't produce this effect:

rYkP1gE.png

FWIW, I'm using WPF to create the window and initialize Direct3D using its HWND via WindowInteropHelper. The size of the window is set to exactly 1280x720.

Advertisement

What kind of filtering do you have set up on the YUV texture you pass to your pixel shader? Linear, nearest, max anisotropic? And also, does your YUV texture have mipmaps? My first guess is that if you have mipmaps or anisotropic filtering then you might get some weird results because your texture coordinate derivative will be really large between rows, which will cause a super small mip to be selected.


The size of the window is set to exactly 1280x720.

First - I'm completely ignorant of WPF, etc., so this may be an inane comment. However, is that the size of the window, or the size of the client area? I.e., does the size of your backbuffer match the size of presentation area?

Please don't PM me with questions. Post them in the forums for everyone's benefit, and I can embarrass myself publicly.

You don't forget how to play when you grow old; you grow old when you forget how to play.


The size of the window is set to exactly 1280x720.

First - I'm completely ignorant of WPF, etc., so this may be an inane comment. However, is that the size of the window, or the size of the client area? I.e., does the size of your backbuffer match the size of presentation area?

Yes it matches exactly, I've verified this by having the shader output a pattern of alternating lines which end up perfectly aligned with physical pixels.

What kind of filtering do you have set up on the YUV texture you pass to your pixel shader? Linear, nearest, max anisotropic? And also, does your YUV texture have mipmaps? My first guess is that if you have mipmaps or anisotropic filtering then you might get some weird results because your texture coordinate derivative will be really large between rows, which will cause a super small mip to be selected.

This is the texture description (in C# using SharpDX):


            var textureDescription = new Texture2DDescription {
                ArraySize = 1,
                BindFlags = BindFlags.ShaderResource,
                CpuAccessFlags = CpuAccessFlags.Write,
                Format = Format.R8_UNorm,
                Width = 1280,
                Height = 720 + 720 / 2,
                MipLevels = 1,
                OptionFlags = ResourceOptionFlags.None,
                SampleDescription = new SampleDescription(1, 0),
                Usage = ResourceUsage.Dynamic
            };

And the sampler state:


            var sampler = new SamplerState(device, new SamplerStateDescription {
                Filter = Filter.ComparisonAnisotropic,
                AddressU = TextureAddressMode.Wrap,
                AddressV = TextureAddressMode.Wrap,
                AddressW = TextureAddressMode.Wrap,
                BorderColor = Color.Black,
                ComparisonFunction = Comparison.Never,
                MaximumAnisotropy = 16,
                MipLodBias = 0,
                MinimumLod = 0,
                MaximumLod = 16,
            });

What do you suggest I change?

Ahh, I would suggest changing MaximumAnisotropy in your sampler state from 16 to 1.

EDIT: And possibly the filter to Filter.MinMagMipPoint (not sure if that's what it's called in C#, it's D3D11_FILTER_MIN_MAG_MIP_POINT in C++), if all you want to do is convert an image then you don't really need (or want) texture filtering.

EDIT2: Some context on why I think anisotropic filtering is your problem: anisotropic filtering is used to filter out aliasing artifacts that occur when a surface is viewed at an oblique angle. The GPU can tell when a surface is at an oblique angle if one of the texture coordinate derivatives is large and the other is small, so in your case when you change the .x coordinate by .5 every other row, it's tricking the gpu into thinking your surface is at a really oblique angle, so it's doing a lot of anisotropic filtering (essentially blurring) to remove high frequency data that it thinks is going to show up as aliasing artifacts. But really in your case you aren't viewing the surface at an oblique angle, and the filtering is unnecessary and counter productive.

By far the simplest option to get this right is to split the data into three R8 textures - one for each channel.

This lets you use hardware texture filtering, which should look better than point filtering. It also means you can easily apply the result to arbitrary geometry.

Ahh, I would suggest changing MaximumAnisotropy in your sampler state from 16 to 1.

EDIT: And possibly the filter to Filter.MinMagMipPoint (not sure if that's what it's called in C#, it's D3D11_FILTER_MIN_MAG_MIP_POINT in C++), if all you want to do is convert an image then you don't really need (or want) texture filtering.

EDIT2: Some context on why I think anisotropic filtering is your problem: anisotropic filtering is used to filter out aliasing artifacts that occur when a surface is viewed at an oblique angle. The GPU can tell when a surface is at an oblique angle if one of the texture coordinate derivatives is large and the other is small, so in your case when you change the .x coordinate by .5 every other row, it's tricking the gpu into thinking your surface is at a really oblique angle, so it's doing a lot of anisotropic filtering (essentially blurring) to remove high frequency data that it thinks is going to show up as aliasing artifacts. But really in your case you aren't viewing the surface at an oblique angle, and the filtering is unnecessary and counter productive.

You're my hero! That was it and it's a logical explanation of the phenomenon. smile.pngsmile.pngsmile.png

By far the simplest option to get this right is to split the data into three R8 textures - one for each channel.

This lets you use hardware texture filtering, which should look better than point filtering. It also means you can easily apply the result to arbitrary geometry.

Someone at Stackoverflow said I should use Texture.Load instead which doesn't perform any filtering. That indeed sounds more like what I really need for this kind of processing, what do you think?

Texture.Load is essentially the same as Texture.Sample when you're using a point filter.

If you split the data into three separate textures then you can use Texture.Sample as normal, and whatever filtering you want. It also makes the pixel shader much simpler and faster as you don't need to mess about calculating weird texture coordinates.

It should also look better if you use bilinear filtering on the chroma channels, instead of making them blocky by using point sampling.


It should also look better if you use bilinear filtering on the chroma channels, instead of making them blocky by using point sampling.

This is a definite advantage over the point sampling I suggested and the Texture.Load the person on stackoverflow suggested.

This topic is closed to new replies.

Advertisement