Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 18 Sep 2009
Offline Last Active Jun 09 2014 11:15 AM

Posts I've Made

In Topic: Smeared edges pattern using HLSL for color space conversion

09 June 2014 - 08:38 AM

Texture.Load is essentially the same as Texture.Sample when you're using a point filter.


If you split the data into three separate textures then you can use Texture.Sample as normal, and whatever filtering you want. It also makes the pixel shader much simpler and faster as you don't need to mess about calculating weird texture coordinates.


It should also look better if you use bilinear filtering on the chroma channels, instead of making them blocky by using point sampling.

I'm wondering if the cost of splitting the data into 3 textures won't offset the cost benefit of simplifying the pixel shader. Also, I will also need shaders that convert packed YUV formats (i.e. Y0 U0 Y1 V0 etc.) and it'll be impossible to cost-effectively split the data before conversion. I might as well do the conversion in software in that case.


As a more general solution, I'm thinking of rendering the texture into a render target of exactly the same size so as to avoid any artifacts related to point filtering, and then draw that render target unto the backbuffer which will apply filtering on the final result.

In Topic: Smeared edges pattern using HLSL for color space conversion

07 June 2014 - 09:17 PM

By far the simplest option to get this right is to split the data into three R8 textures - one for each channel.


This lets you use hardware texture filtering, which should look better than point filtering. It also means you can easily apply the result to arbitrary geometry.

Someone at Stackoverflow said I should use Texture.Load instead which doesn't perform any filtering. That indeed sounds more like what I really need for this kind of processing, what do you think?

In Topic: Smeared edges pattern using HLSL for color space conversion

07 June 2014 - 09:10 PM

Ahh, I would suggest changing MaximumAnisotropy in your sampler state from 16 to 1.


EDIT: And possibly the filter to Filter.MinMagMipPoint (not sure if that's what it's called in C#, it's D3D11_FILTER_MIN_MAG_MIP_POINT in C++), if all you want to do is convert an image then you don't really need (or want) texture filtering.


EDIT2: Some context on why I think anisotropic filtering is your problem: anisotropic filtering is used to filter out aliasing artifacts that occur when a surface is viewed at an oblique angle. The GPU can tell when a surface is at an oblique angle if one of the texture coordinate derivatives is large and the other is small, so in your case when you change the .x coordinate by .5 every other row, it's tricking the gpu into thinking your surface is at a really oblique angle, so it's doing a lot of anisotropic filtering (essentially blurring) to remove high frequency data that it thinks is going to show up as aliasing artifacts. But really in your case you aren't viewing the surface at an oblique angle, and the filtering is unnecessary and counter productive.

You're my hero! That was it and it's a logical explanation of the phenomenon. smile.pngsmile.pngsmile.png

In Topic: Smeared edges pattern using HLSL for color space conversion

07 June 2014 - 02:30 PM


The size of the window is set to exactly 1280x720.


First - I'm completely ignorant of WPF, etc., so this may be an inane comment. However, is that the size of the window, or the size of the client area? I.e., does the size of your backbuffer match the size of presentation area?

Yes it matches exactly, I've verified this by having the shader output a pattern of alternating lines which end up perfectly aligned with physical pixels.


What kind of filtering do you have set up on the YUV texture you pass to your pixel shader? Linear, nearest, max anisotropic? And also, does your YUV texture have mipmaps? My first guess is that if you have mipmaps or anisotropic filtering then you might get some weird results because your texture coordinate derivative will be really large between rows, which will cause a super small mip to be selected.

This is the texture description (in C# using SharpDX):

            var textureDescription = new Texture2DDescription {
                ArraySize = 1,
                BindFlags = BindFlags.ShaderResource,
                CpuAccessFlags = CpuAccessFlags.Write,
                Format = Format.R8_UNorm,
                Width = 1280,
                Height = 720 + 720 / 2,
                MipLevels = 1,
                OptionFlags = ResourceOptionFlags.None,
                SampleDescription = new SampleDescription(1, 0),
                Usage = ResourceUsage.Dynamic

And the sampler state:

            var sampler = new SamplerState(device, new SamplerStateDescription {
                Filter = Filter.ComparisonAnisotropic,
                AddressU = TextureAddressMode.Wrap,
                AddressV = TextureAddressMode.Wrap,
                AddressW = TextureAddressMode.Wrap,
                BorderColor = Color.Black,
                ComparisonFunction = Comparison.Never,
                MaximumAnisotropy = 16,
                MipLodBias = 0,
                MinimumLod = 0,
                MaximumLod = 16,

What do you suggest I change?

In Topic: Fast copying to rendertarget in D3D9

30 July 2013 - 08:27 AM

What you're doing is simply not fast because it just stresses bus bandwidth without ripping the benefits of using a GPU. Btw, check the surfaces may not been created with the dynamic flag (and your locks may not be using the discard flag), if that's the case it would help you with performance a lot.

I tried to pass Usage.Dynamic to CreateRenderTargetEx, however this fails with D3DERR_INVALIDCALL. Is it possible to create a "dynamic" render target, or should I use an intermediate surface, or...?

The intrigue lies in what you mean by "arbitrary image data". If "arbitrary image data" means (for example) you're using libcairo to render nice & complex 2D graphics and then send it to multiple D3D Surfaces, then that's not going to be fast. You're wasting your time trying to use the GPU.
Just combine them on CPU and send the final result to only one D3D surface
If by "arbitrary image data" you mean loading a few icons or pictures from a file, then you should do the update only once, not every frame.
If by "arbitrary image data" you mean images created through compositing (eg. static images, or rectangles layered on top of each other with different alpha blending operations, eg. photoshop-like blend modes) then your method is not the right way to do it; you should upload the static data once, and then use pixel shaders to do the operations you were doing on the CPU to achieve the same result.
Think sequence of pre-generated images, i.e. video. Every tile renders a different sequence of images. These cannot be computed on the GPU. The composition is done by WPF so it cannot be done beforehand either.
Sending all as one D3D surface looks like an interesting optimisation, but they may all be of different sizes and they're all potentially rather large so I'm not sure there's a way of always efficiently combining them.