Texture.Load is essentially the same as Texture.Sample when you're using a point filter.
If you split the data into three separate textures then you can use Texture.Sample as normal, and whatever filtering you want. It also makes the pixel shader much simpler and faster as you don't need to mess about calculating weird texture coordinates.
It should also look better if you use bilinear filtering on the chroma channels, instead of making them blocky by using point sampling.
I'm wondering if the cost of splitting the data into 3 textures won't offset the cost benefit of simplifying the pixel shader. Also, I will also need shaders that convert packed YUV formats (i.e. Y0 U0 Y1 V0 etc.) and it'll be impossible to cost-effectively split the data before conversion. I might as well do the conversion in software in that case.
As a more general solution, I'm thinking of rendering the texture into a render target of exactly the same size so as to avoid any artifacts related to point filtering, and then draw that render target unto the backbuffer which will apply filtering on the final result.