Sign in to follow this  
freka586

HLSL, non-specified length of array arg?

Recommended Posts

I am testing around a bit with some basic image filtering in HLSL. However I have yet to find a suitable way to pass an array to the shader program, without first specifying the number of elements. This might be a strict requirement, in order for the HLSL compiler to properly do its work. But I'd still like to check this with the people here, as it would nice to have the shader completely generic when it comes to filter size... OK ----- float filterWeight3x3[9]; float filterWeight5x5[25]; NOT OK ----- float filterWeights[];

Share this post


Link to post
Share on other sites
Um, it sounds so simply now that you say it... Will definately try that one!

And I guess for scheduling etc. it should make no difference if the array is specified to 49 elements, but only 9 are provided and accessed?

Share this post


Link to post
Share on other sites
It probably shouldn't but it could depend on what shader model you're targetting, and what the driver does with the shader once it gets it.

Personally I like to use declare Effect techniques that set a value of a uniform int which controls the filter size...this causes the Effect compiler to auto-generate a separate pixel shader for each filter width. But this does require you to hard-code your filter widths, which is what you're trying to avoid here.

Share this post


Link to post
Share on other sites
Hey, I'm open for all kinds of ideas! Your Effect approach sounds interesting, care to elaborate a bit more? Auto-generation sounds like it might also prevent the code duplication I am looking for...

At least as an initial approach, my filter kernels would all be square (3x3, 5x5, 7x7, perhaps larger), and executed in a single pass.

Any additional comments, suggestions, ideas or experiences on the topic would be greatly appreciated!

Thanks in advance,

Fredrik

Share this post


Link to post
Share on other sites
Sure, no problem. Let's take a guassian blur pixel shader, which you could set up like this:


static const float SIGMA = 0.5f;

float CalcGaussianWeight(int iSamplePoint)
{
float g = 1.0f / sqrt(2.0f * 3.14159 * SIGMA * SIGMA);
return (g * exp(-(iSamplePoint * iSamplePoint) / (2 * SIGMA * SIGMA)));
}

float4 GaussianBlur ( in float2 in_vTexCoord : TEXCOORD0,
uniform int iRadius,
uniform bool bVertical ) : COLOR0
{
float4 vColor = 0;
float2 vTexCoord = in_vTexCoord;

for (int i = -iRadius; i < iRadius; i++)
{
float fWeight = CalcGaussianWeight(i);

if (bVertical)
vTexCoord.y = in_vTexCoord.y + (i / g_vSourceDimensions.y);
else
vTexCoord.x = in_vTexCoord.x + (i / g_vSourceDimensions.x);

float4 vSample = tex2D(PointSampler0, vTexCoord);
vColor += vSample * fWeight;
}

return vColor;
}




So the pixel shader as 2 uniform int parameters: one that controls the radius of the blur (and the number of samples), and one that controls whether we're doing a vertical or horizontal pass. You can then set values for these in your technique declaration:


technique GaussianBlurH2
{
pass p0
{
VertexShader = compile vs_2_0 PostProcessVS();
PixelShader = compile ps_2_0 GaussianBlurH(2, false);
}
}

technique GaussianBlurV2
{
pass p0
{
VertexShader = compile vs_2_0 PostProcessVS();
PixelShader = compile ps_2_0 GaussianBlurV(2, true);
}
}




Then for each of these techniques, the Effect compiler will automatically generate a separate pixel shader with the 2 parameters set as if you hard-coded them yourself (which means the resulting assembly won't use branches or loops or anything like that, since the values are known at compilation time). This also means that the bit of code for generating the Gaussian filter can be entirely evaluated at compilation time.

Share this post


Link to post
Share on other sites
That's really nice!

I had no idea that this kind of construction can be handled compile time, and have been avoiding it because of branching penalties until now.

This far I have been focusing mainly on single pass filtering, to avoid having intermediate rendertarget texture(s). Are there any neat shortcuts here also that can be of use?

My end goal for this area is to create an edge enhancement filter. The main challenges are performance (re-apply for each frame as result cannot be reused) and texture resolution (2k x 2k to 4k x 4k typically). There are also memory constraint due to large textures being kept for a long time.

I have consider doing a multipass approach such as separate steps of Laplacian of Gaussian och unsharp mask, but to avoid rendertarget textures I have for now used a composite filter kernel of the LoG filter.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this