HLSL - Problem with Global Variables

Started by
3 comments, last by Afritus 7 years, 2 months ago

Hello,
I am new to this forum and new to HLSL. My plan is to implement a compute shader performing a fast fourier transform to calculate a displacement map which I want to use to simulate an ocean surface.
To do this, a 2D-iFFT has to be performed at the end. I want to do it all in one shader call, so I decided to create a few global RWTexture2D variables for calculations across multiple functions.
My problem is following: The global variable OutputSurface seems to be persistent and synchronized across all functions (like global variables I know from other languages). The other two global RWTexture2D variables though (basefield_real and basefield_imag) lose their assigned values after entering an upper scope level.
I've detailed the problem in MainComputeShader in the code fragment below, where values stored in basefield_real in the first loop are lost in the second loop (all 0s again).
This is a very odd problem, which I haven't encountered before. Did I miss anything crucial? A dirty workaround would be using more "bound" variables modifying the C++ code executing the shader. On the other hand, these are output textures, and I don't think this would be the approriate way to do it.
I would really like to know what's causing this behavior. Thank you!

Regards

PS: The code


#pragma once

#include "Common.usf"

#define FFTDIM 512

RWTexture2D<float> OutputSurface; // The output surface, bound from outside by Unreal Engine 4
RWTexture2D<float> basefield_real; // A global temp variable for storing real parts
RWTexture2D<float> basefield_imag; // A global temp variable for storing imaginary parts
// Remark: I can't load from a >32 bit value array, so I had to split in real/imag

uint global_seed;
uint RandSeed;
float Time;

// Some functions and FFT implementation
// ...

[numthreads(FFTDIM, 1, 1)]
void MainComputeShader(uint3 ThreadId : SV_DispatchThreadID)
{
    // Set up some variables we are going to need
    uint thread_id = ThreadId.x;

    // Get parameters with:
    global_seed = 0; // Global variable above
    RandSeed = CSVariables.RandSeed; // Global variable above
    Time = CSVariables.Time; // Global variable above

    float buf_test[512];

    // Prepare data for FFT and store in basefield_real and basefield_imag
    for(int i = 0; i < FFTDIM; i++)
    {
        float2 p = float2(i, thread_id);
        float2 ht_spectrum = HtSpectrum(p);
        basefield_real[p] = ht_spectrum.x; // basefield_real[p] contains ht_spectrum.x after this line
        basefield_imag[p] = ht_spectrum.y;
        // Doing OutputSurface[p] = TempSurfaceReal[p] here copies data from basefield_real to OutputSurface correctly
    }

    // ROW pass
    //FFTrows(thread_id);

    // Wait for all rows to finish
    //AllMemoryBarrierWithGroupSync();

    // COL pass
    //FFTcols(thread_id);

    // Write displacement data to output (only real parts)
    for(int i = 0; i < FFTDIM; i++)
    {
        float2 p = float2(i, thread_id);
        OutputSurface[p] = basefield_real[p]; // basefield_real[p] is all empty (0) here
    }

}

Advertisement

A dirty workaround would be using more "bound" variables modifying the C++ code executing the shader. On the other hand, these are output textures, and I don't think this would be the approriate way to do it.
I would really like to know what's causing this behavior.

Are you saying that the C++ code doesn't bind any texture resources to these RWTexture variables? An RWTexture<T> in HLSL is similar to a T* in C++. If it's "unbound", it's basically a NULL pointer.
C++ will crash if you use a NULL pointer, but HLSL just ignores any reads/writes to them (reads return 0 and writes become nops).

If the temp data is local to your thread, you can use a global "static float", or if it's a few KB and local to your thread-group you can use a global "groupshared float.. [n]".
If it's shared among all groups in the dispatch or larger than a few dozen KB, then you've got to allocate some memory and bind that memory to your texture variable.

This makes sense, thank you for the explanation! Unfortunately, I've already tried using things like float something[...][...], but I need a 512x512 buffer and apparently that's too large, namely 1MB. I have now used two different RWTexture2D variables and allocated memory for them using C++. It seems to work now, at least the values of the RWTexture2D are not lost between scopes. So, allocating a couple of megabytes every frame (the final output texture alone will be a float4 of 512x512) isn't a huge problem?

Regards

You don't have to reallocate the surfaces each frame, just reuse the existing ones that you allocate at the beginning.

Niko Suni

You don't have to reallocate the surfaces each frame, just reuse the existing ones that you allocate at the beginning.

That's exactly what I am trying to do now. I still have to find out how to do it with the Unreal Engine, though. Furthermore, my FFT implementation is still a bit slow and there seems to be a mistake somewhere in my calculations :/

This topic is closed to new replies.

Advertisement