Sign in to follow this  
neroziros

DirectCompute Sorting Problem

Recommended Posts

Hi all! I write here today because I don't understand a problem I had porting the Microsoft's DirectCompute sorting example to my particle system. Link: http://code.msdn.microsoft.com/windowsdesktop/DirectCompute-Basic-Win32-7d5a7408

 

When I tried to adapt the shader example, I got the error "X3020: type mismatch between conditional values". On this line:

Particle result = ((shared_data[GI & ~j].position.z <= shared_data[GI | j].position.z) == (bool)(g_iLevelMask & DTid.x)) ? shared_data[GI ^ j] : shared_data[GI];

But when I changed it to this:

 

Particle result;
if ((shared_data[GI & ~j].position.z <= shared_data[GI | j].position.z) == (bool)(g_iLevelMask & DTid.x))
  result = shared_data[GI ^ j];
else
  result = shared_data[GI];

 

The error was gone. I think both of them are the same thing, am I misundestading ternary operators? or is this something shader related? Thanks in advance for your time

 

Bellow is my current code in case someone else in interested on this:

 

//--------------------------------------------------------------------------------------
// File: ComputeShaderSort11.hlsl
//
// This file contains the compute shaders to perform GPU sorting using DirectX 11.
// 
// Copyright (c) Microsoft Corporation. All rights reserved.
//--------------------------------------------------------------------------------------


#define BITONIC_BLOCK_SIZE 512


#define TRANSPOSE_BLOCK_SIZE 16


// Particle Structure (relevant for the simulation)
struct Particle
{
float3 position;
float3 velocity;
float  time;
};


//--------------------------------------------------------------------------------------
// Constant Buffers
//--------------------------------------------------------------------------------------
cbuffer CB : register(b0)
{
unsigned int g_iLevel;
unsigned int g_iLevelMask;
unsigned int g_iWidth;
unsigned int g_iHeight;
};


//--------------------------------------------------------------------------------------
// Structured Buffers
//--------------------------------------------------------------------------------------
StructuredBuffer<Particle> Input : register(t0);
RWStructuredBuffer<Particle> Data : register(u0);


//--------------------------------------------------------------------------------------
// Bitonic Sort Compute Shader
//--------------------------------------------------------------------------------------
groupshared Particle shared_data[BITONIC_BLOCK_SIZE];


[numthreads(BITONIC_BLOCK_SIZE, 1, 1)]
void BitonicSort(uint3 Gid : SV_GroupID,
uint3 DTid : SV_DispatchThreadID,
uint3 GTid : SV_GroupThreadID,
uint GI : SV_GroupIndex)
{
// Load shared data
shared_data[GI] = Data[DTid.x];
GroupMemoryBarrierWithGroupSync();


// Sort the shared data
for (unsigned int j = g_iLevel >> 1; j > 0; j >>= 1)
{
//Particle result = ((shared_data[GI & ~j].position.z <= shared_data[GI | j].position.z) == (bool)(g_iLevelMask & DTid.x)) ? shared_data[GI ^ j] : shared_data[GI];
Particle result;
if ((shared_data[GI & ~j].position.z <= shared_data[GI | j].position.z) == (bool)(g_iLevelMask & DTid.x))
result = shared_data[GI ^ j];
else
result = shared_data[GI];


GroupMemoryBarrierWithGroupSync();
shared_data[GI] = result;
GroupMemoryBarrierWithGroupSync();
}


// Store shared data
Data[DTid.x] = shared_data[GI];
}


//--------------------------------------------------------------------------------------
// Matrix Transpose Compute Shader
//--------------------------------------------------------------------------------------
groupshared Particle transpose_shared_data[TRANSPOSE_BLOCK_SIZE * TRANSPOSE_BLOCK_SIZE];


[numthreads(TRANSPOSE_BLOCK_SIZE, TRANSPOSE_BLOCK_SIZE, 1)]
void MatrixTranspose(uint3 Gid : SV_GroupID,
uint3 DTid : SV_DispatchThreadID,
uint3 GTid : SV_GroupThreadID,
uint GI : SV_GroupIndex)
{
transpose_shared_data[GI] = Input[DTid.y * g_iWidth + DTid.x];
GroupMemoryBarrierWithGroupSync();
uint2 XY = DTid.yx - GTid.yx + GTid.xy;
Data[XY.y * g_iHeight + XY.x] = transpose_shared_data[GTid.x * TRANSPOSE_BLOCK_SIZE + GTid.y];
}

 

Edited by neroziros

Share this post


Link to post
Share on other sites

I don't know why the compiler is so picky on this. I ran into the same problem with that sample when I was hacking together my own 3D fluids variation.

 

http://www.shaderplay.com/sandbox/superfluids2/superfluids2.html

 

I did something like this..

 

bool Test =  ((shared_data[GI & ~j].position.<= shared_data[GI | j].position.z) == (bool)(g_iLevelMask & DTid.x));

Particle result = shared_data[GI ^ j];

 

GroupMemoryBarrierWithGroupSync();
if ( Test )

    shared_data[GI] = result;
GroupMemoryBarrierWithGroupSync();

 

I'd also suggest you don't try to sort your actual particle buffer, but just a simple buffer that only has your particle index and depth, less memory to shift around and access in the 20 or so radix sort passes that get called. I just integrated this sort into another of my GPU particle systems and it works great.

 

Also, looks like you're sorting on your world space particle z, might want to run a pre compute pass to calculate the actual depth to camera "length( cameraPos - particlePos )" and sort on that.

 

Hope this helps, but I'm sure you figured it out by now.

Edited by gfxbean

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this