Navier-Stokes Equations Solver - Compute Shader!
After a couple of weeks of engine development and testing, it was really nice to spend a few days on an actual algorithm to implement in D3D11. When I originally heard about the Compute Shader, one of the first techniques that I thought of was the NSE water solver that was popularized a few years ago. Up until this point, I only had support for the water solver in my engine on the CPU. From today onward, I have support for it on the GPU - completely on the GPU! (I know, I know, there isn't any D3D11 GPUs out there yet - but it will run when they are available [rolleyes]).
So without further ado - here is my fully GPU water solver in still action:
I still have some optimizations to make (specifically to increase the parallelism by the way I store the current state of the simulation), but it is pretty cool to see it running. I also uploaded a youtube video as well, where I initialized the state of the water to very wavy at one end of a 32x32 grid of water. You can see the waves propagating over to the near side of the water [cool].
">You can view the video here.
I'll post the details of the algorithm once I have the final version of the shader ready, but here is the current state of the shader for you guys to try out:
//-----------------------------------------------------------------------------// NSE_Solver_CS.hlsl//// The NSE Solver takes as input a shader resource view representing the // current state of the system in a texture2D, and the current flow values in // another texture2D. The calculated output is stored in a texture2D bound to // the compute shader with an unordered access view to allow all of the various// threads to write their output simultaneously.//// Copyright (C) 2003-2009 Jason Zink. All rights reserved.//-----------------------------------------------------------------------------Texture2D<float> InputHeight : register( t0 ); Texture2D InputFlow : register( t1 ); RWTexture2D<float> OutputHeight : register( u0 );RWTexture2D OutputFlow : register( u1 );#define size_x 32#define size_y 32groupshared float loadedheights[size_x * size_y];groupshared float4 loadedflows[size_x * size_y];[numthreads(size_x, size_y, 1)]void CSMAIN( uint3 GroupID : SV_GroupID, uint3 DispatchThreadID : SV_DispatchThreadID, uint3 GroupThreadID : SV_GroupThreadID, uint GroupIndex : SV_GroupIndex ){ //---------------------------------------- // Perform all texture accesses here and load the group shared memory with last frame's height and flow loadedheights[GroupIndex] = InputHeight.Load( uint3(DispatchThreadID.x, DispatchThreadID.y, 0 ) ); loadedflows[GroupIndex] = InputFlow.Load( uint3(DispatchThreadID.x, DispatchThreadID.y, 0 ) ); GroupMemoryBarrierWithGroupSync(); //---------------------------------------- // Update all flow calculations // // Programmer artwork: o--o x // /|\ // / | \ // o o o // w z y // Initialize the flow variable to the last frames flow values float4 NewFlow = float4( 0.0f, 0.0f, 0.0f, 0.0f );//loadedflows[GroupIndex]; // Check for 'not' right edge if ( DispatchThreadID.x < size_x - 1 ) { NewFlow.x = ( loadedheights[GroupIndex+1] - loadedheights[GroupIndex] ); // Check for 'not' bottom edge if ( DispatchThreadID.y < size_y - 1 ) { NewFlow.y = ( loadedheights[(GroupIndex+1) + size_x] - loadedheights[GroupIndex] ); } } // Check for 'not' bottom edge if ( DispatchThreadID.y < size_y - 1 ) { NewFlow.z = ( loadedheights[GroupIndex+size_x] - loadedheights[GroupIndex] ); // Check for 'not' left edge if ( DispatchThreadID.x > 0 ) { NewFlow.w = ( loadedheights[GroupIndex + size_x - 1] - loadedheights[GroupIndex] ); } } const float TIME_STEP = 0.05f; const float PIPE_AREA = 0.0001f; const float GRAVITATION = 10.0f; const float PIPE_LENGTH = 0.2f; const float FLUID_DENSITY = 1.0f; const float COLUMN_AREA = 0.05f; const float DAMPING_FACTOR = 0.9998f; const float fAccelFactor = ( TIME_STEP * PIPE_AREA * GRAVITATION ) / ( PIPE_LENGTH * COLUMN_AREA ); NewFlow = ( NewFlow * fAccelFactor + loadedflows[GroupIndex] ) * DAMPING_FACTOR; GroupMemoryBarrierWithGroupSync(); //---------------------------------------- // Calculate the new height for each column, then store the height and modified flow values. // The updated height values are now stored in the loadedheights shared memory. // Out of the current column... loadedheights[GroupIndex] = loadedheights[GroupIndex] + NewFlow.x + NewFlow.y + NewFlow.z + NewFlow.w; GroupMemoryBarrierWithGroupSync(); // Into the right columns if ( DispatchThreadID.x < size_x - 1 ) loadedheights[GroupIndex+1] = loadedheights[GroupIndex+1] - NewFlow.x; GroupMemoryBarrierWithGroupSync(); // Into the bottom right columns if ( DispatchThreadID.x < size_x - 1 ) if ( DispatchThreadID.y < size_y - 1 ) loadedheights[GroupIndex+size_x+1] = loadedheights[GroupIndex+size_x+1] - NewFlow.y; GroupMemoryBarrierWithGroupSync(); // Into the bottom columns if ( DispatchThreadID.y < size_y - 1 ) loadedheights[GroupIndex+size_x] = loadedheights[GroupIndex+size_x] - NewFlow.z; GroupMemoryBarrierWithGroupSync(); // Into the bottom left columns if ( DispatchThreadID.y < size_y - 1 ) if ( DispatchThreadID.x > 0 ) loadedheights[GroupIndex+size_x-1] = loadedheights[GroupIndex+size_x-1] - NewFlow.w; GroupMemoryBarrierWithGroupSync(); OutputHeight[DispatchThreadID.xy] = loadedheights[GroupIndex]; OutputFlow[DispatchThreadID.xy] = NewFlow;}
There's plenty more D3D11 goodness to come - this compute shader functionality really opens up the doors for trying new algorithms...