Million Particle System

Started by
24 comments, last by neroziros 10 years ago

Hi Jason,

Yes, I am reading it here:


// Get current amount of particles in the InputBuffer
RefreshCurrentParticleAmount();

And the function per se is


// Refresh the current amount of particles in the system
void Particleclass::RefreshCurrentParticleAmount()
{
m_D3D->GetDeviceContext()->CopyStructureCount(ParticleCountSTBuffer, 0, InputState);
D3D11_MAPPED_SUBRESOURCE _subresource;
// Transfer the current amount of particles to a local variable
m_D3D->GetDeviceContext()->Map(ParticleCountSTBuffer, 0, D3D11_MAP_READ, 0, &_subresource);
unsigned int* pCount = (unsigned int*)(_subresource.pData);
mNumElements = 0;
for (int i = 0; i < 8; i++)
mNumElements += pCount[i];
m_D3D->GetDeviceContext()->Unmap(ParticleCountSTBuffer, 0);}
Advertisement

Nvm the swap is actually correct, I was mistaken in my assumption: if the UpdateCS isn't executed, then the particles aren't consumed, and what I am actually doing is splitting the particle insertion between the two buffers ( I checked and with the buffer swap, the particles number seems to go up half the speed than without the particle swap).

In the other hand, I can finally pinpoint the real problem with my update! As initially guessed the constant buffer is not being mapped properly. To test that, in the compute shader I tried replacing the line "if (myID < NumParticles)" for "if (myID < 8)" and the particles got properly consumed. This confirm my guess because if the NumParticles is not properly mapped, then its value is probably 0 inside the shader, and the particles never go inside the conditional and are never consumed.

Update Compute Shader


//-----------------------------------------------------------------------------
// Compute shader (Hieroglyph based)
//-----------------------------------------------------------------------------
// Particle Structure (relevant for the simulation)
struct Particle 
{
float3 position;
float3 velocity;
float  time;
};


cbuffer SimulationParameters : register(b0)
{
float4 TimeFactors;
float4 EmitterLocation;
float ParticlesLifeTime;
uint NumParticles;
};


// Compute shaders buffers (Entry Buffer and Output buffer)
AppendStructuredBuffer<Particle> NewSimulationState : register(u0);
ConsumeStructuredBuffer<Particle>   CurrentSimulationState  : register(u1);


[numthreads(512, 1, 1)]
void CSMAIN(uint3 DispatchThreadID : SV_DispatchThreadID)
{
// Check for if this thread should run or not.
uint myID = DispatchThreadID.x + DispatchThreadID.y * 512 + DispatchThreadID.z * 512 * 512;


// The statement must check if there are no more particles than it should
if (myID < NumParticles)
{
// Get the current particle
Particle p = CurrentSimulationState.Consume();


// Calculate the new position, accounting for the new velocity value
// over the current time step.
p.position += p.velocity * TimeFactors.x;


// Update the life time left for the particle.
p.time = p.time + TimeFactors.x;


// Only keep the particle alive IF its life time has not expired
if (p.time < ParticlesLifeTime)
{
NewSimulationState.Append(p);
}
}
}

And the code related to the mapping process:

Map Buffer Structure


// For aligning to float4 boundaries
#define Float4Align __declspec(align(16))

struct SimulationParameters 
{
Float4Align float TimeFactors;
Float4Align D3DXVECTOR3 EmitterLocation; 
Float4Align float ParticlesLifeTime;
Float4Align UINT NumParticles; 
};

Buffer Creation


// Constant buffers (UPDATE)
m_desc.ByteWidth = sizeof(SimulationParameters);
m_desc.Usage = D3D11_USAGE_DYNAMIC;
m_desc.BindFlags = D3D11_BIND_CONSTANT_BUFFER;
m_desc.CPUAccessFlags = D3D11_CPU_ACCESS_WRITE;
m_desc.MiscFlags = 0;
m_desc.StructureByteStride = 0;
result = device->CreateBuffer(&m_desc, NULL, &cs_pUpdateCB);
if (FAILED(result))return false;

Mapping


// Create this frame data
SimulationParameters s;
s.EmitterLocation = position;
s.ParticlesLifeTime = lifeTime;
s.TimeFactors = elapsedSeconds/1000;
s.NumParticles = currentParticles;


// Pass the new frame data to the shader
D3D11_MAPPED_SUBRESOURCE UMapped;
hr = m_D3D->GetDeviceContext()->Map(cs_pUpdateCB, 0, D3D11_MAP_WRITE_DISCARD, 0, &UMapped);
memcpy_s(UMapped.pData, sizeof(SimulationParameters), &s, sizeof(s));
m_D3D->GetDeviceContext()->Unmap(cs_pUpdateCB, 0);


// Send the updated constant buffers to the CS shader
m_D3D->GetDeviceContext()->CSSetConstantBuffers(0, 1, &cs_pUpdateCB);


// Set append and consume buffer
UINT counters[2] = { -1, -1 };
ID3D11UnorderedAccessView* uavs[2] = { OutputState, InputState };
m_D3D->GetDeviceContext()->CSSetUnorderedAccessViews(0, 2, uavs, counters);


// Dispatch the particles's updater 
m_D3D->GetDeviceContext()->Dispatch(maxParticles / 512, 1, 1);

As always, any help will be greatly appreciated!

Cheers

When you do the mapping of the buffer to get the number of particles, do you see the correct numbers if you step through the RefreshCurrentParticleCount() method? If so, then the issue is in how you are copying the value to the constant buffer for use.

One other thing to check - are you enabling the D3D11 debug device? This would emit debug messages if you try doing things like mapping buffers that aren't accessible to the CPU...

Thanks! As you said, it was a value copy error. It seems that the constant buffers must be 16b aligned in order to properly work. Here are my new buffers structures.


// SPAWN
struct SpawnConstantBuffer
{
D3DXVECTOR4 EmmiterPosAndLife;
D3DXVECTOR4 randomVector;
};
// UPDATE
struct SimulationParameters
{
D3DXVECTOR4 TimeFactors;
D3DXVECTOR3 EmitterLocation;
UINT NumParticles;
};

and their GPU counterpart:

CSINSERT


cbuffer ParticleInsertParameters : register(b0)
{
float4 EmmiterPosAndLife; // xyz -> pos ; w ->lifetime
float4 RandomVector;};

CSUPDATE


cbuffer SimulationParameters : register(b0)
{
float4 TimeFactors;
float3 EmitterLocation;
uint NumParticles;
};

I am now looking for an efficient way to spawn more than 8 particles per frame.

Cheers!

Hi all, it seems that I have hit another wall. I can't see the particles at all sad.png

I think I have properly initialized all related buffers. But for some reason, even though the particle's current amount is behaving properly, I can't see any particles on screen. As a side note, I tried using the Graphic Diagnostic Tool (VS 2013) and RenderDoc (Crytek) to debug the shaders, but for some reason the app crashes if I try to capture the current frame info. They work properly if I don't execute the particle system tho, so the problem is probably caused by the compute shaders.

As always, any help is greatly appreciated.

Related Structures


// Render
struct Transforms
{
D3DMATRIX WorldViewMatrix;
D3DMATRIX ProjMatrix;
};

Render Function


// Command the particle system calculations (update and render)
void Particleclass::Draw(float elapsedMiliSeconds,D3DXMATRIX worldMatrix, D3DXMATRIX viewMatrix,D3DXMATRIX projectionMatrix)
{
// Check if the PS has started 
if (State == ParticleSystemState::UNSTARTED)
return;


// Timescale 
float elapsedSeconds = elapsedMiliSeconds / 1000.0f;


///////////////////////////////////////////////////////////////////
// Simulation
///////////////////////////////////////////////////////////////////
if (State == ParticleSystemState::PLAYING)
Update(elapsedSeconds);


///////////////////////////////////////////////////////////////////
// Render
///////////////////////////////////////////////////////////////////
// Check if there are enough particles in the system 
if (currentParticles <= 0) return;


// To render, we need to only select the particles that exist after the update.
m_D3D->GetDeviceContext()->CopyStructureCount(g_pParticlesToRender, 0, InputState);


// Set input layouyt
m_D3D->GetDeviceContext()->IASetInputLayout(pInputLayout);


// Set the type of primitive that should be rendered from this vertex buffer, in this case, the info is stored as point particles.
m_D3D->GetDeviceContext()->IASetPrimitiveTopology(D3D11_PRIMITIVE_TOPOLOGY_POINTLIST);


// Set the vertex and pixel shaders that will be used to render this triangle.
m_D3D->GetDeviceContext()->VSSetShader(vs_pVertex, NULL, 0);
m_D3D->GetDeviceContext()->PSSetShader(ps_pPixel, NULL, 0);
m_D3D->GetDeviceContext()->GSSetShader(gs_pGeometry, NULL, 0);
// m_D3D->GetDeviceContext()->DSSetShader(NULL, NULL, 0); // Domain Shader (Not Yet)


// Set blend and stencil model
// Bind blend state
float blendFactor[4] = { 0.0f, 0.0f, 0.0f, 0.0f };
UINT sampleMask = 0xffffffff;
m_D3D->GetDeviceContext()->OMSetBlendState(pBlendState, blendFactor, sampleMask);
// Bind stencil depth
m_D3D->GetDeviceContext()->OMSetDepthStencilState(pDepthState, 0);


// Set this frame information
Transforms transforms;
transforms.ProjMatrix = projectionMatrix;
transforms.WorldViewMatrix = worldMatrix * viewMatrix;


// Map this frame info
D3D11_MAPPED_SUBRESOURCE mappedResource;
m_D3D->GetDeviceContext()->Map(m_matrixBuffer, 0, D3D11_MAP_WRITE_DISCARD, 0, &mappedResource);
memcpy_s(mappedResource.pData, sizeof(Transforms), &transforms, sizeof(transforms));
m_D3D->GetDeviceContext()->Unmap(m_matrixBuffer, 0);


// Draw the particles
m_D3D->GetDeviceContext()->DrawInstancedIndirect(g_pParticlesToRender, 0); // Check if there are enough particles in the system 
}

Render Program


//--------------------------------------------------------------------------------
// Resources
//--------------------------------------------------------------------------------


cbuffer Transforms
{
matrix WorldViewMatrix;
matrix ProjMatrix;
};


cbuffer ParticleRenderParameters
{
float4 EmitterLocation;
float4 ConsumerLocation;
};


static const float scale = 0.5f;


static const float4 g_positions[4] =
{
float4(-scale, scale, 0, 0),
float4(scale, scale, 0, 0),
float4(-scale, -scale, 0, 0),
float4(scale, -scale, 0, 0),
};


static const float2 g_texcoords[4] =
{
float2(0, 1),
float2(1, 1),
float2(0, 0),
float2(1, 0),
};




struct Particle
{
float3 position;
float3 direction;
float  time;
};


StructuredBuffer<Particle> SimulationState;
Texture2D       ParticleTexture : register(t0);
SamplerState    LinearSampler : register(s0);


//--------------------------------------------------------------------------------
// Inter-stage structures
//--------------------------------------------------------------------------------
struct VS_INPUT
{
uint vertexid : SV_VertexID;
};
//--------------------------------------------------------------------------------
struct GS_INPUT
{
float3 position : Position;
};
//--------------------------------------------------------------------------------
struct PS_INPUT
{
float4 position : SV_Position;
float2 texcoords : TEXCOORD0;
float4 color : Color;
};
//--------------------------------------------------------------------------------
GS_INPUT VSMAIN(in VS_INPUT input)
{
GS_INPUT output;


output.position.xyz = SimulationState[input.vertexid].position;


return output;
}
//--------------------------------------------------------------------------------
[maxvertexcount(4)]
void GSMAIN(point GS_INPUT input[1], inout TriangleStream<PS_INPUT> SpriteStream)
{
PS_INPUT output;


float4 color = float4(1.0f, 1.0f, 1.0f, 0.8f);


// Transform to view space
float4 viewposition = mul(float4(input[0].position, 1.0f), WorldViewMatrix);


// Emit two new triangles
for (int i = 0; i < 4; i++)
{
// Transform to clip space
output.position = mul(viewposition + g_positions[i], ProjMatrix);
output.texcoords = g_texcoords[i];
output.color = color;


SpriteStream.Append(output);
}


SpriteStream.RestartStrip();
}
//--------------------------------------------------------------------------------
float4 PSMAIN(in PS_INPUT input) : SV_Target
{
//float4 color = ParticleTexture.Sample(LinearSampler, input.texcoords);
//color = color * input.color;
float4 color = input.color;


return(color);
}
//--------------------------------------------------------------------------------

It seems the graphics debuggers crash due this line:

m_D3D->GetDeviceContext()->CSSetUnorderedAccessViews(0, 1, &InputState, counts); // HLSL debuggers crash due this (Insert CS).


// Command the particle system calculations (update and render)
void Particleclass::Draw(float elapsedMiliSeconds,D3DXMATRIX worldMatrix, D3DXMATRIX viewMatrix,D3DXMATRIX projectionMatrix)
{
// Check if the PS has started 
if (State == ParticleSystemState::UNSTARTED)
return;


// Timescale 
float elapsedSeconds = elapsedMiliSeconds / 1000.0f;


///////////////////////////////////////////////////////////////////
// Simulation
///////////////////////////////////////////////////////////////////
if (State == ParticleSystemState::PLAYING)
Update(elapsedSeconds);


///////////////////////////////////////////////////////////////////
// Render
///////////////////////////////////////////////////////////////////
// Check if there are enough particles in the system 
if (currentParticles <= 0) return;


// Set blend and stencil model
// Bind blend state
/*
float blendFactor[4] = { 0.0f, 0.0f, 0.0f, 0.0f };
UINT sampleMask = 0xffffffff;
m_D3D->GetDeviceContext()->OMSetBlendState(pBlendState, blendFactor, sampleMask);
// Bind stencil depth
m_D3D->GetDeviceContext()->OMSetDepthStencilState(pDepthState, 0);
*/
// Set the vertex and pixel shaders that will be used to render this frame.
m_D3D->GetDeviceContext()->VSSetShader(vs_pVertex, NULL, 0);
m_D3D->GetDeviceContext()->PSSetShader(ps_pPixel, NULL, 0);
m_D3D->GetDeviceContext()->GSSetShader(gs_pGeometry, NULL, 0);


// Set the type of primitive that should be rendered from this vertex buffer, in this case, the info is stored as point particles.
m_D3D->GetDeviceContext()->IASetPrimitiveTopology(D3D11_PRIMITIVE_TOPOLOGY_POINTLIST);


// To render, we need to only select the particles that exist after the update.
m_D3D->GetDeviceContext()->CopyStructureCount(g_pParticlesToRender, 0, OutputBuffer->GetUnorderedAccessView());
m_D3D->GetDeviceContext()->VSSetShaderResources(0, 1, OutputBuffer->GetShaderResourceViewAddress());


// Set input layouyt
m_D3D->GetDeviceContext()->IASetInputLayout(pInputLayout);


// Set this frame information
Transforms transforms;
transforms.ProjMatrix = projectionMatrix;
transforms.WorldViewMatrix = worldMatrix * viewMatrix;


// Map this frame info
D3D11_MAPPED_SUBRESOURCE mappedResource;
m_D3D->GetDeviceContext()->Map(m_matrixBuffer, 0, D3D11_MAP_WRITE_DISCARD, 0, &mappedResource);
memcpy_s(mappedResource.pData, sizeof(Transforms), &transforms, sizeof(transforms));
m_D3D->GetDeviceContext()->Unmap(m_matrixBuffer, 0);


// Draw the particles
m_D3D->GetDeviceContext()->DrawInstancedIndirect(g_pParticlesToRender, 0); // Check if there are enough particles in the system 


// Swap the two buffers in between frames to allow multithreaded access
// during the rendering phase for the particle buffers..
AppendConsumeBuffer *TempState = OutputBuffer;
OutputBuffer = InputBuffer;
InputBuffer = TempState;
}

Any help is welcomed.

PS: Even more strange is that both the Insert CS and Update CS seem to be working properly (the particle amount is correct in each frame)

This topic is closed to new replies.

Advertisement