Million Particle System

Started by
24 comments, last by neroziros 9 years, 12 months ago
Geo-shaders are basically the shader stage best avoided if you can; by their nature they tend to serialise the hardware a bit.

For a particle system you'd be better off using compute shaders as you can pre-size your buffers and then have them read from one buffer and write to the other. You also only need one shader stage to be invoked; usage of the geo-shader implies a VS must be run first even as a 'pass-thru' due to how the logical software pipeline is arranged - a compute shader wouldn't have this.

That's not to say compute doesn't come with it's own set of potential pitfalls but it is better suited to this task smile.png
Advertisement

Since you are unlikely to set 1 milion of particles positions explicitly, I gess you have a function for their positions upon a scalar or two. If this fact is true, you could get away with a 1 milion quads mesh, where each quad is over the other one with a certain distance - let's say in z direction. This would mean that each quad vertex equals in x,y but differs in z. Thus, z can be a value for your procedural funtion, along with some other factors (time, seed...) . The buffer for this 1 milion "quad pillar" would be static, not altered itself and you would process the verticies of it, upon the vertex z value and other factors. Imagine you could easily strip them like cards just by doing (x,y+z*10.0,z). This solution of course limits you to only procedural positioning. Usualy, there are fewer than 50 particles used.

There are actual use cases for million particle systems and they have been feasible many years already. 5 years old blog post about how to do it with dx9. http://directtovideo.wordpress.com/2009/10/06/a-thoroughly-modern-particle-system/

With modern api and clever coding it should not be any problem.

@JohnnyCode: Thanks for the suggestion! Though I should have mentioned earlier that I am also looking forward making an interactive particle system (both with itself and the environment) so I cannot use procedural algorithms and must rely on State preserving PS

@Kalle_h: heh actually using textures to handle the particles was one of my first ideas to make the particle system, but the lack of examples and the fear that the dynamic particle creation and death would be too complex with that scheme made turn down the idea. It is an excellent way to make a PS with a set amount of particles though, thanks!

I will keep looking into the directcompute idea, as before, I will post my results here so anyone else interested on making a highly interactive particle system with millions of particles can use it smile.png

Cheers

In general, if you are going to be using DX11 then you really should consider the compute shader. The article that you are referencing is actually from before DX11 was even released, so it wouldn't have considered compute shaders at all.

If you are interested in trying it out quickly, the Hieroglyph 3 framework has a ParticleStorm application in it that shows the basic pieces needed.

Hi all!

I have been following the Hieroglyph example. And though I am suffering with the CPU side of the work, I think I am moving forward :D

I have a question: I have already created the append and consume buffers

BUFFER CREATION


// Create consume and append buffer
D3D11_BUFFER_DESC desc;
desc.ByteWidth = sizeof(PARTICLE_VERTEX) * maxParticles;
desc.BindFlags = D3D11_BIND_SHADER_RESOURCE | D3D11_BIND_UNORDERED_ACCESS;
desc.MiscFlags = D3D11_RESOURCE_MISC_BUFFER_STRUCTURED;
desc.StructureByteStride = sizeof(PARTICLE_VERTEX);
desc.Usage = D3D11_USAGE_DEFAULT;
desc.CPUAccessFlags = 0;


result = device->CreateBuffer(&desc, NULL, &appendBuffer);
if (FAILED(result))return false;
result = device->CreateBuffer(&desc, NULL, &consumeBuffer);
if (FAILED(result))return false;

However I am not sure how to send them to the shader. I already append one during the Particle insertion step, but I am not sure how to append both of them at the same time during the Update step.

SPAWN PARTICLES FUNCTION


// Create new particles (through the compute shader)
void Particleclass::SpawnNewParticles(float elapsedSeconds)
{
// Control variable
HRESULT hr;


// Update timer
newParticlesTimer += elapsedSeconds;


// Check if spawn must happen
if (newParticlesTimer >= spawnParticlesInterval)
{
// Set Compute Shader
m_D3D->GetDeviceContext()->CSSetShader(cs_pInsertParticles, nullptr, 0);


// Create this frame data
SpawnConstantBuffer c;


// Set emmitter location
c.emmiterPos = position;


// Create random vector
static const float scale = 2.0f; // Random Variance
float fRandomX = ((float)rand() / (float)RAND_MAX * scale - scale / 2.0f);
float fRandomY = ((float)rand() / (float)RAND_MAX * scale - scale / 2.0f);
float fRandomZ = ((float)rand() / (float)RAND_MAX * scale - scale / 2.0f);
D3DXVECTOR3 normalized = D3DXVECTOR3(fRandomX, fRandomY, fRandomZ);
// Normalize the random vector
float magnitude = (float)sqrt(normalized.x * normalized.x + normalized.y * normalized.y + normalized.z * normalized.z);
if (magnitude == 0.0)magnitude = 0.000001f;
normalized = D3DXVECTOR3(normalized.x / magnitude, normalized.y / magnitude, normalized.z / magnitude);
// Set random vector
c.randomVector = D3DXVECTOR3(normalized.x, normalized.y, normalized.z);


// Copy the new vector and position in the due buffer 
D3D11_MAPPED_SUBRESOURCE mapped;
hr = m_D3D->GetDeviceContext()->Map(cs_pInsertCB, 0, D3D11_MAP_WRITE_DISCARD, 0, &mapped);
// Error Check
if (FAILED(hr)){
std::stringstream stream; stream << "Error Code::" << HRESULT_CODE(hr); MessageBox(NULL, stream.str().c_str(), "InsertCS Buffer Mapping Error", MB_OK); return;
}
memcpy_s(mapped.pData, sizeof(SpawnConstantBuffer), &c, sizeof(c));
m_D3D->GetDeviceContext()->Unmap(cs_pInsertCB, 0);

// Send the updated vector to the CS shader
m_D3D->GetDeviceContext()->CSSetConstantBuffers(0, 1,&cs_pInsertCB);

// Set append buffer
UINT counts[1]; counts[0] = -1;
m_D3D->GetDeviceContext()->CSSetUnorderedAccessViews(0, 1, &prevState, counts);

// Spawn New Particles
m_D3D->GetDeviceContext()->Dispatch(1,1,1);

// Reset timer
newParticlesTimer = 0;
}
}

Problem solved! To add both buffer you must do:


UINT counters[2] = { mNumElements, 0 };
ID3D11UnorderedAccessView* uavs[2] = { prevState, currentState };
m_D3D->GetDeviceContext()->CSSetUnorderedAccessViews(0, 2, uavs, counters);

Tho I am not sure if mNumElements is the current amount of particles in the system OR the max amount of particles that can be in the system.

Cheers

Ok, I am almost done with the Compute Shader step. However, I am having one last problem that I can't understand.

For some reason, the particle count stays at 8 if I dispatch the UpdateCS. The only explanation I can think of its that the compute shader is killing all the particles created in the InsertCS shader. ( If I don't execute the Update CS, the particle count goes up as it should)

Here is the Update function which calls the Update CS


// Update the particle system (advance the particle system)void Particleclass::Update(float elapsedSeconds)
{
// Control variable
HRESULT hr;


// Update total elapsed time
m_TotalTimeElapsed += elapsedSeconds*1000; // Miliseconds, so the RNG functions inside the GPU get diverse values


// Create new particles if needed
SpawnNewParticles(elapsedSeconds);


// Get current ammount of particles in the InputBuffer
RefreshCurrentParticleAmount();


// If there are zero particles, don't execute the updater
if (mNumElements <= 0) return;


// Update particles
// Set Compute Shader
m_D3D->GetDeviceContext()->CSSetShader(cs_pUpdateParticles, nullptr, 0);


// Create this frame data
SimulationParameters s;
s.EmitterLocation = position;
s.ParticlesLifeTime = lifeTime;
s.TimeFactors = 0;


// Pass the new frame data to the shader
D3D11_MAPPED_SUBRESOURCE UMapped;
hr = m_D3D->GetDeviceContext()->Map(cs_pUpdateCB, 0, D3D11_MAP_WRITE_DISCARD, 0, &UMapped);
memcpy_s(UMapped.pData, sizeof(SpawnConstantBuffer), &s, sizeof(s));
m_D3D->GetDeviceContext()->Unmap(cs_pUpdateCB, 0);


// Send this frame particle amount to the Update CS
D3D11_MAPPED_SUBRESOURCE PCMapped;
hr = m_D3D->GetDeviceContext()->Map(cs_pParticleCount, 0, D3D11_MAP_WRITE_DISCARD, 0, &PCMapped);
memcpy_s(PCMapped.pData, sizeof(UINT), &mNumElements, sizeof(mNumElements));
m_D3D->GetDeviceContext()->Unmap(cs_pParticleCount, 0);


// Send the updated constant buffers to the CS shader
ID3D11Buffer *Buffers[2] = {cs_pUpdateCB, cs_pParticleCount};
m_D3D->GetDeviceContext()->CSSetConstantBuffers(0,2,Buffers);


// Set append and consume buffer
UINT counters[2] = {0, mNumElements };
ID3D11UnorderedAccessView* uavs[2] = { OutputState, InputState };
m_D3D->GetDeviceContext()->CSSetUnorderedAccessViews(0, 2, uavs, counters);


// Dispatch the particles's updater 
m_D3D->GetDeviceContext()->Dispatch(maxParticles / 512, 1, 1);


// Swap the two buffers in between frames to allow multithreaded access
// during the rendering phase for the particle buffers.
ID3D11UnorderedAccessView *TempState = InputState;
InputState = OutputState;
OutputState = InputState;
}
And this is the UpdateCS

//-----------------------------------------------------------------------------
// Compute shader (Hieroglyph based)
//-----------------------------------------------------------------------------
// Particle Structure (relevant for the simulation)
struct Particle 
{
float3 position;
float3 velocity;
float  time;
};


cbuffer SimulationParameters : register(b0)
{
float4 TimeFactors;
float4 EmitterLocation;
float ParticlesLifeTime;
};


cbuffer ParticleCount : register(b1)
{
uint4 NumParticles;
};




// Compute shaders buffers (Entry Buffer and Output buffer)
AppendStructuredBuffer<Particle> NewSimulationState : register(u0);
ConsumeStructuredBuffer<Particle>   CurrentSimulationState  : register(u1);


[numthreads(512, 1, 1)]
void CSMAIN(uint3 DispatchThreadID : SV_DispatchThreadID)
{
// Check for if this thread should run or not.
uint myID = DispatchThreadID.x + DispatchThreadID.y * 512 + DispatchThreadID.z * 512 * 512;


// The statement must check if there are no more particles than it should
if (myID < NumParticles.x)
{
// Get the current particle
Particle p = CurrentSimulationState.Consume();


// Calculate the new position, accounting for the new velocity value
// over the current time step.
p.position += p.velocity * TimeFactors.x;


// Update the life time left for the particle.
p.time = p.time + TimeFactors.x;


// Only keep the particle alive IF its life time has not expired
if (p.time < ParticlesLifeTime)
{
NewSimulationState.Append(p);
}
}
}

I suspect the problem is in either the "cs_pParticleCount" -> "cbuffer ParticleCount" mapping or when I send the NumParticles's UAV reference to the Compute Shader but I am really not sure (if the NumParticles is not being received in the GPU, then it will stay at zero and it will kill all the particles) . Any help is really appreciated!

Hmm it seems that the problem is in the buffer swapping. Since I tried not executing the Update compute shader and the particle amount still went up. (If the swap was correct, then the empty outputstate should have overwritten the inputstate, and the amount of particles should had stayed at 8)


// Swap the two buffers in between frames to allow multithreaded access
// during the rendering phase for the particle buffers.
ID3D11UnorderedAccessView *TempState = OutputState; 
OutputState = InputState;
InputState = TempState;

Any idea about what I am doing wrong?

Cheers.

So you didn't do an update, but the particle count still increased? How could that be? I couldn't see in your code above where the number of particles is calculated - are you reading it back from the buffers with CopyStructureCount somewhere?

This topic is closed to new replies.

Advertisement