What I need is:
- a dedicated particle systems whose parameters can be set individually and programmatically
- emitter parameters throughout a system need to be accessible from a script, which requires them to be named
- independence of the rendering code from the iteration/stepping code
- post-fact extensibility (adding new particle system classes after the code is compiled)
As right now I'm doing everything in a simple way (eg I have a particle system class that manages a list of particles), I need to extend this a fair bit. My idea is to implement the particle system as a self-referential emitter class that accepts a emitter configuration class as a parameter.
//this is loaded from an XML file
class EmitterConfiguration {
float args[256];
char* names[256];
int iNumArgs;
Shader* updateShader;
};
class ParticleEmitter {
std::vector<ParticleEmitter> particles;
EmitterConfiguration* cfg;
};
ParticleEmitter::Update()
{
cfg->EnableUpdateShader(true);
for all active particles
cfg->UpdateParticle(p);
cfg->EnableUpdateShader(false);
}
ParticleEmitter* i_am_a_particle_system;
This is all fine and dandy and should cover the extensibility and flexibility parts. However, I can see a number of speed bottlenecks here, which lead me to the following questions:
1) is moving particle updating off the CPU a good idea in a general sense? I mean, CPU cores are dime a dozen on many newer systems and can be expected to only become more widely spread.
2) how much of the iteration/updating should I move to the GPU? Everything? Everything except spawning?
3) how should I go about writing the GPU side of the system? The only solution I can see for storage is textures, but do they justify giving up the streaming cost?
4) how should I go about syncing between the CPU and the GPU? If most work is done on the GPU, I still need pretty detailed information about the system on the CPU and texture read-backs are probably the worst idea to opt for.
5) I'm getting no perceptible speed increase from using geometry shader billboarding - is it worth it to tie up additional GPU resources with it?
6) in systems with several particle textures, which would you recommend: sort the particles into individual lists (might require sorting in all cases as the particles are no longer drawn sequentially); suck it up and draw the particles individually (uh oh); store them in a single array, but parse the array once for all textures (same problem as in the first case); something else? (actually, come to think of it, the easiest and cheapest way is probably to instead build a texture atlas for each system when it is first created)
There are probably a number of other questions that I can't think of right off the bat, but I'm really most curious about how people have managed to pull of the updating bit. Right now I'm getting a 5-20 frame FPS drop in debug mode for a handful of particles (a few hundred to a few thousand). I'm not sure if I'm fill-rate limited (as the particles are relatively large) or the bottleneck is in the fact that I'm rendering them from a linked list instead of a fixed array.