Having optimised the adding and removal of the particles as far as I could go the final bottle neck was contract resolving/reaction based on the numbers coming from Farseer; trying to process 600+ collisions per frame was just too much.
There are two logical steps which can be taken from this point;
- GPU processing
Pushing Farseer onto a second thread wouldn't be that hard; there would be some sync issues as drawing requires all simulation to be completed, however that's a minor point. What it does allow, on the PC at least, is game logic and particles to run at the same time and still have a chance of hitting 60fps.
However, while pushing the sim onto another thread does solve the problem of not having enough time for game logic it doesn't really buy us anything. The big problem is the 360; simply put the power isn't there. It takes ~9x longer to update the particles without collisions on the 360 than it does on my PC, which has a lower clock speed to start with. Throw collisions into the mix and even on a different core we are going to have major problems.
I did consider increasing the threading deep into Farseer, however while on the surface it might look alot more threadable there are issues with collision response again, where if an object is colliding with 2 or more in the world and it is serviced by two threads at once things are going to go wrong.
That's not to say such a solution isn't possible; however using locks would be expensive and Interlock.CompareExchange() only works on the 360 for types of Int, Obj and refernce type T. While it would be possible to box a float into a reference type I suspect this wouldn't be much of a win performance wise anyway.
At which point I turned to the GPU and after reading some things in GPU Gems 3 and ShaderX 6 I considered that I had come up with a solution to quickly doing the collision detection/response on the GPU. For the 360 this is a win as it would allow me to use the GPU to get around the lack of floating point maths power on the CPU (due to the lack of SIMD instruction exposure).
However, it occured to me earlier today that even this solution, for the game I wish to use it with, has a draw back; I need to know when the particles collide with the main player in order to kill them. While this detection is easy, it's a matter of writing out what the particle collided with, getting at the data isn't as such.
It would require a readback from the GPU and then processing; the inital readback would be slow and processing would be an O(N) operation.
You could read back the last frame's data, which in theory on the 360 should be fast due to the unified memory system, however this relies on their being a frame to read back and probably wouldn't work well in the PC world, wher you can have upto 3 frames of latency.
I think or now I might have to go in a new direction for this;
- scale the number of particles down
- push farseer into it's own thread to at least get it out of the way
The other option is to forget the physics for the particles and just let them act as free agents. This would get back cpu time and while it wouldn't look like the vision in my head... well, there is always release 2 I guess *chuckles*
I do now have the books "Real Time Collision Detection" and "Game Physics Engine Development" to read through, so who knows, maybe in a few months I can have a crack at making a game which will cause even my 4 core/8 thread i7 to cry.. now that would be something *chuckles*
For now, I think I need to refocus on getting a game done and worry about something really pretty and over the top later.