Algorithmic improvements to particle-based fluid physics

Started by
3 comments, last by LemonScented 14 years, 9 months ago
Hi, I'm looking for a way to optimise the algorithm used by a particle-based fluid simulation described in a paper, and I guess I'm looking for a hardcore mathematician who is up for a challenge :) The algorithm is the one described in this paper: Particle-based Viscoelastic Fluid Simulation (PDF) I'm not doing anything with viscoelasticity, so I'm only looking at the parts of the algorithm up to section 4.4 of the paper. Specifically, "Algorithm 2" seems like fertile ground for speed improvements - I can see where things could be better, but I can't grasp the changes to the math that would be needed to keep it stable whilst doing so. So I could use some help. Here's what I've seen: 1 - In steps 8 & 9, density and near-density are accumulated for each particle i by considering every particle j in i's list of neighbours. It seems reasonable to also add i's contribution to the density values of particle j at the same time, thus only needing to consider each particle pair once (i.e. avoiding doing the work for pair ji because we've already done all the work when considering ij). However, applying this optimisation, although it leads to broadly the same results, seems to have an adverse effect on the repulsive forces between some particles, and I don't know why. 2 - "q" is calculated for a particle pair in step 6, and again in step 15. My code caches this result from step 6 for re-use in step 15, since calculating q involves a square root to get the magnitude of ij, which is expensive. However, the storage involved can become problematic. For instance, since q is the same for particle pair ij as it is for ji, I'd like to not have to recalculate q for ji (since it was already calculated for ij) and instead store it. Unfortunately, when the number of particles runs into the thousands, storage for this amount of data can become an issue because the best lookup table I can think of needs potentially ((N^2)/2)-N entries, although in most cases most of that space would be unused. I was wondering if anyone had some thoughts about an efficient data structure for caching this information. 3 - Steps 19 and 20 seem to contain an explicit "hack" to take into account the fact that each particle pair is contained twice, by only applying half of the displacement to each particle every time each pair is considered. Again, I'd like to see each pair only considered once - however, the force appears to not be symmetric, so the algorithm seems to rely on this behaviour. As in point 1, I'd like to find a way to only consider particle pair ij once, and to apply suitable displacements, but simply omitting the /2 doesn't work properly. Is there a way I could change the calculations to make this possible and still maintain the same behaviour? Perhaps this is all a bit esoteric, but I thought I'd put the question out there to see if anyone was interested enough to have a go. The optimisations would be immensely useful from a code perspective, but the tweaks which I suspect are needed for the equations need a more mathematical brain than I possess.
Advertisement
My experience doing physics for games is that most of the optimization comes from processor cache optimization, not from changing the maths. If you haven't already tried that, you may want to give it a try. Its amazing the quantity of calculations you can do during a single cache miss.
You have a point about cache optimisations - I've made some advances in this area by rejigging the arrangement of data in memory and repeatedly running a performance profiler to see if I was making improvements. I had some degree of success but I imagine more could be done, but I'm not really sure how to rigorously and accurately test where the biggest bottlenecks are, or the sort of techniques I should employ to reduce them. Any general advice in this area would be welcome, certainly. Attack the problem on all fronts, so to speak.

The reason I wanted help looking into the algorithm itself is that to my eyes it's doing fully double the work that it looks like it really needs to do. Perhaps there's a good mathematical reason for that (if so, it'd be great if someone could explain it), but to be able to work out a speed increase of 100% seems like a pretty significant win to me.
I took a look at the text. If you're really worried about the square root, note that its usage is empirical - see section 4.1 of the text. It should be easy to find a kernel that works well and use the squared distance instead. I think the biggest optimization issue should be to order your particles in such a way that finding neighbors is the fastest possible.

Seems like a well-written paper, ill keep it somewhere, maybe try it someday. The authors are from Montreal, too :)
The square root isn't of particular concern above other things. I've already worked the implementation of the algorithm to only calculate it once per particle per frame, rather than twice. My profiler tells me that there's no particular single point where the performance is suffering, which implies to me that either (as you suggested) every other line contains some kind of cache problem, or (as I suspect is the bigger problem) the code is just "doing too much math" in general - at least partially by considering each particle pair twice.

This topic is closed to new replies.

Advertisement