[Flocking] How to improve performance?

Started by
17 comments, last by noisecrime 9 years, 11 months ago

I'm doing a simulation of flying birds for my thesis.

I am using the usual steering behaviors: Cohesion, Separation, Velocity Match and Obstacle Avoidance (to avoid terrain).

The program is written in C# and uses Unity3D. I have disabled physics for the boids.

I'm also using alglib to calculate K-nearest neighbors using AKNN. alglib uses k-d trees for this.

So here are the fps I get:

100 boids = 60 fps

200 boids = 35 fps

300 boids = 23 fps

500 boids = 13 fps

1000 boids = 5 fps

As the number of boids increases, performance degrades rapidly. I'd like to do something to improve it, but sadly I am using the free version of unity3D, which does not have a profiler.

Is the performance I'm getting good? I haven't found results online for this.

Should I find another library instead of alglib?

By the way, you can see the program online here. Requires unity3d plugin.

Advertisement
First, do yourself a favor and start thinking of how long it takes to do something, not how many times you can do something per second. It's easier to think about the former than the latter.

It looks like your performance is approximately linear in the number of boids, which is about the best you can hope for. If you were using a more naive algorithm, you could easily see the time it takes to complete a frame growing with the square of the number of boids.

So the issue is then how you can get more than 6000 boid updates per second. I don't know anything about unity3D, but I am sure there are ways to profile your code. If nothing else, one can insert timing calls into the code to generate a log that tells you how long each step is taking.

If everything else fails, post your code so we can see if you are doing something obviously inefficient.

One thing I have heard (but have not confirmed) is that Unity does not support multi-threading, but .Net/Mono does. Therefore you may be able to do all the boid calculations in a separate thread using primitive types, then pass the results back to Unity to update the boid GameObjects. Note that none of the Unity provided types are thread-safe AFAIK. Vector3 should be fine, I hope! ;)

I have made some measurements using the StopWatch class of the .net framework.

When using the following steering behaviors I had the results below - using 500 boids.

Time to build K-D tree: 2.34ms. So this is not the problem.

Time to update boid steerings (all 500 boids, average time of N frames)

Behaviors used: Cohesion, Separation, VelocityMatch, ObstacleAvoidance => 46.77ms

Behaviors used: Cohesion, Separation, VelocityMatch, ObstacleAvoidance => 39.55ms

Behaviors used: Cohesion, Separation, VelocityMatch, ObstacleAvoidance => 24.44ms

So by not using some behaviors the time dropped significantly. The difference from not using VelocityMatch to not using Cohesion was bigger because Cohesion uses k = 30 for aknn while velocity match uses k = 10. When I set k = 10 for cohesion the time dropped as well.

This is the code that behaviors use to get k-nearest neighbors:


public class Flock
{
	List<Entity> boids;
	alglib.kdtree kdt;

	public Flock(params Entity[] _boids)
	{
		boids = new List<Entity>();

		Add(_boids);
	}
	
	public int GetNeighborhoodInfo(Entity character, int maxNeighbors, float maxDistance, float minDotProduct, double aknnApproxVal, out Vector3 center, out Vector3 avgVeL) 
	{
		var point = new double[3]{character.position.x, character.position.y, character.position.z};

		var results = new double[0, 0]; // the array containing the results (points of the k nearest neighbors)

		int neighbors = (maxNeighbors == 0 ? 0 : alglib.kdtreequeryaknn(kdt, point, maxNeighbors, aknnApproxVal));

		// get the results
		alglib.kdtreequeryresultsxy(kdt, ref results);

		center = avgVeL = Vector3.zero;

		// where is the character looking
		var look = character.transform.forward;

		int count = 0;
		for (int i = 0; i < neighbors; ++i)
		{
			var pos = new Vector3((float)results[i, 0], (float)results[i, 1], (float)results[i, 2]);
			var vel = new Vector3((float)results[i, 3], (float)results[i, 4], (float)results[i, 5]);

			if (pos == character.position)
				continue;


			if( Vector3.Distance(character.position, pos) <= maxDistance )
			{
				// check for angle
				if( minDotProduct > -1.0f )
				{
					var offset = pos - character.position;
					if( Vector3.Dot(look, offset.normalized) < minDotProduct )
						continue;
				}
				
				center += pos;
				avgVeL += vel;
				++count;
			}
		}

		center /= count;
		avgVeL /= count;


		return count;
	}
}

I think it's pretty straight-forward... So I guess the only solution is to search for another library for aknn? I read somewhere that akkn could be made more efficient when using octrees instead of k-d trees.

Normally you don't need to run an AI system every frame for every agent. I assume your above Behaviours are on every single agent entity?

One way to optimize is to use agent handlers that calculate behaviour for a bunch of agents. That way you can reuse many calculations.

Another way (these are combinable) is to use Coroutines (Threads are a really bad idea in Unity) to run your algorithms over multiple frames. This can boost performance really good.

The last not so obvious problem you might have is colliders. You wrote above you "disabled" physics on your agents. Does this mean you removed the RigidBody component or do you set it to isKinematic? Do you still have colliders attached to the objects?

The problem is: Unity handles every Collider that is not on the same object as a Rigidbody like it was static. Moving a static collider means for Unity to recalculate EVERY other static collider, which is performance hell. So the way to go with moving colliders is 'Collider + Rigidbody (isKinematic)' if you do want to disable physics on that object. Or you remove the colliders, too.

I am a programming dinosaur, so I only use dynamic memory allocation when I need it. For a simulation like this, nothing other than perhaps the k-d tree implementation should use dynamic memory allocation at all. Your code has `new' all over the place, and this is my first guess as to why it is so slow. Perhaps there is no alternative to using `new' everywhere in C#, but my C/C++ brain is disgusted. smile.png

Is there any way you can switch to C++ for this project?
Álvaro is right about the `new`s. The ones on the Vector3s don't cost anything but the 2 array allocations could be hurting performance and are completely unnecessary. Move the allocation of those 2 arrrays into the constructor and just set the values in GetNeighborhoodInfo.

Álvaro is right about the `new`s. The ones on the Vector3s don't cost anything but the 2 array allocations could be hurting performance and are completely unnecessary. Move the allocation of those 2 arrrays into the constructor and just set the values in GetNeighborhoodInfo.

That is true, but I don't think that is the most pressing performance issue there. Making the arrays fields, will help nonetheless. Also you should use jagged arrays, not twodimensional arrays, as they are faster to fill in the mono version Unity is using, than the latter ones. (Jagged arrays: int[x][y] instead of int [x,y])

I see what you guys mean, but the problem is that alglib expects the array to be of this type. And even If allocate earlier it will still be re-allocated by alglib (that's why it takes the array by reference) as you can see below:

// get the results
alglib.kdtreequeryresultsxy(kdt, ref results);

Also as far as I'm aware dynamic memory allocation in C# is fast (I read it's just a pointer increment, not sure how it works).

I can't switch to C++, it's too late now :P Also Unity doesn't support C++ so...

The last not so obvious problem you might have is colliders. You wrote above you "disabled" physics on your agents. Does this mean you removed the RigidBody component or do you set it to isKinematic? Do you still have colliders attached to the objects?

The agents do have rigidbodies attatched to them because I use them to set the velocity of the agents. However they do not have colliders attatched to them.

The fact that it takes it by reference should actually be proof of the opposite. It's allowing you to handle the allocation. Also pre-allocating it in the constructor doesn't mean you have to change the array type. You can keep the same type, all you have to do is move the allocation:

List<Entity> boids;
alglib.kdtree kdt;
double[] point
double[,] results;

public Flock(params Entity[] _boids)
{
	boids = new List<Entity>();
	point = new double[3];
	results = new double[INSERT, SIZE];

	Add(_boids);
}

public int GetNeighborhoodInfo(Entity character, int maxNeighbors, float maxDistance, float minDotProduct, double aknnApproxVal, out Vector3 center, out Vector3 avgVeL) 
{
	point[0] = character.position.x;
	point[1] = character.position.y;
	point[2] = character.position.z;

	int neighbors = (maxNeighbors == 0 ? 0 : alglib.kdtreequeryaknn(kdt, point, maxNeighbors, aknnApproxVal));

	// get the results
	alglib.kdtreequeryresultsxy(kdt, ref results);
	
	
	...
	
}
Edit: Sorry I misread the code, I fixed it up a bit. I'm not sure what the size of the results array should be. The documentation for alglib says this:


        XY      -   possibly pre-allocated buffer. If XY is too small to store
                    result, it is resized. If size(XY) is enough to store
                    result, it is left unchanged.
So if you provide an array that's properly sized, there won't be any allocation

This topic is closed to new replies.

Advertisement