# Fastest detection algorithm?

## Recommended Posts

For a real-time strategy game, I use a grid to calculate collision. Units don't have a rigid body, they only occupy a cell of the physics grid and if a unit is trying to move on a cell that is occupied, a collision is triggered. The performance is pretty good.

To detect if an enemy is close enough to do a melee attack, I use the same physics grid and I parse the 8 cells around a unit to find any enemies. The problem is that 8 cell scan per unit is too slow. The biggest bottleneck in my game is that units are constantly scanning the cells around them to detect enemies. This might not seem like a lot at a first glance, but it adds up really fast when this process is repeated tens of thousands of times.

How can I improve my physical detection? Would running a rigid body simulation be faster than scanning cells?

##### Share on other sites

I think you should show your existing implementation first, because what you have should, in theory, be lightning fast.

##### Share on other sites

It's hard to tell as you didn't post your implementation.
You're right in that GPU is generally faster than CPU. But I wouldn't implement rigidbody just for the sake of that.
So, if you have grid, why not simply cache which cells are occupied? Would that help?

##### Share on other sites

Yeah, I think you need to post your current implementation, directly indexing to see if something is in a grid should be very cheap.  ObjCells[myX +1 + (myY+1) * mapWidth].indexToObjArray != -1 or what have you.

Though, uh, one cheesy speed-up already comes to mind, maybe only do 4 directions instead of 8 =P

And considering you want to do a Total War-alike, with lots and lots of units, you really want to approach things as a struct of arrays.

Edited by ferrous

##### Share on other sites

Just looking for around cells shouldn't be expensive. But you can do some logical optimizations.

If you find an enemy for current unit, you can inform the enemy about current unit as an enemy. This reduce cost by %50.

Also, check only moving units, stationary unit doesn't need to check for enemy, moving enemy will inform itself...

##### Share on other sites

1 hour ago, ptietz said:

So, if you have grid, why not simply cache which cells are occupied? Would that help?

This might help a lot because I realised I might have a very serious case of cache misses.

45 minutes ago, ferrous said:

Though, uh, one cheesy speed-up already comes to mind, maybe only do 4 directions instead of 8 =P

I'm not sure that would work because my units can move diagonally. I imagine if two units are in diagonal neighbour cells, they might ignore each other. Or not because they'd keep moving in their adjacent cells ? I'm definitely going to try that!

33 minutes ago, wisecode said:

If you find an enemy for current unit, you can inform the enemy about current unit as an enemy. This reduce cost by %50.

Also, check only moving units, stationary unit doesn't need to check for enemy, moving enemy will inform itself...

This sounds incredibly simple and ingenious at the same time.

5 hours ago, Kylotan said:

I think you should show your existing implementation first, because what you have should, in theory, be lightning fast.

This is pseudo code of my enemy detection function. I think I would rather not post the real thing because it might be offensive to some (it's not very pretty).

Entity detect_enemy(Entity me)
{
for each cell in 8 cells around me
{
if (cell.is_empty())
continue;

Entity other = cell.get_entity();
if (other.isEnemy(me))
return other;
}
return no one;
}

I'll try all of your advice and let you know of the result.

##### Share on other sites

Don't you want to return a list of all entities around them?  Otherwise, you only ever get the first one found.

##### Share on other sites
7 hours ago, Michael Aganier said:

To detect if an enemy is close enough to do a melee attack, I use the same physics grid and I parse the 8 cells around a unit to find any enemies. The problem is that 8 cell scan per unit is too slow. The biggest bottleneck in my game is that units are constantly scanning the cells around them to detect enemies. This might not seem like a lot at a first glance, but it adds up really fast when this process is repeated tens of thousands of times.

This makes me wonder about a few things.

Why is every unit constantly scanning? Shouldn't they only scan when their current action is complete, which typically means once every 1-2 seconds?

How many units have you got in order for this to be repeated tens of thousands of times?  Each unit doing the scan should only scan the four neighboring cells for a list, and choose one from the list if the list is not empty. Even if you had a relatively high number of 1000 characters in the world, with roughly 1-2 second actions only about 10 would be scanning their neighbors any given frame, and 10 queries of either 5 cells or 9 cells should complete in under a microsecond.

If you're doing those things already and still have "tens of thousands of times" every second, it would mean you've got 20,000+ units out there all looking for actions.   At that point you're talking about perhaps a full frame of 15 or 20 milliseconds doing that work, but you're going to have other serious bottlenecks with that many objects in the world.

##### Share on other sites
2 hours ago, Michael Aganier said:

This is pseudo code of my enemy detection function. I think I would rather not post the real thing because it might be offensive to some (it's not very pretty).

You say it's not pretty, so maybe that's the problem? I think you should show the mess  Maybe bad memory layout or too much indirection. Maybe you should sort entities so they match grid memory order to improve caching.

2 hours ago, Michael Aganier said:
3 hours ago, wisecode said:

If you find an enemy for current unit, you can inform the enemy about current unit as an enemy. This reduce cost by %50.

Also, check only moving units, stationary unit doesn't need to check for enemy, moving enemy will inform itself...

This sounds incredibly simple and ingenious at the same time.

With your current code, each entity would detect the top left neighbour first (if there is one and top left is first cell to look for). So using the suggestion would break this behavior. (Maybe that does not matter, just to mention. Maybe you want this more random anyways.)

##### Share on other sites
15 hours ago, Michael Aganier said:

This is pseudo code of my enemy detection function.

The problem with the pseudo code is that it leaves out all the important information:

• How do you determine which the 8 surrounding grid squares are? These should be O(1) operations but it's possible to get this wrong.
• How is cell.get_entity implemented? If there's only one entity possible there, this too should be O(1), but again, maybe this is wrong.
• How is isEnemy implemented? Again, see above.

##### Share on other sites
On 04/12/2017 at 1:59 PM, frob said:

How many units have you got in order for this to be repeated tens of thousands of times?  Each unit doing the scan should only scan the four neighboring cells for a list, and choose one from the list if the list is not empty.

It's a 2D game with a very basic art style because I want to push a maximum number of units. As of now, I'm only picking the first target and ignoring the others to save on CPU cycles.

On 04/12/2017 at 1:59 PM, frob said:

If you're doing those things already and still have "tens of thousands of times" every second, it would mean you've got 20,000+ units out there all looking for actions.

Yes. The initial prototype of the game with nothing but unit movement would have 500 000 units at ~20fps (only a fraction are on screen, but they're all moving). As I started to add combat and collision, it quickly started going down. Right now, I'm trying to find the balance between simulation accuracy and unit number.

On 04/12/2017 at 1:59 PM, frob said:

Why is every unit constantly scanning? Shouldn't they only scan when their current action is complete, which typically means once every 1-2 seconds?

That's gonna sound ridiculous, but my game stutters if my units aren't scanning constantly. I think it has to do with my threaded code because it stops stuttering with multi thread disabled, but my framerate drops by a lot. Right now, I'd rather keep the constant scanning + multi thread because it gives me both higher framerate and no stuttering which can be improved by reducing the scanning time without reducing its occurrence.

21 hours ago, JoeJ said:

Maybe bad memory layout or too much indirection. Maybe you should sort entities so they match grid memory order to improve caching.

My grid contains cells that own lists of pointers to entities that are in the cell. Cells are packed in the grid, but I can't pack entities in cells because they're can be many entities in one cell.

9 hours ago, Kylotan said:

How do you determine which the 8 surrounding grid squares are? These should be O(1) operations but it's possible to get this wrong.

I dynamically calculate the bounding box coordinates around the unit and iterate through these as indexes inside the grid. I did it that way so that if I want units to have a longer range than 1 cell, I can just give it a radius in cells and it gives the bounding box around the unit which can be scanned for enemies.

9 hours ago, Kylotan said:

How is cell.get_entity implemented? If there's only one entity possible there, this too should be O(1), but again, maybe this is wrong.

Many enemies can be in a cell, but you'll always only see one. The collision guarantees that if there's an enemy in the cell, then there can be no friendly.

Entity cell.get_entity()
{
return this->entities.first(); // entities being a list of pointers
}     
9 hours ago, Kylotan said:

How is isEnemy implemented?

By team id comparison.

bool Entity.is_enemy(Entity other) {
return this.team_id != other.team_id;
}

Edited by Michael Aganier

##### Share on other sites
51 minutes ago, Michael Aganier said:
22 hours ago, JoeJ said:

Maybe bad memory layout or too much indirection. Maybe you should sort entities so they match grid memory order to improve caching.

My grid contains cells that own lists of pointers to entities that are in the cell. Cells are packed in the grid, but I can't pack entities in cells because they're can be many entities in one cell.

Probably that's it. The lists will be dynamically resized, so dynamic allocation. Lists will be scattered through memory, so bad cache utilization, and probably garbage collection does the rest.

cell.ptr -> entity55.next_ptr -> entity87.next_ptr -> null

Inserting is fast (just insert at the cell), and removal too (faster with double linked list - worth if lists are large).

Downside: Each entity can only be linked to a single cell. This works if the entity bounding box is not larger than the cell size. For each cell you need to check neighbours to find all entities that intersect it, but i assume you get a big speed up. (Same concept as in a loose quadtree, which would solve the max size limitation.)

##### Share on other sites
1 hour ago, Michael Aganier said:

That's gonna sound ridiculous, but my game stutters if my units aren't scanning constantly. I think it has to do with my threaded code because it stops stuttering with multi thread disabled, but my framerate drops by a lot. Right now, I'd rather keep the constant scanning + multi thread because it gives me both higher framerate and no stuttering which can be improved by reducing the scanning time without reducing its occurrence.

You should really consider detaching your logic and rendering timesteps from one another. There's no reason that logic updates should have any impact on the framerate.

Ideally, you'd perform logic updates (for the moment continuously) on a background thread. Copy data for only units which (a) are within the camera bounds, and (b) have taken an action recently into a separate data structure to be used by the main rendering thread.

##### Share on other sites
1 hour ago, Michael Aganier said:
On 4.12.2017 at 7:59 PM, frob said:

Why is every unit constantly scanning? Shouldn't they only scan when their current action is complete, which typically means once every 1-2 seconds?

That's gonna sound ridiculous, but my game stutters if my units aren't scanning constantly. I think it has to do with my threaded code because it stops stuttering with multi thread disabled, but my framerate drops by a lot. Right now, I'd rather keep the constant scanning + multi thread because it gives me both higher framerate and no stuttering which can be improved by reducing the scanning time without reducing its occurrence.

If you don't want to deal with threading now, you could distribute the work equally over multiple frames e.g.:

frame 1: update entities 0-999

frame 2: update entities 1000-1999

etc.

##### Share on other sites
2 hours ago, Michael Aganier said:

I dynamically calculate the bounding box coordinates around the unit and iterate through these as indexes inside the grid. I did it that way so that if I want units to have a longer range than 1 cell, I can just give it a radius in cells and it gives the bounding box around the unit which can be scanned for enemies.

You explaining how the code works is not the same as posting the exact code. If you want precise feedback instead of vague guesses every time you add more information that someone randomly asks for, post the exact code.

---

The is_enemy function should probably take an entity by (potentially const) reference or pointer -- might be cheaper than copying it around. At least if it's a big class/struct.

##### Share on other sites

I'm not really familiar with this type of game, so take this with a healthy pinch of salt...

But why do you need to have pointers to each individual unit? Surely you only need to keep track of the number of units of a given type (infantry, cavalry, artillery men, medics etc.) within the cell - it's not like each soldier has a name and personality you need to take into account. You could handle health by just keeping track of the average health of each type within a cell, rather than individually. If the player moves some troops out of the cell, just subtract the numbers from the original cell, and add them to a new one.

This would massively reduce the number of checks you need to do, and combined with a quad tree for your cells would reduce it even further.

##### Share on other sites
9 hours ago, Michael Aganier said:

500 000 units at ~20fps (only a fraction are on screen, but they're all moving).

That's a large number.  It takes significant care to reach that kind of number, about 10M updates per second.  There are some CPU-based particle systems and heavy computing systems that handle it, but it requires attention to details like cache effects and data sizes.

If you're randomly hopping around in memory you won't hit that on modern systems. You'll likely be running into memory bandwidth and timing issues unless you're carefully organizing data to properly stream to the CPU.

##### Share on other sites

I don't understand the question. You're using a grid for collision detection, so you can't really do better than that. Most generic physics engines divide collision detection in two phases, the first just reduces to a possible set of collisions, the second does the proper collision detection... and on top of that, they mostly use BVHs and SweepAndPrune algorithms that scale "logarithmically" instead of "linearly" (a grid).

Edited by RnzCpp

##### Share on other sites
2 hours ago, RnzCpp said:

I don't understand the question. You're using a grid for collision detection, so you can't really do better than that.

That was basically the question. I'm not very familiar to how other games do it so I wanted to know if there was a better way to do it. Since it seems like the fastest say, the next point was about how to improve my implementation.

10 hours ago, Lactose said:

You explaining how the code works is not the same as posting the exact code. If you want precise feedback instead of vague guesses every time you add more information that someone randomly asks for, post the exact code.

You are right. I'm reluctant to posting the exact code because it calls many sub functions who themselves call more functions. I would have to copy bits of code from everywhere to get the whole thing. I know I'm being vague, but everyone's advice has been helpful nonetheless.

##### Share on other sites

Having followed this thread and your other on sprite rendering.. the bigger question that comes to mind is why you want to run AI / physics / individual rendering on 500K units? Granted, considering such large numbers can be an interesting optimization and theoretical exercise, but there may be unforeseen issues in valmorphanizing this into a game.

Aside from the psychology aspects of this, I think one of the key points is that games often don't do what you think they do. Games are all about smoke and mirrors. A good game programmer will make you think they are individually processing 500K units, but really be processing maybe 50 groups, and using a bunch of statistics / randomization / precalculated results, to give you something that *looks like* that number.

##### Share on other sites
1 hour ago, Michael Aganier said:

I'm not very familiar to how other games do it

Other games don't try and simulate 500,000 units - more like a thousandth of that, which means a million times fewer potential collisions to check for. That's basically the only really important fact here.

##### Share on other sites
1 hour ago, lawnjelly said:

the bigger question that comes to mind is why you want to run AI / physics / individual rendering on 500K units?

It's really a matter of pushing the optimisation as far as I can. I don't want to lose the physical feeling of each individual unit. In the end, if I can get 5fps at 500k, it means I can probably get 30+ at 100k. If my game normally starts with 1k units, it means I won't have to worry that it becomes unplayable in the late game when the player builds much bigger armies.

Edited by Michael Aganier

##### Share on other sites

If it can help, here are pictures of the game with a grid showing the collision cells. Blue cells are occupied by one or more entities. Some stats are in the green area of the window. The framerate is 60 because the game is paused when there is no physics calculation happening. It drops about half of that when unpaused.

##### Share on other sites

You can increase the number of threads and increase the minimum hardware?

##### Share on other sites

Can you link to a youtube video that show's the kind of thing you are trying to achieve?

## Create an account

Register a new account