# Out of ideas for optimizing A* search, got any?

## Recommended Posts

Crayz92    173

Here are the two main files:

A few things I've done to optimize the search:

1. Rather than a dictionary to keep track of scores and parents, I use a 2d array of objects that are mapped to the grid i.e. [x,y] represents grid point (x,y).  Objects have F,G,H and Parent fields, these fields are reset to default values during each search.

2. Cheap weighted heuristic

3. Orthogonal neighbors (searching 4 neighbors rather than 8)

Paths don't have to be beautiful because the point reduction smooths them out quite nice.

In Path.cs you can see the two functions to reduce points, Incremental and Stretch.  Stretch is much faster because it has the potential to reduce points all the way to the end right off the bat or early on thus reducing calls to the LineOfSight function, but it isn't so reliable around small areas/corners.  The incremental function is reliable but it calls to the LineOfSight function for each and every point along the path, this causes the path reduction to be just about as slow as calculating the path itself.

Line of Sight is determined by getting all points between 2 points and checking that they are all flagged as Walkable.

My grid is dynamic and roughly 250x250.  I tried out Theta* and JPS, both were slower than my current A*.  I'm trying to avoid using a navmesh or worker thread, I suppose mostly just looking for ideas to tighten up my code

Edited by Crayz92

##### Share on other sites
ApochPiQ    23062

Have you run a profiler against your code?

##### Share on other sites
Crayz92    173
50 minutes ago, ApochPiQ said:

Have you run a profiler against your code?

I have.  The two biggest bottlenecks are in point reduction and the priority queue (using BlueRaja's which claims to be the fastest)

Here's a screenshot from the Unity profiler:

Profiling alone adds a lot of overhead but I think it gives a good idea of where things sit.  I've never used a profile other than Unity's

##### Share on other sites
ApochPiQ    23062

Actually, the profiler is showing you something slightly different.

Keep in mind that what you care about in this context is not the sum total cost of every call to a function - just the average cost of a single call. If you divide the time spent in a function by its call count, you'll see that the priority queue is dirt cheap per-call, you just call it a whole lot.

This is important data on two fronts. First, you should focus your effort on your algorithm primarily. Look for ways to avoid calling code that has no effect on the final results of the path search. Second, if you look at the actual places your code is burning a lot of time, you'll get much more visible performance wins.

Unity has a "deep" profiler option that you should also look into.

##### Share on other sites
Nypyren    12074

I took a look at some of the code:

- Heuristic.  Is distance-squared admissible for A*?  I've never seen it used for A* before and that is a warning sign to me.

- FastPriorityQueue is being called an extremely high number of times, but each operation should be EXTREMELY short, so the profiler overhead is likely to be polluting the numbers to the point where you should no longer trust them.  You'll have to move your A* code into a standalone .Net project in order to use a sampling profiler with it, as far as I know.

- I looked at the github repo for FastPriorityQueue, and there are some obvious microoptimizations that can be made to it if you determine that it really is worth it.  Change the Node to a struct, change those properties to plain old fields, perhaps change the binary heap to a cache-friendly heap (after the first few layers of a binary heap you will cache miss on every single layer).

- JPS+ should be tremendously faster than A*, so if it wasn't then there's something significantly wrong with the implementation that you tried.

Edited by Nypyren

##### Share on other sites
ericrrichards22    2434

I might be wrong here, but it looks like you might be enqueueing more nodes than you really ought to need to, unless you're hitting a really pathological layout.  250x250 is 62500 total nodes, and your profiler result is showing 59600 calls to Enqueue(), which appears to indicate that it is considering a pretty hefty fraction of the total nodes available.  Assuming no obstacles and an appropriate heuristic, A* ought to pretty much go in a straight line from the start to the goal, and only have to put the neighboring nodes of the ideal path on the open (priority queue) set.

You might want to try tweaking your code so that you can step through the algorithm one iteration at a time, and visualize how nodes are getting added to your open and closed sets at each step, so you can see if it is clearly doing something weird.  Five seconds is really far too long for this to take.

Where your connections all appear to be orthogonal, you might want to look at something like Manhattan Distance as a heuristic, which tends to work very well with A* in such situations.  That would tend to look something like this, lifted from my pathfinding code:

public static float ManhattanDistance(Point start, Point goal) {
var dx = Math.Abs(start.X - goal.X);
var dy = Math.Abs(start.Y - goal.Y);
var h = dx + dy;

return h;
}

##### Share on other sites
Crayz92    173
1 hour ago, Nypyren said:

I took a look at some of the code:

- Heuristic.  Is distance-squared admissible for A*?  I've never seen it used for A* before and that is a warning sign to me.

- FastPriorityQueue is being called an extremely high number of times, so the profiler overhead is likely to be polluting the numbers to the point where you should no longer trust them.  You'll have to move your A* code into a standalone .Net project in order to use a sampling profiler with it, as far as I know.

- I looked at the github repo for FastPriorityQueue, and there are some obvious microoptimizations that can be made to it if you determine that it really is worth it.  Change the Node to a struct, change those properties to plain old fields, perhaps change the binary heap to a cache-friendly heap (after the first few layers of a binary heap you will cache miss on every single layer).

- JPS+ should be tremendously faster than A*, so if it wasn't then there's something significantly wrong with the implementation that you tried.

I've played around with a variety of heuristics, this is the fastest I've come across and it produces acceptable results.  I'll give JPS another try, it's very possibly my implementation is messed up.

1 hour ago, ericrrichards22 said:

I might be wrong here, but it looks like you might be enqueueing more nodes than you really ought to need to, unless you're hitting a really pathological layout.  250x250 is 62500 total nodes, and your profiler result is showing 59600 calls to Enqueue(), which appears to indicate that it is considering a pretty hefty fraction of the total nodes available.  Assuming no obstacles and an appropriate heuristic, A* ought to pretty much go in a straight line from the start to the goal, and only have to put the neighboring nodes of the ideal path on the open (priority queue) set.

You might want to try tweaking your code so that you can step through the algorithm one iteration at a time, and visualize how nodes are getting added to your open and closed sets at each step, so you can see if it is clearly doing something weird.  Five seconds is really far too long for this to take.

Where your connections all appear to be orthogonal, you might want to look at something like Manhattan Distance as a heuristic, which tends to work very well with A* in such situations.  That would tend to look something like this, lifted from my pathfinding code:


public static float ManhattanDistance(Point start, Point goal) {
var dx = Math.Abs(start.X - goal.X);
var dy = Math.Abs(start.Y - goal.Y);
var h = dx + dy;

return h;
}

Ah I didn't mention that profiler screenshot is the result of 100 paths in a single frame.  Here's a better shot for a single path:

I fixed up some code and gave Manhattan distance a test run.  Here's a  comparison between my current heuristic & Manhattan

Manhattan without weight puts it at about 2,200ms.  While Manhattan makes for a better looking path, post-processing takes care of that so I'm mostly interested in speed

Edited by Crayz92

##### Share on other sites
Nypyren    12074

Use an instance of the Stopwatch class to time SetDestination with and without deep profiling enabled, and see how much it differs.  I still think the instrumenting profiler is causing too much overhead on the FastPriorityQueue and it would be good to find out how much there is.  It's the biggest offender if the profiler is telling the truth.

##### Share on other sites
Crayz92    173
3 minutes ago, Nypyren said:

Use an instance of the Stopwatch class to time SetDestination with and without deep profiling enabled, and see how much it differs.  I still think the instrumenting profiler is causing too much overhead on the FastPriorityQueue and it would be good to find out how much there is.  It's the biggest offender if the profiler is telling the truth.

I have my debug script set up just like that

The results in the two screenshots above are from this Stopwatch snippet with deep profile & profiler disabled completely

##### Share on other sites
Nypyren    12074

Ok, so with profiling one call takes 31ms, and without you have 1-2ms.  Definitely too much overhead from Unity's instrumenting profiler.

If you can, take all of your pathfinding code and make a standalone .Net app out of it, and use a sampling profiler on it.  I think newer versions of Visual Studio should have a profiler built into them.  I can't tell if you're on Windows or not from the screenshots, though.  If not, there should also be some freely available profilers available that will run on other OSes.

##### Share on other sites
ApochPiQ    23062

Just a handful of random suggestions:

• Get your algorithm correct first. Micro-optimization is a waste of time if your code is solving the problem wrong.
• In case you aren't already aware, benchmark in Release builds, not Debug builds. This can make a tremendous difference.
• Start with a tiny test case, like 10x10, and walk through the algorithm start to finish in a debugger. Pay attention to the behavior of the code. You will almost certainly find inefficiencies this way.

##### Share on other sites
Crayz92    173
On 9/9/2017 at 11:27 AM, ApochPiQ said:

Just a handful of random suggestions:

• Get your algorithm correct first. Micro-optimization is a waste of time if your code is solving the problem wrong.
• In case you aren't already aware, benchmark in Release builds, not Debug builds. This can make a tremendous difference.
• Start with a tiny test case, like 10x10, and walk through the algorithm start to finish in a debugger. Pay attention to the behavior of the code. You will almost certainly find inefficiencies this way.

I went over my code and found some issues - kind of funny how the heuristic I was using showed better performance with a bug in the A* algorithm.  It's overall better performance now using the diagonal distance heuristic.

On 9/9/2017 at 11:02 AM, Nypyren said:

Ok, so with profiling one call takes 31ms, and without you have 1-2ms.  Definitely too much overhead from Unity's instrumenting profiler.

If you can, take all of your pathfinding code and make a standalone .Net app out of it, and use a sampling profiler on it.  I think newer versions of Visual Studio should have a profiler built into them.  I can't tell if you're on Windows or not from the screenshots, though.  If not, there should also be some freely available profilers available that will run on other OSes.

Yea, definitely a lot of overhead from the profiler.  I use it just to get an idea of where things are at.  I am on Visual Studio, I wonder if its possible for it to hook into the Unity editor and profile that way?  I fixed up my JPS algorithm as well, it shows about 15-20% increased performance but in some cases performed much worse, so I'll have to do some more digging.

##### Share on other sites
Nypyren    12074
1 hour ago, Crayz92 said:

Yea, definitely a lot of overhead from the profiler.  I use it just to get an idea of where things are at.  I am on Visual Studio, I wonder if its possible for it to hook into the Unity editor and profile that way?  I fixed up my JPS algorithm as well, it shows about 15-20% increased performance but in some cases performed much worse, so I'll have to do some more digging.

Unity's using a different .Net runtime (Mono) and I haven't been able to figure out how to get Visual Studio to profile that.

##### Share on other sites
LorenzoGatti    4450
On 9/9/2017 at 6:27 PM, ApochPiQ said:

Just a handful of random suggestions:

• Get your algorithm correct first. Micro-optimization is a waste of time if your code is solving the problem wrong.
• ...

Regarding correctness, if two heuristics give different results and not only different counts of expanded nodes it is crushing evidence there are bugs in the heuristics and/or in the A* search.
Judging from the screenshots the so-called "current" heuristic is inadmissible or worse and the Manhattan distance heuristic is OK.

##### Share on other sites
Kylotan    10007

Manhattan distance is not an admissible heuristic if the character can move diagonally. It's not just sub-optimal, it's wrong. Same goes for distance-squared. Overestimating heuristics will give incorrect results. The direct straight-line distance is the best baseline heuristic for almost all applications, if you don't know for sure that you have a better one.

##### Share on other sites
hplus0603    11356

Note that inadmissible heuristics are totally fine, as long as you want "a" path rather than "the best" path.

Sometimes, inadmissible heuristics will be just the ticket to make pathfinding run well.

In this particular case, I'd first look at why PathQuery.LineOfSight() takes so much time. A simple greedy path simplification algorithm should basically run in linear time. Then, I'd look at whether there are calls to functions that don't need to be made. Then, I'd look at functions that are virtual, and see if they could be made non-virtual. Or maybe even inline.

If all of that still doesn't work, then also look at whether you can re-use previous path finding from other entities, whether you can remember cached segments and re-use them, and whether you can implement a hierarchical finder. Split the 256 grid in 16x16 cells, and classify connectivity between edges as a pre-processing step. Run pathfinding through this coarse grid. Once that works, you can run a single refine step on the found path, and end up with a fine result.

Or you can build passable segments by "inflating" interior geometry (boxes, or spheres) until they hit non-passable areas, and know which pieces connect to which other pieces, and just path-find between pieces rather than cells. Again, a final pass of smoothing will fix up the path.

##### Share on other sites
frob    44972
On 9/8/2017 at 5:21 PM, Crayz92 said:

Here's a screenshot from the Unity profiler

Since you seem to be using Unity, is there a reason you aren't using Unity's own navmesh functionality instead of re-inventing the thing?  It's auto-generated navmeshes are pretty good, and you can create your own to fine-tune anything you need.

##### Share on other sites
Crayz92    173
4 hours ago, frob said:

Since you seem to be using Unity, is there a reason you aren't using Unity's own navmesh functionality instead of re-inventing the thing?  It's auto-generated navmeshes are pretty good, and you can create your own to fine-tune anything you need.

I decided early on to avoid Unity as much as possible and keep all my game's logic within its own boundaries.  Reason for it is so that my server can run headless and/or standalone with no dependencies to Unity.  And reinventing the wheel isn't such a bad thing, if I had decided to stick to Unity's physics and pathfinding I would only have a little bit more knowledge about Unity and a lot less knowledge about pathfinding and physics.

##### Share on other sites
Wyrframe    2426

Try having your "result" debug tool (the one showing the path line) also show each grid node that became opened during the search, and colour each node by a function of its (best found) G cost, so you can visually track the open set and the best path found to each cell in the open set. Like others are saying, after reviewing your code, I think you're massively over-opening the search space.

##### Share on other sites
ntroPi    449

If you want 8 direction movement, your heuristic should support that. Creating an "8 direction manhattan style estimate" between two points should be no big deal. (dist = min(xDiff, yDiff) * sqrt(2) + abs(xDiff - yDiff))

Now to make A* really lightning fast there is an counter intuitive tip: Make the heuristic under/overestimate the distance to the target so it doesn't backtrack as much to go around an obstacle. This is exactly what everyone says the A* heuristic must never do, because now it will no longer be guaranteed to calculate a correct shortest path at all. But usually in games you don't care about the path being 100% the shortest one.

The upside are crazy performance gains up to a factor 10! In extreme cases like labyrinth style maps the results can be very wrong though.

You'll have to experiment a little as I did that a long time ago and don't remember what the best factor was exactly. You'll know when you've hit the sweet spot. CPU time will drop like a stone and the paths should still look mostly optimal.

Hope this helps :-)

Edit: Postprocessing/smoothing a found path can reduce the introduced error, especially in the more obvious suboptimal cases like running into a dead end.

Edited by ntroPi

## Create an account

Register a new account

• ### Similar Content

• When my program was generating navigation meshes for my geometry, and the extent is 5000x5000 (m), with a cell size of 5.0 meters, it has to generate 250,000 cells, which makes my computer halt, the target object is a truck, which has a radius of about 20 meters, should I give the cell size a value of 20, because I don't really want to use this navigation mesh on one type of vehicle only. I want to use it for vehicles like cars, which has only about 5 meters in radius... the generation speed is a problem while the memory consumption is another.
Any ideas how to improve this?
Thanks
Jack

• 3D Rigger Mixamo Is Making Some Changes
Originally founded in 2008, Mixamo has been a go-to source for 3D animation and rigging for nearly a decade. On August 28th, 2013, Mixamo launched a platform called Face Plus. It was designed to allow developers or animators turn facial expressions recorded with even a webcam into top-notch 3D animation. This is not unlike what Holotech Studios’ “FaceRig” does for streamers and video creators as well.
Back this past May 2017, Adobe (publishers of the Mixamo platform since 2015) announced the end of an era. Mixamo’s Face Plus will be removed on August 22nd, 2017, as well as several other free features.
According to the Adobe post on the Mixamo website, they lay out what is changing and what is staying the same.

Set out as bullet points in the post, they seem to be cleaning house on the “free” features. But, they are keeping a small handful of their main key points, such as:
Select and use characters from Mixamo’s Character Collection
Upload 3D characters to get them ready to animate using the Auto-Rigger
Apply animations to characters
Noticeably, the list that’s getting 86’ed includes major tools and the ability to save assets from their site:
The ability to save and retrieve assets in “My Assets”.
The Face Plus facial animation plugin
The Decimator polygon reduction tool
Download types for .bvh and .biped that streamline integration with 3rd party applications
Mixamo forums will close and all help articles will be moved to Adobe’s HelpX
Don't Worry! There's still time!

They’re allowing people to still download the tools and plug-ins until August 22nd, as well as utilizing their easy-to-download features of saving Assets directly. Those that use Mixamo, utilize Face Plus, or even just do animation are highly encouraged to take advantage and download the tools for the next week.
UPDATE (8/15/17): Adobe would like to stress that Mixamo, itself, is not going anywhere, just the Face Plus features, the tools mentioned above, and a couple other features to make the platform more "streamlined and a little modernized". Once the dust is cleared next week from the changes, I'll reach out to Adobe from a more guided tour of the new set-up. Additionally, I have changed the title of this article to reflect the clarification. Apologies for the inconvenience to those that have already read the article.

• By suliman
Hi
I have a A* gridbased pathfind for a city builder I'm planning (think banished or stronghold).
Units can walk in 8 directions from each tile. But I want more free movement. Look at the example pic below. The pathfind finds the path in step 0 to 19.
But now I want to "simplify" the path so the unit goes straight from node marked 0 to 10 (and obviously also from 12 to 19; i forgot to mark this in the picture).
Any good algoritm to "reduce" the path in this way? I understand I need to check if there is a unbroken "line of walk" between two nodes but if i start with checking 0 to 1, 0 to 2, 0 to 3 until i find that 0 to 12 doesnt work and stop there (so I throw away nodes marked 1 through 8) this would be VERY slow.
Any idea?

• By suliman
Hi
In a game like stronghold with many workers going back and forth from a workplace to a dropoff-point for example, would I store the path and then use it backwards to not have to redo pathfinding all the time? If the moving units in the city doesnt block movement, very often I could just reuse the old path. Only if i run into a wall (if the player constructs a new building in the path of the old workers) will I recalculate the path.
Workers in old age of empires for example, do they pathfind every time they go back to drop off gold in the storage?
Thanks!
Erik

• 25
• 13
• 28
• 10
• 13