# A* - doesn't it check for unlimited ways?

## Recommended Posts

Hi, I've read the tutorial about A* and I've build a version into my game. It works quite well, but when I correctly follow what it does, it will return a path (even the shortest of the ones it finds (every node that has the same distance (and thus needs to be chosen randomly) I go through, so noone would be left out)). However, it can be possible that it is not the shortest of all ways. The point is, that when I try to improve it, that I find out it had unlimited ways of getting from A to B. That is true, there are, in theory, unlimited ways of going from A to B... However, I don't want my script to find all ways (which is unlimited) ofcourse. What am I doing wrong? What feature in A* have I not implemented? I do roughly this: It gets a node, checks for the surrounding nodes and chooses the smallest approx. distance. Of that node it checks for surrounding nodes too...picks the one with the smallest approx. distance to its goal...etc etc etc. For every node I check for surrounding nodes, I get maximal 8 new nodes which have to be checked (or less, if it is not walkable). This will make for every handled node 8 new nodes...that increases quite fast into unlimited nodes. What am I doing wrong? Thanks in advanced. Decrius

##### Share on other sites
If I understand you correctly, your current approach is to select the "next node" as being the neighbor of the "previous node" which minimizes the "distance to origin + heuristic" total.

The correct approach for A* is to select the "next node" as being the neighbor of any previously explored node and minimizing the "distance to origin + heuristic" total. Which means you will usually have to test more than just eight nodes. However, each node is only tested once, and the heuristic, if chosen correctly, should help prune the search tree even further.

##### Share on other sites
Quote:
 Original post by DecriusI go through, so noone would be left out)). However, it can be possible that it is not the shortest of all ways.

That is true, A* does not necessarily guarantee that the path returned will be the shortest; however in most cases it does a pretty good job at generating an optimal path.

If you want to guarantee the shortest path then you could use Dijkstras Algorithm. Only problem with that algorithm is that it is much more brute force and not as convergent as A*. Depends what you want really, good paths that are quick to find (but not unnecessarily the best) or the best possible paths that are really slow to find.

##### Share on other sites
Quote:
Original post by Darragh
Quote:
 Original post by DecriusI go through, so noone would be left out)). However, it can be possible that it is not the shortest of all ways.

That is true, A* does not necessarily guarantee that the path returned will be the shortest; however in most cases it does a pretty good job at generating an optimal path.

Actually A* is intended to use an admissible heuristic, so it is guaranteed to find the optimal path. That is, the heuristic that estimates the cost from a given node to the goal node should return an optimistic value compared to the actual cost.
In the case of path-finding an optimistic estimate would be a distance which is less than or equal to the actual distance; the length of a straight line between the two points would typically satisfy this condition.

##### Share on other sites
Quote:
Original post by dmatter
Quote:
Original post by Darragh
Quote:
 Original post by DecriusI go through, so noone would be left out)). However, it can be possible that it is not the shortest of all ways.

That is true, A* does not necessarily guarantee that the path returned will be the shortest; however in most cases it does a pretty good job at generating an optimal path.

Actually A* is intended to use an admissible heuristic, so it is guaranteed to find the optimal path. That is, the heuristic that estimates the cost from a given node to the goal node should return an optimistic value compared to the actual cost.
In the case of path-finding an optimistic estimate would be a distance which is less than or equal to the actual distance; the length of a straight line between the two points would typically satisfy this condition.

No. It is not guaranteed to find the optimal path for A*. Don't confuse "it's very likely/probably" with "guaranteed". A* will return the optimal path if the true path cost matched exactly to the heuristic. Which is very hard, since a convex obstacle will definately make the heuristic not match the cost (in most cases).
Like been said, Dijkstras will return optimal path always. While A* will often return the optimal one.

Quote:
 For every node I check for surrounding nodes, I get maximal 8 new nodes which have to be checked (or less, if it is not walkable). This will make for every handled node 8 new nodes...that increases quite fast into unlimited nodes.

ToohrVyk already explained your mistake here. But I'll repeat something that might not be very clear. The A* formula is F = G + H, where G is the true cost and H is the heuristic. You're talking about comparing H values, while you need to compare F values. Plus that, ToohrVyk explained that you're also comparing against the wrong nodes.

Edit: G is the cost travelled so far. So if making one step costs "1", then the second parsed node will contain G = (G from it'sparent) + G from this node. In the example the second node (assuming they're both related) will have a G = 2:

---
| 2 |
--- --- ---
| 1 | 2 | 3 |
--- --- ---
| 2 |
---

Hope this helps
Dark Sylinc

##### Share on other sites
Thanks guys, I see my mistake. I'm rewriting the code completely now :)

As for dijsktra's method: I want the shortest path, but if it is not exactly the shortest one it should be no problem. And since dijkstra's method is much more CPU consuming, I think its better to use the A* method.

Sometimes 2 nodes around one parent-node have the same H value, then you can randomly take one of the nodes OR you could try both. The second method would result in 2 possible paths. At the end I could choose for the shortest.
Would that be correct? Would that higher the chance of getting the shortest path?

Thanks for your help se far,
Decrius

##### Share on other sites
Quote:
 Original post by Matias GoldbergNo. It is not guaranteed to find the optimal path for A*. Don't confuse "it's very likely/probably" with "guaranteed".

I didn't [smile]
Like I said, if the heuristic function is admissible, i.e. it gives an underestimate (Edit: More accurately, it does not give an over-estimate) of the distance, then the A* search is admissible also.

Quote:
 A* will return the optimal path if the true path cost matched exactly to the heuristic.

Correct, although in this case the A* search will examine the most optimal set of nodes, that is it will only examine those nodes which are actually on the optimal path.

Quote:
 Like been said, Dijkstras will return optimal path always. While A* will often return the optimal one.

No, A* will return the optimal path if the heuristic is admissible. Look in any good state space search algorithms book.

[Edited by - dmatter on March 8, 2008 12:01:50 PM]

##### Share on other sites
Quote:
 Original post by dmatterif the heuristic is admissible

There we go, when is the heuristic admissible? I use the method of counting width and height differences...works ok, but could be better I guess.

##### Share on other sites
Quote:
Original post by Decrius
Quote:
 Original post by dmatterif the heuristic is admissible

There we go, when is the heuristic admissible?

I answered this in a post above.
For spatial pathfinding, an admissible heuristic might be one that returns an "as the crow flies" distance from the current node's position to the goal node's position, this is the straight line distance between the two points and is regardless of whether such a straight line path is walkable or not.

Quote:
 I use the method of counting width and height differences...works ok, but could be better I guess.

I'm not sure I follow.

##### Share on other sites
Quote:
 Original post by DecriusThanks guys, I see my mistake. I'm rewriting the code completely now :)As for dijsktra's method: I want the shortest path, but if it is not exactly the shortest one it should be no problem. And since dijkstra's method is much more CPU consuming, I think its better to use the A* method.

As *correctly* staten above, if you use an admissible heuristic with A*, you will always get the shortest path (if it exist). You can also use an non-admissible heuristic. Then you are not guaranteed a shortest-path BUT, if you are clever about your heuristic, you can sometime reduce the search time.

Quote:
 Original post by DecriusSometimes 2 nodes around one parent-node have the same H value, then you can randomly take one of the nodes OR you could try both. The second method would result in 2 possible paths. At the end I could choose for the shortest.Would that be correct? Would that higher the chance of getting the shortest path?

Again, if you use an admissible heuristic, it wont matter which of these two nodes you try first.

It is a good exercise to do the proof that A* will always return the shortest-path with an admissible heuristic. Typical first-semester assignement in CS :)

##### Share on other sites
Quote:
Original post by dmatter
Quote:
 I use the method of counting width and height differences...works ok, but could be better I guess.

I'm not sure I follow.

I think he means the Manhattan distance.

An admissible heuristic is an heuristic that is always smaller or equal to the *real* cost. You need to "underestimate" the distance to the goal. For a simple 2D pathfinding, the classical choice is to use the Euclidean distance. Remember that A* can be applied to any positively weighted graph, it doesnt have to be pathfinding...

##### Share on other sites
Quote:
Original post by dmatter
Quote:
 Original post by Matias GoldbergNo. It is not guaranteed to find the optimal path for A*. Don't confuse "it's very likely/probably" with "guaranteed".

I didn't [smile]
Like I said, if the heuristic function is admissible, i.e. it gives an underestimate (Edit: More accurately, it does not give an over-estimate) of the distance, then the A* search is admissible also.

Sorry then, but didn't seem like you did. [cool]

Quote:
Original post by dmatter
Quote:
 Like been said, Dijkstras will return optimal path always. While A* will often return the optimal one.

No, A* will return the optimal path if the heuristic is admissible. Look in any good state space search algorithms book.

[/quote]
Now I guess I was missunderstood. That's what I tried to say. Though, doesn't seem like hehe
My mistake, sorry, but I've been working on variable-sized grids in A* since a long time and that statement is not totally true. Actually it is still true, but it's harder to calculate an exact heuristic (and even cost) in such grids.

Dark Sylinc

##### Share on other sites
Quote:
Quote:
Original post by dmatter
Quote:
 I use the method of counting width and height differences...works ok, but could be better I guess.

I'm not sure I follow.

I think he means the Manhattan distance.

Ah, I see, yes that makes sense now.

In that case, what I'm talking about is a calculation of the Euclidean distance as the heuristic, which as Steadtler mentions, is the classical choice for spatial pathfinding.

I guess Manhattan distance should work too provided that there is no cost considered in changing direction, which there might be if you were actually more interested in finding the quickest path rather than the shortest. This would be analagous with traversing a positively weighted graph, which, as Steadtler again mentions, A* works with too. [smile]

##### Share on other sites
Quote:
 Original post by Matias GoldbergNo. It is not guaranteed to find the optimal path for A*.

Yes, it is.

Proof:

Let d(x,y) be the shortest distance from x to y. Let L(x) = d(start,x).

By definition, the value associated to each node x is f(x) = g(x) + h(x), where g(x) is intended to be L(x), and h(x) is a heuristic that verifies that 0 ≤ h(x) ≤ d(x,end) and h(x) ≤ h(y) + d(x,y).

We show that A* correctly computes g(x) = L(x) for any explored node—if this is the case, then one can find a minimal path by backtracking from the destination based on the value of g(x).

Initially g(start) = L(start) = 0, so we only need to prove that when x is added to the explored-node list, L(x) = g(x) = d(x,y) + g(y), where y is the explored node adjacent to x which minimizes d(x,y) + g(y).

Let us suppose this is not the case: thus, since g(y) = L(y) as an inductive hypothesis, g(x) > L(x). Since y minimizes d(x,y) + g(y) in the explored node list, there exists a shorter path from the start to x which moves through at least one non-explored node. Let z be the first non-explored node of a minimal path from the start to x: then z is accessible from an explored node, and since it has not been added to the set of explored nodes instead of x, f(z) ≥ f(x).

However, g(z) = L(z), because z is the first non-explored node of the minimal path from the start to x, and so the minimal path to z is entirely comprised of explored nodes. Thus g(z) < g(x). Furthermore, since h is an admissible heuristic function, it necessarily follows that h(z) ≤ d(x,z) + h(x) ≤ h(x). And thus, g(z) + h(z) < g(x) + h(x), which contradicts the result that f(z) ≥ f(x).

Therefore, at every explored node, L(x) = g(x), including g(end) = L(end), from where one easily deduces that the shortest path can be extracted.

##### Share on other sites
Quote:
Original post by ToohrVyk
Quote:
 Original post by Matias GoldbergNo. It is not guaranteed to find the optimal path for A*.

Yes, it is.

Proof:

Let d(x,y) be the shortest distance from x to y. Let L(x) = d(start,x).

By definition, the value associated to each node x is f(x) = g(x) + h(x), where g(x) is intended to be L(x), and h(x) is a heuristic that verifies that 0 ≤ h(x) ≤ d(x,end) and h(x) ≤ h(y) + d(x,y).

We show that A* correctly computes g(x) = L(x) for any explored node—if this is the case, then one can find a minimal path by backtracking from the destination based on the value of g(x).

Initially g(start) = L(start) = 0, so we only need to prove that when x is added to the explored-node list, L(x) = g(x) = d(x,y) + g(y), where y is the explored node adjacent to x which minimizes d(x,y) + g(y).

Let us suppose this is not the case: thus, since g(y) = L(y) as an inductive hypothesis, g(x) > L(x). Since y minimizes d(x,y) + g(y) in the explored node list, there exists a shorter path from the start to x which moves through at least one non-explored node. Let z be the first non-explored node of a minimal path from the start to x: then z is accessible from an explored node, and since it has not been added to the set of explored nodes instead of x, f(z) ≥ f(x).

However, g(z) = L(z), because z is the first non-explored node of the minimal path from the start to x, and so the minimal path to z is entirely comprised of explored nodes. Thus g(z) < g(x). Furthermore, since h is an admissible heuristic function, it necessarily follows that h(z) ≤ d(x,z) + h(x) ≤ h(x). And thus, g(z) + h(z) < g(x) + h(x), which contradicts the result that f(z) ≥ f(x).

Therefore, at every explored node, L(x) = g(x), including g(end) = L(end), from where one easily deduces that the shortest path can be extracted.

I haven't fully read it yet, but I have already apoligized for saying such statement.
I appreciate you for taking the time to explain it. Although I already know that explanation.

Dark Sylinc

##### Share on other sites
Quote:
 Posted by DecriusSometimes 2 nodes around one parent-node have the same H value, then you can randomly take one of the nodes OR you could try both. The second method would result in 2 possible paths. At the end I could choose for the shortest.Would that be correct? Would that higher the chance of getting the shortest path?

It doesn't matter which node you pick. If you picked the one that was longer, at some point the other node would become more favourable (because the longer path *must* have strayed from the heuristic estimate (otherwise it would be the shortest path). A* will then start expanding that node instead of continuing on the longer path.

##### Share on other sites
Quote:
Original post by ToohrVyk
Quote:
 Original post by Matias GoldbergNo. It is not guaranteed to find the optimal path for A*.

Yes, it is.

Proof:

Let d(x,y) be the shortest distance from x to y. Let L(x) = d(start,x).

By definition, the value associated to each node x is f(x) = g(x) + h(x), where g(x) is intended to be L(x), and h(x) is a heuristic that verifies that 0 ≤ h(x) ≤ d(x,end) and h(x) ≤ h(y) + d(x,y).

We show that A* correctly computes g(x) = L(x) for any explored node—if this is the case, then one can find a minimal path by backtracking from the destination based on the value of g(x).

Initially g(start) = L(start) = 0, so we only need to prove that when x is added to the explored-node list, L(x) = g(x) = d(x,y) + g(y), where y is the explored node adjacent to x which minimizes d(x,y) + g(y).

Let us suppose this is not the case: thus, since g(y) = L(y) as an inductive hypothesis, g(x) > L(x). Since y minimizes d(x,y) + g(y) in the explored node list, there exists a shorter path from the start to x which moves through at least one non-explored node. Let z be the first non-explored node of a minimal path from the start to x: then z is accessible from an explored node, and since it has not been added to the set of explored nodes instead of x, f(z) ≥ f(x).

However, g(z) = L(z), because z is the first non-explored node of the minimal path from the start to x, and so the minimal path to z is entirely comprised of explored nodes. Thus g(z) < g(x). Furthermore, since h is an admissible heuristic function, it necessarily follows that h(z) ≤ d(x,z) + h(x) ≤ h(x). And thus, g(z) + h(z) < g(x) + h(x), which contradicts the result that f(z) ≥ f(x).

Therefore, at every explored node, L(x) = g(x), including g(end) = L(end), from where one easily deduces that the shortest path can be extracted.

That proof is way too long and complicated.

Suppose that A* (the * implies an admissible heuristic h(x)) returns a non-shortest path P1. That implies it still exist in the open list a node n2 leading to the shortest path P2. Because the heuristic is admissible, the g(x) + h(x) of n2 is smaller or equal than the g(x) of the shortest path P1, and thus also smaller than the g(x) of the returned path P2, and thus would have been returned instead of P2. So A* cannot return a non-shortest path. CQFD.

Ok I suck with formal syntax but thats the gist of it...

##### Share on other sites
Quote:
 Original post by SteadtlerBecause the heuristic is admissible, the g(x) + h(x) of n2 is smaller or equal than the g(x) of the shortest path P1

Nope. The fact that the heuristic is admissible merely ensures that h(x) ≤ d(x,end), and in itself has nothing to do with g(x). Your proof relies on the special property that g(x) = L(x) for all explored nodes, which remains unproven (and is the main reason why my proof is so long)—besides, if you've actually managed to prove that g(x) = L(x) for all nodes, then you can QED right away without need for your proof.

##### Share on other sites
Quote:
Original post by ToohrVyk
Quote:
 Original post by SteadtlerBecause the heuristic is admissible, the g(x) + h(x) of n2 is smaller or equal than the g(x) of the shortest path P1

Nope. The fact that the heuristic is admissible merely ensures that h(x) ≤ d(x,end), and in itself has nothing to do with g(x). Your proof relies on the special property that g(x) = L(x) for all explored nodes, which remains unproven (and is the main reason why my proof is so long)—besides, if you've actually managed to prove that g(x) = L(x) for all nodes, then you can QED right away without need for your proof.

Let me rephrase: if A* returns a non-optimal path P1, there exist at least one node on the open list with g(x) = L(x) and h(x) <= d(x,end) and thus g(x) + h(x) < L(P1).

##### Share on other sites
Quote:
 Original post by Steadtlerthere exist at least one node on the open list with g(x) = L(x) and h(x) <= d(x,end)

Why?

• Perhaps the algorithm explores the destination node last (and so the open list is empty) but has failed somewhere during processing, making the path suboptimal.

• Even if the open list does contain nodes, how can you know for sure that the one on the optimal path satisfies g(x) = L(x) ? In fact, most of the nodes in the open list do not satisfy this property, otherwise Dijsktra would be an O(N) algorithm. This property is true, but it's not obvious and needs to be proven.

##### Share on other sites
Quote:
Original post by ToohrVyk
Quote:
 Original post by Steadtlerthere exist at least one node on the open list with g(x) = L(x) and h(x) <= d(x,end)

Why?

• Perhaps the algorithm explores the destination node last (and so the open list is empty) but has failed somewhere during processing, making the path suboptimal.

Well any proof for an algorithm will assume that the algorithm is correctly executed. If the open list is empty, then you explored all possible paths and thus the algorithm must have returned the optimal path by brute force.

Quote:
 Original post by ToohrVykEven if the open list does contain nodes, how can you know for sure that the one on the optimal path satisfies g(x) = L(x) ? In fact, most of the nodes in the open list do not satisfy this property, otherwise Dijsktra would be an O(N) algorithm. This property is true, but it's not obvious and needs to be proven.

Such a node exist on the open list because the algorithm returned a non-optimal path. Thus it must exist on the open list a node that you should have choosen to lead to the optimal path but did not. Such a node will have g(x) = L(x) (by definition, else it would be another, "earlier" node) and h(x) < d(x,end) (because the heuristic is admissible).

##### Share on other sites
Quote:
 Original post by SteadtlerWell any proof for an algorithm will assume that the algorithm is correctly executed. If the open list is empty, then you explored all possible paths and thus the algorithm must have returned the optimal path by brute force.

Your sentence contains two arguments: the fact that the algorithm is correctly executed and the fact that all nodes have been explored. Both these arguments apply to any search algorithm, not only A*. Therefore, if they were sufficient to prove that A* is correct, then they are sufficient to prove that, say, breadth-first search also finds the shortest path in a weighted graph. Consider this graph:

          B - C - D         /         \  start - A           F - G - end         \___ E ___/

Let's assume that the lengths guarantee AE + EF > AB + BC + CD + DF (as such, the shortest path is ABCDFG). A breadth-first search will explore node F before node D, and thus the path found by it will be AEFG, which is suboptimal. This happens even if breadth-first has explored all nodes, and even if it has executed correctly.

This is akin to saying, "Fish have eyes, therefore they can breathe underwater". Fish can breathe underwater, but this isn't because they have eyes—humans have eyes too, but I don't really enjoy lungfuls of water.

In essence, there is a specific property of the A* algorithm which lets it find the shortest path, and this is what you need to mention in any correctness proof.

Quote:
 Such a node will have g(x) = L(x) (by definition, else it would be another, "earlier" node)

Nonsense. Also, the presence of quotes around "earlier" hints at overzealous hand-waving. You're trying to shorten the proof as much as you can, but you're dropping out fundamental arguments.

You're trying to show that g(x) = L(x) for a given node in the open list. Given the way g(x) is computed, this implies that you expect the closest explored neighbor of x, which we'll call y, to verify g(y) = L(y), so that g(x) = g(y) + d(x,y) = L(y) + d(x,y), which you hope is equal to L(x) but haven't formally proven either (and you won't be able to prove it, since you haven't really defined 'x' yet beyond its being in the open list, and that you hope g(x) = L(x) is true). And if you wish to prove g(y) = L(y), then you need a lot more words than "by definition", since it has nothing to do about definitions and everything to do about proving an algorithm invariant.

To prove the correctness of A*, you need to argue the fact that at any point during execution, for any explored node y so far, g(y) = L(y). This is your algorithm's invariant, and without it you cannot assume that the value of g(x) makes sense—because if g(y) wasn't the intended value for the explored nodes, then A* will compute an unintended value of g(x) for open list nodes as well, which makes g(x) = L(x) an awfully difficult assumption to make.

To prove that g(x) = L(x) for any explored node, you need to reason on the shortest path from start to x, since this is what L(x) represents. However, since your entire reasoning works around the shortest path from start to destination, you simply cannot prove that g(x) = L(x), which is why I consider your proof insufficient.

##### Share on other sites
For those still wondering what the hell is going on and whether A* returns an optimal path always/sometimes/never...

A* is optimally efficient

That means that it guarantees to return the shortest cost path (if it exists) and in addition, guarantees to search no more nodes than any other algorithm on the same problem space.

In addition, A* guarantees to expand (search) every node within a given cost contour before expanding any node in a higher cost contour.

A* does this only if the following holds. For a cost-to-go function,
f(.) = g(.) + h(.)
being an additive sum of the expended cost, g(.), plus the estimated remaining cost, h(.), then for any two nodes x and y such that y is a child of x, A* is optimally efficient i.f.f.:

f(y) ≥ f(x) (so costs are monotonically increasing with search depth)
h(y) ≤ h(x) (estimated costs are monotonically decreasing with search depth)
h(y) ≤ h*(y) (estimated costs never over-estimate the actual future cost of any path through y)

For those that are interested the A* algorithm follows from Bellman's dynamic programming principle.

If you want a proof of these claims, see above... or I can post a shorter proof if needed.

Cheers,

Timkin

##### Share on other sites
Maybe we can explain the role of the admissible heuristic in a more intuitive way.
A search algorithm can only produce a suboptimal result if after finding some portion of the optimal path, up to node n, decides to expand other nodes rather than continuing with n's neighbours (where the optimal path continues).
A* doesn't risk doing this mistake, because when it "blesses" a result it can guarantee that continuing from every node that is still "open" would be at least as expensive as the heuristic indicates and more expensive than the actual cost of the result; rather than ending with the first complete path, search is supposed to continue with candidates that might turn out better than the first expanded.
With an inadmissible heuristic, discarded paths are merely likely to be worse than the selected result, and/or guaranteed not to be much better; extending them to the destination would be required to be sure of the solution's optimality.

##### Share on other sites
Quote:
Original post by ToohrVyk
Quote:
 Original post by SteadtlerWell any proof for an algorithm will assume that the algorithm is correctly executed. If the open list is empty, then you explored all possible paths and thus the algorithm must have returned the optimal path by brute force.

Your sentence contains two arguments: the fact that the algorithm is correctly executed and the fact that all nodes have been explored. Both these arguments apply to any search algorithm, not only A*. Therefore, if they were sufficient to prove that A* is correct, then they are sufficient to prove that, say, breadth-first search also finds the shortest path in a weighted graph. Consider this graph:

You're loosing yourself, and are taking this way too seriously. First, we were not proving that A* is correct, but proving that A* is admissible. Then no, its not sufficient to prove that A* is either complete nor admissible, but thats not what I was saying either.

Quote:
 Original post by ToohrVykYou're trying to show that g(x) = L(x) for a given node in the open list. .

No, *again*, thats not what I wrote. More reading, less typing.

## Create an account

Register a new account

• ## Partner Spotlight

• ### Forum Statistics

• Total Topics
627658
• Total Posts
2978474

• 10
• 12
• 22
• 13
• 33