How much is too much

Started by
19 comments, last by LorenzoGatti 4 months ago

JoeJ said:
Say you pick one random source vertex on the mesh, and 10 random target vertices. You start at the source, and keep Dijkstra running until all 10 targets have been found. No changes or extensions to the algorithm are needed, and all 10 paths will be the shortest possible as expected. I guess with A star this can't work, because the heuristic involves a target.

The heuristic can be anything that “optimistically estimates remaining path”. “Optimistically” here means less or equal to actual remaining path. Best performance is obtained when estimate is exactly the remaining distance. If the estimate is smaller, performance goes down but you keep getting optimal paths. Estimates higher than remaining costs are faster than best performance at the cost of NOT always getting the optimal path.

So you could pick the constant 0 as the estimate, and it will work (it is less then any remaining distance). The constant 1 as suggested by frob is the same idea, except that may lead to non-optimal paths if you have paths with lengths smaller than 1.

The effect of picking a fixed value like 0 is that A* loses all steering towards a destination, and will spread out in all directions just like Dijkstra does. As frob already said, Dijkstra and A* with estimate 0 are equivalent. As for the stop condition, as long as you don;t terminate the process, A* with estimate 0 will happily continue to expand the searched area until you've had enough or until it run s out of vertices.

JoeJ said:
For games this would be useful if you want to find the closest target out of many options. But if that's 4 nearby targets for example, and each can be found by a straight path, running A star 4 times is likely still faster.

Pretty much all A* code at the Internets assumes a single target that you want to reach so the extremely common estimate is distance to that target. The algorithm itself however doesn't demand that, any estimate equal or less than real remaining path-cost will do. So it's fine to write an estimator that computes remaining cost for all your 4 targets and then returns the smallest value. The disadvantage here is that the estimator gets more complicated and deciding you reached a target gets more complicated.

You can however avoid those disadvantages if you modify the problem a bit. Say the initial starting vertex is S, and your four target vertices are D1 to D4 (I use Dx as shorthand of “one of D1, D2, D3, or D4".). Add a new vertex T, and edges with cost 0 from D1..D4 to T. Next, swap direction of all edges but keep the costs as they were. Also swap target vertex with starting vertex. Thus the search starts at T and ends at S. This is a classic A* search. The first step in the path will move to one of the D1..D4 nodes and then it gives the shortest path to S. This is optimal, since the edge from T to any Dx vertex didn't cost anything.

Also, while you swapped direction of the edges, you kept original costs. Those costs are thus valid when you follow the same path from S to Dx in the unmodified graph.

A few practical notes: You don't need to literally swap the direction of the edges in the graph. If you have a way to find incoming edges at a vertex, you can simply follow incoming edges rather than outgoing edges in the implementation. Also, if you look closely at the first iteration of the A* algorithm, you'll find that it deletes the T node and adds all Dx nodes. You can instead write code that does exactly that, and then start the A* algorithm. The algorithm will be one iteration shorter and the initial node will be at one of the Dx vertices.

Advertisement

Alberth said:
So you could pick the constant 0 as the estimate, and it will work (it is less then any remaining distance). The constant 1 as suggested by frob is the same idea, except that may lead to non-optimal paths if you have paths with lengths smaller than 1.

Because the value is added to every query it doesn't matter what the constant is if you want it to degrade. Big or small, positive or negative, if it is constant ultimately the value vanishes as weights are compared against each other. Any constant works, it will become a memory-heavy version of Djikstra's pairwise search.

frob said:
Big or small, positive or negative, if it is constant ultimately the value vanishes as weights are compared against each other.

Ah, this is where I went wrong. You apply the same value to all paths so it doesn't matter what value you use as long as it's always the same. Thanks!

Alberth said:

JoeJ said:
Say you pick one random source vertex on the mesh, and 10 random target vertices. You start at the source, and keep Dijkstra running until all 10 targets have been found. No changes or extensions to the algorithm are needed, and all 10 paths will be the shortest possible as expected. I guess with A star this can't work, because the heuristic involves a target.

The heuristic can be anything that “optimistically estimates remaining path”. “Optimistically” here means less or equal to actual remaining path. Best performance is obtained when estimate is exactly the remaining distance. If the estimate is smaller, performance goes down but you keep getting optimal paths. Estimates higher than remaining costs are faster than best performance at the cost of NOT always getting the optimal path.

So you could pick the constant 0 as the estimate, and it will work (it is less then any remaining distance). The constant 1 as suggested by frob is the same idea, except that may lead to non-optimal paths if you have paths with lengths smaller than 1.

The effect of picking a fixed value like 0 is that A* loses all steering towards a destination, and will spread out in all directions just like Dijkstra does. As frob already said, Dijkstra and A* with estimate 0 are equivalent. As for the stop condition, as long as you don;t terminate the process, A* with estimate 0 will happily continue to expand the searched area until you've had enough or until it run s out of vertices.

JoeJ said:
For games this would be useful if you want to find the closest target out of many options. But if that's 4 nearby targets for example, and each can be found by a straight path, running A star 4 times is likely still faster.

Pretty much all A* code at the Internets assumes a single target that you want to reach so the extremely common estimate is distance to that target. The algorithm itself however doesn't demand that, any estimate equal or less than real remaining path-cost will do. So it's fine to write an estimator that computes remaining cost for all your 4 targets and then returns the smallest value. The disadvantage here is that the estimator gets more complicated and deciding you reached a target gets more complicated.

You can however avoid those disadvantages if you modify the problem a bit. Say the initial starting vertex is S, and your four target vertices are D1 to D4 (I use Dx as shorthand of “one of D1, D2, D3, or D4".). Add a new vertex T, and edges with cost 0 from D1..D4 to T. Next, swap direction of all edges but keep the costs as they were. Also swap target vertex with starting vertex. Thus the search starts at T and ends at S. This is a classic A* search. The first step in the path will move to one of the D1..D4 nodes and then it gives the shortest path to S. This is optimal, since the edge from T to any Dx vertex didn't cost anything.

Also, while you swapped direction of the edges, you kept original costs. Those costs are thus valid when you follow the same path from S to Dx in the unmodified graph.

A few practical notes: You don't need to literally swap the direction of the edges in the graph. If you have a way to find incoming edges at a vertex, you can simply follow incoming edges rather than outgoing edges in the implementation. Also, if you look closely at the first iteration of the A* algorithm, you'll find that it deletes the T node and adds all Dx nodes. You can instead write code that does exactly that, and then start the A* algorithm. The algorithm will be one iteration shorter and the initial node will be at one of the Dx vertices.

Best read i've had so far about A star. I did not consider you can do such interesting tricks with A star too. : )

I'll update my claim to ‘Dijsktra is still efficient for a dense set of multiple targets, and ideal if you want to fill the whole graph with a tree of shortest paths from the source.’

Though, i wonder what's the actual performance win if we use your proposed algorithm for 4 targets, vs. running the standard single target algorithm 4 times. I assume they both would do almost the same amount of work?

Calin said:
The problem of using a timer is that you get different results on different computers. All I’m saying is that it’s good to stay alert when you’re repeating an operation tens of thousands of times at one time.

Notice this problem of varying HW power is much smaller than the problem with making assumptions about performance.

Calculating time complexity is useful, and not just an assumption. It shows us a curve relating costs to N, and you will see roughly the same curve also in your actual measurements of the algorithm in practice. You can rely on that, and you can often use time / memory complexity to pick one algorithm over the other before implementing and testing both of them.

But knowing how the curve looks does not tell you anything about the actual runtime in practice. You need to do measurements to find an approximated factor to relate the cost from he curve to the expected real world runtime on actual hardware.
But such attempt only works for a certain range of N. If we increase N, hardware details such as finite cache size can cause a sudden big change of our factor at a certain N. So the real measured curve always depends on HW and is hard to predict.
If you want to know real world perf costs, there is no way around measurements, and using your own computer is the only thing you can do. Knowledge from other, eventually experienced programmers is not better than your measurements and can't replace the need for it.

frob said:
Any constant works, it will become a memory-heavy version of Djikstra's pairwise search.

A star needs more memory than Dijsktra? Why? Is there something like two costs to store, eventually only one involving the heuristic?

This might help me to figure out what was wrong with my quick attempt to upgrade Dijkstra to A star. I've added the heuristic, but memory requirements kept the same. And the found paths were not the shortest.

Edit:

Yes, looking here i see A star adds the heuristic only to the priority used for the queue, but it's ignored for the path length per node.
This is what i got wrong and caused all my confusion. I should be able to make it work now… : )

JoeJ said:
Though, i wonder what's the actual performance win if we use your proposed algorithm for 4 targets, vs. running the standard single target algorithm 4 times. I assume they both would do almost the same amount of work?

Depending on how competitive the choices are and how much explored space they share, somewhere between the same effort and 1 shortest path.

Assuming the graph with the swapped edges. After one iteration you have reached D1, D2, D3, and D4. They all have 0 length traveled (since the cost from the initial starting point to the Dx points is 0). They all get a heuristic value added to order them in the queue.

One of the points with the smallest heuristic gets picked and its path P gets expanded. Depending on the heuristic, the total cost (path length + heuristic) stays the same or grows a little bit. To see, consider the path T → Dx → →… C → N → → → … S. Path from the initial starting point T to the final destination S. We're at C and decided to take the edge to N. The total cost at C is TC_R (real cost from T to C) + CN_H (heuristic cost at edge CN) + NS_H (heuristic cost from N to S). At N we have TC_R + CN_R (real cost edge CN) + NS_H. Effective difference is CN_R - CN_H. If the heuristic cost is equal to the real cost at the edge, total costs are the same. If the heuristic is a proper under-estimate (CN_H < CN_R), total cost rises by CN_R - CN_H.

As long as the other 3 Dx-es have a sufficiently higher estimated cost w.r.t the updated cost of P, A* has no reason to start exploring from another Dx, and thus continues to expand the same path P again (and again and ..). Thus with one of the Dx having a sufficiently lower estimate (ie it's a sufficiently better choice) the other 3 targets are never expanded (they do get checked for not being better of course, every iteration).

If all Dx are at the same distance from S then sure, all paths get expanded since they are basically equally good. Do note however in that case the explored positions are shared. So if paths from, different Dx end up at the same intermediate point, one of them will die (the second one arriving at the point will find its destination already explored and if it doesn't have an improvement, A* doesn't make a new entry in that case.

Yeah makes sense. But i meant we need all 4 completed paths to all 4 goals, no matter which is the closest. (Thinking again about geometry processing, where that's often the case.)

Ah ok. Not sure if you can do that at all with A* , at least I wouldn't know a solution for that just now.

If you start from S, you may be able to re-use real lengths of already explored vertices of the previous search so you only need to re-compute the heuristic value and build a new priority queue. Whether this is worth the effort no idea, probably it varies a lot between cases.

There is also the Floyd Warshall algorithm for computing distance between all point in a graph https://stackoverflow.com/questions/40430571/is-there-an-algorithm-to-find-distance-of-every-node-from-all-other-nodes#40430661​ Never looked at it though.

Alberth said:

If you start from S, you may be able to re-use real lengths of already explored vertices of the previous search so you only need to re-compute the heuristic value and build a new priority queue. Whether this is worth the effort no idea, probably it varies a lot between cases.

There is also the Floyd Warshall algorithm for computing distance between all point in a graph https://stackoverflow.com/questions/40430571/is-there-an-algorithm-to-find-distance-of-every-node-from-all-other-nodes#40430661​ Never looked at it though.

The form ate my comment.

Omae Wa Mou Shindeiru

This topic is closed to new replies.

Advertisement