My a* pathfinding is very slow... (c++)

Started by
28 comments, last by ApochPiQ 7 years ago

VERY slow (unusable) right now.

Profiling it I get:

20x20 (400 nodes) = less than sec
30x30 (900 nodes) = 2s
40x40 (1600 nodes) = 5s
50x50 (2500 nodes) = 20s
60x60 (3600 nodes) = 40s

Can someone see what I did wrong? I've tried to follow the pseudo-code of what I find on the internet. I use simple Manhattan distance for H. Maybe I use priority queue wrongly?

This is the path-find function as it looks now:


int SimplePather::FindPath(const int nStartX, const int nStartY, const int nTargetX, const int nTargetY,
                         const unsigned char* pMap, const int nMapWidth, const int nMapHeight, int* pOutBuffer, const int nOutBufferSize)
{
    static std::priority_queue<simplePatherNode> pq[2]; // list of open nodes. Open nodes are not yet tried
    static int pqi; // pq index (0 or 1)
    simplePatherNode n2(0,0,0,0); // used when checking the neighbors of n
    static int i, j, x, y, x2, y2;
    pqi=0;
    int pathLength=0;

    int mapSize = nMapWidth * nMapHeight;
    std::vector<int> closedMap(mapSize, 0); // closed nodes have been checked
    std::vector<int> openMap(mapSize, 0); // open nodes have not been checked
    std::vector<int> dirMap(mapSize);

        // create the start node and push into map of open nodes
        simplePatherNode n(nStartX, nStartY, 0, 0);
        n.setF(nTargetX, nTargetY);
        pq[pqi].push(n);
        openMap[nStartX+nStartY*nMapWidth]=n.getF(); // mark it on the open nodes map

        // Main path-find loop
        while(!pq[pqi].empty())
        {
            // get the current node w/ the highest priority
            // from the list of open nodes
            n.setAll(pq[pqi].top().getxPos(), pq[pqi].top().getyPos(), pq[pqi].top().getG(), pq[pqi].top().getF());

            x=n.getxPos();
            y=n.getyPos();

            pq[pqi].pop(); // remove the node from the open list
            openMap[x+y*nMapWidth]=0;
            // mark it on the closed nodes map
            closedMap[x+y*nMapWidth]=1;

            // the goal state is reached = finish searching
            if(x==nTargetX && y==nTargetY)
            {
                //hold the temporary node indexes
                std::vector<int> tempNodes(nOutBufferSize);

                // generate the path from target to start
                // by following the directions back to the start
                while(!(x==nStartX&&y==nStartY))
                {
                    // not enough buffer to write our path = fail path-finding
                    if(pathLength>=nOutBufferSize)
                        return -1;

                    j=dirMap[x+y*nMapWidth];
                    tempNodes[pathLength]=x+y*nMapWidth;
                    pathLength++;

                    x+=dx[j];
                    y+=dy[j];
                }

                //invert the buffer since we want to return a buffer with nodes from start to target
                for(int i=0;i<pathLength;i++){
                    pOutBuffer[i]=tempNodes[pathLength-1-i];
                }

                // garbage collection
                for(int i=0;i<2;i++)
                while(!pq[i].empty())
                    pq[i].pop();

                return pathLength;
            }

            // If we are not at the target location, check neighbors in all possible directions
            for(i=0;i<directions;i++)
            {
                x2=x+dx[i]; y2=y+dy[i];

                //check if node is outside map-scope, impassable "terrain type" or already checked
                if(!(x2<0 || x2>nMapWidth-1 || y2<0 || y2>nMapHeight-1 || pMap[x2+y2*nMapWidth]==IMPASSABLE_NODE_ID    || closedMap[x2+y2*nMapWidth]==1))
                {
                    // generate a neighbor node
                    n2.setAll( x2, y2, n.getG(), n.getF());
                    n2.increaseG(i, diagonalAllowed);
                    n2.setF(nTargetX, nTargetY);

                    // if it is not in the open map then add into that
                    if(openMap[x2+y2*nMapWidth]==0)
                    {
                        openMap[x2+y2*nMapWidth]=n2.getF();
                        pq[pqi].push(n2);
                        // mark its parent node direction
                        dirMap[x2+y2*nMapWidth]=(i+directions/2)%directions;
                    }
                    else if(openMap[x2+y2*nMapWidth]>n2.getF())
                    {
                        // update the priority info
                        openMap[x2+y2*nMapWidth]=n2.getF();
                        // update the parent direction info
                        dirMap[x2+y2*nMapWidth]=(i+directions/2)%directions;

                        // replace the node
                        // by emptying one pq to the other one
                        // except the node to be replaced will be ignored
                        // and the new node will be pushed in instead
                        while(!(pq[pqi].top().getxPos()==x2 &&
                            pq[pqi].top().getyPos()==y2))
                        {
                            pq[1-pqi].push(pq[pqi].top());
                            pq[pqi].pop();
                        }
                        pq[pqi].pop(); // remove the wanted node

                        // empty the larger size pq to the smaller one
                        if(pq[pqi].size()>pq[1-pqi].size())
                            pqi=1-pqi;
                        while(!pq[pqi].empty())
                        {
                            pq[1-pqi].push(pq[pqi].top());
                            pq[pqi].pop();
                        }
                        pqi=1-pqi;
                        pq[pqi].push(n2); // add the better node instead
                    }
                }
            }
        }
        // garbage collection
        for(int i=0;i<2;i++)
        while(!pq[i].empty())
            pq[i].pop();

        return -1; // no route was found (couldn't reach the target node), indicate this with -1
}
Advertisement

Have you tried running your code with a profiler attached to find any glaring hotspots? Are you compiling your code in release mode?

  • Why do you use two priority queues, and do complex maintenance to them? A queue backed by a std::vector grows automatically as much as your graph requires.
  • What is a simplePatherNode exactly, and how are they compared? Post code.
    I would expect it to contain a "primary key" (in this case, x and y coordinates are ok) and one path length, but it has at least four fields. What are F and G?
  • If you store, for each map cell, the cost of reaching it from the starting location (defaulting to infinitely high, and repeatedly lowered when search finds the first path to that node and then shorter paths) you don't need dirMap: when you reach the destination and finish the open queue, go from the far end of the path to the neighbour that is closest to the starting node, breaking ties as you please.

Omae Wa Mou Shindeiru

That first 'n.setF(nTargetX, nTargetY);' line looks wrong. That first node can't have an F value until you've calculated the G value (which is zero) and the H value (which is the estimate to the target). But, if setF is actually setting the H value and then calculating F, fine, but it's a misleadingly-named function.

The other culprit might be that you're sorting the priority queue wrongly. But we don't have that code here.

The obvious debugging step for you should have been to log each node that gets pulled off the open queue, so you can see if it's doing what you expect or not. Run it on a trivial 4x4 grid with logging and you should be able to draw it on graph paper and see whether your expectations are being met or not.

Not sure if this is related to the performance, but one thing I noticed is creation of nodes as local stack variables and adding them to the priority queue. The PQ stores references to objects, not copies of the object. So, if you insert multiple copies of n2 (containing different values at each insert) into a PQ, the PQ would just contain multiple references to the same n2 object (or multiple identical objects, from the PQ's view), which I suspect interferes with the behaviour of the PQ. Can you test this by creating node objects on the heap?

It should be fine for the priority queue to contain actual nodes (if they're small), and that should make for reasonably fast comparisons, but this code is very confusing in having an 'openMap' that is actually a vector that in turn actually wants to be a bitset, and in having other properties of the node stored outside the node (e.g. in dirmap). There's also no reason for 2 priority queues. Or for variables to be static.

Usually slow A* is due to poor data structure choice, but your data structures look good to me.

The only thing that seems like it might be problematic is your update-cost-and-fix-the-priority-queue code. How often is that being hit? In the A* pathfinder I wrote at work, I found that reprioritization after updating a node cost only occurs 0.2% of the time, so I didn't bother optimizing the operation to rebalance the binary heap. If it's happening all the time for you, then that could lead to a lot of performance degradation.

I would definitely run a profiler on it, though.

Some discussion about fixing the order in a priority_queue to change costs here: http://stackoverflow.com/questions/5810190/how-to-tell-a-stdpriority-queue-to-refresh-its-ordering

I also agree with lisphacker that the priority queue elements themselves need to be as small as possible to reduce memory access when the queue is modified. Ideally I try to keep just two values - the sort key and an index/pointer to the larger node struct/class. It looks like your priority queue is keeping a copy of X,Y,G, and F. All you should need are the F and a pointer/index to the X,Y,G (which can then remain stationary in memory).

I leave the actual cost in the priority queue itself so that modifying the queue is slightly more cache-friendly.

The last thing to note is that if you are running in debug mode, I've noticed that std's binary heap code does a LOT of extra checks which slow things down. I haven't used priority_queue, but it could be doing it as well. Definitely make sure you're testing with a fully optimized build.

The last thing to note is that if you are running in debug mode, I've noticed that std's binary heap code does a LOT of extra checks which slow things down. I haven't used priority_queue, but it could be doing it as well. Definitely make sure you're testing with a fully optimized build.

If you're not on Windows using the Microsoft compiler, this isn't a thing. If you are in that environment, it's a Very Serious thing (ie. gives you lots of wonderful protection at the cost of really slow and inefficient implementation in a totally non-standards-conforming way that might, for example, turn a guaranteed O(log n) operation into an O(n²) operation with a large constant).

Stephen M. Webb
Professional Free Software Developer

Seriously, stop using static. Especially around pathfinding. Even if it doesn't cause a bug it's already costing you performance and it means you can't safely thread the function.

The PQ stores references to objects, not copies of the object.


This is not correct. STL containers in general cannot contain references. std::priority_queue is not an exception to this.

void hurrrrrrrr() {__asm sub [ebp+4],5;}

There are ten kinds of people in this world: those who understand binary and those who don't.

Even if it doesn't cause a bug it's already costing you performance

Why does static incur a performance hit?

Hello to all my stalkers.

This topic is closed to new replies.

Advertisement