• Create Account

# [C++] Vector faster than priority-queue in A*

Old topic!

Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

18 replies to this topic

### #1gnomgrol  Members   -  Reputation: 698

Like
0Likes
Like

Posted 11 March 2014 - 04:21 PM

Well met!

I'm still going on with optimising my A*-pathing. Everything is working fine now, only thing I need to do is make it faster.

After a lot of reading and testing, I found the only bottleneck was the useage of a std::vector for the openlist, making the retrieval of the lowest cost node expensive. So I went ahead and replaced it with a priority-queue, beeing sure that everything would be fine afterwards.

I was fairly suprised when I noticed that the speed had not changed by much. So I tried profiling again, and found that the bottleneck has shifted to one line:

openList.push(*map[x][y]);

My priority-queue is defined as

priority_queue<node> openList;

and the operator

bool operator>(const node i) const{
return costF > i.costF;
}  

I can't really see why the push exactly could be the problem, others than the operator beeing bad. I also tried not pushing the nodes itself, but a struct holding only the important infos, but that did nothing.

Note that I am pushing nodes when they change their F-cost, too. Because priority-queues are hard to reorder (and in O(n)). When poping the top, ignore it if it isn't actually on the openlist (flag in node).

Edited by gnomgrol, 11 March 2014 - 04:30 PM.

### #2ApochPiQ  Moderators   -  Reputation: 20439

Like
9Likes
Like

Posted 11 March 2014 - 04:56 PM

Any particular reason your operator > doesn't take the parameter as a const reference (versus a const value, as posted)?

Wielder of the Sacred Wands

### #3KnolanCross  Members   -  Reputation: 1856

Like
0Likes
Like

Posted 11 March 2014 - 05:39 PM

I'm no C++ pro but I believe that not using a reference in your operator would cause your program to copy the node.

Also, have you tried using a vector with make_heap from algorithm?

Currently working on a scene editor for ORX (http://orx-project.org), using kivy (http://kivy.org).

### #4TiagoCosta  Crossbones+   -  Reputation: 3494

Like
0Likes
Like

Posted 11 March 2014 - 06:13 PM

Storing the nodes in a separate array and storing pointers or indices because the vector might be rellocated (to the objects stored in the vector) in the priority_queue might improve performance since pushing values into the priority_queue will require moving them around and copying pointers is faster than copying the whole structure.

Edited by TiagoCosta, 11 March 2014 - 06:15 PM.

### #5Bregma  Crossbones+   -  Reputation: 7826

Like
1Likes
Like

Posted 11 March 2014 - 07:14 PM

Also, have you tried using a vector with make_heap from algorithm?

Did you know that's how std::priority_queue is implemented?

Stephen M. Webb
Professional Free Software Developer

### #6KnolanCross  Members   -  Reputation: 1856

Like
0Likes
Like

Posted 12 March 2014 - 10:50 AM

Also, have you tried using a vector with make_heap from algorithm?

Did you know that's how std::priority_queue is implemented?

He said:

"I found the only bottleneck was the useage of a std::vector for the openlist, making the retrieval of the lowest cost node expensive. So I went ahead and replaced it with a priority-queue, beeing sure that everything would be fine afterwards."

What I meant was that he could try using the make_heap, instead of replacing it with a priority_queue (probably I wasn't clear enough though).

Currently working on a scene editor for ORX (http://orx-project.org), using kivy (http://kivy.org).

### #7lunkhound  Members   -  Reputation: 1106

Like
0Likes
Like

Posted 12 March 2014 - 06:35 PM

I did a similar comparison a few years back, only I wasn't using any STL, my code was running on Cell SPUs.  What I found was that keeping the open nodes in plain old unsorted arrays (and optimizing the search for the lowest cost node using SIMD intrinsics) was faster than using a priority queue (without SIMD optimization).

It depends on how many open nodes you tend to have and how frequently the cost of nodes needs to be updated.  In my case I found that node costs were updated much more frequently than removing a node from the open list, so the overhead of maintaining the order in the priority queue was outweighing the benefit of having it.  I seem to remember having around ~200 nodes on the open list on average.  Those SPUs were really good at brute force.

In most cases though I would expect the priority queue to win out.

### #8ferrous  Members   -  Reputation: 5075

Like
0Likes
Like

Posted 12 March 2014 - 07:10 PM

It depends on how many open nodes you tend to have and how frequently the cost of nodes needs to be updated.  In my case I found that node costs were updated much more frequently than removing a node from the open list, so the overhead of maintaining the order in the priority queue was outweighing the benefit of having it.  I seem to remember having around ~200 nodes on the open list on average.  Those SPUs were really good at brute force.

In most cases though I would expect the priority queue to win out.

I'm curious, but why were you updating the cost of nodes so often?  And as a workaround, did you try just adding the node again, with a different cost.

### #9lunkhound  Members   -  Reputation: 1106

Like
0Likes
Like

Posted 13 March 2014 - 12:31 AM

It depends on how many open nodes you tend to have and how frequently the cost of nodes needs to be updated.  In my case I found that node costs were updated much more frequently than removing a node from the open list, so the overhead of maintaining the order in the priority queue was outweighing the benefit of having it.  I seem to remember having around ~200 nodes on the open list on average.  Those SPUs were really good at brute force.

In most cases though I would expect the priority queue to win out.

I'm curious, but why were you updating the cost of nodes so often?  And as a workaround, did you try just adding the node again, with a different cost.

Well, as I said this is from a few years ago, so I can't recall all of the details.  It may be that the frequency of updating node costs was simply much more than I had expected, and not more frequent than node removals.  With the priority queue, it needs to be reordered any time a node is added, removed or has it's cost updated.  So all three of those operations are O( log N ).  With a simple unsorted array, adding a node or updating the cost is O( 1 ), and removing the best node is O( N ).

### #10gnomgrol  Members   -  Reputation: 698

Like
0Likes
Like

Posted 13 March 2014 - 04:20 AM

When I saw ApochPIQs answer, I hit my head against a wall. Of corse that was the problem!
... Atleast I thought so. For some reason, when the operator is defined like this

bool node::operator<(const node *right) const{
return (costF < right->costF);
}

it is beeing completly ignored by the priority queue. Deleting it doesn't give me ">operator not defined".

So, I switched back to using
struct nodeCostCompare{
bool operator() (const node *a, const node *b)const
{
return a->costF > b->costF;
}
};
priority_queue<node*, vector<node*>, nodeCostCompare> openList;  
which is actually beeing called.

The  problem is that this is not faster than my old code. I guess it is because, everytime I push, a new nodeCostCompare is created, right?
So I have to find a way of using the overloaded >operator, but as I stated, I can't figure out why it is not beeing called when using reference instead of value to compare (and what the hell else it is using).

Storing the nodes as pointers leaves me with the problem that I can't update the values of them after I pushed them into the queue - I bypassed the problem by just pushing the node again, and when poping, checked if they were on the openlist.
But I can't do that with pointers - as soon as I try to update a value on a node that I pushed onto the openlist, I end up with the "invalid heap" runtime error.

For the moment, I just took out the code that was updateing nodes - now the path is slightly incorrect.

If you had some more thoughts on this, I'd highly apprechiate it to hear them!

EDIT: Okey, I figured out the problem with this one: The problem is, the pointer addresses are sorted. The only way to avoid this is a class that 'compares the pointers'.
So this is no option either.

Edited by gnomgrol, 13 March 2014 - 05:35 AM.

### #11Mona2000  Members   -  Reputation: 1777

Like
1Likes
Like

Posted 13 March 2014 - 05:02 AM

... Atleast I thought so. For some reason, when the operator is defined like this

bool node::operator<(const node *right) const{
return (costF < right->costF);
}

it is beeing completly ignored by the priority queue. Deleting it doesn't give me ">operator not defined".

const node * is a pointer, not a reference. const node & is a reference.

### #12gnomgrol  Members   -  Reputation: 698

Like
1Likes
Like

Posted 13 March 2014 - 05:11 AM

You are right ... I shouldn't try coding this late at night.

But changing it didn't increase the speed of openList.push() at all.

So, we are back were we began

EDIT: I also tried what TiagoCosta mentioned, but it didn't increase the speed by a noticable amount.

Edited by gnomgrol, 13 March 2014 - 06:34 AM.

### #13KnolanCross  Members   -  Reputation: 1856

Like
1Likes
Like

Posted 13 March 2014 - 11:34 AM

Maybe your testing method is flawed.

How fast/slow is it? Have you compared with other implementations? What are you using to represent the graph (a grid, a navigation mesh, something else)?
Have you tried with several sizes of maps?

Edited by KnolanCross, 13 March 2014 - 11:43 AM.

Currently working on a scene editor for ORX (http://orx-project.org), using kivy (http://kivy.org).

### #14ankhd  Members   -  Reputation: 2224

Like
0Likes
Like

Posted 13 March 2014 - 06:51 PM

Hello.

Have you tryed a std::list for your Prioityqueue this way you can incert the lowest cost it works fast for me, my maps are a grid 1000 X 1000 plus cells

std::list<Cell *> Priorityqueue;//ordered list

### #15gnomgrol  Members   -  Reputation: 698

Like
0Likes
Like

Posted 14 March 2014 - 11:21 AM

For testing, I use

clock();
... Not the best, I know, but it does the job and doesn't seem to have a big negative impact on the performance.

I did compare with other implementations, but they were all much slower than mine. The graph is represented by a 2D array of nodes.

I went through several mapsizes, currently 256x256 is the only one that performs close to an acceptable speed, but still is kinda to slow when a player wants to path over the full map in one go(2-3 sec). On short or midrange paths it seems ok though.

I'm not exactly sure how NPC-movement is handled in onlinegames. They have to path too, so the server would have to comptue their paths - and that rather frequently. That really would impact that servers performance, right?

@ankhd

I don't really understand what you want to say. You suppose a std::list is faster than a priority-queue in terms of pushing? How fast do you manage to path over a 1kx1k grid from edge to edge?

### #16tobiasjs  Members   -  Reputation: 148

Like
1Likes
Like

Posted 14 March 2014 - 04:07 PM

Vectors have more cache coherent access patterns than heaps. Also if your help contains objects rather than pointers, then swaps are expensive operations. Unless your nodes are simple, your heap should definitely consist of pointers to nodes. With a bit of magic you could also wrap the pointers so that they updated the nodes to keep track of their position of their pointer in the heap, to make updates more efficient.

To properly understand whether a heap is justified you need to collect some statistics about the distribution of counts of open nodes at each exploration step.

### #17Pink Horror  Members   -  Reputation: 2447

Like
1Likes
Like

Posted 16 March 2014 - 06:34 PM

But I can't do that with pointers - as soon as I try to update a value on a node that I pushed onto the openlist, I end up with the "invalid heap" runtime error.

I'm sorry if this is a stupid question, but are you measuring performance in a debug build? The push function is going to be much more expensive in a build with the valid heap check than without it.

Also, how do you find the node you want to update in the priority queue? Are you still just pushing it twice?

Unfortunately, I wouldn't use std::priority_queue because it isn't really built for updating anything in it. However, you should be able to program your own heap without much work with help from the STL.

Also, how are you storing your nodes? What does that class/struct look like? I see it might be the same thing as in your map. You might want to streamline it as much as possible. I see you said you've tried that, but it's hard to know exactly what you tried and if there were any flaws based on this limited amount of information.

Do you have the ability to use a profiler instead of clock()?

### #18gnomgrol  Members   -  Reputation: 698

Like
0Likes
Like

Posted 21 March 2014 - 03:16 AM

Hello, sorry for my late reply.

I was profiling in a debug build, I didn't think of that beeing able to influence the performance. Switching it improved the speed by like 25%.

I never used a profiler before, but if you are able to point me to a good one, I'd be willing to give it a try.

I'm still pushing twice instead of updateing the nodes. I take it you suggest updating rather than doing that?

I'm storing the nodes in a simple 2D array

node *map[size][size];

Here is the .h of the node class:



class node{
public:
node *parent;
int2 pos;
D3DXVECTOR2 posInNode;
bool isOpen;

bool isOnClosedList, isOnOpenList; // not using closedlist, and isOnOpenList lets me skip the sorting through the openlist when I want to see if a node is on it
int costF, costG, costH;

bool pushedOntoQueue;

public:
node(){
resetData();
isOpen = true;
pos = int2(0, 0);
}
node(int2 posIn){
pos = posIn;
isOpen = true;
resetData();
}
~node(){}

// reset for reusing
void resetData(){
posInNode = D3DXVECTOR2(-1, -1);
parent = NULL;
costF = costG = costH = 0;
isOnClosedList = false;
isOnOpenList = false;

pushedOntoQueue = false;
}
};



This is what I use in the priorityqueue instead of the full Node:

struct nodePH{
int2 pos;
int costF;
bool operator<(const nodePH &right) const{
return (costF > right.costF);
}
};


### #19Waterlimon  Crossbones+   -  Reputation: 4318

Like
0Likes
Like

Posted 21 March 2014 - 07:30 AM

I use Very Sleepy for my profiling, it essentially looks what function you are in (using the stack?) many times a second to get an approximate of where you spend the most time. You then get percentages and times for the most visited code.

o3o

Old topic!

Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

PARTNERS