• Announcements

    • khawk

      Download the Game Design and Indie Game Marketing Freebook   07/19/17

      GameDev.net and CRC Press have teamed up to bring a free ebook of content curated from top titles published by CRC Press. The freebook, Practices of Game Design & Indie Game Marketing, includes chapters from The Art of Game Design: A Book of Lenses, A Practical Guide to Indie Game Marketing, and An Architectural Approach to Level Design. The GameDev.net FreeBook is relevant to game designers, developers, and those interested in learning more about the challenges in game development. We know game development can be a tough discipline and business, so we picked several chapters from CRC Press titles that we thought would be of interest to you, the GameDev.net audience, in your journey to design, develop, and market your next game. The free ebook is available through CRC Press by clicking here. The Curated Books The Art of Game Design: A Book of Lenses, Second Edition, by Jesse Schell Presents 100+ sets of questions, or different lenses, for viewing a game’s design, encompassing diverse fields such as psychology, architecture, music, film, software engineering, theme park design, mathematics, anthropology, and more. Written by one of the world's top game designers, this book describes the deepest and most fundamental principles of game design, demonstrating how tactics used in board, card, and athletic games also work in video games. It provides practical instruction on creating world-class games that will be played again and again. View it here. A Practical Guide to Indie Game Marketing, by Joel Dreskin Marketing is an essential but too frequently overlooked or minimized component of the release plan for indie games. A Practical Guide to Indie Game Marketing provides you with the tools needed to build visibility and sell your indie games. With special focus on those developers with small budgets and limited staff and resources, this book is packed with tangible recommendations and techniques that you can put to use immediately. As a seasoned professional of the indie game arena, author Joel Dreskin gives you insight into practical, real-world experiences of marketing numerous successful games and also provides stories of the failures. View it here. An Architectural Approach to Level Design This is one of the first books to integrate architectural and spatial design theory with the field of level design. The book presents architectural techniques and theories for level designers to use in their own work. It connects architecture and level design in different ways that address the practical elements of how designers construct space and the experiential elements of how and why humans interact with this space. Throughout the text, readers learn skills for spatial layout, evoking emotion through gamespaces, and creating better levels through architectural theory. View it here. Learn more and download the ebook by clicking here. Did you know? GameDev.net and CRC Press also recently teamed up to bring GDNet+ Members up to a 20% discount on all CRC Press books. Learn more about this and other benefits here.
Sign in to follow this  
Followers 0
Anand Baumunk

[C++] Vector faster than priority-queue in A*

18 posts in this topic

Well met!

 

I'm still going on with optimising my A*-pathing. Everything is working fine now, only thing I need to do is make it faster.

After a lot of reading and testing, I found the only bottleneck was the useage of a std::vector for the openlist, making the retrieval of the lowest cost node expensive. So I went ahead and replaced it with a priority-queue, beeing sure that everything would be fine afterwards.

 

I was fairly suprised when I noticed that the speed had not changed by much. So I tried profiling again, and found that the bottleneck has shifted to one line:

openList.push(*map[x][y]);

My priority-queue is defined as

priority_queue<node> openList;

and the operator

bool operator>(const node i) const{
    return costF > i.costF;
}  

I can't really see why the push exactly could be the problem, others than the operator beeing bad. I also tried not pushing the nodes itself, but a struct holding only the important infos, but that did nothing.

 

Note that I am pushing nodes when they change their F-cost, too. Because priority-queues are hard to reorder (and in O(n)). When poping the top, ignore it if it isn't actually on the openlist (flag in node). 

 

I have no idea what could be the problem with it - please help me out smile.png

Thanks in advance!

Edited by gnomgrol
0

Share this post


Link to post
Share on other sites

I'm no C++ pro but I believe that not using a reference in your operator would cause your program to copy the node.

 

Also, have you tried using a vector with make_heap from algorithm?

0

Share this post


Link to post
Share on other sites

Storing the nodes in a separate array and storing pointers or indices because the vector might be rellocated (to the objects stored in the vector) in the priority_queue might improve performance since pushing values into the priority_queue will require moving them around and copying pointers is faster than copying the whole structure.

Edited by TiagoCosta
0

Share this post


Link to post
Share on other sites

Also, have you tried using a vector with make_heap from algorithm?

Did you know that's how std::priority_queue is implemented?

1

Share this post


Link to post
Share on other sites

 

Also, have you tried using a vector with make_heap from algorithm?

Did you know that's how std::priority_queue is implemented?

 

 

He said:

 

"I found the only bottleneck was the useage of a std::vector for the openlist, making the retrieval of the lowest cost node expensive. So I went ahead and replaced it with a priority-queue, beeing sure that everything would be fine afterwards."

 

What I meant was that he could try using the make_heap, instead of replacing it with a priority_queue (probably I wasn't clear enough though).

0

Share this post


Link to post
Share on other sites

I did a similar comparison a few years back, only I wasn't using any STL, my code was running on Cell SPUs.  What I found was that keeping the open nodes in plain old unsorted arrays (and optimizing the search for the lowest cost node using SIMD intrinsics) was faster than using a priority queue (without SIMD optimization).

 

It depends on how many open nodes you tend to have and how frequently the cost of nodes needs to be updated.  In my case I found that node costs were updated much more frequently than removing a node from the open list, so the overhead of maintaining the order in the priority queue was outweighing the benefit of having it.  I seem to remember having around ~200 nodes on the open list on average.  Those SPUs were really good at brute force.

 

In most cases though I would expect the priority queue to win out.

 

0

Share this post


Link to post
Share on other sites

It depends on how many open nodes you tend to have and how frequently the cost of nodes needs to be updated.  In my case I found that node costs were updated much more frequently than removing a node from the open list, so the overhead of maintaining the order in the priority queue was outweighing the benefit of having it.  I seem to remember having around ~200 nodes on the open list on average.  Those SPUs were really good at brute force.

 

In most cases though I would expect the priority queue to win out.

 

I'm curious, but why were you updating the cost of nodes so often?  And as a workaround, did you try just adding the node again, with a different cost.

0

Share this post


Link to post
Share on other sites

 

It depends on how many open nodes you tend to have and how frequently the cost of nodes needs to be updated.  In my case I found that node costs were updated much more frequently than removing a node from the open list, so the overhead of maintaining the order in the priority queue was outweighing the benefit of having it.  I seem to remember having around ~200 nodes on the open list on average.  Those SPUs were really good at brute force.

 

In most cases though I would expect the priority queue to win out.

 

I'm curious, but why were you updating the cost of nodes so often?  And as a workaround, did you try just adding the node again, with a different cost.

 

 

Well, as I said this is from a few years ago, so I can't recall all of the details.  It may be that the frequency of updating node costs was simply much more than I had expected, and not more frequent than node removals.  With the priority queue, it needs to be reordered any time a node is added, removed or has it's cost updated.  So all three of those operations are O( log N ).  With a simple unsorted array, adding a node or updating the cost is O( 1 ), and removing the best node is O( N ).

0

Share this post


Link to post
Share on other sites
Thanks for all the answers!
When I saw ApochPIQs answer, I hit my head against a wall. Of corse that was the problem! 
... Atleast I thought so. For some reason, when the operator is defined like this
 
bool node::operator<(const node *right) const{
return (costF < right->costF);
}
it is beeing completly ignored by the priority queue. Deleting it doesn't give me ">operator not defined".
 
So, I switched back to using 
struct nodeCostCompare{
bool operator() (const node *a, const node *b)const
{
return a->costF > b->costF;
}
};
priority_queue<node*, vector<node*>, nodeCostCompare> openList;  
which is actually beeing called. 
 
The  problem is that this is not faster than my old code. I guess it is because, everytime I push, a new nodeCostCompare is created, right?
So I have to find a way of using the overloaded >operator, but as I stated, I can't figure out why it is not beeing called when using reference instead of value to compare (and what the hell else it is using).
 
Storing the nodes as pointers leaves me with the problem that I can't update the values of them after I pushed them into the queue - I bypassed the problem by just pushing the node again, and when poping, checked if they were on the openlist. 
But I can't do that with pointers - as soon as I try to update a value on a node that I pushed onto the openlist, I end up with the "invalid heap" runtime error. 
 
For the moment, I just took out the code that was updateing nodes - now the path is slightly incorrect.
 
If you had some more thoughts on this, I'd highly apprechiate it to hear them!
 
 
EDIT: Okey, I figured out the problem with this one: The problem is, the pointer addresses are sorted. The only way to avoid this is a class that 'compares the pointers'.
So this is no option either.
Edited by gnomgrol
0

Share this post


Link to post
Share on other sites

... Atleast I thought so. For some reason, when the operator is defined like this

 
bool node::operator<(const node *right) const{
return (costF < right->costF);
}
it is beeing completly ignored by the priority queue. Deleting it doesn't give me ">operator not defined".

const node * is a pointer, not a reference. const node & is a reference.

1

Share this post


Link to post
Share on other sites

You are right ... I shouldn't try coding this late at night.

But changing it didn't increase the speed of openList.push() at all.

So, we are back were we began sad.png

 

EDIT: I also tried what TiagoCosta mentioned, but it didn't increase the speed by a noticable amount.

Edited by gnomgrol
1

Share this post


Link to post
Share on other sites

Maybe your testing method is flawed.

 

How fast/slow is it? Have you compared with other implementations? What are you using to represent the graph (a grid, a navigation mesh, something else)?
Have you tried with several sizes of maps?

Edited by KnolanCross
1

Share this post


Link to post
Share on other sites

Hello.

Have you tryed a std::list for your Prioityqueue this way you can incert the lowest cost it works fast for me, my maps are a grid 1000 X 1000 plus cells

std::list<Cell *> Priorityqueue;//ordered list

0

Share this post


Link to post
Share on other sites

For testing, I use

clock();
... Not the best, I know, but it does the job and doesn't seem to have a big negative impact on the performance. 

I did compare with other implementations, but they were all much slower than mine. The graph is represented by a 2D array of nodes.

I went through several mapsizes, currently 256x256 is the only one that performs close to an acceptable speed, but still is kinda to slow when a player wants to path over the full map in one go(2-3 sec). On short or midrange paths it seems ok though.

 

I'm not exactly sure how NPC-movement is handled in onlinegames. They have to path too, so the server would have to comptue their paths - and that rather frequently. That really would impact that servers performance, right?

 

@ankhd 

I don't really understand what you want to say. You suppose a std::list is faster than a priority-queue in terms of pushing? How fast do you manage to path over a 1kx1k grid from edge to edge?

0

Share this post


Link to post
Share on other sites
Vectors have more cache coherent access patterns than heaps. Also if your help contains objects rather than pointers, then swaps are expensive operations. Unless your nodes are simple, your heap should definitely consist of pointers to nodes. With a bit of magic you could also wrap the pointers so that they updated the nodes to keep track of their position of their pointer in the heap, to make updates more efficient.

To properly understand whether a heap is justified you need to collect some statistics about the distribution of counts of open nodes at each exploration step.
1

Share this post


Link to post
Share on other sites

But I can't do that with pointers - as soon as I try to update a value on a node that I pushed onto the openlist, I end up with the "invalid heap" runtime error.


I'm sorry if this is a stupid question, but are you measuring performance in a debug build? The push function is going to be much more expensive in a build with the valid heap check than without it.

Also, how do you find the node you want to update in the priority queue? Are you still just pushing it twice?

Unfortunately, I wouldn't use std::priority_queue because it isn't really built for updating anything in it. However, you should be able to program your own heap without much work with help from the STL.

Also, how are you storing your nodes? What does that class/struct look like? I see it might be the same thing as in your map. You might want to streamline it as much as possible. I see you said you've tried that, but it's hard to know exactly what you tried and if there were any flaws based on this limited amount of information.

Do you have the ability to use a profiler instead of clock()?
1

Share this post


Link to post
Share on other sites

Hello, sorry for my late reply.

 

I was profiling in a debug build, I didn't think of that beeing able to influence the performance. Switching it improved the speed by like 25%. 

I never used a profiler before, but if you are able to point me to a good one, I'd be willing to give it a try.

 

I'm still pushing twice instead of updateing the nodes. I take it you suggest updating rather than doing that?

 

I'm storing the nodes in a simple 2D array

node *map[size][size];

Here is the .h of the node class:

 
 
class node{
public:
node *parent;
int2 pos;
D3DXVECTOR2 posInNode;
bool isOpen;
 
bool isOnClosedList, isOnOpenList; // not using closedlist, and isOnOpenList lets me skip the sorting through the openlist when I want to see if a node is on it
int costF, costG, costH;
 
bool pushedOntoQueue;
 
public:
node(){
resetData();
isOpen = true;
pos = int2(0, 0);
}
node(int2 posIn){
pos = posIn;
isOpen = true;
resetData();
}
~node(){}
 
// reset for reusing
void resetData(){
posInNode = D3DXVECTOR2(-1, -1);
parent = NULL;
costF = costG = costH = 0;
isOnClosedList = false;
isOnOpenList = false;
 
pushedOntoQueue = false;
}
};
 

This is what I use in the priorityqueue instead of the full Node:

struct nodePH{
int2 pos;
int costF;
bool operator<(const nodePH &right) const{
return (costF > right.costF);
}
};
0

Share this post


Link to post
Share on other sites

I use Very Sleepy for my profiling, it essentially looks what function you are in (using the stack?) many times a second to get an approximate of where you spend the most time. You then get percentages and times for the most visited code.

0

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0