Binary heap vs list A* - my humble performance comparison [updated with JPS]

Fernando Coelho · 2014-02-15T18:07:55

*** UPDATED *** Added JPS information, I added some information about the implementation in another post, be sure to check it. Quite some time ago I released a C implementation of the A* algorithm using grids as the graph and a LinkedList for the OpenList, the topic can be found here: http://www.gamedev.net/topic/632734-a-c-implementation-code/ After that I got really busy with a lot of different work and couldn't really touch the code for quite some time. Recently I managed to put some work on it and implement a version using a binary heap for the open list, as well as a version that uses memory pooling. The pool would increase its size by 4096 * struct size each time it run out of elements. My objective with this post is to show the performance gain you can expect to achieve by optimizing an A* implementation, so that people can decide if they should optimize their own. I won't release the code as of now because it is still a bit ugly and it depends on a container library that I use (it has no package yet, so people would need to compile themselves). Tests Info: The tests were run on a core 2 duo 8400 3.00GHz, the code runs on a single thread. The OS used was Ubuntu 12.04 32 bits. All the times are in microseconds (10^6 = 1s). Test on a random generated 64x64 map. List implementation: 235. Heap implementation: 161. Pool + Heap implementation: (first path) 124 | (following paths) 105. JPS + Pool + Heap: (first path) 118 | (following paths) 117. Test on a random generated 640 x 640 map: List implementation: 322823. Heap implementation: 2006. Pool + Heap implementation: (first path) 1823 | (following paths) 1652. JPS + Pool + Heap: (first path) 1724 | (following paths) 1542. Test on a random generated 3600 x 1800 map: List implementation: 8075664 Heap implementation: 131072. Pool + Heap implementation: (first path) 126942 | (following paths) 124395. JPS + Pool + Heap: (first path) 7218 | (following paths) 6321. The results are pretty much what I would expect them to be, as the path size increases the heap implementation will get a lot faster than a list implementation. The pooling advantage isn't really that big as the path size increases, but it does increase your performance. My conclusion would be that optimization would be important only if your paths can get big. I used the list based version on a prototype dungeon game (something like a gauntlet), the monsters would only aggro players up to 20 squares distance, and the performance was not a issue at all. Comments, questions and suggestions are always welcome.

Artificial Intelligence Programming

Started by KnolanCross January 06, 2014 09:32 PM

14 comments, last by Pink Horror 10 years, 2 months ago

samoth

9,833

January 11, 2014 01:18 PM

On a 640x640 map your method was 195 times faster. But on a 3600x1800 it was only 65 times faster. Do you know why?

See my above post.

Definitively related to memory access pattern (cache effect). This is not only visible from the discrepancy with 3600x1800 but also with the tiny 64x64 map.

For the tiny map, everything fits into the cache, so the list approach is still very fast (only 2x difference, almost certainly due to algorithmic difference). The large map benefits both from algorithmic difference and from a more cache-friendly access pattern, allowing most or all of the data to be cached when accessed in the heap implementation. For the huge map, the heap is having too many "holes" in the access pattern to be very cache friendly, so this degenerates to the same "kind of random access" factor as with the list implementation. There is however still an algorithmic advantage, so it still runs a lot faster (only not that much faster).

KnolanCross

1,974

Author

January 14, 2014 05:27 PM

Just want to add this info in case anyone got here using search in the future:

I said you can't use a custom container, but I was wrong. If you are using C++ you can use a vector as open list by using the make_heap from algorithm. It works, but I have no idea of the performance.

Currently working on a scene editor for ORX (http://orx-project.org), using kivy (http://kivy.org).

wodinoneeye

1,691

January 15, 2014 04:52 AM

I remember helping some guy on USENET something like 10 years ago do improvements on his A* and he came up with use of a customized HeapQ to optimize significantly. It was similarly used with a fixed grid network. I recall recommending doing pointer math directly (the list 'nodes' in a statically allocated pool) in several of the operations and making the data as tight as possible (packing and smaller data types, conbined flags, etc..) to help with the caching issues (and other things like using a closed marked border.to simplify neighbor candidate if-then logic).

I havent worked on this stuff for a while but recall that most of the Unix compilers DONT have a data 'pack' implementation (or one that does anything) which would lose some of the significant optimizations.

Caches are bigger now (?) but with maps of the size mentioned (3600 x 1800) which is much larger that the ones we we playing with (~1Kx1K) you still will run into alot of misses.

I wonder at what use there would be with maps of that size without also considering implementing a hierarchical system.

--------------------------------------------[size="1"]Ratings are Opinion, not Fact

KnolanCross

1,974

Author

January 20, 2014 11:01 AM

I adapted the JPS from jumper ( https://github.com/Yonaba/Jumper/blob/master/jumper/search/jps.lua ) to my C implementation. First I must say that this is not a 100% fair comparisson because jumper's algorithm allows diagonal movement even it there is a wall next to the current node:

Here is a partion of the small map to illustrate the difference:


000....00000........x000000......000000000....0000000000......00
0000..000000........x00000........000000000....0000000000.....00
000000000000.........x0000.........00..00000...00000000000....00
000000000000.........x00000..............000...000000000000...00

In my original implementation the movent from the second to the third line is not allowed. A pretty small detail, considerering I am more interested in the performance times.

So here they are, all the results are for JPS with memory pooling and heap for open list. Again, times are in microsseconds (10^6 = 1s).

Test on a random generated 64x64 map.

First path 118. Following paths: 117.

Test on a random generated 640 x 640 map:

First path 1724. Following paths: 1542.

Test on a random generated 3600 x 1800 map:

First path 7218. Following paths: 6321.

So there it is. JPS is, indeed, incredibly fast, still I need to realize how to add the diagonal move restriction.

Currently working on a scene editor for ORX (http://orx-project.org), using kivy (http://kivy.org).

ferrous

6,164

January 31, 2014 06:23 AM

Another major limitation of Jump Point Search is its inability to deal with different move costs per tile. For some games, that's perfectly acceptable (Like Bioware's Infinity Engine games), but for other games, like the Civ series, with mountains, and whatnot, it is not.

KnolanCross

1,974

Author

January 31, 2014 01:44 PM

@ferrus

The jumper implementation actually comes with a "clearance" that allows you to ignore nodes of determined cost. If you have few variation you can use different clearances to calculate a few paths and peek the cheapest one.

Currently working on a scene editor for ORX (http://orx-project.org), using kivy (http://kivy.org).

polyfrag

2,504

February 15, 2014 05:02 AM

The only thing left now is to write it in assembler (heh), multithread it, or do it on GPU.

Pink Horror

2,459

February 15, 2014 06:07 PM

I havent worked on this stuff for a while but recall that most of the Unix compilers DONT have a data 'pack' implementation (or one that does anything) which would lose some of the significant optimizations.

gcc has packed structs.

Binary heap vs list A* - my humble performance comparison [updated with JPS]

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Binary heap vs list A* - my humble performance comparison [updated with JPS]

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines