Binary heap vs list A* - my humble performance comparison [updated with JPS]

Started by
14 comments, last by Pink Horror 10 years, 2 months ago

On a 640x640 map your method was 195 times faster. But on a 3600x1800 it was only 65 times faster. Do you know why?

See my above post.

Definitively related to memory access pattern (cache effect). This is not only visible from the discrepancy with 3600x1800 but also with the tiny 64x64 map.

For the tiny map, everything fits into the cache, so the list approach is still very fast (only 2x difference, almost certainly due to algorithmic difference). The large map benefits both from algorithmic difference and from a more cache-friendly access pattern, allowing most or all of the data to be cached when accessed in the heap implementation. For the huge map, the heap is having too many "holes" in the access pattern to be very cache friendly, so this degenerates to the same "kind of random access" factor as with the list implementation. There is however still an algorithmic advantage, so it still runs a lot faster (only not that much faster).

Advertisement

Just want to add this info in case anyone got here using search in the future:

I said you can't use a custom container, but I was wrong. If you are using C++ you can use a vector as open list by using the make_heap from algorithm. It works, but I have no idea of the performance.

Currently working on a scene editor for ORX (http://orx-project.org), using kivy (http://kivy.org).

I remember helping some guy on USENET something like 10 years ago do improvements on his A* and he came up with use of a customized HeapQ to optimize significantly. It was similarly used with a fixed grid network. I recall recommending doing pointer math directly (the list 'nodes' in a statically allocated pool) in several of the operations and making the data as tight as possible (packing and smaller data types, conbined flags, etc..) to help with the caching issues (and other things like using a closed marked border.to simplify neighbor candidate if-then logic).

I havent worked on this stuff for a while but recall that most of the Unix compilers DONT have a data 'pack' implementation (or one that does anything) which would lose some of the significant optimizations.

Caches are bigger now (?) but with maps of the size mentioned (3600 x 1800) which is much larger that the ones we we playing with (~1Kx1K) you still will run into alot of misses.

I wonder at what use there would be with maps of that size without also considering implementing a hierarchical system.

--------------------------------------------[size="1"]Ratings are Opinion, not Fact

I adapted the JPS from jumper ( https://github.com/Yonaba/Jumper/blob/master/jumper/search/jps.lua ) to my C implementation. First I must say that this is not a 100% fair comparisson because jumper's algorithm allows diagonal movement even it there is a wall next to the current node:

Here is a partion of the small map to illustrate the difference:


000....00000........x000000......000000000....0000000000......00
0000..000000........x00000........000000000....0000000000.....00
000000000000.........x0000.........00..00000...00000000000....00
000000000000.........x00000..............000...000000000000...00

In my original implementation the movent from the second to the third line is not allowed. A pretty small detail, considerering I am more interested in the performance times.

So here they are, all the results are for JPS with memory pooling and heap for open list. Again, times are in microsseconds (10^6 = 1s).

Test on a random generated 64x64 map.

First path 118. Following paths: 117.

Test on a random generated 640 x 640 map:
First path 1724. Following paths: 1542.
Test on a random generated 3600 x 1800 map:
First path 7218. Following paths: 6321.

So there it is. JPS is, indeed, incredibly fast, still I need to realize how to add the diagonal move restriction.

Currently working on a scene editor for ORX (http://orx-project.org), using kivy (http://kivy.org).

Another major limitation of Jump Point Search is its inability to deal with different move costs per tile. For some games, that's perfectly acceptable (Like Bioware's Infinity Engine games), but for other games, like the Civ series, with mountains, and whatnot, it is not.

@ferrus

The jumper implementation actually comes with a "clearance" that allows you to ignore nodes of determined cost. If you have few variation you can use different clearances to calculate a few paths and peek the cheapest one.

Currently working on a scene editor for ORX (http://orx-project.org), using kivy (http://kivy.org).

The only thing left now is to write it in assembler (heh), multithread it, or do it on GPU.


I havent worked on this stuff for a while but recall that most of the Unix compilers DONT have a data 'pack' implementation (or one that does anything) which would lose some of the significant optimizations.

gcc has packed structs.

This topic is closed to new replies.

Advertisement