Sign in to follow this  
Norman Barrows

towards a faster A*

Recommended Posts

so i've finally begun the A* implementation for Caveman 3.0.  i first played around with A* almost a decade ago. this time around, with my typical outside the box approach to things, i decided to try things a little differently.

 

to begin with, i'm working with a 300x300 grid type search area. once the code is complete this will most likely be changed to a variable sized search area.

 

instead of open and closed lists, i keep an A* map: a 300x300 array with f,g,h, and state: open | closed | neither.

 

for the moment, a separate 300x300 collision map has the obstacle data in it. but the two maps might be combined.

 

the only slow part is getting lowest f, which requires searching the entire grid.

 

so the idea to speed up that is to use an index system.

 

when a node is marked as open (added to the open list), it would be bucket sorted on f into the index system.

 

with a 300x300 area, the longest straight path is only 420, and a convoluted path might max out at 600 or so.  so you would need perhaps 600 buckets. since multiple open nodes can have the same f value (working with ints here), each bucket would be a list. 

 

for the lists i was thinking an un-ordered list with active fields.  i figured insert a first inactive and set active to false to remove would be faster than insert at end and copy higher array elements down on remove.

 

when adding a node to the open list, you'd have to iterate thru that bucket's list to insert at first inactive.

 

when getting lowest f, you'd have to iterate through the buckets to the first non-empty one, then iterate though its list of nodes to the first active one.

 

storing the number of nodes in each bucket will speed up determining empty buckets.  same idea might apply for first active and inactive node in a bucket, but maybe not. you have to sort or search at some point. same might also be true for first non-empty bucket.

 

nodes in the a* map and the index system would be doubly linked, IE a map node would have a reference to its index system entry, and its index system entry would have the coordinates of the node in the A* map.  this would make operations like removal very fast.

 

 

 

if i can track first non-empty bucket instead of searching for it, then search for first active node in the bucket list is about the only slow part left.

 

its my understanding that the open list is the  "frontier" nodes, and the closed list is the "interior" nodes.

 

the max number of frontier nodes would be the max size for a bucket list (worst case).

 

might take a bit of ram. i'm thinking have one A* map and index system, and just keep using them over and over for each entity. entities would store the paths that were the output of the A* system.

 

 

thoughts? comments? suggestions?

Edited by Norman Barrows

Share this post


Link to post
Share on other sites
The usual implementation uses a heap for the open list, which allows insertions and retrieval of the lowest f node in time O(log(number_of_open_nodes)). A node can be added to the list multiple times, which is fine as long as you check if the lowest-f open node is closed; in that case simply skip it.

I would start by implementing that. I would try the other stuff only if that's not fast enough.

Good luck!

Share this post


Link to post
Share on other sites

I second Alvaro's suggestion re: the Heap. I remember reading about A* and not know how a heap worked it seemed pretty scary, but once I had a crack at implementing it and it started to made sense, my A* implementation became several orders of magnitude faster. 

Share this post


Link to post
Share on other sites

my choice of  a grid for the open and closed lists was inspired by Amit's A* pages where he combines the two lists onto one.

 

he goes into a fair amount of detail re: possible implementations, with lots of big O notation. but also points out that different algos are superior, depending on the value of N. specifically, he mentions the o log n algos and their poor performance with N < 200,000.

 

>> Start with the bare-bones dead-simple A* implementation. Then profile it in actual usage with real data from your game to discover where it is slow and where things could be optimized.

 

i'm still in the algo EDIT: implementation selection stage here. profiling of prototypes of different implementations would be about the only use for profiling at this point.

Share this post


Link to post
Share on other sites

my choice of  a grid for the open and closed lists was inspired by Amit's A* pages where he combines the two lists onto one.
 
he goes into a fair amount of detail re: possible implementations, with lots of big O notation. but also points out that different algos are superior, depending on the value of N. specifically, he mentions the o log n algos and their poor performance with N < 200,000.


I have used a priority queue to compute the distance to goal in the game of Quoridor, which has a 9x9 board. The code is very simple and it was pretty fast: The alpha-beta search went through 1 to 2 million nodes per second --depending on the stage of the game-- while calling A* twice per leaf.

Share this post


Link to post
Share on other sites

You are lucky if you can fix your window overlay grid dimensions into constants

 

Then you can do pointer arithematic on a fixed size map array   (ie   ptr+1  ptr-1  ptr+span  ptr+span+1 ptr-span+1 etc..) .... example - to access the usual 8 neighbors when finding open list candidates    You can do this with a single value offset (array of the 8 of offsets and use a for loop) if you make your map 1 dimensional array   [90000]

 

Fixed size HeapQ for the open list  (also doing pointer math to walk nodes of the tree  --  a systematic power of 2 progression for the binary tree)      Max depth shouldnt need to be sized to worst case  as you can have them fall off the bottom and IF they ever become canditates again they can be readded to the open list.

 

use flags ON the window A* grid map (overlay) nodes for your closed list (flags)

 

If you are still doing link lists then absolutely use node pools instead of costly allocate/deallocate overhead

 

crush your map/open list data as much as is possible to maximize cache hits

   (300 coord  size calls for 16 bit ints ...)   use bitmaps for flags  avoid floats if you can for weight values  to use try to use bytes/int16 .

all needed AI* data extracted from the real world map this 300x300 window is overlaid on  (yanking from chunks at tthat time

 

Eliminate if-then X Y edge of map tests being done when getting open list candidates by oversizing map +2 and pre-marking the boundry lines as CLOSED (flagged) - the map edges will now be excluded as part of the normal Closed node test

 

Finish the current pathfind task - dont timeslice multiple A* pathfindings

 

if precalc'd edge cost values can be stored in the real world map nodes,  then do that (memory is cheap) and make them contiguous so they might be copied/extracted in a block into the A* working data

 

IF additional map data needs to be extracted/examined  BUT the number of nodes this is required for is largely culled down THEN that data  doesnt need to be extracted and put into the A* working data (less initial copying AND smaller A* working data....)

 

figure out how to maximize reuse of the same A* working data for multiple objects

 

---

 

Something they rarely include for A* speed calc is the setup time of the working data you will use  (interpretting/extracting from the world map and resetting between A* runs)

 

 

 

Share this post


Link to post
Share on other sites

Thanks wodinoneeye, some helpful tips in there for everyone.

 

One of the other ones I like is using an integer instead of a boolean for closed.  Each A* search has an integer, a simple counter, and if the current search != the number on tile, than it's open.  It means you never have to go back and do a clear, you just increment the search counter.

Share this post


Link to post
Share on other sites

>> Something they rarely include for A* speed calc is the setup time of the working data you will use  (interpretting/extracting from the world map and resetting between A* runs)

 

that's where i'm at now.

 

nested for loops to find the lowest open in the grid clocked in at 11K ms or 6K ms, depending on whether x or z was the outer loop. odd things was, "for x, for z..." was faster than "for z, for x...", when accessing node[x][z]. i thought it was supposed to be the other way around.

 

needles to say, floats for f,g,and h ran much slower than ints.

 

implementing the bucket sort on f cut that down to about 300 ms.

 

but now i'm getting times of around 300 ms whether i limit the search to a range of 10 or 300 (the goal is 271 nodes away).

 

so i expand 100 nodes (max range 20) and its around 300 ms, or i expand 7000 nodes (max rng 300) and its still around 300 ms.

 

so its looking like the setup is the major over head.

 

i made the lists in the bucket sort insert at end and swap last on remove.  so no more iterating thru lists to add  and remove.

 

get lowest open still iterates thru the buckets to find the first bucket with num_entries >0, then returns the first entry in that buckets list. i'm currently using 400 buckets. f values are ranging from 0 to a little over 300 at the moment, but theoretically could go higher (400-500). the list size in each bucket is whopping 10,000 entries! so the index is a few meg in size overall. 

 

so add to open consists of setting the astar map node to state=open, calculating f,g,and h, and adding it to the index. adding to index is simply: entry[num_entries] = new data, num_entries++.

 

add to closed: just set the state to closed, and remove from index. remove from index requires iterating thru the bucket list to find the entry.   using an active field, and no swap with last on remove, and insert at first inactive, the index and astar map could be doubly linked (map holds bucket & entry #) as well as bucket entry having astar map coords. this would mean an iteration for remove from index could be avoided - replaced with iteration for insert at first inactive. not sure if that would help though....

 

get lowest open iterates though the buckets til it finds the first non empty, then returns that bucket's first entry.

 

so the next step is to put the timers back on it, and see where that 300 ms setup overhead is coming from.

 

it may turn out that while a all-in-one grid type list for open/closed/neither runs fast, its slow to setup. you have to mark all 90,000 nodes as neither before starting.

 

i'm already checking for min and max f values to get an idea of the number of buckets i'll need. i plan to add metrics so i an track just how big the entry lists for buckets get. i sure didn't expect 10,000 nodes at the same f value.

 

i have a feeling that its one of those algos where one of the steps is just plain slow. and you can switch around which of the half dozen steps or so that is, but in the end, one will run slow (iteration) and the others fast (simple assignment).

 

if, as i suspect, the grid implementation runs pretty fast, but has high setup costs, i will then look for an implementation with lower overhead, such as the traditional two list method. the grid basically keeps three lists, which requires everything to be put in the "neither open nor closed list" as part of setup.

 

eventually i'll get down to the implementation that has the lowest overall cost, or some number of possible implementations with relatively equal costs.

 

 

one thing i don't know is what kind of performance i should expect.  right now my best performance is something like 7000+ open nodes processed in ~300+ ms (about 23 open nodes per ms).  if you consider each open node processed to actually process the node plus its eight neighbors, i'm processing 63,000+ total nodes in 300+ ms (about 210 total nodes nodes per ms)

 

another thing i found interesting was that the first time i run the test i'll get high times, like 11K ms. but if i then run it again, i get 300ms. i assume this is due to L2 cache load times for the fist run.

 

this testing has also pointed out that collision map generation could stand some optimization. i no longer include that in the astar time required. under normal circumstances, the map would already be generated and sitting in the cache, ready to be used.

Share this post


Link to post
Share on other sites

nested for loops to find the lowest open in the grid clocked in at 11K ms or 6K ms, depending on whether x or z was the outer loop. odd things was, "for x, for z..." was faster than "for z, for x...", when accessing node[x][z]. i thought it was supposed to be the other way around.

 

why? a[3][3] would be laid out like this afaik:

 

a[0][0], a[0][1], a[0][2], a[1][0], a[1][1], a[1][2], a[2][0], a[2][1], a[2][2]

 

So if you are incrementing the second coordinate in the inner loop you are more likely to reuse data from the cache, if you are are altering the first coordinate in the inner loop you will get cache misses all the time with an array as big as yours, no?

Share this post


Link to post
Share on other sites

Thanks wodinoneeye, some helpful tips in there for everyone.

 

One of the other ones I like is using an integer instead of a boolean for closed.  Each A* search has an integer, a simple counter, and if the current search != the number on tile, than it's open.  It means you never have to go back and do a clear, you just increment the search counter.

 

I forgot about that one - you have to do a marker wipe across the whole map grid  once in a long while  ( after 255 if a byte  or  64k-1 for int16  need to speed test if byte is better because smaller mem footprint or extra instructions to do the byte value insert into a word - i would hope its not same for 16bit vs int32)

(you eventually would loop back around to get to the same marker value again - you cannot depend on intervening activity to always change the map array markers...)

Share this post


Link to post
Share on other sites

 

nested for loops to find the lowest open in the grid clocked in at 11K ms or 6K ms, depending on whether x or z was the outer loop. odd things was, "for x, for z..." was faster than "for z, for x...", when accessing node[x][z]. i thought it was supposed to be the other way around.

 

why? a[3][3] would be laid out like this afaik:

 

a[0][0], a[0][1], a[0][2], a[1][0], a[1][1], a[1][2], a[2][0], a[2][1], a[2][2]

 

So if you are incrementing the second coordinate in the inner loop you are more likely to reuse data from the cache, if you are are altering the first coordinate in the inner loop you will get cache misses all the time with an array as big as yours, no?

 

of course when you use 2 nested loops you now need to skip the center 0,0 offset which is an extra if test in inner loop

 

 

 

 

// Single loop offset + pointer math method :

 

const  span = 300 + 2;   // maparray will have the outer edges preflagged as CLOSED fa

NODE maparray[span*span];    // the 302x302 array flattened --- with the extra bountry edge

const  int16 offset[8] = { -1-span, 0-span 1-span, -1, 1, -1+span, 0+span 1+span };   // 8 neighbors of a grid square

 

//  open list HeapQ stores the candidates as this single index value

 

uint32 pos = getTopOfHeapQ();  // given   pos is  x + y * span    is the current node checking for open list candidates

 

NODE *ptr = &maparray[pos];   // single indexed offset   (this is now current node ptr)

 

for (int d=0; d<8; d++)

{

NODE* candidate = *(ptr+offset[d]);    // pointer arithmetic

 

// NOTE - no edge test logic is required as no open list index will ever be on the pre closed edges of maparray

 

if (candidate->closedflag)   continue;  // skip candidates already closed

 

// you may have to adjust edge cost here  as 45 degree (NW NE SE SW) distance is more then 90 degree (N S E W) directions

 

insertHeapQ(pos+offset[d]);   // sortinsert candidate into the open list

 

} //end loop

 

 

that offset array you can order any way it best gets  cache hits  (probably with y major and x minor  as they are already set (and yor grid maparray being the same)

Share this post


Link to post
Share on other sites

where's selective quote?!?!?!?!?  <g>

 

>> of course when you use 2 nested loops you now need to skip the center 0,0 offset which is an extra if test in inner loop

 

i use the nested for loops for the non-indexed get_lowest_open routine.

 

at first i was using them for the expand_neighbors routine as well. and yes, i forgot to omit x+0,z+0.  weird thing was, it seemed to run faster when you expanded the parent as well as neighbors. although off the top of my head that doesn't make sense. it would just skip it or re-open it with a higher g value - i think. now i explicitly call expand_neighbor(x,z,...) for each neighbor.

 

>> // NOTE - no edge test logic is required as no open list index will ever be on the pre closed edges of maparray

 

that will save me x and z range checks. 4 if statements. every little bit helps.

 

>> if (candidate->closedflag)   continue;  // skip candidates already closed

 

don't closed have to be removed from closed if a shorter path to them is found? i doubt i'll be dealing with the ideal conditions that can omit that step.

 

>> // you may have to adjust edge cost here  as 45 degree (NW NE SE SW) distance is more then 90 degree (N S E W) directions

 

i've already tried both ints with cost of 1 for all directions, and floats with costs of 1.0 (h and v) and 1,414 (diag).  1's and ints works fine. i've tried BBox distance, euclidean distance, and distance squared for h (with no scaling correction for g, such as g squared). so far BBox and ints with cost of 1 in all directions is working fine, and running the fastest.

Edited by Norman Barrows

Share this post


Link to post
Share on other sites

ok, i slapped some timers on the setup code.

 

the results are quite interesting:

 

test #1    path from SE to SW corner of the collision map 1 collision map north of the campsite of the long term play test game's band:

1271 nodes (calls to get lowest open)

time for A* main loop: 4 ms

f ranges from 0 to 245

setup time: 1056 ms (1st run). 353 ms (subsequent runs).

path length 245

317 nodes/ms

 

test #2    path from NE to SW corner of the collision map with the campsite of the long term play test game's band.

4778 nodes

17 ms

f= 0-306

path len = 305

setup = 341 ms (1st time) 332 ms (subsequent runs).

281 nodes per ms.

 

note that the second test was run after the first, so setup times were lower, since the astar map and/or index  was already at least partially in the L2 cache.

 

looking at the setup code.....   ( c = call <func>  <parms>.   cr = call with return <func>  <ret_value>  <params>.   )

 

c Zstarttimer 9
c init_astar
= astar_path_length 0
= astarmap[start_x][start_z].state ASTAR_OPEN
= astarmap[start_x][start_z].g 0
cr astarmap[start_x][start_z].h astar_dist start_x start_z goal_x goal_z
= astarmap[start_x][start_z].f astarmap[start_x][start_z].g+astarmap[start_x][start_z].h
c astar_index_add astarmap[start_x][start_z].f start_x start_z
= quit 0
= count 0
= endx goal_x
= endz goal_z
cr t2 Zelapsedtime 9
 

 

init_astar is two calls to ZeroMemory, one for the map, and one for the index.

 

astar_dist is just an int BBox range function. IE: greater of abs(dx), abs(dz).

 

astar_index_add is just index[f].list[num_entries]=new data, num_entries++.

 

and everything else is simple assignment statements.

 

the map is 300x300 x 6 ints x 4 bytes per int = 2.16M bytes.

 

the index is 400 buckets x 1 int x 4 bytes + 400 buckets x 10,000 entries x 2 or 3 ints (don't think i need active any more) x 4 bytes, for a total of 1600 + 32M bytes or 48M bytes.

 

so as expected, the bucket sort is very fast, and takes huge amounts of ram. one unforeseen consequence of this huge size appears to be the cost of initializing that memory.

i have yet to bracket it with timers, but its already rather obvious which calls are to blame (memset 16 meg and memset 48 meg). but of course, i'll bracket it to make sure. 

 

so it would appear that a map and bucket sorted index implementation which is designed to speed up the individual operations (add, remove, get lowest) works well, at the cost of very high ram requirements, and unexpectedly high setup costs.  while a game like Caveman with a relatively small working set (750M ?) might be able to handle the ram hit, the setup costs are prohibitive.

 

i'll take a look and see if setup costs can be reduced, but i doubt it.  

 

some other approach is most likely called for.  perhaps the combined open and closed list as decsribed by Amit. if you start with (an) empty list(s), instead of a full list of everything (2d map implementation), you don't have to init anything, other than perhaps num_entries=0.

 

so i'll probably take a second look at the algo, taking setup time into account, and see what data structures look best. if nothing jumps out at me, i'll just go with the traditional 2 list or Amit's one list implementation as the next thing to prototype. as i recall, a binary heap for the open list is part of the usual optimizations for the traditional two list implementation.

 

one thing i'm not accounting  for yet is ties. i'm just going with first found at a given f.

 

from running the tests it also looks like jump star (jump point a*) might be applicable. originally i didn't think there would be large enough open spaces, but i wasn't thinking in terms of diagonals.

Edited by Norman Barrows

Share this post


Link to post
Share on other sites

 

Thanks wodinoneeye, some helpful tips in there for everyone.

 

One of the other ones I like is using an integer instead of a boolean for closed.  Each A* search has an integer, a simple counter, and if the current search != the number on tile, than it's open.  It means you never have to go back and do a clear, you just increment the search counter.

 

I forgot about that one - you have to do a marker wipe across the whole map grid  once in a long while  ( after 255 if a byte  or  64k-1 for int16  need to speed test if byte is better because smaller mem footprint or extra instructions to do the byte value insert into a word - i would hope its not same for 16bit vs int32)

(you eventually would loop back around to get to the same marker value again - you cannot depend on intervening activity to always change the map array markers...)

 

 

Yeah, good point, though easily remedied with a check in the same place the increment is done at.  Not sure what is best size wise, probably depends a lot on how big your node already is. 

Share this post


Link to post
Share on other sites

Well, i'm close to a full working implementation in the game. still need to grow obstacles by the critter's collision radius, and add the code to move from one collision map to the next once you path to an edge node. the code is not fully optimized yet. good enough for Wolfgang to follow Shana around, but may not be fast enough for 500 critters without at least round-robin. i'll post a complete postmortem later. a surprisingly few number of optimizations got things running quite fast. nothing complex like heaps or anything like that. but i did use one or two of the suggestions here.

 

really, i just wanted to say that ApochPiQ nailed it in the first place, when he made this call:

 

>> You're not talking about vast data sets here nor are you talking about having extremely complex cost functions. A vanilla A* implementation done correctly is more than fast enough to handle your case.

 

a big +1 for that.

 

and you know i don't give away my +1's lightly.

Share this post


Link to post
Share on other sites

http://www.codeproject.com/Articles/118015/Fast-A-Star-D-Implementation-for-C

This guy here wrote an extremely fast heap priority queue.

Let's try it.

It runs very fast in my program

 

Looking at the large grid map (on that article) with all those zigzags what besides a computer scientist would think something real has as detailed a set of terrain information to be able to employ such a detailed pathing  (Mars Rover where its GAME OVER if the thing gets stuck/rolls over - and even then the info is quite spotty to be so precise)

 

Anyway,    animals have paths they repeat  and repeated exploration to find their 'good enough' paths to things they routinely do.

Really the coarse pathfinding around blocking rivers and mountain ridges/swamps  done at a fairly coarse scale (for long range movements) is the closest they do for such distances,  with close range exactness for immediate (next 5 minutes) movement  being where any precision exists.    At the large scale in their tiny brains it is more an irregular network of interconnected nodes adjacencies of regions having certain properties (and moreso the resources in them - the goal info)

 

So really its an organic mapping with very rough costs (more 'can I get through' or not or 'do I want to even attempt it')  And then relying on  fairly short range movements stepwise precision.    Solving a maze when you cant even really see the maze doesnt  need to be attempted.

 

For the 'out of sight' beasty AI nature of the game being described it can be very roughly done.

 

---

 

SO with fairly short paths still needing efficient pathing (particularly at close range and likely in a dynamic environment requiring frequent repathings over short intervals)    The HeapQ i used for the open list used pointer math because of the tree relations between the nodes where every 'parent' or left/right' sibling was a fixed mathematical offset within the data structure being used  (so when you add new cadidates at the top and displace downward and pull the top and promote upwards  it was very fast processing)

 

 

---

 

I do also remember at that time thinking that for multitudes of pathing objects using similar paths between resources,  that developing common routes by some monte carlo samplings between resouce areas would allow reuse of a much simpler individual system of paths.  These already determined withing the local area via A* to be leading to a particular resource (like a waterhole) that once you reached a node on that precanned route it was obvious it was part of an open path to that resource (could be followed with MUCH simpler logic - drasticly cutting down on the pathfinding processing).

Edited by wodinoneeye

Share this post


Link to post
Share on other sites

postmortem:

 

the open and closed lists are static arrays of structs allocated on the stack (IE local variables).

the number of open and closed nodes in each list is tracked.

init_astar_lists is simply setting num_open_nodes and num_closed_nodes to zero.

the lists use insert at end, and swap and dec count to remove.

the open list stores both x,z, and an id=z*collsion_map_size+x

the main function is:

 

init_astar_lists

while !quit

     get_lowest_open &id

     move_to_closed id

     expand_neighbors id

     .

extract_path

 

get_lowest_open does a simple for loop search of the open list array for the index with lowest f

 

in expand_neighbors i unrolled the nested for loops into eight explicit calls to expand_neighbor.

 

in expand_neighbor i combined the atomic operations such as add, remove, membership check, etc.

 

i left in the check for neighbor off edge of collision map, as opposed to using a border of closed nodes. it seemed to me that a border of closed nodes would increase the size of the closed list and therefore slow down membership tests. but this is only due to my implementation method. for some other method with faster membership tests, a border of closed nodes may be faster.

 

so expand_neighbor ended up looking like this:

 

if off_edge return

if in_closed_and_not_removed return

if in_open_and_not_removed return

add_to_open
 
in_closed_and_not_removed does the membership test. if the test succeeds and the node should be removed, it does the remove. it then returns a result, saying what it did.
in_open_and_not_removed works in a similar manner.
 
ints are used for f, g, and h, and h is just BBox distance to goal. i tried floats and sqrt(2) for diagonal cost  and true 2d distance, its just slower.
 
the parents are stored in a 2d array the same size and scale as the collision map. and there's no need to initialize it.
 
extract_path pulls the path from the 2d array and stores it in a 1D path array, which is then copied to the entity's path array.  it could be modified to extract directly to an entity's path.
 
the collision map system was modified to handle collision maps at arbitrary scales. collision maps are now generated at different scales, based on the diameter of the critter in question. so critters are always one square in size. obstacle radii are grown by 1 to account for critter radius when going around corners.
 
Astar uses the collision maps for passable/impassable node checks.
 
before astar actually runs, it does a number of checks:
 
1. if the start is in impassable terrain, it finds the closest open node and does LOS movement to there. this check may be unnecessary.
 
2. if the goal is in impassable, it finds the closest open node to the goal and does Astar movement to there. this can occur if the goal is a critter who seems to be in impassable terrain due to the collision map scale vs their size. IE a human of diameter 2 can move to what appears to be inside a rock on a collision map at a scale of 6.
 
3. it then gets the goal for the current collision map, based on the ultimate goal. this is basically a raycast to the edge, followed by an outwardly expanding edge search for open nodes on both this map's edge, and the adjacent map's edge. if the ultimate goal is in the current collision map, it simply returns the ultimate goal as the current goal.
 
4. if it can not find an open edge node for the current goal, it tries LOS. this case will only occur if the entire edge of a collision map is impassable. IE its ocean or impassable mountains. in such cases you're more or less screwed anyway. i mean if your target takes off on a raft across the ocean, there's not much you can do about it. especially if you're an animal that can't build a raft and give chase.
 
5. if the start == the current goal, either its at the ultimate goal or at the edge of a collision map, either way LOS is used to get the rest of the way, or to the next collision map.
 
if none of the above conditions is true, it runs astar out to a range of 50 feet. this is far enough to get the critter moving in the correct direction around large obstacles, even if it doesn't path all the way past them.
 
when following the path, if they are more than one node distant from node zero, they goto node zero, else they goto node 1.
 
optimization of real-time collision map generation at arbitrary scales may be called for.
 
further optimization of astar itself may be called for with a large number of active critters at once.
 
i'm currently running A* every update, which is probably unnecessary.  it should only have to be run when the goal changes, or when they exhaust the nodes in their current path, or when they deviate drastically from the path for some reason.
 
notes:
 
1. don't forget that extracted paths are in reverse order.
 
2. collision maps at arbitrary scales meant i was dealing with three coordinate systems: world, collision map, and scaled collision map.  this led to a bit of confusion as different parts used different coordinate systems. in the end i converted everything from world to scaled collision map coordinates, did all the astar stuff, then converted the results back to world coordinates.
 
3. i was quite surprised how just a few things like insert-at-end, swap and dec, unrolling the loops in expand neighbors, and especially combining the atomic calls in expand neighbor really made a big difference in performance. all my times are zero milliseconds now.  i'm down to having to measure astar execution time in clock ticks.
Edited by Norman Barrows

Share this post


Link to post
Share on other sites

postmortem update:

 

i switched collision checks back to using high resolution collision maps (1 square is 1 foot across). Astar continues to use collision maps where width of a square = critter diameter.

 

i suspect a map with a minimum resolution of "width of one square = critter radius" is required for "pixel perfect" BBox collisions.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this