A* Implementation

Started by
8 comments, last by kirkd 16 years, 7 months ago
Below is a basic implementation of A* that I've put together. I would appreciate any critiques. I haven't included all the details - just those associated with the A* search itself. -Kirk

bool AStar::Search(SearchNode *root)
{
    bool found = false;
    
    SearchNode *currentNode;
    unsigned int ID;  //each searchNode has a UID that I can use to track it in the open/closed lists
        
    vector<SearchNode *> v, children;  //v is used to keep the current nodes of interest -
					//effectively an open_list that I can easily sort
					//priority_queue did not seem to work
					//children is used to retrieve the next layer of the search     
    
    map<unsigned int, SearchNode *> closed_list, open_list;  //I used a map since each SearchNode has a UID
      
    v.push_back(root); //the root is the starting node
    open_list[root->GetID()]=root;

    while (v.size() !=0 && found==false)
    {
        //sort the nodes available
        sort(v.begin(), v.end(), CompareAStar);  //My compare function sorts by F()

        //get the cheapest node so far
        currentNode = v.at(0);

        //is this it??
        if ( currentNode->Goal() ) //returns true if we're at the goal node
        {
            found=true; //yes!!
            cout << "Found it!" << endl;
        }
        else
        {
            //move this node from the open_list to the closed_list
            ID = currentNode->GetID();
            closed_list[ID] = currentNode;
            open_list.erase(ID); //logarithmic complexity in the size of the map - not sure how to get around this
            v.erase(v.begin());

            //no - get the next layer of nodes
            children = currentNode->GetChildren();

            for (unsigned int x=0; x<children.size(); x++)
            {
                ID = children.at(x)->GetID();
                //make sure it isn't already in the open/closed list
                if ( open_list.find(ID) == open_list.end() &&
                     closed_list.find(ID) == closed_list.end() )
                {
                    //not in either list, so add it to the open list and push it into the stack
                    open_list[ID] = children.at(x);
                    v.push_back(children.at(x));
                }
                else
                {
                    //it was already in one of the lists, so we don't care
                    delete children.at(x);
                }
            }
        }
    }

    if ( !found )        
        cout << "No solution found.";

    return found;
}



[Edited by - kirkd on September 26, 2007 10:12:13 PM]
Advertisement
1)"found==false" ?

Come on, use "!found". Using the == operator on booleans is just silly.

2) I suspect that "GetChildren" performs dynamic allocation of nodes, and I see that you dynamically delete them. This is a no-no: you want all your memory to be pre-allocated.

3) An improvement would be to make your AStar class a template class. A good A* template class can be a powerfull tool for all sorts of problems. It even helped me win a programming contest.

4) I would ditch the "found" variable, and loop against a maximum number of iterations instead. Break out when you find a solution, and fail inside the loop if v is empty (or the opposite).
Erasing the first element of a vector is an O(n) operation. You could easily sort the open list in the reverse order and then erase the last element, or use a deque instead of a vector. I don't see any reason why a priority queue wouldn't work, as long as it used your sorting function to sort the pointers appropriately.

To ensure optimality, when a node is found that is already on the open list the costs must be compared. It is possible to find a lower cost path to the node than was found when it was first placed on the open list. Locating the original node in the priority queue can be tricky; an easy solution is to flag the node to mark that it is invalid and then examine this flag when getting the lowest cost node.

Instead of using UIDs, the nodes can implement comparisons and/or hashing so they can be directly used as the keys in maps and sets. Internally the nodes can compare UIDs, if that makes sense.

The memory management is inconsistent. The function sometimes deletes nodes when they aren't needed, but otherwise they aren't deleted.

How does the caller retrieve the solution that was found?
Quote:Original post by Steadtler
1)"found==false" ?

Come on, use "!found". Using the == operator on booleans is just silly.


I'm sure that'll boost the overall performance dramatically.

Quote:
2) I suspect that "GetChildren" performs dynamic allocation of nodes, and I see that you dynamically delete them. This is a no-no: you want all your memory to be pre-allocated.


Excellent point. The problem is that in this case, for the search I'm doing, I cannot expand the entire search tree up front. The test case I'm working on if finding optimal solutions for the Rubik's cube, and those have to expand in process.

Quote:
3) An improvement would be to make your AStar class a template class. A good A* template class can be a powerfull tool for all sorts of problems. It even helped me win a programming contest.


Good idea. As I mentioned, this is a basic implementation. Templates or class hierarchies will come later.

Quote:
4) I would ditch the "found" variable, and loop against a maximum number of iterations instead. Break out when you find a solution, and fail inside the loop if v is empty (or the opposite).


Also a good idea.


Quote:Original post by kirkd

Excellent point. The problem is that in this case, for the search I'm doing, I cannot expand the entire search tree up front. The test case I'm working on if finding optimal solutions for the Rubik's cube, and those have to expand in process.



You still can allocate a sizable chunk at first, and allocate new blocks on demand if you run out of nodes. Especially if you go with a maximum number of iterations, then you should be able to put an upper bound on the number of nodes. And about that !found, dont underestimate the power of beautiful code to run faster. Ugly code tend to rust over time...
Quote:Original post by Vorpy
Erasing the first element of a vector is an O(n) operation. You could easily sort the open list in the reverse order and then erase the last element, or use a deque instead of a vector. I don't see any reason why a priority queue wouldn't work, as long as it used your sorting function to sort the pointers appropriately.


I like the reverse sort idea. I may try that one. As for the priority_queue, I could not get it to work with my comparison function. I tried a number of variations, and even cut-and-pasted code from the internet, but I could never get priority_queue to accept any sort of comparison function. (I'm using g++ 3.4.5, so surely it isn't the implementation.)

Quote:
To ensure optimality, when a node is found that is already on the open list the costs must be compared. It is possible to find a lower cost path to the node than was found when it was first placed on the open list. Locating the original node in the priority queue can be tricky; an easy solution is to flag the node to mark that it is invalid and then examine this flag when getting the lowest cost node.


You are very correct. In this case that will not happen - optimal solutions for Rubik's Cube. If I find a node further down in the search tree, it is always higher cost.

Quote:
Instead of using UIDs, the nodes can implement comparisons and/or hashing so they can be directly used as the keys in maps and sets. Internally the nodes can compare UIDs, if that makes sense.


Not sure if I follow you there. How would that be different from an explicity UID?


Quote:
The memory management is inconsistent. The function sometimes deletes nodes when they aren't needed, but otherwise they aren't deleted.

How does the caller retrieve the solution that was found?


Both of those details are in the set of "left out" details. I do take care of cleaning up all the references at the end of my real function. As Steadtler mentioned, children are dynamically allocated by a different function - a no no.

For the solution, I construct the solution at the end of the search and store it within the object. The caller can use a GetSolution() call.

-Kirk
Quote:Original post by kirkd
Quote:
Instead of using UIDs, the nodes can implement comparisons and/or hashing so they can be directly used as the keys in maps and sets. Internally the nodes can compare UIDs, if that makes sense.


Not sure if I follow you there. How would that be different from an explicity UID?


The UIDs are either stored in the node or they are generated when needed. Either way, unless they are the representation used internally by the nodes or they would be generated anyway for some other purpose, they are adding memory and/or processing overhead. Implementing comparisons to define an order based on the internal state of the node could be more efficient.
Quote:Original post by Vorpy
Quote:Original post by kirkd
Quote:
Instead of using UIDs, the nodes can implement comparisons and/or hashing so they can be directly used as the keys in maps and sets. Internally the nodes can compare UIDs, if that makes sense.


Not sure if I follow you there. How would that be different from an explicity UID?


The UIDs are either stored in the node or they are generated when needed. Either way, unless they are the representation used internally by the nodes or they would be generated anyway for some other purpose, they are adding memory and/or processing overhead. Implementing comparisons to define an order based on the internal state of the node could be more efficient.



Yes, that makes much more sense. Thank you.

The UIDs that I'm using are essentially the internal state of the cube converted to an unsigned integer. The UID can be used to reproduce the exact state of the cube associated with that node, or for the purposes of comparison (any particular state only produces one UID - symmetries, etc. are taken into account). They are certainly more than just simple identifiers.
So in the actual code, are the UIDs stored as strings or as integers? Storing them as integers (possibly more than one, if they don't fit in 32 or 64 bits) would be more efficient than as strings.
They are stored as unsigned integers. The line "string ID" is wrong. I added it when I stripped down the code for posting here. I'll go back and fix it now. I'm sorry for the confusion, and I've fixed the code.

This topic is closed to new replies.

Advertisement