# Planning Methods

This topic is 3007 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

##### Share on other sites
I always recommend Steve LaValle's Planning Book (free online).

##### Share on other sites
STRIPS is always a good place to begin.

##### Share on other sites
Hi, thanks. I've been looking into STRIPS but I'm not sure how to represent tasks or the goal state, as there will be lots of numbers and not many predicates. There is no special goal state, just to maximise the reward value. How do I represent things such as time using STRIPS?

##### Share on other sites
Sounds like a job for dynamic programming. Sort of like a knapsack problem, but the knapsack size is the amount of time left. If new tasks can be added arbitrarily, you could assume some sort of probability distribution on new tasks and incorporate the expected value into the problem.

##### Share on other sites
Quote:
 Original post by CzarKirkHi, thanks. I've been looking into STRIPS but I'm not sure how to represent tasks or the goal state, as there will be lots of numbers and not many predicates. There is no special goal state, just to maximise the reward value. How do I represent things such as time using STRIPS?

Usually goal states are just those for which a certain predicate evaluates to true.

I'm actually not used to thinking of reward values in STRIPS, but of course if you want to think of things this way, one way to encode the goal states is to just make the reward in non-goal states "minus-infinity."

Once you include the "frame axioms" STRIPS essentially become a state-space search which you can solve with BFS/DFS/A*, with each state being the binary vector of the different values of predicates (every predicate evaluates to either true or false). Traditionally, this vector is represented as a list from which things are added or removed, but from my point of view, that's just a sparse vector representation.

##### Share on other sites
Here is some crazy idea I've been mulling over. The most successful modern computer-go programs use something called Monte Carlo Tree Search, and I think this could be applied to planning in most games. The basic idea is to have some very fast model of what agents would do in any given circumstance. These models should involve some randomness. So if you have a symbolic representation of your game (which you probably need for any type of planning), you can just simulate forward what's going to happen, Monte Carlo style. At the end of the simulation (you have to decide somehow when that is), you look at some utility function that tells you how happy each agent is with the result. You will then run many such simulations and, among the actions that an agent has available right now, you'll pick the one that is at the beginning of simulations with the highest expected utility for the agent. Well, so far I just described the Monte Carlo part. The "Tree Search" part comes from accumulating statistics about the utility of actions in positions early in the simulations, which we will visit many times if we run thousands of simulations. You can then use that information to do better than using your simplistic probabilistic model for reacting in that particular situation.

I love this technique because it allows the agents to reason about what adversaries might do in reaction to their own actions, it allows for collaboration (if agents have similar goals), it allows for random outcomes (if I shoot at the guard from a distance, some fraction of the time I'll kill him, and if I don't he'll be alerted, and the simulations can capture that)... It just seems really flexible and it has proven to be very powerful in computer go, which is a really hard problem.

EDIT: Too bad I can't show you a working example of a video game where this technique works great. My job gets in the way of my hobbies... ;)

[Edited by - alvaro on March 24, 2010 10:44:34 AM]

##### Share on other sites
Alvaro, that sounds interesting and definitely feasible. The down side is that simulation/prediction can be very expensive...

Do you have any links to papers about applying the Monte Carlo tree search to Go?

##### Share on other sites
Actually, searching for `Monte Carlo Tree Search' you get some decent papers, including this one that shows that applying this technique to video games is not as original an idea as I thought. :)

The people who created MoGo wrote a good explanation of what it does. That's where I learned a lot of how UCT (the original MCTS algorithm) works.

##### Share on other sites
Alvaro, if I understand you, by doing that, you would turn the AI into a reactive planner. And then you would lose one of the main advantages of using planning, which is consistency of actions. Because you dont "keep" a plan, there is no way to make sure that chained actions are consistent. Total war games have a big problem with this on their world map AI...

1. 1
2. 2
Rutin
19
3. 3
4. 4
5. 5

• 14
• 12
• 9
• 12
• 37
• ### Forum Statistics

• Total Topics
631428
• Total Posts
3000027
×