Sign in to follow this  

Decision theory, high risk scenarios, etc

This topic is 3580 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I have a system where several agents have to be arranged into groups to attempt certain abstract tasks, for which a fixed reward is on offer to be shared among the group members if successful. The mathematics of this are chosen so that each additional person contributes less towards the chance of success than the previous one. eg. If one person attempted Task A, he'd have a 50% chance of success, if two people attempted it, the chance would rise to 75%, if three attempted it the chance would become 87.5%, etc. Obviously this means that, given that the expected value for an agent participating in that group drops as the group grows bigger, so they would always opt for smaller groups - a 25% chance of $10 is better than a 90% chance of $1, for example. One factor not yet mentioned though is that these tasks carry risk; failed tasks can be considered dangerous and an agent may be destroyed as a result of that task. Similar to the conclusions of Pascal's Wager, one could consider that any potential reward, no matter how high, compares unfavourably to the chance of death - arguably an infinitely negative penalty - and that therefore the rational decision swings back the other way; all agents should opt for the largest group possible to reduce the risk (assuming they can't just avoid attempting the task altogether). Obviously that is an extreme position, as we daily perform acts with a minuscule but non-zero chance of fatality. But how do we rationalise this? For gameplay reasons, I don't want the agents to always form the smallest group (to earn the most reward) nor to always form the biggest group (to minimise risk), but to come to tradeoffs, based on the amount of reward and the amount of risk. Intuitively, this is how I think humans act, but mathematically, it's hard to see where those two curves cross over. Basically my question therefore is, is there a good way to model this sort of decision making process where people decide that the chance of a high penalty has been reduced to an acceptable level? Can the penalty in this case be adequately measured, and thus compared directly against the reward? (Edit: reworded first sentence to remove emphasis on each agent and more on the group.) [Edited by - Kylotan on February 15, 2008 5:42:59 PM]

Share this post


Link to post
Share on other sites
Quote:
Original post by Kylotan
Obviously that is an extreme position, as we daily perform acts with a minuscule but non-zero chance of fatality. But how do we rationalise this?

We don't. But if you want a game-theoretical explanation, we place a finite negative utility on death.

Now, I'm a little confused as to the exact mechanics of this game. Do all agents announce their decision to participate or not participate simultaneously, do they announce one at a time, or do they have multiple (or infinite) rounds in which to change their mind? That last category can become extremely hairy, as there's no guarantee that the agents will ever make up their minds.

Share this post


Link to post
Share on other sites
Apologies if some of the initial post is vague or lacking in information. I wasn't sure what was relevant and what was not, since I'm really learning as I explore this myself. I have a game design and am trying to work out the mechanics to fit it.

A finite negative utility sounds ok, and is what I expected really. But is there some way of picking a decent value for this, possibly based on predicted future rewards or something?

Currently, the manner in which the agents decide to participate in groups is not finalised. At a given point, there are N agents available, and M tasks available to perform (though it is possible that none will be performed). The use of the term agent is perhaps misleading, as I had envisaged having an external omniscient system coordinating the grouping of the agents. Assume that this is a genetic algorithm that creates random candidate groups, and the fitness of the group is how much each agent wishes to be part of it. Thus the agents will veto proposed groupings that they do not think are worthwhile, based on their own assessment of the chance of success, and the relative costs of success or defeat. I just need to work out how to determine those value on a per-agent basis.

Share this post


Link to post
Share on other sites
Well, let's put some exactness in, then. Here's the formulae I'm using:

$ = total reward (to be split) from success
-D = punishment for failure (death)
-C = cost of participation, regardless of outcome

pfail,n = 1/(2^n) = probability of failure with n agents
psucc,n = 1-1/(2^n) = probability of success with n agents
rfail,n = -D - C = reward from failure with n agents
rsucc,n = $/n - C = reward from success with n agents

Epart,n = psucc,n*rsucc,n + pfail,n*rfail,n = expected reward from being the nth agent
to participate
Eabst,n = 0 = expected reward from abstaining

Now, assume that in round 1, each agent in turn announces his decision to participate or not participate. After all have announced, all have a chance to change their minds (again in order). If any do, it can be assumed that the game diverges and their is no solution. If nobody changes their mind, the game succeeds.

From some Matlab scribbling, it looks relatively easy to come up with constants that produce reasonable results. $=200, D=100, C=10, for instance, make it a good bet to be the 20th agent, and a bad bet to be the 21st. In this case, the game does not diverge; all of the first 20 agents sign on, nobody else does, and nobody changes their mind (note that by that time, each participating agent's expected reward is miniscule but positive). If C=0, the solution is again non-divergent, but boring; either everybody signs on, or nobody wants to be the first to sign on.

If already-participating agents can veto other agents from signing on, the signup simply stops near the maximum. For the values above, that's 1 or 2 agents.

Share this post


Link to post
Share on other sites
Ok, that's quite helpful. I hadn't thought about a constant cost of participation, but I suppose it makes sense that it's needed, since the expected reward function curve doesn't have any local maxima and will therefore, as you add participants, either always grow, always shrink, or always remain unchanged. (Is that right? Or am I making this up? My maths is not too hot.)

I don't have an actual constant cost of participation, but I can come up with some arbitrary value based on the fact that each task has a duration, during which you are unavailable for other tasks. I would assume that a constant proportional to this duration would be sufficient.

Ideally I want the groups to have between 1 and 10 members. Given that the probability of success/failure will vary depending on the members and the task (ie. the 2 in the formula is liable to change, though the result will always converge on 1 as the number of agents increases), so I assume I would have to tweak that participation cost constant to yield groups of this size.

One thing in your prototype that doesn't seem to ring true with what I want to do however, is that I would expect most tasks to be so risky that no individual agent would choose to do them alone. Hence, the auto-generation of potential groups, based on some heuristic I'd come up with. If it's vetoed, it's thrown away and another chosen, a group for the task is formed or it becomes apparent that one won't be formed. Does increasing D significantly affect this?

It's worth noting that optimising the reward per agent is desirable but not necessary. Since there are massive amounts of permutations, I don't expect or require anything resembling the best solution. Mainly I'm just looking for a method that creates groups of a size appropriate to the task's reward and risk, where each member of the group feels that it is in their benefit to be there. It has to look like a decision that each group member might reasonably make.

Share this post


Link to post
Share on other sites
Quote:
Original post by Kylotan
Ok, that's quite helpful. I hadn't thought about a constant cost of participation, but I suppose it makes sense that it's needed, since the expected reward function curve doesn't have any local maxima and will therefore, as you add participants, either always grow, always shrink, or always remain unchanged. (Is that right? Or am I making this up? My maths is not too hot.)
The curve does have exactly one maximum for reasonable values. It also has a horizontal asymptote to the right. The purpose of C here is to make that asymptote at a negative value (so not everybody plays).
Quote:
I don't have an actual constant cost of participation, but I can come up with some arbitrary value based on the fact that each task has a duration, during which you are unavailable for other tasks. I would assume that a constant proportional to this duration would be sufficient.
You can alternatively have a cost of success only, which produces a curve that looks about the same.

Quote:
One thing in your prototype that doesn't seem to ring true with what I want to do however, is that I would expect most tasks to be so risky that no individual agent would choose to do them alone. Hence, the auto-generation of potential groups, based on some heuristic I'd come up with. If it's vetoed, it's thrown away and another chosen, a group for the task is formed or it becomes apparent that one won't be formed. Does increasing D significantly affect this?

Yes, with a D much higher than $, the expected reward for the first agent is far, far negative. If a few agents can form a cabal which agrees to participate together then this is circumvented (this is an area of game theory which I'm not familiar with, though).

Quote:
Mainly I'm just looking for a method that creates groups of a size appropriate to the task's reward and risk, where each member of the group feels that it is in their benefit to be there. It has to look like a decision that each group member might reasonably make.

Note that, as with many game theory tasks, the self-interest solution I've posed results in a bunch of agents whose expected utility is barely greater than zero. If you aren't interested in dry game theory stuff and instead just want that sort of a grouping, I do suggest you simply find the maximum of the curve (the solution where participating agents can veto other agents).

Share this post


Link to post
Share on other sites
IMHO, in real life, we do not put infinite negative value on death, since, we come accross suicides or self sacrifice things (maybe a mother for her children etc.)

but, the chances we keep taking despite death possibility may not be as high as we think. you are taking a risk of death while crossing a street or driving a car. but what is the % of death? imho quite low. (some things it is higher on plane and do not get on planes) maybe less that 1th in 10 million or less.

would you do something which has a higher death possibility? lets say 10%? would you involve a group of 10 people, which one of them be chosen and killed and remaining ones will get lets say 1000$? would you accept this for 1,000,000$ ? some people tends to accept second one. (which i also would give a serious thought) or would you accept if it was not 10 (10%) people but 1000 (0.1%) people or 10,000 (0.001%) people?

so, what i am trying to say?
1. death is not infinite.
2. if you mimick real life, your agents will try to make groups to decrease the ratio to an acceptable level. (very low) how to solve it? (maybe use a logarithmic/quadratic ratio calculation so possibility will decrease with less number of agents)

Share this post


Link to post
Share on other sites
Quote:
one could consider that any potential reward, no matter how high, compares unfavourably to the chance of death - arguably an infinitely negative penalty -

Only if you are otherwise immortal. But if you're going to die later on in any case, then an agent can just try to maximise his rewards over his expected lifetime.

If the chance of death is directly related to the expectation value of a task, then they can just pick tasks randomly. If it isn't, and I'm immortal, then I wait for the best possible ratio every time. If it isn't, and I'm not immortal, then the problem becomes interesting. Model this as a small chance to die when not accepting a task?
Quote:
Obviously that is an extreme position, as we daily perform acts with a minuscule but non-zero chance of fatality. But how do we rationalise this?

We rationalise this because:
a. we're likely to die in the future anyway.
b. since we don't have perfect information, not performing those acts may also bring us closer to death.

Share this post


Link to post
Share on other sites
Quote:
Original post by Sneftel
The curve does have exactly one maximum for reasonable values. It also has a horizontal asymptote to the right. The purpose of C here is to make that asymptote at a negative value (so not everybody plays).


Ok, I think my problem is that I don't have an intuitive picture of what this curve looks like, and I have no tool to plot it on. Is it highest where N=1, dropping down as N increases, approaching -C in the limit(N->inf)?

Quote:
Yes, with a D much higher than $, the expected reward for the first agent is far, far negative. If a few agents can form a cabal which agrees to participate together then this is circumvented (this is an area of game theory which I'm not familiar with, though).


This implies that the curve rises though, so I think it's clear I don't quite understand what is going on. :(

Either way, I want to keep things simple by eliminating any iterative aspect to the decision making. Basically the character should see the group offered, and be able to say how much it suits him (ie. report their expected value, I think). They don't have to weigh it against the same group with 1 fewer person, or 1 more person, or potential other groups that may be offered to them if they refuse this one.

Perhaps if I retract my original statement of "arrange themselves in groups" and replace it with "be arranged into groups", it will make things clearer. In other aspects of the game, they will sense and act autonomously, but at this point, I'm not interested in having them maximise their reward, just in forming groups of reasonable sizes where they all anticipate a positive reward, taking into account that chance of a massive cost.

Unfortunately this seems to leave me back at the point where the optimum group size is either 1 or infinity, depending on whether the cost or the benefit diminishes more quickly. If I could work out how to use SciPad, I'd plot things for myself and work something out. :)

Share this post


Link to post
Share on other sites
Quote:
Original post by Argus2
Quote:
one could consider that any potential reward, no matter how high, compares unfavourably to the chance of death - arguably an infinitely negative penalty -

Only if you are otherwise immortal. But if you're going to die later on in any case, then an agent can just try to maximise his rewards over his expected lifetime.


I think the idea is that since you don't know how many rewards you will get in your future lifetime, you don't know how much you could lose by dying. Therefore death is modelled as an infinitely high cost, since it always has to be greater than your potential future rewards. Obviously as soon as you know your lifespan or know the limit to the rewards you can obtain within it, the cost of death can become proportional to that.

Quote:
If the chance of death is directly related to the expectation value of a task, then they can just pick tasks randomly. If it isn't, and I'm immortal, then I wait for the best possible ratio every time. If it isn't, and I'm not immortal, then the problem becomes interesting. Model this as a small chance to die when not accepting a task?


I'm not sure what you're suggesting here. The chance of death is directly related to the expected value of a task, but the task's reward is also directly related, as is the number of people participating in the task. So you can't just pick them randomly as there are more factors than just the risk. Either way, I don't need to find the best ratio for a given person. I need to find curves so that the better ratios are greater or equal to 1 and less than or equal to 10 or 15 or so, and this needs to vary based on the task's reward and the task's risk.

Share this post


Link to post
Share on other sites
Some people value the highs of living on the edge of death or saving a life or glory. In days passed it was an honour to die in battle. Modern Times will include such people as rescue workers, boxers warned by doctors, 'daredevils', David Blaine etc.

So there should be some agents who will go for smaller groups in there. They will all likely be of similar 'dispositions'.

Share this post


Link to post
Share on other sites
Quote:
Original post by Kylotan
Ok, I think my problem is that I don't have an intuitive picture of what this curve looks like, and I have no tool to plot it on. Is it highest where N=1, dropping down as N increases, approaching -C in the limit(N->inf)?
It's highest at some arbitrary point determined by the constants, which is not necessarily 1.

Quote:
Perhaps if I retract my original statement of "arrange themselves in groups" and replace it with "be arranged into groups", it will make things clearer. In other aspects of the game, they will sense and act autonomously, but at this point, I'm not interested in having them maximise their reward, just in forming groups of reasonable sizes where they all anticipate a positive reward, taking into account that chance of a massive cost.
Then just do the curve maximum.

Here's a Matlab plot with $=30, D=50, C=1:


Share this post


Link to post
Share on other sites
Quote:
I think the idea is that since you don't know how many rewards you will get in your future lifetime, you don't know how much you could lose by dying. Therefore death is modelled as an infinitely high cost ...

Again, only if you are immortal. Notice that Sneftel puts a cost on participation for every task. His agents effectively believe they will die after (D / C) tasks, and choose accordingly.

Of course, if you can opt-out of tasks without cost, then it's always best to wait until you can max out the curve by joining. What rational agent would be the first to join a task, without knowing how many (if any) other agents will join him?


Share this post


Link to post
Share on other sites
Quote:
Original post by Daerax
So there should be some agents who will go for smaller groups in there. They will all likely be of similar 'dispositions'.


Yeah, the characters in this game will have traits that affect their perception of risk and glory. Similarly, some will prefer smaller groups to larger ones. I'll apply these as modifiers after I fix the baseline for the average character.

Quote:
Original post by Sneftel
Quote:
Original post by Kylotan
Perhaps if I retract my original statement of "arrange themselves in groups" and replace it with "be arranged into groups", it will make things clearer. In other aspects of the game, they will sense and act autonomously, but at this point, I'm not interested in having them maximise their reward, just in forming groups of reasonable sizes where they all anticipate a positive reward, taking into account that chance of a massive cost.
Then just do the curve maximum.


Ok, I'll play around with SciPad and try to recreate what you did there. And then I suppose I'll have to dig out the calculus book. :) The problem for me was always trying to find (or indeed create) this curve, since it's not very intuitive to me.

Quote:
Original post by Argus2
Of course, if you can opt-out of tasks without cost, then it's always best to wait until you can max out the curve by joining. What rational agent would be the first to join a task, without knowing how many (if any) other agents will join him?


I thought I explained above that there isn't any sort of iterative process - you're presented the option of joining A task with N people in it, and that's it. There's no point where an agent takes the decision to be member N+1. (It's possible that if 1 of the N rejects the group, I'll present it again to the rest, but that's just an optimisation to avoid having to pick another permutation of characters.)

Share this post


Link to post
Share on other sites
I think the part that makes the game a bit boring/bland atm is that all agents know the actual chance of winning and losing. Making agents self-sustained objects with their own parameters for what they think the chances are will affect the result a bit more. Add a small bit of random and make them adjust their expectations based on what they've experienced and seen with others.

Somehow, it feels like you should also include the point of not being able to do something else while involved with an accepted task - so you could be effectively losing by joining a large group.

Share this post


Link to post
Share on other sites
I wrote up a big reply to this thread yesterday but decided not to post it as I wanted to think about it more... and now I'm looking at things slightly differently, so much of what I wrote yesterday may not be the best approach. Here's my current thinking...

In principle, you need to determine if your agents are risk-averse, risk-neutral or risk-taking and then each agent can make a decision, per task, as to whether to accept the task or not. The problem you have is that in order to solve this problem each agent needs to know how many other agents are going to be involved so that they can reasonably estimate the probabilities for success/failure and subsequently the expected rewards. This creates a problem whereby the desired result is an equilibrium state of a dynamic decision problem, which is not a trivial problem to solve. It has some broad relationships with problems in game theory and economic systems, but I won't go into those here.

Ultimately, if you want an information theoretic solution your agents should NOT be limited to making their decisions based only on the current proposed task.

If you just want an ad hoc method for determining reasonable assignments of players to tasks, I'd go with a bidding system. Each agent bids an amount it is prepared to pay to take part in the task. Add a small amount of noise to bids for more non-deterministic results. Then, for a given task, select the bid level at which you will accept participants (you can do this randomly, or by looking at the actual bids and working out how many agents you want to participate). All agents offering a bid >= this level are accepted and take part.

Now, the interesting part. Agents should make a bid relative to their risk behaviour. Risk-taking agents will bid more than the expected payoff (but less than the maximum reward that could be obtained... that wouldn't be rational... unless, of course, you want irrational agents to take part as well ;) ). Risk-neutral agents will bid exactly the expected payoff. Risk-averse agents will bid less than the expected payoff.

You can extend this model by actually forcing the agents to pay their bid. If they are accepted and succeed they'll get some reward back. If they fail they have paid a cost to participate. If players are losing money they should adjust their strategy to increase their future expected rewards. The basic method of doing this would be for an agent to 'play it safe' by being more risk-averse and offering less money to participate... but they'll end up taking part in less tasks. By adding a taxation system in you can force agents to become more risk-taking over time. If you get the parameter balance right you can end up with a very nice little system.

I implemented a system like this many years ago, although the application was slight different but had similar aims... and it worked very well (particularly with the noise and taxation).

Cheers,

Timkin

Share this post


Link to post
Share on other sites
Quote:
Original post by dascandy
I think the part that makes the game a bit boring/bland atm is that all agents know the actual chance of winning and losing.


No, in the actual game they will be more than capable of misjudging the risk! It's just that while I tweak the parameters to get the basic system working, I have to work with median values to keep the number of variables manageable.

Quote:
Original post by Timkin
Ultimately, if you want an information theoretic solution your agents should NOT be limited to making their decisions based only on the current proposed task.


I was hoping to simplify things by presenting tasks one by one and picking groups for them in turn. I'm not committed to any particular brand of solution, just one that appears to demonstrate some appreciation of the risk/reward payoff for any given group.

Quote:
If you just want an ad hoc method for determining reasonable assignments of players to tasks, I'd go with a bidding system. Each agent bids an amount it is prepared to pay to take part in the task. Add a small amount of noise to bids for more non-deterministic results. Then, for a given task, select the bid level at which you will accept participants (you can do this randomly, or by looking at the actual bids and working out how many agents you want to participate). All agents offering a bid >= this level are accepted and take part.


I don't understand how that deals with situations where they are only safe for larger groups. How do I select the bid level at which I accept participants? I can't do it randomly because the group size must make sense. And I don't think I can just keep adding participants one by one until some perceived safety threshold is reached because that obviously affects the payoff from the task.

Share this post


Link to post
Share on other sites
Quote:
Original post by Kylotan
Quote:
If you just want an ad hoc method for determining reasonable assignments of players to tasks, I'd go with a bidding system. Each agent bids an amount it is prepared to pay to take part in the task. Add a small amount of noise to bids for more non-deterministic results. Then, for a given task, select the bid level at which you will accept participants (you can do this randomly, or by looking at the actual bids and working out how many agents you want to participate). All agents offering a bid >= this level are accepted and take part.


I don't understand how that deals with situations where they are only safe for larger groups.


Why do you want to ensure safety? I interpreted your earlier posts to indicate that you wanted agents to be able to make a decision to join or not based on their attitude to risk. By allowing agents to bid, you are essentially asking them to declare their risk strategy (by comparing their bid to the expected payoff).

Quote:
How do I select the bid level at which I accept participants?


As to selecting a bid cutoff, you have two choices as I see it.

1) Declare the number of required agents for the task at the beginning of the bidding.

This will enable agents to estimate the expected payoff and bid accordingly. Count down from the highest bid until you have enough agents for the task.

2) Don't declare the required number of agents and simply select a bid cutoff according to some scheme.

For example, take the average bid as the cutoff. Or, starting with the highest bid and working down, take as many bids as equals the total payoff for the task (this will ensure a neutral economy if you don't include taxation).

You may need to iterate over the bidding process several times, culling out people that don't want to participate given the other people bidding. For example, a risk-averse player is unlikely to want to participate with risk-taking players, since the latter will prefer lower groups which have, given the lower probability of success, a lower expected payoff (but again, it's not linear since the share of rewards goes up with fewer participants).



It would be helpful if we had a little more information about the game, like what are the important factors in its design? Is it desireable for agents to live as long as possible? Are they trying to maximise their short term rewards? What is the aim of participation? Just to get the tasks completed? Exactly what information is available to the participants?

Cheers,

Timkin

Share this post


Link to post
Share on other sites
Quote:
Original post by Kylotan
I have a system where several agents have to be arranged into groups to attempt certain abstract tasks, for which a fixed reward is on offer to be shared among the group members if successful.
...
One factor not yet mentioned though is that these tasks carry risk; failed tasks can be considered dangerous and an agent may be destroyed as a result of that task.


These agents can die, and are therefore mortal.
Can a replacement agent be purchased, and if so, how much does it cost?

Share this post


Link to post
Share on other sites
Quote:
Original post by Timkin
Why do you want to ensure safety? I interpreted your earlier posts to indicate that you wanted agents to be able to make a decision to join or not based on their attitude to risk.


No, I want agents to form groups where the size of the group reflects their wish to reduce their perceived risk (while not making the payoff trivial).

Quote:
It would be helpful if we had a little more information about the game, like what are the important factors in its design? Is it desireable for agents to live as long as possible?


Yes, it is. That's why I started off with the death analogy, to emphasise that sometimes failure is a really bad thing, from their perspective.

Quote:
What is the aim of participation?


Trivially, it's to get the reward. That reward is (mostly) a resource which they can then spend on things, and the cycle repeats.

Quote:
Exactly what information is available to the participants?


The riskiness of the task, the total reward available (which will be split evenly among participants if successful), the duration of the task (which is how long they'll be unavailable for other tasks), and in the situation I envisaged, the attributes of the other members of the proposed group.

Share this post


Link to post
Share on other sites
Quote:
Original post by Kylotan
...I want agents to form groups where the size of the group reflects their wish to reduce their perceived risk (while not making the payoff trivial).


How do you define 'risk' for your agents (quantify it please).

Quote:
the duration of the task (which is how long they'll be unavailable for other tasks), and in the situation I envisaged, the attributes of the other members of the proposed group.


This clearly indicates that agents should be trying to maximise their expected future rewards (or minimise expected future losses) given the population of agents (at least in the ideal solution) and the set of tasks.

Okay... more info needed...

Can tasks run concurrently, or are all tasks sequentially ordered? If the latter then each agent has the option to participate in each task. If the former, then an agent must always choose a schedule of tasks to participate in such that this schedule maximises some quantity over this set of tasks. If you force them to a choose a task at any given time they are free and evaluate their potential risk/reward based only on that task, you will not be able to make any assurances about the long term viability of agents (nor encode this in their solutions). They need to be able to consider what it is they are giving up by accepting the following task in order to make rational decisions.

You should probably use a discounted future reward model of expected utility.

Fundamentally you still have one problem: each agent can only make a decision after all other agents have made a decision. You can get around this by asking agents to list their preferences for tasks. Once preferences have been given (and this might be random as a first assignment, or based on some agent attribute) each agent can assess the potential risk/reward of each task more accurately and re-order their preferences. You could iterate this and hope for a stable solution, or merely limit each agent to a finite number of changes they can make to their preference list.

Share this post


Link to post
Share on other sites
chance of success: 1 - 1/2^numAgents
reward: prize / numAgents
Expected Reward: (1 - 1/2^numAgents) * (prize / numAgents)

As agents are added, the difference between Reward and Expected Reward diminishes. For example, using a prize value of 1:

Agents Reward Expected Difference
1______1.0_____0.5______0.5
2______0.5_____0.375____0.125
3______0.3333__0.2917___0.0417
4______0.25____0.2344___0.0156
5______0.2_____0.1938___0.0063

So it might make sense to have the agents choose to join, based on the difference. Individual risk tolerances could be measured as the maximum difference that an agent is willing to accept.

[Edited by - AngleWyrm on February 19, 2008 8:28:07 PM]

Share this post


Link to post
Share on other sites
Quote:
Original post by Timkin
Quote:
Original post by Kylotan
...I want agents to form groups where the size of the group reflects their wish to reduce their perceived risk (while not making the payoff trivial).


How do you define 'risk' for your agents (quantify it please).


Consider it the probability of dying during the course of that task, which is proportional to the probability of the group failing the task. I can't give you an exact quantity because I don't know it - that's part of my problem. I mentioned Pascal's Wager as an example of why it might be hard to quantify it.

Quote:
Can tasks run concurrently, or are all tasks sequentially ordered? If the latter then each agent has the option to participate in each task. If the former, then an agent must always choose a schedule of tasks to participate in such that this schedule maximises some quantity over this set of tasks.


It's the former. At any given time, there can be a variety of tasks ongoing, each with several people assigned exclusively to that task.

Quote:
If you force them to a choose a task at any given time they are free and evaluate their potential risk/reward based only on that task, you will not be able to make any assurances about the long term viability of agents (nor encode this in their solutions). They need to be able to consider what it is they are giving up by accepting the following task in order to make rational decisions.


It is possible to present them with all the current tasks on offer. They can also judge or rank their own suitability for/interest in them.

They're not forced to take a task whenever one is available for them; they can ignore tasks entirely, if they don't suit. I have to balance the game so that this doesn't happen too often.

Quote:
Fundamentally you still have one problem: each agent can only make a decision after all other agents have made a decision.


I really must stress that it's not important for me to have each agent acting individually here. If a top-down system presents some sort of resolution that is considered likely to be accepted - eg. "People A, B, D, and G join Task 1" - and then those people get to accept or veto this, that's fine, providing I can come up with resolutions that have a decent chance of being accepted. I don't want any potential solutions to allocating people to groups to be held back by the notion of each agent needing to act totally individually.

Quote:
Once preferences have been given (and this might be random as a first assignment, or based on some agent attribute) each agent can assess the potential risk/reward of each task more accurately and re-order their preferences. You could iterate this and hope for a stable solution, or merely limit each agent to a finite number of changes they can make to their preference list.


Hmm. Unless the reward levels differ significantly between tasks, I would expect the individual agents' preferences will spread them out fairly evenly. But I suppose that it wouldn't take much deviation from an even spread for one or two tasks to become more attractive though, and on the subsequent iterations maybe that would draw others in.

However, I still don't have a criteria for deciding when a group is 'good enough' anyway, since I don't know where this risk/reward crossover is (or if it even exists yet). I can't just make the best groups that are on offer - I have to be able to people join no groups at all, if none meet the agents' criteria.

Share this post


Link to post
Share on other sites
Quote:
No, I want agents to form groups where the size of the group reflects their wish to reduce their perceived risk (while not making the payoff trivial).


I don't think that works. If an agent believes there is any chance of death, it may not want to be involved in any group of any size that participates in that activity.

You also assume that the "threat level" applies to agents on a group basis. That may be particular to your game. But have you considered cases where that may not be so? For instance, living on a major fault-line or in Tornado Alley. It doesn't matter how many people live there already, there is no protection in being part of a group when a tornado runs over you.

Share this post


Link to post
Share on other sites
If every task has a base risk and reward, you know what the expected value is from each task. If all of your agents were super-smart with perfect information, they would only go on the tasks with the highest possible expected values. If we assume that isn't the case, then we have a fairly simple algorithm:

1. Give each agent an attribute (call it 'wisdom') valued between 0 and 1.

2. Add up all of the expected values from the tasks on offer.

3. Map each task to a range of the total based on its expected value with the lowest expected value at the start leading up to the highest at the end.

4. Multiply the agent's wisdom by the total to find which task it picks.

The tasks with bigger expectation values then get more agents. Wiser agents will go for tasks with bigger expectation values.

We don't need to worry about a change in numbers per task because while the reward will go down, so will the risk. Unless the relationship is not linear of course. If the reward goes down disproportionately to risk, then you can always rearrange afterwards, biasing toward smaller groups - or larger, in the reverse case.

You do need to build the risk of death into the expected value though, like Sneftel did. It's not really the same as Pascal's Wager, because an eternity in hell is a lot worse than death, which is coming to us in any case. A life without reward should be worthless to your agents, in which case rewards can always be valued against death.

Share this post


Link to post
Share on other sites

This topic is 3580 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this