# Evenly distributing a data set [Solved]

## Recommended Posts

Puck    374
Hi, I'm having a bit of a logical problem that my google-fu is too weak to solve. Here's the situation: What I have [0][0][0][1][2][2] Imagine you have a data set that looks something like that. You have a range of numbers, and each number shows up in the data set one or more times. To put things in context what I have is a variable number of players who each get one or more turns in a "round". Each player is added to the queue n times, where n is the number of turns they get per round. Thus, the data set varies in length, but all possible elements are known ahead of time and fall in the range {0..n}. The number of each element is also known if it's any help. What I want [0][2][1][0][2][0] I want to take this data set and distribute the elements so that they're spaced as evenly as possible, never leaving two of the same element next to each other if possible. This will become the final turn order. It seems like there must be some sort of sorting algorithm that does this, and there's probably a wealth of resources on it, but without a name to put on the concept I'm lost in searching for them. Any names, links, or ideas would be appreciated! [smile] [Edited by - Puck on December 15, 2008 10:32:11 PM]

##### Share on other sites
Zipster    2365
What I would do is put player 0 at index 0, player 1 at index 1, player 2 at index 2, etc. If you run out of a certain player value, you just skip it and keep going. Something like this (off the top of my head, can't guarantee it works [smile]):
std::vector<int> Turns;// fill in Turnsint player = 0;for(std::vector<int>::iterator it = Turns.begin(); it != Turns.end(); /*no increment*/){   std::vector<int>::iterator next_player = std::find(it, Turns.end(), player);   if(next_player != Turns.end())   {      std::swap(*it, *next_player);      ++it;   }   ++player;   if(player >= MAX_PLAYERS)      player = 0;}

##### Share on other sites
Codeka    1239
I don't think Zipster's solution would work if you have to start with [0][0][0][0][1][1][2] (say) because you'd end up with: [0][1][2][0][1][0][0] when the "ideal" distribution would be more like [0][1][0][2][0][1][0]

I don't usually like to criticize without providing a solution myself, but I'm actually on my lunch break and I don't really have time to think this through properly...

My gut tells me that what you want to do is start with a number of "buckets" and put players in each bucket, then sort the buckets separately. Start with num_buckets = the maximum number of turns for any one player (4 in the case above) and then work by putting players in each bucket so that you don't put the same player in the same bucket more than once. Try to keep the number of players per bucket approximately equal.

If nobody's come up with a proper solution by the time I get home from work, I'll have a better crack then :-)

##### Share on other sites
Zipster    2365
My post was based on how turn orders work in most board games -- each player has their turn in order, and if you can't do anything (i.e. you ran out of pieces to place on the board), it just skips you in the order until everyone is unable to do anything, at which point the next phase/round begins. The problem with the [0][1][0][2][0][1][0] distribution is that player 0 gets to go twice before player 2 goes once, and then a third time before player 1 goes twice, which might be unfair. But then again I don't know anything about the OP's game so it could be perfectly fair and valid! There wasn't enough sample data from which to infer :)

##### Share on other sites
Codeka    1239
Quote:
 Original post by ZipsterMy post was based on how turn orders work in most board games -- each player has their turn in order, and if you can't do anything (i.e. you ran out of pieces to place on the board), it just skips you in the order until everyone is unable to do anything, at which point the next phase/round begins. The problem with the [0][1][0][2][0][1][0] distribution is that player 0 gets to go twice before player 2 goes once, and then a third time before player 1 goes twice, which might be unfair. But then again I don't know anything about the OP's game so it could be perfectly fair and valid! There wasn't enough sample data from which to infer :)

That's a good point. It depends how you read the original question, I suppose :-) In my experience, I've never really seen such a huge descrepency between the number of turns per round like that anyway (e.g. play #0 gets four and player #2 gets one - you might see one player with one extra turn but that's usually the maximum). If that's the case, then Zipster's algorithm should be enough.

Puck, can you expand your requirements a bit? For example, how would you expect [0][0][0][0][1][1][2] to be sorted (or is such a thing even possible?)

##### Share on other sites
Puck    374
Thank you everyone for your responses, and I apologize for the confusion. the ideal solution for [0][0][0][0][1][1][2] would indeed be [0][1][0][2][0][1][0].

Exactly what I'm trying to do is create a rogue-like RPG with multiple players. I wanted to have some sort of "speed" variable, where different entities -- not only players as I first implied for simplification -- would be able to act at different speeds. An average player might have a speed of 4 or 5, where an explosion might expand at a speed of 10, giving players with a high enough speed value the opportunity to raise their shield or take a Hollywood style leap away from the explosion.

Originally I created a prototype for this idea where each entity got one "turn" every x seconds based on their speed, but this approach didn't really have the feel I was looking for, so now I'm looking to modify the prototype to be turn based, with each entity receiving speed "turns" per "round". Where a "turn" is one action as in a typical rogue-like -- moving one space, attacking, etc. -- and a "round" is defines as one full cycle where every entity uses up all of their turns and the whole thing starts again.

Zipster's solution is better than what I have, but not ideal. I would like for it to be possible for the explosion to get two "moves" before one of the players if their speed is low enough. Of course, I'm also open to other suggestions for how to arrange the turn order other than even distribution, though that may be creeping more into the realm of game design than programming. [wink]

Edit: Upon reviewing Codeka's advice, I believe I'll attempt to implement a similar algorithm, since no existing algorithm appears to be forthcoming. I'll make another edit later with the results of my attempt.

[Edited by - Puck on December 15, 2008 1:19:18 PM]

##### Share on other sites
Pseudokai    142
This is mostly random thoughts on the subject but it seems to have some pattern:
I tried to figure this out with a rather nasty data set.
[0][1][2][2][3][3][4][4][4][4]
It's easiest to mix a set of n elements with a set of n+1 elements so I started with [0] and combined it with the first two element set [2][2] resulting in
[2][0][2]
This new set can be combined with the [4][4][4][4] easily resulting in
[4][2][4][0][4][2][4]
Nothing more can be done with that set so drop back down to the beginning which is now [1] and combine it with [3][3] resulting in
[3][1][3]
Now it's tricky to combine what's left ([3][1][3] and [4][2][4][0][4][2][4])
It seems like the easiest way would be to group some of the elements of the larger set so that you have a four subsets and then mix those with the smaller set in the same way you would two normal sets.
In your original example for the set [0][0][0][1][2][2] this would be something like:
[1] [2][2] [0][0][0]
[2][1][2] [0][0][0]
Equal sized sets, another case to handle. Choose one and reduce it to two subsets.
[2,1][2] [0][0][0]
[0][2,1][0][2][0]->[0][2][1][0][2][0]
You'd just have to find the best way to generate the subsets to get the best distribution.

##### Share on other sites
Drigovas    509
How about this, just as an attempt that I don't have any formal proof of optimality for [though I suspect it is pretty close, if not spot on, at least with how I am understanding the nature of the problem].

You start with a set {<element, number of occurrences>} denoting obviously the set of elements paired with their associated number of occurrences. You find some element with an odd number of occurrences, and call this element the pivot, and subtract one from its number of occurrences. you then split this set into two sets of as close to equal size as you can get them that would look like {<element, number of occurrences / 2>} for each side, with ties split in the case of other elements with odd occurrences arbitrarily. Repeat the same process for each of these two sets. [if you have no odd occurrence counts, just pick an even one to get things rolling].

The return value looks something like this: <Result of set 1> + my pivot + <Result of set 2>.

It is a straight forward recursive approach. Simple to implement, and I suspect that it actually does discover *some* combination that is optimal [*some* combination instead of *the* combination, since there isn't likely *A* single optimal solution for most problems of this type].

##### Share on other sites
Codeka    1239
I had a go with Pseudokai's idea, and this is what I came up with, in C# (sorry, the code is ugly, uncommented and I'd probably work on it a bit more for production-ready, but it was just what I came up with in a few minutes):

    class CombineSets    {        public static void Main(string[] args)        {            var turns = new List<int>();            foreach (char ch in args[0])            {                int i = (ch - '0');                turns.Add(i);            }            Combine(turns);        }        class Atom { public List<int> Players = new List<int>(); }        static Random rand = new Random();        public static void Combine(List<int> turns)        {            // set up the sets so there's one set per player            var sets = new List<List<Atom>>();            foreach(int player in turns)            {                bool found = false;                foreach(var list in sets)                {                    if (list[0].Players[0] == player)                    {                        list.Add(new Atom { Players = {player}});                        found = true;                        break;                    }                }                if (!found)                {                    sets.Add(new List<Atom> { new Atom{ Players = {player}} });                }            }            while(sets.Count != 1)            {                // go through the sets and find an (n, n+1) combination                bool foundAtLeastOne = false;                for(int i = 0; i < sets.Count; i++)                {                    for(int j = i + 1; j < sets.Count; j++)                    {                        if (sets[i].Count == sets[j].Count + 1 ||                            sets[i].Count + 1 == sets[j].Count ||                            sets[i].Count == sets[j].Count)                        {                            PrintSets(sets);                            sets[i] = Combine(sets[i], sets[j], false);                            sets.RemoveAt(j);                            foundAtLeastOne = true;                        }                    }                }                if (!foundAtLeastOne && sets.Count > 1)                {                    // if we didn't find any, but there's still more than one                    // set left, we'll start to join atoms together                    List<Atom> biggest = null;                    foreach(var list in sets)                    {                        if (biggest == null || biggest.Count < list.Count)                            biggest = list;                    }                    if (biggest != null)                    {                        int index1 = rand.Next(biggest.Count-1);                        int index2 = index1 + 1;                        biggest[index1].Players = Combine(biggest[index1].Players, biggest[index2].Players, true);                        biggest.RemoveAt(index2);                    }                }                PrintSets(sets);            }            Console.WriteLine();            foreach (var list in sets)            {                foreach (var player in list)                {                    foreach (int i in player.Players)                    {                        Console.Write(i);                    }                }            }            Console.WriteLine();        }        private static void PrintSets(IEnumerable<List<Atom>> sets)        {            foreach (var list in sets)            {                foreach (var player in list)                {                    Console.Write("[");                    foreach (int i in player.Players)                    {                        Console.Write(i);                    }                    Console.Write("]");                }                Console.Write(" / ");            }            Console.WriteLine();        }        private static List<T> Combine<T>(IList<T> a, IList<T> b, bool inorder)        {            var combined = new List<T>();            if (!inorder)            {                int count = Math.Max(a.Count, b.Count);                for (int i = 0; i < count; i++)                {                    if (i < a.Count)                        combined.Add(a[i]);                    if (i < b.Count)                        combined.Add(b[i]);                }            }            else            {                foreach(T i in a)                    combined.Add(i);                foreach (T i in b)                    combined.Add(i);            }            return combined;        }    }

Some sample outputs:

> CombineSets 001122
012021

> CombineSets 000001122
010201020

> CombineSets 00000001122233
01023031002020

> CombineSets 012345678
052368174

It's obviously not very fast, but when we're talking about only a dozen or so numbers, the speed is good enough. There's also more possible improvements (like you could modify it to take into account a "reflex" property maybe, which makes your turns all appear closer to the start of the list, rather than randomly spaced out, etc...)

##### Share on other sites
Drigovas    509
A quick addendum to my recommendation. Choose one element from all elements that appear an odd number of times. If there are no elements with odd occurrences, pick nothing, and just divide the set up.

##### Share on other sites
Puck    374
I believe I've found a suitable solution. It's almost stupid simple, but relatively fast for that fact and good enough to at least fake the effect I'm looking for I believe. The pseudo code looks something like this:

turn order = select the most common element from the dataremove the most common element from the datafor each remaining element type:  increment = number of elements of this type / total current size of turn order  count = increment  for each element in the turn order:    while count > 1:      insert instance of element before current element in turn order      count -= 1    count += increment

And the simplistic C++ code I wrote to test the theory:

#include <iostream>#include <vector>#include <map>using namespace std;vector<int> dist_sort(vector<int> in) {	map<int, int> count; //The count of each element type in the original data set	vector<int> turnorder; // The final turn order	int highest = 0; //The most common element	//Count the number of occurrences of each element in the original data set	for (vector<int>::iterator it = in.begin(); it < in.end(); it++) count[*it]++;	//Find the most common element	for (map<int, int>::iterator it = count.begin(); it != count.end(); it++) {		if (it->second > count[highest]) highest = it->first;	}	//Initialize the turn order to contain all instances of the most common element...	for (int i = 0; i < count[highest]; i++) turnorder.push_back(highest);	count.erase(highest); //...and remove them from the original data	//Iterate over the remaining elements	for (map<int, int>::iterator it = count.begin(); it != count.end(); it++) {		//Calculate the increment by dividing the number of this element by the length of turnorder and initialize the counter		float increment = (float)it->second/turnorder.size(), level = increment;		//Iterate over turn order		for (vector<int>::iterator pos = turnorder.begin(); pos <= turnorder.end(); pos++) {			//Insert elements until counter is less than 1.0			while (level >= 1.0 && count[it->first] > 0) {				pos = turnorder.insert(pos, it->first);				pos++;				count[it->first]--;				level -= 1.0;			}			//Increment the counter			level += increment;		}	}		return turnorder;}int main(int argc, char* argv[]) {	vector<int> temp;	//Simple usage message for the wrong number of arguments	if (argc != 2) {		cout << "Usage: sort dataset" << endl << "  e.g. sort 0000112" << endl;		return 1;	}	//Perform a simple search and gather all digits from the input string, forming them into an input vector	for (int i = 0; argv[1][i] != 0; i++) {		if (argv[1][i] > 47 && argv[1][i] < 58) temp.push_back(argv[1][i]-48);	}	//Sort the resulting vector	temp = dist_sort(temp);	//And print the result of the sort	for (vector<int>::iterator it = temp.begin(); it < temp.end(); it++) cout << (*it); cout << endl;}

I'd like to thank everyone again for all the help. While I may not have utilized any of your suggestions directly, they were invaluable in finding a solution that did work. [smile]

##### Share on other sites
Codeka    1239
Quote:
 Original post by PuckI'd like to thank everyone again for all the help. While I may not have utilized any of your suggestions directly, they were invaluable in finding a solution that did work. [smile]

It was a fun problem, and nice diversion [smile]