Jump to content

  • Log In with Google      Sign In   
  • Create Account


#ActualAmzBee

Posted 06 March 2013 - 06:44 AM

Hey everyone, I'm currently working on some behaviours for non playable characters in a game, and I need to find a reliable and fast method for finding the mode specific average of a set of values. This is not really a mathematics specific question rather a performance once, see the code must be per-formant in various situations where creating things like hash tables or anything else that creates orphan garbage to an extent that has potential to slow the game down to a crawl.

 

So far, I've found 4 different approaches to the issue as follows:

 

static int ModeA(int[] items)
{
    var groups = items.GroupBy(t => t);
    int maxCount = groups.Max(g => g.Count());
    return groups.First(g => g.Count() == maxCount).Key;
}
 
static int ModeB(IEnumerable<int> items)
{
    int maxVal = 0, maxCount = 0, curCount;
    var iter = items.GetEnumerator();
    while (iter.MoveNext())
    {
        curCount = items.Count(t => t == iter.Current);
        if (curCount > maxCount)
        {
            maxVal = iter.Current;
            maxCount = curCount;
        }
    }
    return maxVal;
}
 
static int ModeC(int[] items)
{
    int maxVal, maxCount, curCount;
    for (int i = maxVal = curCount = maxCount = 0; i < items.Length; i++)
    {
        var curVal = items[i];
        if ((curCount = items.Count(t => t == curVal)) > maxCount)
        {
            maxVal = curVal;
            maxCount = curCount;
        }
    }
    return maxVal;
}
 
static int ModeD(int[] items)
{
    int maxVal = 0, maxCount = 0, curCount = 0, curVal = 0, i = 0, j = 0;
    for (i = 0, j = 0; i < items.Length; i++, curCount = 0)
    {
        curVal = items[i]; 
 
        for (j = 0; j < items.Length; j++)
            if (items[j] == curVal) 
                curCount++;
 
        if (curCount > maxCount)
        {
            maxVal = curVal;
            maxCount = curCount;
        }
    }
    return maxVal;
}

 

 

And here is what I get performance wise if I ask each function to return the mode for 10,000 elements (timed with the Stopwatch class):

 

Method A Took: 00:00:00.0112893.
Method B Took: 00:00:02.9916290.
Method C Took: 00:00:02.3635998.
Method D Took: 00:00:00.5677500.
 
Hopefully you'll already be able to see my problem, method A is obviously the fastest but method D is more favourable (at least from what I can see as it doesn't create an enumerable collection like A and B do). In the real-time scenario I'll only be throwing a maximum of 300 elements at this method, which gives me something more reasonable:
 
Method A Took: 00:00:00.0087793.
Method B Took: 00:00:00.0039755.
Method C Took: 00:00:00.0026175.
Method D Took: 00:00:00.0007191.
 
And if I repeat this test within a loop a few times, the results change slightly:

Method A Took: 00:00:00.0000429.
Method B Took: 00:00:00.0027317.
Method C Took: 00:00:00.0021401.
Method D Took: 00:00:00.0005269.
 
Again obviously method A is the fastest, but in this 2rd test only after iteration. So my question is am I being too obsessed over this when I could just go with method A and sod GC, or is method D truly better even though I never see that much improvement after iteration and on larger arrays it struggles.
 
Thanks everyone for any feedback on the subject.
 
Aimee

#1AmzBee

Posted 06 March 2013 - 06:43 AM

Hey everyone, I'm currently working on some behaviours for non playable characters in a game, and I need to find a reliable and fast method for finding the mode specific average of a set of values. This is not really a mathematics specific question rather a performance once, see the code must be per-formant in various situations where creating things like hash tables or anything else that creates orphan garbage to an extent that has potential to slow the game down to a crawl.

 

So far, I've found 4 different approaches to the issue as follows:

 

static int ModeA(int[] items)
{
    var groups = items.GroupBy(t => t);
    int maxCount = groups.Max(g => g.Count());
    return groups.First(g => g.Count() == maxCount).Key;
}
 
static int ModeB(IEnumerable<int> items)
{
    int maxVal = 0, maxCount = 0, curCount;
    var iter = items.GetEnumerator();
    while (iter.MoveNext())
    {
        curCount = items.Count(t => t == iter.Current);
        if (curCount > maxCount)
        {
            maxVal = iter.Current;
            maxCount = curCount;
        }
    }
    return maxVal;
}
 
static int ModeC(int[] items)
{
    int maxVal, maxCount, curCount;
    for (int i = maxVal = curCount = maxCount = 0; i < items.Length; i++)
    {
        var curVal = items[i];
        if ((curCount = items.Count(t => t == curVal)) > maxCount)
        {
            maxVal = curVal;
            maxCount = curCount;
        }
    }
    return maxVal;
}
 
static int ModeD(int[] items)
{
    int maxVal = 0, maxCount = 0, curCount = 0, curVal = 0, i = 0, j = 0;
    for (i = 0, j = 0; i < items.Length; i++, curCount = 0)
    {
        curVal = items[i]; 
 
        for (j = 0; j < items.Length; j++)
            if (items[j] == curVal) 
                curCount++;
 
        if (curCount > maxCount)
        {
            maxVal = curVal;
            maxCount = curCount;
        }
    }
    return maxVal;

}

 

And here is what I get performance wise if I ask each function to return the mode for 10,000 elements (timed with the Stopwatch class):

 

Method A Took: 00:00:00.0112893.
Method B Took: 00:00:02.9916290.
Method C Took: 00:00:02.3635998.
Method D Took: 00:00:00.5677500.
 
Hopefully you'll already be able to see my problem, method A is obviously the fastest but method D is more favourable (at least from what I can see as it doesn't create an enumerable collection like A and B do). In the real-time scenario I'll only be throwing a maximum of 300 elements at this method, which gives me something more reasonable:
 
Method A Took: 00:00:00.0087793.
Method B Took: 00:00:00.0039755.
Method C Took: 00:00:00.0026175.
Method D Took: 00:00:00.0007191.
 
And if I repeat this test within a loop a few times, the results change slightly:

Method A Took: 00:00:00.0000429.
Method B Took: 00:00:00.0027317.
Method C Took: 00:00:00.0021401.
Method D Took: 00:00:00.0005269.
 
Again obviously method A is the fastest, but in this 2rd test only after iteration. So my question is am I being too obsessed over this when I could just go with method A and sod GC, or is method D truly better even though I never see that much improvement after iteration and on larger arrays it struggles.
 
Thanks everyone for any feedback on the subject.
 
Aimee

 


PARTNERS