Back to For Beginners

Slow execution of my code ???

For Beginners

Started by signal_ October 28, 2007 01:14 AM

8 comments, last by Zahlman 16 years, 5 months ago

signal_

366

Author

October 28, 2007 01:14 AM

I have recently begun a life simulator project using SDL and C++ in order to practice and learn. My simulation takes place on a pond which houses a thriving population of toads (predator) and a swarm of flies (prey). The game loop does the following: 1. flies move (algorithm to process data & graphical representation) 2. toads move (algorithm to process data & graphical representation) 3. flies reproduce 4. each toad targets its fly victim 5. check to see if victim was eaten & award corresponding energy (calories) 6. check to see if toads mate 7. update the toads' ages/energy status Each numbered point roughly corresponds to a function. When I run the code with small numbers of flies and toads (created in C++ by vectors) the program runs swiftly (i.e. 500 flies and 50 toads). But if the creature levels rise to large numbers (i.e. 1000 flies, 200 toads or more) the program slows to a crawl and the creatures move very slowly. I thought that perhaps it was the graphical output that was slowing the code down so I commented it out. This turned out to not be the bottleneck. Then I looked at the fly reproduction function. The algorithm in pseudo code looks like:



for (cr = 0; cr <= flyNum && flyNum >= 0; cr++)
           {
           for (cr2 = 0; cr2 <= flyNum && flyNum >= 0 && birthflag <= 20; cr2++)
                 {
                     // do fly reproduction stuff here

                 }
            }

Basically, the code above checks to see if their are two flies sharing the same location on the game grid. If there are, then the flies procreate and new flies are born. But if flyNum = N and N >> 0, then there are N^2 checks (is this correct?). This slows the code down considerably. I have confirmed that when I comment out the fly reproduction code everything runs much faster. So I know this is the culprit. But, if I comment out the reproduction code and also initialize flyNum = number flies to be a very large integer, say 50,000, the code then crawls again. Is this just a limitation of the processor or am I using the processor terribly ineffectually? Please help anyone that can. Also, please note the embedded for loop above. This must be a common issue, so are there commonly known work-arounds that I am ignorant of? Thanks for any help that can be given. Edit: I guess the basic question I am asking is that is there a limitation to the number of game objects?

ToohrVyk

1,596

October 28, 2007 01:28 AM

There is an inherent limitation due to the speed of your code. You can optimize it to increase its performance. For instance, your O(n²) algorithm for flies could easily be converted to an O(n) one using simple techniques (such as storing fly positions on the grid, or using a sweeping algorithm for detecting flies on the same position). You probably have a lot of other places in your code which are not optimized.

I would suggest using a profiler to find bottlenecks (it's faster than commenting out things by hand) and optimizing these bottlenecks until you have a fast enough program.

Blog — Facebook

signal_

366

Author

October 28, 2007 01:34 AM

Thank you, ToohrVyk. I have suspected that someone would respond to my question in a manner as you did. It is very helpful. I was mostly looking to confirm my suspicions that my code is inefficient. So, I will pursue the recommendations that you gave.

However, what is a profiler? Care to elaborate? I am scouring google as we speak.

Thanks again.

shadwdrake

121

October 28, 2007 01:48 AM

Just a quick fix for your posted nested loop.

When you are looping an array over itself you can make one quick optimization.

in the second loop, change the constructor of cr2 from cr2=0 to cr2=cr+1.

Since you are just wanting to compare all the values of the array against each other, you only want to do that once.

You just need to check your initial array variable against the elements that come after it in the array. Elements before it will have already been checked against it.

Here is a breakdown of how it would compare.

Save you had a 5 element vector of the following [1,2,3,4,5].

Using the trick I posted above you'd get compares like this

the first loop would only run 4 times since you don't compare the last element
to anything else (its already been compared to all previous elements).

outer-loop 1: 1->2, 1->3, 1->4, 1->5.
outer-loop 2: 2->3, 2->4, 2->5. (2 has already been compared to 1 previously)
outer-loop 3: 3->4, 3->5. (same for 3->1 and 3->2)
outer-loop 4: 4->5 (same as above)

You can see where you'd see your code speed up since you are not doing duplicate compares anymore.

ToohrVyk

1,596

October 28, 2007 02:02 AM

A profiler is a tool to measure the execution speed of portions of your program. Common examples include gprof and DevPartner.

Blog — Facebook

Anonymous P

429

October 28, 2007 08:13 AM

Quote:
Basically, the code above checks to see if their are two flies sharing the same location on the game grid. If there are, then the flies procreate and new flies are born. But if flyNum = N and N >> 0, then there are N^2 checks (is this correct?). This slows the code down considerably.

Yep. Depending on what you're doing, quick nearest neighbor searching can be pretty involved. In fact, searching for nearest neighbors in high dimensional spaces quickly is a research topic. In 2-d, you might use a metric tree such as a kd tree or a quadtree.

But you're only looking for flies on the same grid square, if I understand correctly. This case is much simpler; you'd like a function that takes some flies (as a vector, for example), finds the set of flies in a given square (there might be several) and does something with them, like crossing them over or something. (aside: this operation could be quadratic as long as the number of flies per grid square is low).

I'm assuming, for demonstration purposes, that frogs have x, y integer coordinates denoting their grid position.

One way to this is to consider the quadrants as bins, and use a hash table to efficiently pack each frog into its corresponding bin, and later go through the bins.

In python, this is very straightforward, and fast enough:

class Frog:    def __init__(self, x, y):        self.x = x        self.y = ygrid_dim = 1000bins = dict()frogs = []for i in range(50000):    frogs.append(Frog(i, i))start = time.time()for f in frogs:    flat = f.x + (f.y * grid_dim)    if flat in bins:        bins[flat].append(f)    else:        bins[flat] = [f]print time.time() - start

If you feel the pressing need to inflict C++ upon yourself instead of basking in the warm glow of python and don't have a standard hashtable type (yet) there is another nice approach you can use.

You can sort the frogs by their bin position (any ordering will do as long as you preserve equality) and extract the consecutive ranges of frogs with equal bins from the sorted list.

You do have the STL in C++, which is pretty pleasant, and I need the practice (disclaimer: I am not a C++ programmer, so this may be horrid C++ code):

#include <iostream>#include <cstdlib>#include <vector>#include<algorithm>using namespace std;struct Fly{	//fly stuff goes here	int fly_id;	vector<int> coords;	Fly(int id, int x, int y);};Fly::Fly(int id, int x, int y){	fly_id = id;	coords = vector<int>(2);	coords[0] = x;	coords[1] = y;}inline bool fly_lt(const Fly *f1, const Fly *f2){  	return f1->coords[0] < f2->coords[0] || f1->coords[0] == f2->coords[0] && f1->coords[1] < f2->coords[1];}inline bool fly_eq(const Fly *f1, const Fly *f2){ 	return f1->coords == f2->coords;}void neighbor_flies(vector<Fly*> &flies, bool print_out) {		sort(flies.begin(), flies.end(), fly_lt);		vector<Fly*>::iterator current = flies.begin();	pair< vector<Fly*>::iterator, vector<Fly*>::iterator > bounds;		//bounds delimit flies who live in the same bin		while(current != flies.end()){		bounds = equal_range(current, flies.end(), *current, fly_lt); 		for(vector<Fly*>::iterator i = bounds.first; i < bounds.second; i++)			if(print_out) cout << (*i)->fly_id << endl;				current = bounds.second;				if(print_out) cout << endl;	}}int main(int argc, char *argv[]){	vector<Fly*> flies; //Example flies	flies.push_back(new Fly(1, 1, 1));	flies.push_back(new Fly(2, 1, 1));	flies.push_back(new Fly(3, 1, 2));	flies.push_back(new Fly(4, 1, 1));	neighbor_flies(flies, 1); //Do the bins look correct?		for(int i = 5000000; i > 0; i--)		flies.push_back(new Fly(i, i, i)); 		int e;	cout << "allocation done" << endl;	cin >> e;	neighbor_flies(flies, 0); //Let's see how long 5000000 flies take, once the flies are allocated  return EXIT_SUCCESS;}

signal_

366

Author

October 28, 2007 05:18 PM

Thanks for the help, ToohrVyk, shadwdrake, and Anonymous P. I am glad to see that there are solutions to the problems with my code. I have read yr posts and am thinking about the possible solution.

So I guess the moral of the story is to always replace N^2 type operations with operations that scale with N instead of N^2 when dealing with many game objects.

Kylotan

10,510

October 29, 2007 05:05 AM

Not always: if you can replace them with solutions that don't need to scale at all - eg. hash tables (and Python's dictionary class) - then that is better than solutions that scale with N! But yes, the first rule when dealing with large quantities is to pick the most effective data structures and algorithms.

signal_

366

Author

October 29, 2007 03:28 PM

Hi Kylotan:

Quote:Original post by Kylotan
Not always: if you can replace them with solutions that don't need to scale at all

Indeed. Even a programming dummy like myself can grasp this concept. =) I just had to shed my brute-force-stubbornness first.

A solution like the one you described is the most elegant and efficient. Nascent solutions have already been implanted in my brain (oops, that last sentence sounds like it came out of a William Gibson novel). Yesterday, I had spent most of the day, before watching the victorious Red Sox, reading about optimization and profiling, which, of course, led me to data structures and algorithms.

I was not a comp sci student at the uni so I have only been briefly exposed to some of this stuff (like when I attended a colloquium/lecture, etc). I did not realize how important and relevant this subject is because I assumed the processors would be so powerful that efficiency could be neglected.... This is not the first time I have been mistaken. =)

I am now looking for a book on this subject (data structures & algorithms) to facilitate an all-out assault on my inefficient code. I will post my solution later for critique, if you guys are game. Thanks for all the help. I'm now off to the casino....

Edit: I forgot to ask, how many game objects are possible in a game? Let's assume low-level graphics, and game objects have requisite data - like position, type, energy, age, etc. - and move around interacting with one another. Further, let's assume efficient code (a huge assumption for my programs =) ). Could we have 50,000 game objects, 100,000, more/less? I am curious since this impacts my project in terms of its eventual direction.

Zahlman

1,682

October 29, 2007 03:50 PM

Quote:Original post by Anonymous P
In python, this is very straightforward, and fast enough:

class Frog:    def __init__(self, x, y):        self.x = x        self.y = ygrid_dim = 1000bins = dict()frogs = []for i in range(50000):    frogs.append(Frog(i, i))start = time.time()for f in frogs:    flat = f.x + (f.y * grid_dim)    if flat in bins:        bins[flat].append(f)    else:        bins[flat] = [f]print time.time() - start

Just thought I'd take the opportunity to teach some Python tricks :)

from time import time# Of course, I understand that this is a placeholder for something more complex,# but I would normally just use a tuple to start and make the class later. :)class Frog:    def __init__(self, x, y):        self.x = x        self.y = ygrid_dim = 1000bins = {}frogs = [Frog(i, i) for i in range(50000)]start = time()for f in frogs:    flat = f.x + (f.y * grid_dim)    bins.setdefault(flat, []).append(f)print time() - start

Slow execution of my code ???

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Slow execution of my code ???

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines