Sign in to follow this  

need help optimising...

This topic is 4558 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

What up. I have just written some code to rotate a sprite using SDL and it works quite well...but unfortunately its a little slow. I've done my best with optimisation (with what i know so far...heh), I was just wondering if anyone can offer some tips on how to get it a bit faster. Heres the prototype SDL_Surface* rotate(int degrees, int x0=33, int y0=33) degrees- -degrees to rotate sprite x0,y0 - coordinates of axis of rotation Returns a pointer to a surface containing the rotated sprite. and the code.... ---------------------------------------------------------------------
SDL_Surface* rotate(int degrees, int x0=33,  int y0=33)
{
	SDL_Surface *temp_src,*temp_dst;
	//The variables used to store coord data
	double x2,y2,fx,fy,w1,w2,w3,w4;
	int x1,y1;
	int x2a,x2b,y2a,y2b;
	double cosined,sined;

	//Need four variables for surrounding pixels (bilinear filtering)
	Uint32	Ca=0,	Cb=0,
			Cc=0,	Cd=0;

	//Our resultant pixel
	Uint32 Pixel=0;

	//Sprite data (height and width)
	int h=65;
	int w=65;

	//Absolute axis coords
	unsigned int absol_coord_dst;

	//assign some variables for masking
	Uint32 rmask,gmask,bmask,amask;
	#if SDL_BYTEORDER == SDL_BIG_ENDIAN
	rmask = 0xff000000;
	gmask = 0x00ff0000;
	bmask = 0x0000ff00;
	amask = 0x000000ff;
	#else
	rmask = 0x000000ff;
	gmask = 0x0000ff00;
	bmask = 0x00ff0000;
	amask = 0xff000000;
	#endif

	//If there is no rotation,return original sprite
	if (degrees==0) return(basesprite);
	
	//copy the original sprite to temp
	temp_src = SDL_DisplayFormat(basesprite);
	temp_dst = SDL_DisplayFormat(basesprite);
	
	//Lock both surfaces so we can read/write individual pixels
	SDL_LockSurface(temp_src);
	SDL_LockSurface(temp_dst);
	
	
	//Precalc sin/cos values
	cosined=cos((360-degrees)*PI/180);
	sined=sin((360-degrees)*PI/180);

	//For each pixel in our rotated sprite
	for(y1=0;y1<h;y1++)
	{
		for(x1=0;x1<w;x1++)
		{

			//Find its equiv x,y location before the rotation
			x2=(cosined*(x1-x0)-sined*(y1-y0)+x0);
			y2=(sined*(x1-x0)+cosined*(y1-y0)+y0);
					
			//If it was out of bounds, set pixel to transparent
			if ((x2>64)|(x2<0)|(y2>64)|(y2<0)) 
			{
				Pixel=gmask; //should be ColourKey! Set to green for now
			}
			else
			{
				//BILINEAR FILTERING
				//-------------------
				//Now find the four pixels around x2,y2 imaginary location (ie its still a float)
				x2a=(int)x2;x2b=x2a+1;
				y2a=(int)y2;y2b=y2a+1;

				//Used for weightings
				fx=x2-x2a;
				fy=y2-y2a;
				w1=(1-fx)*(1-fy);
				w2=(fx)*(1-fy);
				w3=(1-fx)*(fy);
				w4=(fx)*(fy);

				//Assign the four surrounding pixels
				Ca=((unsigned int *)temp_src->pixels)[x2a + y2a * w];
				Cb=((unsigned int *)temp_src->pixels)[x2b + y2a * w];
				Cc=((unsigned int *)temp_src->pixels)[x2a + y2b * w];
				Cd=((unsigned int *)temp_src->pixels)[x2b + y2b * w];


				//Calculate the weighted average of each channel
				Pixel=	  (Uint32)(  ( w1*(rmask&Ca) ) + ( w2*(rmask&Cb) ) + ( w3*(rmask&Cc) ) + ( w4*(rmask&Cd) ) )
						| ((Uint32)(  ( w1*( (gmask&Ca) >>8 ) ) + ( w2*( (gmask&Cb) >>8 ) ) + ( w3*( (gmask&Cc)>>8 ) ) + ( w4*( (gmask&Cd) >>8 ) ))<<8)
						| ((Uint32)(  ( w1*((bmask&Ca)>>16) ) + ( w2*((bmask&Cb)>>16) ) + ( w3*((bmask&Cc)>>16) ) + ( w4*( (bmask&Cd)>>16 ) )) <<16);
				
				
			}

			//and store the pixel in the sprite			
			((unsigned int *)temp_dst->pixels)[x1 + y1 * w]=Pixel;


		}
	}
	
	
	//We are finished with the pixels
	SDL_UnlockSurface(temp_src);
	SDL_UnlockSurface(temp_dst);
	
	//copy the result to "sprite" surface
	sprite=SDL_DisplayFormat(temp_dst);
	
	//Dont need temps anymore
	SDL_FreeSurface(temp_src);
	SDL_FreeSurface(temp_dst);

	//Return a pointer to the surface...
	return(sprite);


}

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
sry i couldn't resist :) anyways, i heard some bad stuff with sdl thats why i said that, but it seems in the lines were you have
cosined=cos((360-degrees)*PI/180);
sined=sin((360-degrees)*PI/180);
you should already have PI/180 figured out, sry i dont know about sdl as much as i should to help you.

Share this post


Link to post
Share on other sites
Before optimizing, you need to know what parts of the code are slow. I would suggest running it through a profiler (or sprinkling some timing information through the function). With that information, you'll be able to tell whether the problem is in your calculations, the SDL surface functions, or whatever.

Share this post


Link to post
Share on other sites
Quote:
Original post by Anonymous Poster
you should already have PI/180 figured out, sry i dont know about sdl as much as i should to help you.

The compiler will already do that for you.

Share this post


Link to post
Share on other sites
Just one little thing that jumped out at me:

if ((x2>64)|(x2<0)|(y2>64)|(y2<0)) 


This is a bitwise Or operation. It forces the compiler to evaluate every single part of the conditional before deciding which branch to take. If you use a logical Or (the || operator) the compiler will short-circuit the conditional and abort as soon as it figures out what to do; this will minimize the number of checks you have to make. It won't make a massive difference, but it's important to understand the difference when writing performance-critical code.

For best results, change it to something like:


if((x2 < 0) || (y2 < 0) || (x2 > 64) || (y2 > 64))
Pixel = gmask;
else
{
// ... Regular code here
}



Chances are your slowness is coming from either repeatedly locking/unlocking the surfaces, allocating and freeing temp surfaces, or a mixture of both. If you can find ways to reduce or eliminate those operations as much as possible you'll get a noticeable speed boost. Especially if you are wanting to rotate a sprite per-frame (or many sprites per-frame) this will incur a major penalty. (And, obviously, as Konfusius said, software rotation is slow. Always optimize your algorithms and design before optimizing individual lines of code.)

Share this post


Link to post
Share on other sites
Thanks for the help guys.

I'll look into learning how to get timing information fom VC++.

One more quick question. I put the following code in to speed up the
bilinear filtering (Instead of interpolating for every pixel), and it seems to have speeded up the code slightly.

For some reason i cant figure out, If i replace Pixel=Ca; with Pixel=rmask;, the affected pixels go blue....and vice versa (bmask makes them go red). I have already copied some code to change their definitions depending on if its little endian or not (see my first post)...any ideas why this is happening?


//If its surrounded by identical pixels...dont interpolate
if((Ca==Cb)&(Cb==Cc)&(Cc==Cd))
{
Pixel=Ca;
}
else
{
//Calculate the weighted average of each channel
Pixel= (Uint32)( ( w1*(rmask&Ca) ) + ( w2*(rmask&Cb) ) + ( w3*(rmask&Cc) ) + ( w4*(rmask&Cd) ) )
| ((Uint32)( ( w1*( (gmask&Ca) >>8 ) ) + ( w2*( (gmask&Cb) >>8 ) ) + ( w3*( (gmask&Cc)>>8 ) ) + ( w4*( (gmask&Cd) >>8 ) ))<<8)
| ((Uint32)( ( w1*((bmask&Ca)>>16) ) + ( w2*((bmask&Cb)>>16) ) + ( w3*((bmask&Cc)>>16) ) + ( w4*( (bmask&Cd)>>16 ) )) <<16);
}

Share this post


Link to post
Share on other sites
Weighted average of 4 pixels?

Hmmm.
I don't know many optimisations for that.

Are you sure you can't get a simple average. (i can do normal averages really fast).

Normal average
pixel = ((((w + x) + (y + z))) & mask) >> 2); //Mask gets rid of pixels that move from red to green, and from green to blue during the shift.

Note:
If you make w and x, into one 16 bit int, you can get it to work faster.
There will be leak through. (for eg. if you have a lot of red, it may leak through to the green if theres heaps of it. But you won't see it much. At most it will change the lsb. That might cause a flow through effect if you set all channels to max tho. Be sure to hold it in a big enough variable, just in case.)

Also note, you do this one the ENTIRE PIXEL. (not per channel. This is what makes it fast).

Note, the >> and the mask do three divides. (for each channel).

All over, compare 3 adds, a bitshift and an and, with having to do (per channel,
4 adds and a divide).

Mask, just has a set of zeros where the bits would move into the next channell.

Now, weighted average, i'd have to think some....

From,
Nice coder

Share this post


Link to post
Share on other sites
x2a=(int)x2;x2b=x2a+1;
y2a=(int)y2;y2b=y2a+1;



//Used for weightings
fx=x2-x2a; //So this is how far it is from an int.

fy=y2-y2a; //Same

w1=(1-fx)*(1-fy);

w2=(fx)*(1-fy);

w3=(1-fx)*(fy);

w4=(fx)*(fy);

I don't know why your doing this. Mind enlightening me? (hey, i'm optimising your code!)

As for neibouring pixels
pixel[x + 1]
pixel[x - 1]
pixel[x + row]
pixel[x - row]

Where row is the length of one row.

Care has to be taken for the edjes of the image. (shove a border around it, and start, using the border as the neibour for the first iteration).

From,
Nice coder

Share this post


Link to post
Share on other sites
Hey nice coder,

The weighted average is a part of a process called "bilinear filtering", basically instead of taking each pixel in the original and rotating it, you take each pixel in the final image and reverse the rotation to find its colour. Most of the time the rotation doesn't give integer coordinates, so to find the equivalent colour at that (non-int) position, you take a weighted average of the pixels around it.
see http://www.flipcode.com/articles/article_bilinearfiltering.shtml

You're right in that its not the fastest way of performing a rotation, but it does give pretty nice results.

The reason i'm trying to get it faster is because i wanted to be able to rotate a little spaceship in any direction realtime. At the moment it could probably handle that, but not if i need to rotate several frames of an animated sprite.

Share this post


Link to post
Share on other sites
Are you using fast sin/cos functions?

** edit **
If you are using non-optimized sin/cos routines, what I would recommend is that you create yourself some lookup tables:
float sin_lookup[361]; // 0-360 inclusive
float cos_lookup[361]; // 0-360 inclusive

fill them in at runtime via a loop, and then you may instantly gain the sin/cos of any whole angle:
sin_lookup[angle%360];

If you need more accuracy one option is to linearly interpolate between two values in the lookup.
int i_angle = (int)angle;
float dec = angle – _floor(angle);

float sin = sin_lookup[i_angle]*(1-dec) + sin_lookup[i_angle+1]*dec;

HTH


[Edited by - CyberFox on June 24, 2005 1:00:37 PM]

Share this post


Link to post
Share on other sites
Optimization questions really shouldn't go in For Beginners. :(

Quote:

x2=(cosined*(x1-x0)-sined*(y1-y0)+x0);
y2=(sined*(x1-x0)+cosined*(y1-y0)+y0);


Since x0/y0 are constant and x1/y1 vary linearly, it should be possible to avoid the multiplications here by doing accumulation instead. Outside the loop, initialize:


x2 = sined*y0 - cosined*x0 + x0;
y2 = y0 - sined*x0 - cosined*y0;


Then adjust each by the appropriate sined/cosined on each iteration (remembering to reset at the end of each "row").

OTOH, your compiler might be good enough to do that for you.

Other than that, working indirectly through surfaces is quite possibly the biggest speed hit here. Try caching basesprite->pixels (or whatever), and working with that directly - rotate "into" a local Uint32[] of the same size, and copy the data back to the sprite's pixel data.

Share this post


Link to post
Share on other sites

This topic is 4558 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this