Here's the Deal - 2D tile based blit w/ scroll.

Started by
14 comments, last by pimstead 22 years, 8 months ago
Okay. Here''s what I came up with for a scrolling algorithm that blits 32x32 tiles from a map into any screen width and height with scrolling. I can run two of these backgrounds full of tiles in 640x480 in 16bit mode. Any more than that and it cuts the frame rate in half. I''ve already gone through this code a couple of times and took out almost all multiplies and all divides. If people could please take a look at this code and see if there is any way to speed it up I would appreciate it. Here''s the code. Width and Height are the width and height of the tile map in tiles. The pointer into the map was changed so the pointer was updated instead of indexed into (example Map[x][y]). As some of you may or may not know indexing is very very very slow compared (Just look at the asm it produces).
  
void GenericTileHandler(ObjectPtr Obj)
{
	int TileX, TileY, XStart, SrcLeftOffset, Width, Height;
	char *Map, *TempMap;
	LPDIRECTDRAWSURFACE	DDSurface;
	RECT SrcRect, DestRect;

	DDSurface = Obj->ObjectSurface;
	Width = Obj->Width;
	Height = Obj->Height;

	XStart = Obj->X_World >> 5;
	TileY = Obj->Y_World >> 5;
	TileX = XStart;
	SrcLeftOffset = Obj->X_World & 31;
	Map = (char*)Obj->Ptr1;
	Map += (TileY * Width) + TileX;
	TempMap = Map;

	SrcRect.top = Obj->Y_World & 31;
	SrcRect.bottom = 32;
	SrcRect.left = *TempMap << 5;
	SrcRect.right = SrcRect.left + 32;
	SrcRect.left += SrcLeftOffset;

	DestRect.top = 0;
	DestRect.bottom = SrcRect.bottom - SrcRect.top;
	DestRect.left = 0;
	DestRect.right = SrcRect.right - SrcRect.left;

	// y loop

	for(;;)
	{
		// x loop

		for(;;)
		{
			DDBackBufferSurface->Blt(&DestRect, DDSurface, &SrcRect, 0, NULL);
			
			// next screen column

			TileX++;
			TempMap++;
			DestRect.left = DestRect.right;
			// check if we are past the screen width or done rendering the available map

			if((DestRect.left >= ScreenWidth) || (TileX >= Width))
				break;
			// clip the right with the right edge of the screen if necessary

			if((DestRect.left + 32) >= ScreenWidth)
				DestRect.right = ScreenWidth;
			else
				DestRect.right = DestRect.left + 32;
			
			SrcRect.left = *TempMap << 5;
			// match the src clip with the dest clip

			SrcRect.right = (DestRect.right - DestRect.left) + SrcRect.left;
		}
		// next screen row

		TileY++;
		Map += Width;
		TempMap = Map;
		DestRect.top = DestRect.bottom;
		SrcRect.top = 0;
		TileX = XStart;
		SrcRect.left = *TempMap << 5;
		SrcRect.right = SrcRect.left + 32;
		SrcRect.left += SrcLeftOffset;
		DestRect.left = 0;
		DestRect.right = SrcRect.right - SrcRect.left;

		// check if we are past the screen height or done rendering the available map

		if((DestRect.top >= ScreenHeight) || (TileY >= Height))
			break;
		// clip with the bottom of the screen if necessary

		if((DestRect.top + 32) >= ScreenHeight)
			DestRect.bottom = ScreenHeight;
		else
			DestRect.bottom = DestRect.top + 32;
		// match the src clip with the dest clip

		SrcRect.bottom = DestRect.bottom - DestRect.top;
	}
}
  
So again. Any input on speeding this baby up would be appreciated. Realistically a tile based game should be able to handle 3 or 4 layers of tiles (although not necessarily a tile in every x,y) and the layers above would be blitted with transparency. So we need speed speed speed. For anyone reading this who is trying to figure out how to scroll and finds that the tutorials are kinda lame, please feel free to use this code. However I am looking for some experienced people to give some feedback on speeding it up. Thanks in advance.
Advertisement
You''ld be better of just using an array containing the tile rects and another array containing the tile numbers, Dont worry about the speed since even a Pentium 233MMX can handle it with no slow-down.

rcTileRects[y][x]; // 0,0,32,32 32,0,64,32 etc..
nTileLayer[y][x];

for(y=0;y<480/32;y++)
{
for(x=0;x<640/32;x++)
{
Blit...start at the XStart and YStart offset
}
}
I appreciate your response. However, this does not take into account several things. First, different screen sizes, second, actual scrolling (when you scroll the edges aren''t whole tiles and you will NEVER know in advance what your rect sizes are for the 4 edges), third, speed IS an issue. You DO need to be able to run several simultaneous background layers without slowdown to the frame rate. Even this code (which has been somewhat optimized) isnt fast enough to run more than two full layers without losing frame rate.

If your suggestion was to precompute all the needed tile ids for the screen blit and precompute all of the rects for each tile blit, then that is a fine idea. Only problem is that it wouldnt be one iota faster since you would still be precomputing every single frame.

More suggestions are appreciated. Also please limit the discussion to the implementation of the code. It is not necessary to go over memory implementations and surface locations, etc. I am just looking for ways to speed up the render cycle.

Thanks again.

No matter what system you use to blit the tiles, it will always slow down. The main slowdown facter is the actual ->Blt function itself, which in-turn is directly related to the video card you use. Try it on a GeForce which are capable of many layers of blits with frame rates ranging from 150 to 600, you will find that you can have 4 layers with no noticable slow down at all. Just limit your FPS by the monitors refresh rate (75FPS).

As for the clipping, if the screen width is 640, you would blit 672 pixels worth of tile (one extra tile at the end) to account for the 32Pixel offset gap. Instead of calling ->Blt within your function, create your own blit routine that clips everything you throw at it, trust me, this will save a hell of a lot of repeated code later on. Better still, create a Graphics wrapper with alpha, clipping etc.. built in.

When calculating what tile goes where, start at 0:0 (minus the current 32Pixel offset) and work your way to 672:512 (or whatever your screen size is) incrementing 32 pixels at a time. Use your tile array to figure out which tile goes where using offset/32 . If the current horz offset is say 1000, you would read the array from nTile[0][1000], then nTile[0][1001] and so on until 672 is reached.

There was a scrolling tutorial somewhere on the web, I think it was a dutch site (in english)...

What you should be doing is coding your routines as basic as possible so its all working correctly, then optimize the code if the performance is an issue, which it shouldnt be in this case, but its up to you.
At this point the code I wrote works well. However, it does not and will eventually have to take into account for transparency and translucency.

I did wait to optimize the code until it was working and then did the cleanup. I see a lot of people throwing around divides and multiplies like it was free candy. I''m not here to criticize and I do quite appreciate the help. However, no professional company developing on a console (ps2, gamecube, agb) will accept this coding. People need to get into the habbit of performing bitwise operations when you can deal with numbers in the power of 2. Divide by 32 >>5. Multiply by 64 <<6. Mod by 8 &7. These are very powerful coding techniques to employ that will get you hired over another person, or get your code running faster. Especially when you see the assembly that is generated by other coding.

I do understand that I cant write code to speed up the actual blit. However all of the overhead calculating which tile to blit where and how to clip it for scrolling and screen size is where you can speed up the code. I know that even this simple code that blits a screen full of tiles and has to generate rects for clipping and scrolling can create a performance hit. When you have a game like Starcraft that has tons of things happening at once and many objects on top of layers of tiles a performance hit is just unacceptable. The code needs to be lean and mean. Anyone who doesn''t care about performance down to the cycle and memory down to the byte will never program anything other than a pc and definitely wont develop games professionally. The AGB has 32k internal ram and only 256k external ram for example. The performance is just as important in the coding as the product.

If anyone can offer some insight into speeding up this code or making it more elegant (simpler, yet faster) I would appreciate it. If anyone really knows about how professional companies writing 2d tiled DirectX games put their stuff together I would appreciate knowing if I am even close. Otherwise please don''t point me to tutorials on this site. I appreciate their value for beginners however they are no where near advanced enough to really dig into this subject.

Another thing I thought of out of this discussion is the need for an assembly section on this website. Even as windows programmers writing DirectX code we are not above writing assembly language routines where extra performance is required. I think it would be appropriate to have information on this site on setting up and calling directx and windows functions from assembly.

Thanks again to the people who have helped out on this.
So no one can actually come with anything?...
I''m not sure if this will help you at all, but have you tried doing this in d3d instead? I''m not good at c++ or directx (yet) but I noticed a high increase of speed when coding my tile-engine in d3d with textures instead of blocks of "sprites" and directdraw (Don''t know the actual name for it in english).

Please mail me or add me on icq, I would be glad to discuss this techniques with you!

Best regards
Fredrich
I''m studying Computer Science in Linkoping, Sweden. I''ve been programming for about 8 years and I''m now out to learn how to code optimized c++ and directx.

I am working on a multi-layer tile-engine and something I''ve done that improved the speed is this: For every layer in front of the first, as I loop through my Y and X, for each tile, I check to see if it is tile zero. If it is, I skip it, and go to the next tile. Since more and more of the tiles are zero/empty as the layers increase, it becomes quicker to do the check, than to blit all those empty tiles.
pimstead:
Check out this site. It might help you.

Edited by - TookH on August 2, 2001 9:47:12 PM
"It tastes like burning..."
faster up with assembly ...
and you dont want 8-------D ????

This topic is closed to new replies.

Advertisement