Any tricks to speed up CopyRect() or alternative??

Hi All, I'm writing a 2D game that uses 1024x1024 size surfaces and map tiles to them for rendering backgrounds. Here is the code I'm using. I have it set up so it only updates when you can see the next surface in the direction your character is moving with a possible of 4 total in view at once. This appears to really lag out slower machines quite a bit when it updates. Any tricks I'm missing, or a better way? Would looping through with memset for each tile be any quicker? Thanks

Temp_Texture = *g_EmitterPin[m_iParentPinIndex].GetTexture( m_iFrameCount );


if ( D3D_OK != Temp_Texture->GetSurfaceLevel(0, &pTexSurface) )

  SetError( " Particle GetSurfaceLevel Failed " );

// Create a clean surface to clear the texture with.
LPDIRECT3DSURFACE8 pCleanSurface = 0;

if ( D3D_OK != g_pDevice->CreateImageSurface( 1024, 1024, D3DFMT_A8R8G8B8, &pCleanSurface) )
SetError( " Particle CreateImageSurface Failed " );

if ( D3D_OK != pCleanSurface->LockRect( &lockRect, NULL, 0) )
SetError( " Particle LockRect Failed " );

memset( (BYTE*)lockRect.pBits, 0, 1024 * lockRect.Pitch );

if ( D3D_OK != pCleanSurface->UnlockRect() )
SetError( " Particle UnlockRect Failed " );

if ( D3D_OK !=  g_pDevice->CopyRects( pCleanSurface, NULL, 0,pTexSurface, NULL ) )
SetError( " Particle CopyRects Failed LTTS" );

if ( bTextureSet_2 == true )

 g_pDevice->CopyRects( TileSet_Trees, Rect_Source, 1024, pTexSurface, Point_Dest );
	g_pDevice->CopyRects( TileSet_Forest, Rect_Source, 1024, pTexSurface, Point_Dest );

StretchRect() will take better advantage of hardware acceleration. The downside is, there are more restrictions based on the driver for your graphics hardware so you may need multiple paths.

