Jump to content
  • Advertisement
Sign in to follow this  
gamelife

memcpy using SSE in linux

This topic is 3883 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I got some free time recently and wrote a SSE version of memcpy myself. I tested it in Linux using KDevelop/g++, and found it about 20% faster than standard memcpy. I count the time by clock_gettime( CLOCK_THREAD_CPUTIME_ID ).
#include <xmmintrin.h>

void* memcpy_sse( uchar* pDest, const uchar* pSrc, size_t nBytes )
{
	assert( nBytes >= (15 + 64) );
	void* pDestOrg = pDest;

	uint nAlignDest = (16 - (uintptr_t)pDest) & 15;
	memcpy( pDest, pSrc, nAlignDest );
	pDest += nAlignDest;
	pSrc  += nAlignDest;
	nBytes -= nAlignDest;

	uint nLoops = nBytes >> 6; // no. of loops to copy 64 bytes
	nBytes -= nLoops << 6;
	if( ((uintptr_t)pSrc & 15) == 0 )
	{
		for( int i = nLoops; i > 0; --i )
		{
			__m128 tmp0 = _mm_load_ps( (float*)(pSrc + 0 ) );
			__m128 tmp1 = _mm_load_ps( (float*)(pSrc + 16) );
			__m128 tmp2 = _mm_load_ps( (float*)(pSrc + 32) );
			__m128 tmp3 = _mm_load_ps( (float*)(pSrc + 48) );
			_mm_store_ps( (float*)(pDest + 0 ), tmp0 );
			_mm_store_ps( (float*)(pDest + 16), tmp1 );
			_mm_store_ps( (float*)(pDest + 32), tmp2 );
			_mm_store_ps( (float*)(pDest + 48), tmp3 );
			pSrc  += 64;
			pDest += 64;
		}
	}
	else
	{
		for( int i = nLoops; i > 0; --i )
		{
			__m128 tmp0 = _mm_loadu_ps( (float*)(pSrc + 0 ) );
			__m128 tmp1 = _mm_loadu_ps( (float*)(pSrc + 16) );
			__m128 tmp2 = _mm_loadu_ps( (float*)(pSrc + 32) );
			__m128 tmp3 = _mm_loadu_ps( (float*)(pSrc + 48) );
			_mm_store_ps( (float*)(pDest + 0 ), tmp0 );
			_mm_store_ps( (float*)(pDest + 16), tmp1 );
			_mm_store_ps( (float*)(pDest + 32), tmp2 );
			_mm_store_ps( (float*)(pDest + 48), tmp3 );
			pSrc  += 64;
			pDest += 64;
		}
	}
	memcpy( pDest, pSrc, nBytes );
	return pDestOrg;
}

I'm not sure if it's suitable in production environment. Comments welcome.

Share this post


Link to post
Share on other sites
Advertisement
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!