Jump to content
  • Advertisement

Archived

This topic is now archived and is closed to further replies.

jeffakew

fast memcpy

This topic is 6669 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hello Could anyone give me some help with improving the performance of the standard memcpy() function please. My game runs about 25% faster with out it. I know that memcpy works with bytes only but I dont know asm so I cant write a function to copy qwords so what can I do? Any help would be much appreciated thanks.

Share this post


Link to post
Share on other sites
Advertisement
you can words at a time but sorry I dont know how
what OS are you programming for and what are you trying to copy memory to? If its in dos I can help

Share this post


Link to post
Share on other sites
I don't if this helps but in C you could do this?

        

void copy (long *from, long *to, int nlongs)
{
int blocks = nlongs / 100;
int remainder = nlongs % 100;

while (blocks--) {
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
*to++ = *from++;
}

// do the rest


while (remainder--)
*to++ = *from++;
}


Edited by - bishop_pass on June 19, 2000 2:03:53 AM

Share this post


Link to post
Share on other sites
Hi
Thanks alot, I''m at work right now but when I get home I''ll try that function. I''m programming for win32 and I''m trying to copy a system memory buffer to VRAM just by locking the VRAM. The pitch of the VRAM memory is the same as system memory. Surely after the VRAM is locked it is just the same as dos?(I mean the same as in to write too). Thanks again for your help.

Share this post


Link to post
Share on other sites
Hi,

don''t use the function above.. it''s slow! He''s just copying a byte after another...

to copy fast (paste this stuff):

----- cut here ------

void mov2scr_32(unsigned char *source,unsigned char *dest,unsigned long count)
{
__asm
{
mov esi,source
mov edi,dest
mov ebx,count
mov edx,edi
and edx,11b
jz m2s_memaligned
mov ecx,4
sub ecx,edx
rep movsb
sub ebx,ecx

m2s_memaligned:
mov edx,ebx
and edx,11b
mov ecx,ebx
shr ecx,2
rep movsd
mov ecx,edx
rep movsb
}
}

------ cut here -----

just pass the memory pointers and the number of bytes to be copied to the function.. that''s all...
the function checks how many dwords to copy and how many bytes remain... so it uses fast dword copy if possible..

Share this post


Link to post
Share on other sites
Hi,thanks thats great it''s just what I looking for, when I get in I''ll convert my code to use it and let you know what happens.Once again thanks alot.

Share this post


Link to post
Share on other sites
My function is slow?

I ran some tests on both functions and copied more than 100 billion bytes in the tests to insure fairness.

My function was 6% faster on an AMD 350. It is also machine independent.

Share this post


Link to post
Share on other sites
I should note that my function does not do the initial verifying that the assembly version does, but this could be added. My function takes a number of long words to copy. It does not copy bytes at a time.

Also, Jacen/SE, you should note that my function incurs loop maintenance only every 400 bytes. It appears yours does maintenance every 4 bytes.

The fastest function would be a hybrid of the 2.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
memcpy only takes bytes AS A PARAMETER. That doesn''t mean it copies
them byte-by-byte internally. It most likely tries to move them
as fast as possible (i.e. DWORDs).

Share this post


Link to post
Share on other sites

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!