Back to General and Gameplay Programming

Copying Double Words

General and Gameplay Programming Programming

Started by Helicon56 June 26, 2002 10:52 AM

6 comments, last by Helicon56 21 years, 10 months ago

Helicon56

122

Author

June 26, 2002 10:52 AM

Hi, Just wondering if there''s there a function that copies memory in 32 bits? like memcpy works in bytes, i want one that works in double words. I used to use dosmemputl in djgpp...is there any equivalent? Thanks alot, --Helicon56

Michalson

1,657

June 26, 2002 11:03 AM

Lookup MOVSD (its an x86 assembler instruction), its the same as MOVSB, but moves data in 32bit chunks rather than 8bit chunks.

q2guy

122

June 26, 2002 12:13 PM

you can also use the FPU (coprocessor) to copy double words in memory

Vorlath

122

June 26, 2002 09:53 PM

VC++ and most other compilers'' memcpy() will move memory in 32bit chunks. The data that is not aligned at the beginning and end will be moved one byte at a time (max 3 bytes). And it''s faster than anything you''ll ever be able to write or find unless you use MMX, but that''s only marginally faster and a waste of registers. memcpy()''s written specifically for the Pentium''s pipeline. Not sure about djgpp''s memcpy though.

DO NOT USE MOVSD!!! It''s been slower than a loop since the 486.

IndirectX

122

June 26, 2002 10:27 PM

Additionally, MSVC can inline memcpy calls (replace them with assembly code), but it''s the same thing as doing REP MOVSD anyway.

---visit #directxdev on afternet <- not just for directx, despite the name

Shannon Barber

1,684

June 26, 2002 11:07 PM

quote:Original post by Helicon56
Hi,
Just wondering if there''s there a function that copies memory in 32 bits? like memcpy works in bytes, i want one that works in double words.

On Win32, memcpy is optimized to copy upto 3 bytes at the beginning and end of any memory block, and 4byte DWORDs in between. The only way to make it faster, is to garuantee that the data you want to copy is DWORD aligned, skip the check & rep move the DWORDs.
Rumors have floated around that you can make a faster memcpy using MMX, but I suspect it''s due to the alignment assumption/requirement.

quote:
you can also use the FPU (coprocessor) to copy double words in memory

Good lord, do you know what happens when you do that?!

- The trade-off between price and quality does not exist in Japan. Rather, the idea that high quality brings on cost reduction is widely accepted.-- Tajima & Matsubara

Vorlath

122

June 27, 2002 01:20 AM

quote:Original post by Magmai Kai Holmlor
On Win32, memcpy is optimized to copy upto 3 bytes at the beginning and end of any memory block, and 4byte DWORDs in between. The only way to make it faster, is to garuantee that the data you want to copy is DWORD aligned, skip the check & rep move the DWORDs.
Rumors have floated around that you can make a faster memcpy using MMX, but I suspect it''s due to the alignment assumption/requirement.

It''s no rumor. I actually coded one up, but it''s only slightly faster (for large amounts of data only because the setup kills it). It''s not worth the effort because the speed gain is negligible. The 7 bytes before and after alignment still have to be copied the normal way.

And IndicectX, I hope you''re not suggesting that REP MOVSD is the same as the memcpy() function. memcpy() will run circles around REP MOVSD any day.

IndirectX

122

June 27, 2002 01:26 AM

quote:Original post by Vorlath
And IndicectX, I hope you're not suggesting that REP MOVSD is the same as the memcpy() function. memcpy() will run circles around REP MOVSD any day.

Care to explain how?

What does this line from memcpy do:

        rep     movsd           ;N - move all of our dwords

And in speed-optimized release build,

82:           char x[100];83:           char y[100];84:           memcpy(x, y, 100);00401060   mov         ecx,19h00401065   lea         esi,[y]00401068   lea         edi,[x]0040106E   rep movs    dword ptr [edi],dword ptr [esi]

I just thought it funny that size-optimized release build produced the following code:

82:           char x[100];83:           char y[100];84:           memcpy(x, y, 100);0040104D   push        64h0040104F   lea         eax,[y]00401052   push        eax00401053   lea         eax,[x]00401059   push        eax0040105A   call        _memcpy (004010a0)0040105F   add         esp,0Ch

I don't see how this is size optimized. With #pragma intrinsic, I get rep movs back.

[edited by - IndirectX on June 27, 2002 2:31:10 AM]

---visit #directxdev on afternet <- not just for directx, despite the name

Copying Double Words

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Copying Double Words

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines