Copying Double Words
Hi,
Just wondering if there''s there a function that copies memory in 32 bits? like memcpy works in bytes, i want one that works in double words.
I used to use dosmemputl in djgpp...is there any equivalent?
Thanks alot,
--Helicon56
Lookup MOVSD (its an x86 assembler instruction), its the same as MOVSB, but moves data in 32bit chunks rather than 8bit chunks.
VC++ and most other compilers'' memcpy() will move memory in 32bit chunks. The data that is not aligned at the beginning and end will be moved one byte at a time (max 3 bytes). And it''s faster than anything you''ll ever be able to write or find unless you use MMX, but that''s only marginally faster and a waste of registers. memcpy()''s written specifically for the Pentium''s pipeline. Not sure about djgpp''s memcpy though.
DO NOT USE MOVSD!!! It''s been slower than a loop since the 486.
DO NOT USE MOVSD!!! It''s been slower than a loop since the 486.
Additionally, MSVC can inline memcpy calls (replace them with assembly code), but it''s the same thing as doing REP MOVSD anyway.
quote:Original post by Helicon56
Hi,
Just wondering if there''s there a function that copies memory in 32 bits? like memcpy works in bytes, i want one that works in double words.
On Win32, memcpy is optimized to copy upto 3 bytes at the beginning and end of any memory block, and 4byte DWORDs in between. The only way to make it faster, is to garuantee that the data you want to copy is DWORD aligned, skip the check & rep move the DWORDs.
Rumors have floated around that you can make a faster memcpy using MMX, but I suspect it''s due to the alignment assumption/requirement.
quote:
you can also use the FPU (coprocessor) to copy double words in memory
Good lord, do you know what happens when you do that?!
quote:Original post by Magmai Kai Holmlor
On Win32, memcpy is optimized to copy upto 3 bytes at the beginning and end of any memory block, and 4byte DWORDs in between. The only way to make it faster, is to garuantee that the data you want to copy is DWORD aligned, skip the check & rep move the DWORDs.
Rumors have floated around that you can make a faster memcpy using MMX, but I suspect it''s due to the alignment assumption/requirement.
It''s no rumor. I actually coded one up, but it''s only slightly faster (for large amounts of data only because the setup kills it). It''s not worth the effort because the speed gain is negligible. The 7 bytes before and after alignment still have to be copied the normal way.
And IndicectX, I hope you''re not suggesting that REP MOVSD is the same as the memcpy() function. memcpy() will run circles around REP MOVSD any day.
quote:Original post by Vorlath
And IndicectX, I hope you're not suggesting that REP MOVSD is the same as the memcpy() function. memcpy() will run circles around REP MOVSD any day.
Care to explain how?
What does this line from memcpy do:
rep movsd ;N - move all of our dwords
And in speed-optimized release build,
82: char x[100];83: char y[100];84: memcpy(x, y, 100);00401060 mov ecx,19h00401065 lea esi,[y]00401068 lea edi,[x]0040106E rep movs dword ptr [edi],dword ptr [esi]
I just thought it funny that size-optimized release build produced the following code:
82: char x[100];83: char y[100];84: memcpy(x, y, 100);0040104D push 64h0040104F lea eax,[y]00401052 push eax00401053 lea eax,[x]00401059 push eax0040105A call _memcpy (004010a0)0040105F add esp,0Ch
I don't see how this is size optimized. With #pragma intrinsic, I get rep movs back.
[edited by - IndirectX on June 27, 2002 2:31:10 AM]
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement