In the crtdll code memcpy does proceed one byte at a time. It's still fast. Assembly language aside, I'm curious how a void pointer could be copied any other way?
quote:Original post by fettodingo Ah.. he ment like that. Some newbies tend to think it copies everything in one lump in some magic way, but I guess he didnt mean that.
That would actually be nice Maybe with quantum computers...
quote:Original post by LessBread In the crtdll code memcpy does proceed one byte at a time. It''s still fast. Assembly language aside, I''m curious how a void pointer could be copied any other way?
That would cause problems for 3 byte objects - eg. char trio[3] - and for objects with sizes not divisible by 4 - eg 6, 10, 13, 17 bytes (etc.)
This page, Creating Small Win32 Executables, has replacement functions for buffer manipulation functions (at the bottom). They all use bytes for the actual transfer.
For a generalized function, byte to byte might be the only way (and the optimizer the speediness).
"I thought what I'd do was, I'd pretend I was one of those deaf-mutes." - the Laughing Man
IIRC, due to the caching on current Intel CPUs, sequential memory access and copying was essentially the same speed regardless of the size of the individual blocks transferred.
This is how you''d copy something four bytes at a time:
inline void memcpy32 ( LPVOID Dest, LPVOID Source, UINT Size ){ _asm { mov edi, Dest mov esi, Source mov ecx, Size shr ecx, 2 ; Divide by four cld rep movsd }}
~CGameProgrammer( );
~CGameProgrammer( );Developer Image Exchange -- New Features: Upload screenshots of your games (size is unlimited) and upload the game itself (up to 10MB). Free. No registration needed.
quote:Original post by TerranFury IIRC, due to the caching on current Intel CPUs, sequential memory access and copying was essentially the same speed regardless of the size of the individual blocks transferred.
Simple benchtesting suggests that this is very much not the case.