Jump to content
  • Advertisement

Archived

This topic is now archived and is closed to further replies.

drarem

faster way to copy...

This topic is 5708 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Below is the function I am working with.. a simple strcpy(). It uses pointer math and qword.. I don''t worry about alignment because the basic rule of the mscvrt lib is the strings need to be ''0'' terminated.. or do I need to worry bout that? And is there a faster way, does it need to be asm? Using array math ie ptrout[retg++]=ptrin[retg] is slightly slower, probably due to use of the stack.
  int StrCpyA(void *OUTDATA, void *INDATA) {
    unsigned long long *ptrout;
    unsigned long long *ptrin;

    ptrout = (unsigned long long *) OUTDATA;
    ptrin = (unsigned long long *) INDATA;
    
//    ptrout[0] = ptrin[0];

    while (*ptrin) *ptrout++ = *ptrin++;
    
}
  
I fseek, therefore I fam.

Share this post


Link to post
Share on other sites
Advertisement
Weird, when I compile using the -O1 option (GCC), speed is like 117% faster than the strcpy() version... ?



I fseek, therefore I fam.

Share this post


Link to post
Share on other sites
just out of curiosity (no speed difference) why do this?

unsigned long long *ptrout;
unsigned long long *ptrin;
ptrout = (unsigned long long *) OUTDATA;
ptrin = (unsigned long long *) INDATA;

instead of this:

unsigned long long *ptrout = (unsigned long long *) OUTDATA;
unsigned long long *ptrin = (unsigned long long *) INDATA;

?

Share this post


Link to post
Share on other sites
I didn''t think of that, but I am trying to keep the stack out of this, that is why it is so ''open'', but..

I just tried that and it seemed to run negligbly slower according to the timer results on four test runs, so I put it back..



I fseek, therefore I fam.

Share this post


Link to post
Share on other sites
you do have to take care of alignment issues, and as it stands now, your code is incorrect. before trying to optimize strcpy, i suggest checking if your favorite compiler can generate intrinsic rep movs code for it - msvc for example does so since version 6, if not before, producing code that''s both fast and correct.

Share this post


Link to post
Share on other sites
intrinsic, you mean such as:

__rep__movs; ?

or inline assembly.. what does the syntax look like, kind of inlined stuff? I searched the include lib and found no rep or movs.. If the prior, I don''t think mingw/gcc supports it.

To align, I could either call strlen(), divide up the qwords and bytes and copy that way, or

check for *ptrin until null, then go to the byte level.

I fseek, therefore I fam.

Share this post


Link to post
Share on other sites
Here''s my new StrCpy(), it works, what the heck is going on here?
ptrout[0] = ptrin[0]... my pointer knowledge sucks..



  int StrCpyA(void *OUTDATA, void *INDATA) {
unsigned long long *ptrout;
unsigned long long *ptrin;

ptrout = (unsigned long long *) OUTDATA;
ptrin = (unsigned long long *) INDATA;

ptrout[0] = ptrin[0];

}


I fseek, therefore I fam.

Share this post


Link to post
Share on other sites
its not going to work like this because it can only copy in 64 bit increments, most likely overwriting, not underwriting, which will be bad.

Share this post


Link to post
Share on other sites
Otay I was on drugs, sorry ~:\ I didn''t examine my test code to see I copied the string back in.. so ptrout[0]=ptrin[0] does absolutely nothing.. desirable



I fseek, therefore I fam.

Share this post


Link to post
Share on other sites
One more time, I think this will do it.. 32% faster than the mscvrt version, two questions:

1) Can it be optimized more?

2) Why do I have to subtract 16 from the stringlength? The regular strlen() returns the same number of chars..

Not to go on about this, but how does strlen be optimized via DWORD or QWORD and not go over without knowing its length to begin with.. I guess reading a bad address is ok but writing to it is not.. such as:

while (*ptr++) slen++; where ptr is counting in qwords.. OK? or NOK?

ok so I answered my own question kinda, POINTING to a null address is ok but WRITING to it is not, am I on the right track?


        int StrCpyA(void *OUTDATA, void *INDATA) {
unsigned long long *ptrout;
unsigned long long *ptrin;
unsigned char * ptroutc;
unsigned char * ptrinc;
unsigned int ret=StrLenA(INDATA)-16;
unsigned int cnt=0;

ptrout = (unsigned long long *) OUTDATA;
ptrin = (unsigned long long *) INDATA;

ptroutc = (unsigned char *) OUTDATA;
ptrinc = (unsigned char *) INDATA;

for (; cnt < ret; cnt += sizeof(unsigned long long)) *ptrout++ = *ptrin++;
for (; ptrinc[cnt]; cnt++) ptroutc[cnt] = *(ptrinc + cnt);
}


I fseek, therefore I fam.

[edited by - drarem on May 3, 2003 5:19:02 AM]

[edited by - drarem on May 3, 2003 5:19:50 AM]

[edited by - drarem on May 3, 2003 5:21:44 AM]

Share this post


Link to post
Share on other sites

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!