Archived

This topic is now archived and is closed to further replies.

memcpy needs to be faster ???!

This topic is 6030 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi all, I use DJGPP C++ DOS compiler and have seen the memset() and memcpy functions and all they are, are just basic string copy functions that copy byte by byte. Can anyone tell me where I can find one that is an assembly memcpy() function or one that can copy data from source to destination much quicker. You see, when I write games, I rely on memcpy() to copy the back buffer to the vga screen and this, I find can be a bit slow. Well, I get a crap frame rate any way. Is there an assembly one that has been already pre-compiled or one for DJGPP that can be linked to my code. Or does anyone know how to write one? if so, do you mind if I have it?? Any help is good help Thanks in advance Dark Star

Share this post


Link to post
Share on other sites
I''ve tried Allegro, I did not like to use it. I kinda hate using libraries because of the way different people program. I get confused by their way of coding and the way their functions work. I don''t really wanna go into detail about this ''cos that aint the point. I just need a faster memcpy(). I know it''s possible because I have heard of it working for faster in DJGPP but got no idea how to do it


Dark Star



Share this post


Link to post
Share on other sites
You can write your own that copies 4 bytes at a time (and just hard code the last few odd bytes if there are any. This should give you an x4 speed increase straight away since you have 1/4 iterations through your loop and 1/4 the memory reads/writes.

Plus for some really fast VGA stuff read Abrashes black book (free download here http://www.ddj.com/articles/2001/0165/0165f/0165f.htm) Chapter 24 and onwards will tell you more about DOS graphics than you ever need to know.

Edited by - Skulver on June 11, 2001 10:47:56 AM

Share this post


Link to post
Share on other sites
Skulver''s suggestion would work fine, but why not take it a step further and use MMX, that way you could copy 8 bytes at a time instead of 4. I don''t know of any place that has this code available for download, though. For absolute speed, MMX will probably be your best bet.

Share this post


Link to post
Share on other sites
check this out

#define blockcop(dest, src, numwords) \
__asm__ __volatile__ ( \
"cld\n\t" \
"rep\n\t" \
"movsl" \
: : "D" (dest), "S" (src), "c" (numwords) \
: "%ecx", "%esi", "%edi" )

#define blockset(value, dest, numwords) \
__asm__ __volatile__ ( \
"0:\n\t" \
"mov %%eax, (%%edi)\n\t" \
"mov %%eax, 4(%%edi)\n\t" \
"add $8, %%edi\n\t" \
"dec %%ecx\n\t" \
"jnz 0b" \
: : "a" (value), "D" (dest), "c" (numwords) \
: "%ecx", "%edi" )


good luck
just use as memcpy/memset

Arkon
[QSoft Systems]

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
This is a different poster, but Arkon, I tried your code and both DJGPP and Dev-C++ give me a "forbidden register" error with it.

Share this post


Link to post
Share on other sites
Take a look at Denthor''s tutorials (I think they''re available at gamedev.net).. One of those tutorials includes a "crash-course" in optimizing asm code, and I think that one of the things "optimized" was in fact a Blit-routine...

Will code anything for free beer!

Share this post


Link to post
Share on other sites
HAHAHAHAHAHAH
ofcourse that won''t work for VC++
it''s AT&T syntax )

can you tell me exactly what it told you in DJGPP
it works for me fine

i know a lot of ppl have problems with asm in djgpp
i don''t know why that
maybe they miss some files or something
weird

Arkon
[QSoft Systems]

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Arkon: anon didn''t say that he tried it in VC, he said Dev-C++. An IDE made by Bloodshed, that uses the MinGW32 port of GCC, so it should work.

Share this post


Link to post
Share on other sites
I would suggest trying plain old c functions and see how many mb/s you can move before adding asm for this purpose. Unless you go to extremes in asm I doubt you will make much of an improvement over a well written c function. Even going to extremes asm is unlikely to give you more than a 20% improvement. If you unroll the loop, move integers instead of characters, increment pointers instead of subscripts and hardcode the subscripts off the pointers you should be able to keep pace with most asm implementations.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
So you should, Arkon. It sucks when people get all high and mighty thinking they know all, and I love it when said people realize they fscked up. Next time, think before you rip it out of people.

Angry Anon

Share this post


Link to post
Share on other sites