# memcpy

This topic is 4191 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

Hi, I am interested in the virtual texture algorithm for rendering of a large terrain. I am trying some tests to update Direct3D textures at each frames. But there is something strange: On my PC, it is faster to read texture's data directely from Hard Drive than from memory.
D3DLOCKED_RECT lockedRect;
m_TextureId->UnlockRect(0);
// Where m_HDTextures[0][0] is std::fstream

Is faster than:
D3DLOCKED_RECT lockedRect;
memcpy(lockedRect.pBits, m_FinalImage, imageSize*imageSize*4);
m_TextureId->UnlockRect(0);
// Where m_FinalImage has been read in the beginning of the application

Why ? A copy from memory should be faster than a read from Hard Drive ? Is it something faster than memcpy ?

##### Share on other sites
1. Is this release or debug mode?
2. Is m_FinalImage aligned properly? (Is the pointer address a multiple of 4?)

##### Share on other sites
1. It was in debug mode. I tried in release mode and I had the same FPS.

2. I don't understand exactly what you mean. I have used this for allocation:

m_FinalImage = (char*)malloc(imageBigSize*imageBigSize*4*sizeof(char));

I use X4 because of R,G,B,A

##### Share on other sites
looking at VS CRT source for memcpy, it is a byte-by-byte copy with a while loop counter, which is why i believe it is slow.

void * __cdecl memcpy (        void * dst,        const void * src,        size_t count        ){        void * ret = dst;#if defined (_M_IA64)        {// ...        }#else  /* defined (_M_IA64) */        /*         * copy from lower addresses to higher addresses         */        while (count--) {                *(char *)dst = *(char *)src;                dst = (char *)dst + 1;                src = (char *)src + 1;        }#endif  /* defined (_M_IA64) */        return(ret);}

##### Share on other sites
Quote:
 Original post by texel3d1. It was in debug mode. I tried in release mode and I had the same FPS.2. I don't understand exactly what you mean. I have used this for allocation:m_FinalImage = (char*)malloc(imageBigSize*imageBigSize*4*sizeof(char));I use X4 because of R,G,B,A
How are you measuring performance? FPS is next to useless for this. Are you creating the texture every frame or something? Because you really shouldn't be doing that at all anyway.

EDIT: Off topic, but you really shouldn't copy data into the texture like that. You need to pay attention to the pitch, and copy a scanline at a time. You can put in some code to copy the texture data in one chunk if (and only if) you detect that the pitch is exactly equal to the texture width * bytes per pixel. Remember that it might be different for different cards too.

Quote:
 Original post by yadangolooking at VS CRT source for memcpy, it is a byte-by-byte copy with a while loop counter, which is why i believe it is slow.*** Source Snippet Removed ***
Only in debug builds. There's a file called memcpy.asm which copies aligned blocks of memory which is used in release builds.

##### Share on other sites
ah cool. i never noticed that till now. sse2 optimization... yeah, should be fast :-).

##### Share on other sites
I don't care *how* slow his compilers implementation of memcpy is, it should still beat reading from the harddrive. In the absolute best case scenario, reading from the file will be slightly (perhaps unnoticably) slower once the file has been cached by the OS.

Odds are, your method of measuring performance is horribly broken.

##### Share on other sites
I use fraps to display fps.

I create the texture just one time (in the beginning of the application). After, i just use lock and unlock.

The file with texture's data is also open in the beginning of the application, and is closed at the end.

I know it's not a good idea to update the texture from hard drive at every frame. I do it in order to view the performance and to see how the FPS can decrease at it maximum.

Is there any tutorials which explain how to load very big textures from Hard Drive in Direct3D to render a large terrain in real time with a good FPS. (And not just the theory).

##### Share on other sites
Quote:
 I use fraps to display fps.

FPS can't be used to measure efficiency of memcpy. Or anything else for that matter. FRAPS even less so.

You'll need a high-resolution timer at very least. And in order to measure memcpy, you'll need to run it in a loop and then average the results.

If you were getting 500fps that would mean your timer has 2 ms resolution. If you're getting 60 fps, that means 16ms resolution.

memcpy, even when doing unoptimized copy, will likely take a few microseconds to complete.

So you might as well stop right now, or change the timing methods (see QueryPerformanceCounter on Windows).

1. 1
2. 2
Rutin
21
3. 3
4. 4
5. 5

• 13
• 26
• 10
• 11
• 9
• ### Forum Statistics

• Total Topics
633736
• Total Posts
3013603
×