Jump to content
  • Advertisement
Sign in to follow this  
Dookie

Which is faster, ZeroMemory or this?

This topic is 4834 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hey guys and gals, I have a quick question for you... Which is faster performance-wise, using ZeroMemory() to zero out a struct in my CPP program, or simply assigning 'zero' and 'NULL' to each element of the struct? This struct has about 30 elements to it, including characters, numbers, and other smaller structs. Thanks in advance for the info!

Share this post


Link to post
Share on other sites
Advertisement
My uneducated guess is that ZeroMemory() (which I'm pretty sure is a macro for memset()) would be just as fast or faster (but probably barely) than individually setting the items to zero. My thought process goes like this: memset() just works on a contiguous block of memory; it doesn't care what it is. So it just does a for-loop to set each element to zero. I'm first of all assuming that it is able to use the word size of the machine it is on, typically 32-bit chunks, rather than merely doing everything in bytes. This I'm guessing would cut down on bit-shifts/bit-masks and such to isolate and set individual bytes. Secondly, I'm assuming that the compiler will be able to unroll the loop, so that it doesn't really need to do the variable incrementing. Then again, I don't know how capable the loop unrolling is. I only know enough about these technicalities to get by. It'd be a fun item to research though, I'm sure, for either of us.

If you manually set individual items, then unless all the items are precisely integer-multiples of the size of the machine-word, you might have a minor performance hit, as it needs to do a few bit-shifts/bit-masks as I mentioned above to set each item. But it is already unrolled and there isn't any loop incrementing occurring. All the memory address offsets are already calculated. Which can probably also be done in the above case, but I can't guarantee that.

Lastly, I doubt it matters. Write your program first. If you have a problem with performance, profile it. If the profiler says that this is a chunk of code that is run frequently and takes up time, try both methods to see which works best. It's not a major thing that'll take days to rewrite; it's a small code change that is pretty quick to test.

Share this post


Link to post
Share on other sites
ZeroMemory will almost certainly be faster. The C++ compiler knows about this function and will even optimize it completely out if it can.

I wrote a SIMD zero-memory filler and it was only marginally faster than ZeroMemory across a large block of memory.

Share this post


Link to post
Share on other sites
You should relize using ZeroMemory (which just ends up using memset) cannot be used on NON POD types (POD == Plain old data types, a technical term used in the C++ standard) that means quite of alot C++ user-defined types cannot be used without undefined behaviour and will mess things up besides. If you don't have POD-class types then use the constructor initializer lists to initialize data members.

Share this post


Link to post
Share on other sites
Why not actually try it out instead of relying on the admitted guesses of others? It's not exactly difficult or time consuming to write a small app that does each one in a loop a few million times.

It also probably doesn't really matter. Do whichever way you think is easier to understand.

By the way, calling ZeroMemory on C++ objects can be a bad thing. Particularly if the class contains virtual functions. It's fine for the so-called "plain old data" types but be careful.

Share this post


Link to post
Share on other sites
Setting aside the POD issue, I think it depends on the size of struct and the kind of member types. Off hand I would guess that for structs less than 16 bytes in size it would be faster to zero each member. Above that size the advantage would go to ZeroMemory/memset.

Share this post


Link to post
Share on other sites
As with all optimization questions, the answer is almost invariably:

"It'll vary depending on your [compiler|flags|...|phase_of_moon]."

Profile it! Worst case scenario, you'll see absolutely no difference and realize you wasted your time optimizing some code that dosn't get called that often.

Share this post


Link to post
Share on other sites
Quote:
Original post by krum
ZeroMemory will almost certainly be faster. The C++ compiler knows about this function and will even optimize it completely out if it can.

I wrote a SIMD zero-memory filler and it was only marginally faster than ZeroMemory across a large block of memory.


That's because you didn't do it right ;) You need to strip or line the cache properly.

However how often do you need to memset multi-megabyte arrays? 99.999% of the time memset, zeromemory, or a rep stos is fast enough.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!