Which is faster, ZeroMemory or this?

Started by
7 comments, last by Name_Unknown 18 years, 7 months ago
Hey guys and gals, I have a quick question for you... Which is faster performance-wise, using ZeroMemory() to zero out a struct in my CPP program, or simply assigning 'zero' and 'NULL' to each element of the struct? This struct has about 30 elements to it, including characters, numbers, and other smaller structs. Thanks in advance for the info!
"The crows seemed to be calling his name, thought Caw"
Advertisement
My uneducated guess is that ZeroMemory() (which I'm pretty sure is a macro for memset()) would be just as fast or faster (but probably barely) than individually setting the items to zero. My thought process goes like this: memset() just works on a contiguous block of memory; it doesn't care what it is. So it just does a for-loop to set each element to zero. I'm first of all assuming that it is able to use the word size of the machine it is on, typically 32-bit chunks, rather than merely doing everything in bytes. This I'm guessing would cut down on bit-shifts/bit-masks and such to isolate and set individual bytes. Secondly, I'm assuming that the compiler will be able to unroll the loop, so that it doesn't really need to do the variable incrementing. Then again, I don't know how capable the loop unrolling is. I only know enough about these technicalities to get by. It'd be a fun item to research though, I'm sure, for either of us.

If you manually set individual items, then unless all the items are precisely integer-multiples of the size of the machine-word, you might have a minor performance hit, as it needs to do a few bit-shifts/bit-masks as I mentioned above to set each item. But it is already unrolled and there isn't any loop incrementing occurring. All the memory address offsets are already calculated. Which can probably also be done in the above case, but I can't guarantee that.

Lastly, I doubt it matters. Write your program first. If you have a problem with performance, profile it. If the profiler says that this is a chunk of code that is run frequently and takes up time, try both methods to see which works best. It's not a major thing that'll take days to rewrite; it's a small code change that is pretty quick to test.
"We should have a great fewer disputes in the world if words were taken for what they are, the signs of our ideas only, and not for things themselves." - John Locke
ZeroMemory will almost certainly be faster. The C++ compiler knows about this function and will even optimize it completely out if it can.

I wrote a SIMD zero-memory filler and it was only marginally faster than ZeroMemory across a large block of memory.

You should relize using ZeroMemory (which just ends up using memset) cannot be used on NON POD types (POD == Plain old data types, a technical term used in the C++ standard) that means quite of alot C++ user-defined types cannot be used without undefined behaviour and will mess things up besides. If you don't have POD-class types then use the constructor initializer lists to initialize data members.
Why not actually try it out instead of relying on the admitted guesses of others? It's not exactly difficult or time consuming to write a small app that does each one in a loop a few million times.

It also probably doesn't really matter. Do whichever way you think is easier to understand.

By the way, calling ZeroMemory on C++ objects can be a bad thing. Particularly if the class contains virtual functions. It's fine for the so-called "plain old data" types but be careful.
-Mike
Hehehe, can anyone say "NULLed V-Table"? [grin]
Free speech for the living, dead men tell no tales,Your laughing finger will never point again...Omerta!Sing for me now!
Setting aside the POD issue, I think it depends on the size of struct and the kind of member types. Off hand I would guess that for structs less than 16 bytes in size it would be faster to zero each member. Above that size the advantage would go to ZeroMemory/memset.
"I thought what I'd do was, I'd pretend I was one of those deaf-mutes." - the Laughing Man
As with all optimization questions, the answer is almost invariably:

"It'll vary depending on your [compiler|flags|...|phase_of_moon]."

Profile it! Worst case scenario, you'll see absolutely no difference and realize you wasted your time optimizing some code that dosn't get called that often.
Quote:Original post by krum
ZeroMemory will almost certainly be faster. The C++ compiler knows about this function and will even optimize it completely out if it can.

I wrote a SIMD zero-memory filler and it was only marginally faster than ZeroMemory across a large block of memory.


That's because you didn't do it right ;) You need to strip or line the cache properly.

However how often do you need to memset multi-megabyte arrays? 99.999% of the time memset, zeromemory, or a rep stos is fast enough.

"It's such a useful tool for living in the city!"

This topic is closed to new replies.

Advertisement