# Why is this not optimized?

## Recommended Posts

Storyyeller    215
I have a class that contains two bitmaps.
class SpaceArray
{
static const uint RES = 16;
static const uint ARRAYSIZE = RES * RES * RES * RES * RES;
std::bitset<ARRAYSIZE> sure;
};


When I encapsulate them, performance is impacted, even though logically, the two programs are the exact same.
#pragma once

#include <bitset>

template <unsigned int SIZE>
class DualBitArray
{
typedef unsigned int uint;
static const uint BITSETSIZE = SIZE * SIZE * SIZE * SIZE * SIZE;

std::bitset<BITSETSIZE> fixed;

public:

bool Valid(uint i) const {return i<BITSETSIZE;}
bool IsFixed(uint i) const {return fixed.test(i);}
uint NumSet() const {return fixed.count();}
void SetFixed(uint i, bool val){fixed.set(i, val);}
};


class SpaceArray
{
static const uint RES = 16;
DualBitArray<RES> mydata;
};


Why is the second version slower? I compiled on gcc with -O2, so I assumed that the two would be optimized into the same code. What went wrong?

##### Share on other sites
Zahlman    1682
How are you determining that performance is impacted?

##### Share on other sites
Storyyeller    215
I do a bunch of calculations to fill the set and then call QueryPerformanceCounter to measure elapsed time.

##### Share on other sites
Quick question: Are you comparing debug or build versions? Because that can have a huge impact.

##### Share on other sites
Hodgman    51338
Only your second example shows how you're manipulating the data - is there a difference between the two approaches in this respect?

##### Share on other sites
ddn3    1610
I don't know if there is any guarantee that the compiler will concatenate the calls into member objects (sure and deadly) so they compile down to a single function call. I think they will optimize access to simple member variables of POD type, so get/set methods are usually optimized down. If that's not the case your incurring 2 function calls for each method (SetDeadly,etc..) and that would explain the increase in cost.

Not sure that might also be due to the class being a templated and how robust gcc optimizes for templates classes. Too many unknowns really to determine that.

You'll have to break it down and systematically test for these possibilities.

Good Luck!

-ddn

##### Share on other sites
You may be able to also suggest/enforce "inline", but if it's enforce, be careful - A good compiler is usually right about optimization.

##### Share on other sites
Storyyeller    215
Aren't functions defined in a header always inline?
Also, yes it is release build, with -O2 enabled.

##### Share on other sites
Inline (AFAIK) in most compilers is an attribute given to a function/method to suggest or enforce, well, inlining. I may be wrong, but I don't think it's the kind of thing that would go in a header, generally - It would go on specific functions (Possibly on the functions in the header, just to confuse things)

##### Share on other sites
Hodgman    51338
Quote:
 Original post by StoryyellerAren't functions defined in a header always inline?Also, yes it is release build, with -O2 enabled.
Yeah if you implement the function inside the body of the class like you have, then it's implicitly inline (i.e. it's the same as if you put the inline keyword on the front). However, the inline keyword is just a hint to the compiler that you'd like it to be inlined (and a warning to the linker that it's going to find duplicates), the compiler can still choose not to inline it if it wants to.
Some compilers have extra keywords, like __forceinline that are stronger than just "hints".