• Advertisement
Sign in to follow this  

Why is this not optimized?

This topic is 2910 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I have a class that contains two bitmaps.
class SpaceArray
{
    static const uint RES = 16;
    static const uint ARRAYSIZE = RES * RES * RES * RES * RES;
    std::bitset<ARRAYSIZE> deadly;
    std::bitset<ARRAYSIZE> sure;
};

When I encapsulate them, performance is impacted, even though logically, the two programs are the exact same.
#pragma once

#include <bitset>

template <unsigned int SIZE>
class DualBitArray
{
    typedef unsigned int uint;
    static const uint BITSETSIZE = SIZE * SIZE * SIZE * SIZE * SIZE;

    std::bitset<BITSETSIZE> deadly;
    std::bitset<BITSETSIZE> fixed;

    public:

    bool Valid(uint i) const {return i<BITSETSIZE;}
    bool IsDeadly(uint i) const {return deadly.test(i);}
    bool IsFixed(uint i) const {return fixed.test(i);}
    uint NumSet() const {return fixed.count();}
    void SetDeadly(uint i, bool val) {deadly.set(i, val);}
    void SetFixed(uint i, bool val){fixed.set(i, val);}
};


class SpaceArray
{
    static const uint RES = 16;
    DualBitArray<RES> mydata;
};

Why is the second version slower? I compiled on gcc with -O2, so I assumed that the two would be optimized into the same code. What went wrong?

Share this post


Link to post
Share on other sites
Advertisement
I do a bunch of calculations to fill the set and then call QueryPerformanceCounter to measure elapsed time.

Share this post


Link to post
Share on other sites
I don't know if there is any guarantee that the compiler will concatenate the calls into member objects (sure and deadly) so they compile down to a single function call. I think they will optimize access to simple member variables of POD type, so get/set methods are usually optimized down. If that's not the case your incurring 2 function calls for each method (SetDeadly,etc..) and that would explain the increase in cost.

Not sure that might also be due to the class being a templated and how robust gcc optimizes for templates classes. Too many unknowns really to determine that.

You'll have to break it down and systematically test for these possibilities.

Good Luck!

-ddn

Share this post


Link to post
Share on other sites
You may be able to also suggest/enforce "inline", but if it's enforce, be careful - A good compiler is usually right about optimization.

Share this post


Link to post
Share on other sites
Aren't functions defined in a header always inline?
Also, yes it is release build, with -O2 enabled.

Share this post


Link to post
Share on other sites
Inline (AFAIK) in most compilers is an attribute given to a function/method to suggest or enforce, well, inlining. I may be wrong, but I don't think it's the kind of thing that would go in a header, generally - It would go on specific functions (Possibly on the functions in the header, just to confuse things)

Share this post


Link to post
Share on other sites
Quote:
Original post by Storyyeller
Aren't functions defined in a header always inline?
Also, yes it is release build, with -O2 enabled.
Yeah if you implement the function inside the body of the class like you have, then it's implicitly inline (i.e. it's the same as if you put the inline keyword on the front). However, the inline keyword is just a hint to the compiler that you'd like it to be inlined (and a warning to the linker that it's going to find duplicates), the compiler can still choose not to inline it if it wants to.
Some compilers have extra keywords, like __forceinline that are stronger than just "hints".

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement