Back to For Beginners

Why is this not optimized?

For Beginners

Started by Storyyeller May 01, 2010 06:57 PM

8 comments, last by Hodgman 13 years, 11 months ago

Storyyeller

215

Author

May 01, 2010 06:57 PM

I have a class that contains two bitmaps.


class SpaceArray
{
    static const uint RES = 16;
    static const uint ARRAYSIZE = RES * RES * RES * RES * RES;
    std::bitset<ARRAYSIZE> deadly;
    std::bitset<ARRAYSIZE> sure;
};

When I encapsulate them, performance is impacted, even though logically, the two programs are the exact same.


#pragma once

#include <bitset>

template <unsigned int SIZE>
class DualBitArray
{
    typedef unsigned int uint;
    static const uint BITSETSIZE = SIZE * SIZE * SIZE * SIZE * SIZE;

    std::bitset<BITSETSIZE> deadly;
    std::bitset<BITSETSIZE> fixed;

    public:

    bool Valid(uint i) const {return i<BITSETSIZE;}
    bool IsDeadly(uint i) const {return deadly.test(i);}
    bool IsFixed(uint i) const {return fixed.test(i);}
    uint NumSet() const {return fixed.count();}
    void SetDeadly(uint i, bool val) {deadly.set(i, val);}
    void SetFixed(uint i, bool val){fixed.set(i, val);}
};


class SpaceArray
{
    static const uint RES = 16;
    DualBitArray<RES> mydata;
};

Why is the second version slower? I compiled on gcc with -O2, so I assumed that the two would be optimized into the same code. What went wrong?

I trust exceptions about as far as I can throw them.

Zahlman

1,682

May 01, 2010 09:19 PM

How are you determining that performance is impacted?

Storyyeller

215

Author

May 01, 2010 10:32 PM

I do a bunch of calculations to fill the set and then call QueryPerformanceCounter to measure elapsed time.

I trust exceptions about as far as I can throw them.

Narf the Mouse

322

May 02, 2010 12:30 AM

Quick question: Are you comparing debug or build versions? Because that can have a huge impact.

Now open: MouseProduced Games

Hodgman

52,717

May 02, 2010 12:42 AM

Only your second example shows how you're manipulating the data - is there a difference between the two approaches in this respect?

. 22 Racing Series .

ddn3

1,610

May 02, 2010 03:22 AM

I don't know if there is any guarantee that the compiler will concatenate the calls into member objects (sure and deadly) so they compile down to a single function call. I think they will optimize access to simple member variables of POD type, so get/set methods are usually optimized down. If that's not the case your incurring 2 function calls for each method (SetDeadly,etc..) and that would explain the increase in cost.

Not sure that might also be due to the class being a templated and how robust gcc optimizes for templates classes. Too many unknowns really to determine that.

You'll have to break it down and systematically test for these possibilities.

Good Luck!

-ddn

Narf the Mouse

322

May 02, 2010 03:59 PM

You may be able to also suggest/enforce "inline", but if it's enforce, be careful - A good compiler is usually right about optimization.

Now open: MouseProduced Games

Storyyeller

215

Author

May 02, 2010 06:10 PM

Aren't functions defined in a header always inline?
Also, yes it is release build, with -O2 enabled.

I trust exceptions about as far as I can throw them.

Narf the Mouse

322

May 02, 2010 07:59 PM

Inline (AFAIK) in most compilers is an attribute given to a function/method to suggest or enforce, well, inlining. I may be wrong, but I don't think it's the kind of thing that would go in a header, generally - It would go on specific functions (Possibly on the functions in the header, just to confuse things)

Now open: MouseProduced Games

Hodgman

52,717

May 02, 2010 08:19 PM

Quote:Original post by Storyyeller
Aren't functions defined in a header always inline?
Also, yes it is release build, with -O2 enabled.

Yeah if you implement the function inside the body of the class like you have, then it's implicitly inline (i.e. it's the same as if you put the inline keyword on the front). However, the inline keyword is just a hint to the compiler that you'd like it to be inlined (and a warning to the linker that it's going to find duplicates), the compiler can still choose not to inline it if it wants to.
Some compilers have extra keywords, like __forceinline that are stronger than just "hints".

. 22 Racing Series .

Why is this not optimized?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Why is this not optimized?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines