Cost of Abstraction

Started by
39 comments, last by Finalspace 7 years ago

I found a forum post about someone wanting to easy iterating over a range of numbers -> normal for loops are too heavy, too error prune...

The answer from one guy was very interesting, so i want to share that with you:

This is the code he put up there:


namespace util {
    class range {
    public:
        class iterator {
            friend class range;
        public:
            long int operator *() const {
                return i_;
            }
            const iterator &operator ++() {
                ++i_; return *this;
            }
            iterator operator ++(int) {
                iterator copy(*this); ++i_; return copy;
            }

            bool operator ==(const iterator &other) const {
                return i_ == other.i_;
            }
            bool operator !=(const iterator &other) const {
                return i_ != other.i_;
            }

        protected:
            iterator(long int start) : i_(start) {
            }

        private:
            unsigned long i_;

        };

        iterator begin() const {
            return begin_;
        }
        iterator end() const {
            return end_;
        }
        range(long int  begin, long int end) : begin_(begin), end_(end) {
        }
    private:
        iterator begin_;
        iterator end_;

    };
};

So that you can easiely iterate over a range of numbers with a crap api:


auto range = util::range(firstIndex, lastIndex + 1);
    for (auto i : range) {
        std::cout << i << std::endl;
    }

I have no words for this, just compare it with a simple for loop doing the same thing.

Without any compiler optimization, its ridiculous how much instructions you pay for abstraction like this.

Crap like this is the reason why almost every software on this planet is slow - especially the web!

This is just a simple for loop, imaging that more complex stuff are build like this...

What do you think?

Advertisement

This is standard in Python, e.g. for x in range(10). It is safer, and clearer. C/C++ for-loops are overpowered for most uses, and are a source of bugs, albeit rarely. You could replicate the Python behaviour with boost::irange or something similar.

Without any compiler optimization, these things can get expensive. But if you don't have compiler optimization, you shouldn't use C++ anyway.

Has been addressed by Stroustrup (mid-2015), most recent wording (2017):

https://github.com/isocpp/CppCoreGuidelines/blob/master/docs/P0122R4.pdf

Working implementation:

https://github.com/Microsoft/GSL/blob/master/include/gsl/span

alternatively:

https://github.com/martinmoene/gsl-lite/tree/master/include/gsl

Alternative/concurrent approach: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/n4651.pdf

One of the two might hopefully make it into C++20 (I understand C++17 is feature-complete, and span/range is not in).

This just seems like a way to prevent things that the programmer shouldn't do in the first place, anyway (not modify i_ inside the loop). To me, it's just bad practice to prevent bad practice. Also slower.

edit:

I know these things emerge from beginners in their beginning years, trying to abstract things, because they are in the heat. What surprises me is how all this kind of convoluted pattern is passed forward around the internet.

KISS

Without any compiler optimization, its ridiculous how much instructions you pay for abstraction like this. Crap like this is the reason why almost every software on this planet is slow - especially the web!

This just seems like a way to prevent things that the programmer shouldn't do in the first place, anyway (not modify i_ inside the loop). To me, it's just bad practice to prevent bad practice. Also slower.

It should be noted that neither of these is truly correct. With a not-totally-shit compiler, the cost of abstraction is close to zero in a non-optimized build, and exactly zero in an optimized build. No, this is not the reason why software is slow. No, it is not the reason why the web is slow.

It also isn't a way of preventing things that a programmer shouldn't do. It's a way of making certain patterns which are used all the time more idiomatic and safer at no extra cost (the standardized version gives a couple of guarantees on common programming errors that cannot happen, or that fail early, if you use these). It is something that should have been standard years ago, if you ask me.

C/C++ for-loops are overpowered for most uses

What the hell?

But if you don't have compiler optimization, you shouldn't use C++ anyway.

It should be noted that neither of these is truly correct. With a not-totally-shit compiler, the cost of abstraction is close to zero in a non-optimized build, and exactly zero in an optimized build.

It's commonly said that you can tell how many C++ features a game uses by the difference in speed in its debug vs optimized build :lol:
C'mon debuggable optimized builds from modern compilers... many game platforms don't support this yet though.

Even with O2 activated you still have more cycles you pay for using this abstraction. Sure its much lesser than debug mode - but its still there and thats the thing it bugs me. I double checked that in VStudio 2015.

If i really want to abstract that, than i simply make a macro like this:


#define FOR_RANGE_NUMERIC(value, range) for (int value = 0; value < range; ++value)
FOR_RANGE_NUMERIC(index, 100) {
   // index goes from 0 to 99
}

And in the rare case i really really need to have some special thing, i do it like this:


struct forlooprange {
  int start, end;
}
#define FOR_RANGE_STRUCT(value, range) for (int value = (range).start; value <= (range).end; ++value)
FOR_RANGE_STRUCT(index, {0, 100}) {
  // Index goes from 0 to 100
}
Did you time it? Can you compile both to assembly language and post the difference?

C/C++ for-loops are overpowered for most uses

What the hell?

It's a heavily overloaded statement used for a bunch of related but different purposes - creating number sequences, iterating over arrays, iterating using arbitrary iterators, infinite loops, while loops, do-while loops, etc. It contains 3 separate expressions, each of which can (and often is) used in weird ways to get subtly different iterating behaviour, leading to code that is harder to read.

If you want to iterate over a container, a "for element in container" style loop makes more sense. The language should be able to extract the start and end position accordingly. Vectors and arrays can be included in this.

If you have some sort of iterator, then arguably a while loop fits that semantic better. C++ wedges them into the for-statement via the help of hacks like the magical default-constructed input iterator that marks an end of a sequence. It's not a natural fit.

If you want to create a number sequence, that can be a separate operation. e.g. a Range() function. Then, iterating over it is optional - which is good, because that's often suboptimal anyway, in these days of vectorisation and parallelism you don't really want explicit loops everywhere.

If you want an infinite loop then that's what 'while' is there for - not "for (;;)". Same goes for any other loop that would be a [do]-while in other languages - those constructs exist because they're clearer, semantically.

This topic is closed to new replies.

Advertisement