Jump to content

  • Log In with Google      Sign In   
  • Create Account

Banner advertising on our site currently available from just $5!


1. Learn about the promo. 2. Sign up for GDNet+. 3. Set up your advert!


King_DuckZ

Member Since 16 Jan 2005
Offline Last Active Apr 03 2013 04:30 AM

#4894122 Optimizations in debug

Posted by King_DuckZ on 15 December 2011 - 03:22 AM

Recently I rewrote the matrix class we use here at work. The old one was really a collection of classes, like matrix4x4, matrix4x3 etc, a very long copy and paste with little changes here and there. My approach was to go template.

Being this the case, the old multiplication code was a manually unrolled loop, like a00 * b00 + a01 * b10 etc, while the new one is the more classical and generic "3 nested loops". Now, the generated optimized code is many times faster than the old code, which is cool. The problem is that in debug mode all the loops are kept, no inlining is performed and the resulting code is 6-7 times slower than the old one (which was already slow due to the many cache misses).

I know that debug is normally slower than release, but in order to keep the framerate at an acceptable rate for the others I'd like my code to be a bit faster in debug. I thought I could surround my function with a #pragma optimize, but that would be VS-specific and maybe there are other downsides I'm not aware of. Any suggestion?

I'm seeing if I can use SIMD intrinsics instead, but I'm not sure I'll get to do that due to some issues with our allocator and the 16-bytes alignment.




#4815362 Templates and specializations

Posted by King_DuckZ on 24 May 2011 - 05:33 PM

Hello, I'm writing some templated classes and for each of them I ran into the problem of having to copy and paste lines and lines of code for every specialization I made.

For example:


template <typename T, size_t S>
class Vector { ... };


template <typename T> class Vector<T, 2> {
T GetX(); 
T GetY();
T operator[] (size_t index) { ... }
};

template <typename T> class Vector<T, 3> {
T GetX(); 
T GetY();
T GetZ();
T operator[] (size_t index) { ... }
};

template <typename T> class Vector<T, 4> {
T GetX(); 
T GetY();
T GetZ();
T GetW();
T operator[] (size_t index) { ... }
};


Here already, GetX() and GetY() are common to every specialization, as well as dot(), cross(), operators and all the whistles and bells I can add to implement vectors. Everything is generic enough so that I never need to write different code for each method, and the interface only changes slightly.


The best approach I've found to solve the problem of having to add each new method's declaration 3 or more times is this:


template <typename T, size_t S, typename D>
class VectorBase {
public:
      // Note that VectorBase has no specialization, so it doesn't know how to implement a operator[] - thus the cast to D*
      T dot ( const D& other) { const D& self = *static_cast<D*>this; ... }
};


template <typename T, size_t S>
class Vector : private VectorBase<T, S, Vector<T, S> > {
public:
      T dot ( const Vector& other) { return VectorBase<, S, Vector<T, S> >::dot(other); }
     T operator[] (size_t index) { ... } // Different size-based specializations only have to define methods that really change, like this one
};


This has the advantage that I can put all of the wrapper methods into a macro and then only call that macro for each specialization, and I still get specialized constructors, no public base classes and relatively clear interfaces. The drawbacks is that I have to cast this most of the times, and that for cases where one specialization needs to be friend with another, then the base class must have the same friendships. Also macros are incredibly annoying and ugly.


What do you think of this pattern? Has anyone a better solution to this recurring problem?


PARTNERS