Jump to content
  • Advertisement
Sign in to follow this  
nightz

Intrinsics bugs with msvc

This topic is 2989 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi there =]

Im developing an in-house game engine and i wanted some external help for a small problem here. Im trying to optimize my 4x4 matrix class. For now, im using a modified version of vmath and i found some optimized algorithm for matrix multiplication. I dont know if its my class alignment or what (i already tried to align EVERYTHING on Matrix4<T> to 16 byte) but my app is crashing when multiplying matrices. If i remove _mm_store_ps, it stops crashing (but makes no sense at all to remove it - dumb). Can anyone help me? Im sending my changes to vmath back to the author but i want to make sure everything is working pretty well. Mind not my not documented changes and weird commented/modified stuff. Thanks!

vmath.h:

/* -*- C++ -*- */
/*
* vmath, set of classes for computer graphics mathemtics.
* Copyright (c) 2005-2009, Jan Bartipan < barzto at gmail dot com >
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* - Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* - Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in
* the documentation and/or other materials provided with the
* distribution.
* - Neither the names of its contributors may be used to endorse or
* promote products derived from this software without specific
* prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
* FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
* COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
* BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
* LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
* CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY
* WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*/


/**
* @mainpage Vector mathemtics for computer graphics
*
* @section Features
* <ul>
* <li> basic aritemetic operations - using operators </li>
* <li> basic linear algebra operations - such as transpose, dot product, etc. </li>
* <li> aliasis for vertex coordinates - it means:
* <pre>
* Vector3f v;
* // use vertex coordinates
* v.x = 1; v.y = 2; v.z = -1;
*
* // use texture coordinates
* v.s = 0; v.t = 1; v.u = 0.5;
* // use color coordinates
* v.r = 1; v.g = 0.5; v.b = 0;
* </pre>
* </li>
* <li> conversion constructor and assign operators - so you can assign a value of Vector3&lt;T1&gt; type
* to a variable of Vector3&lt;T2&gt; type for any convertable T1, T2 type pairs. In other words, you can do this:
* <pre>
*
* Vector3f f3; Vector3d d3 = f3;
* ...
* f3 = d3;
* </pre>
* </li>
* </ul>
*/


#ifndef __vmath_Header_File__
#define __vmath_Header_File__

#include <cassert>
#include <cmath>
#include <cstring>
#include <iostream>
#include <vector>

#include "logger.h"

#ifdef VMATH_NAMESPACE
namespace VMATH_NAMESPACE
{
#endif

#ifdef WIN32
#pragma warning(disable: 4244)
#include <xmmintrin.h>
#endif

// Romulo - Make the API more windows friendly
#ifdef WIN32
#ifdef MERACH_ENGINE
#define DLL_EXPORT __declspec(dllexport)
#else
#define DLL_EXPORT __declspec(dllimport)
#endif
#endif
// End Romulo

#ifndef M_PI
#define M_PI 3.14159265358979323846 /* pi */
#endif

#define DEG2RAD(x) ((x * M_PI) / 180.0f)
//#define EPSILON (4.37114e-07)
const double epsilon = 4.37114e-05;
#define EPSILON epsilon

/**
* Class for two dimensional vector.
*/

template <class T>
class DLL_EXPORT Vector2
{
public:
// Packed Data
union
{
struct
{
T x, y;
};

struct
{
T s, t;
};

T v[2];
};

//----------------[ constructors ]--------------------------
/**
* Creates and sets to (0,0)
*/

Vector2() : x(0),y(0)
{ }



/**
* Creates and sets to (x,y)
* @param nx intial x-coordinate value
* @param ny intial y-coordinate value
*/

Vector2(T nx, T ny) : x(nx), y(ny)
{ }


/**
* Copy constructor.
* @param src Source of data for new created instace.
*/

Vector2(const Vector2<T>& src)
: x(src.x), y(src.y)
{ }


/**
* Copy casting constructor.
* @param src Source of data for new created instace.
*/

template <class FromT>
Vector2(const Vector2<FromT>& src)
: x(static_cast<T>(src.x)),
y(static_cast<T>(src.y))
{ }


//----------------[ access operators ]-------------------
/**
* Copy casting operator
* @param rhs Right hand side argument of binary operator.
*/

template <class FromT>
Vector2<T>& operator=(const Vector2<FromT>& rhs)
{
x = static_cast<T>(rhs.x);
y = static_cast<T>(rhs.y);
return * this;
}

/**
* Copy operator
* @param rhs Right hand side argument of binary operator.
*/

Vector2<T>& operator=(const Vector2<T>& rhs)
{
x = rhs.x;
y = rhs.y;
return * this;
}


/**
* Array access operator
* @param n Array index
* @return For n = 0, reference to x coordinate, else reference to y
* y coordinate.
*/

T& operator[](int n)
{
assert(n >= 0 && n <= 1);
if (0 == n) return x;
else
return y;
}


//---------------[ vector aritmetic operator ]--------------
/**
* Addition operator
* @param rhs Right hand side argument of binary operator.
*/

Vector2<T> operator+(const Vector2<T>& rhs) const
{
return Vector2<T> (x + rhs.x, y + rhs.y);
}

/**
* Substraction operator
* @param rhs Right hand side argument of binary operator.
*/

Vector2<T> operator-(const Vector2<T>& rhs) const
{
return Vector2<T> (x - rhs.x, y - rhs.y);
}

/**
* Multiplication operator
* @param rhs Right hand side argument of binary operator.
*/

Vector2<T> operator*(const Vector2<T>& rhs) const
{
return Vector2<T> (x * rhs.x, y * rhs.y);
}

/**
* Division operator
* @param rhs Right hand side argument of binary operator.
*/

Vector2<T> operator/(const Vector2<T>& rhs) const
{
return Vector2<T> (x / rhs.x, y / rhs.y);
}

/**
* Addition operator
* @param rhs Right hand side argument of binary operator.
*/

Vector2<T>& operator+=(const Vector2<T>& rhs)
{
x += rhs.x;
y += rhs.y;
return * this;
}

/**
* Substraction operator
* @param rhs Right hand side argument of binary operator.
*/

Vector2<T>& operator-=(const Vector2<T>& rhs)
{
x -= rhs.x;
y -= rhs.y;
return * this;
}

/**
* Multiplication operator
* @param rhs Right hand side argument of binary operator.
*/

Vector2<T>& operator*=(const Vector2<T>& rhs)
{
x *= rhs.x;
y *= rhs.y;
return * this;
}

/**
* Division operator
* @param rhs Right hand side argument of binary operator.
*/

Vector2<T>& operator/=(const Vector2<T>& rhs)
{
x /= rhs.x;
y /= rhs.y;
return * this;
}

//--------------[ scalar vector operator ]--------------------
/**
* Addition operator
* @param rhs Right hand side argument of binary operator.
*/

Vector2<T> operator+(T rhs) const
{
return Vector2<T> (x + rhs, y + rhs);
}

/**
* Substraction operator
* @param rhs Right hand side argument of binary operator.
*/

Vector2<T> operator-(T rhs) const
{
return Vector2<T> (x - rhs, y - rhs);
}

/**
* Multiplication operator
* @param rhs Right hand side argument of binary operator.
*/

Vector2<T> operator*(T rhs) const
{
return Vector2<T> (x * rhs, y * rhs);
}

/**
* Division operator
* @param rhs Right hand side argument of binary operator.
*/

Vector2<T> operator/(T rhs) const
{
return Vector2<T> (x / rhs, y / rhs);
}

/**
* Addition operator
* @param rhs Right hand side argument of binary operator.
*/

Vector2<T>& operator+=(T rhs)
{
x += rhs;
y += rhs;
return * this;
}

/**
* Substraction operator
* @param rhs Right hand side argument of binary operator.
*/

Vector2<T>& operator-=(T rhs)
{
x -= rhs;
y -= rhs;
return * this;
}

/**
* Multiplication operator
* @param rhs Right hand side argument of binary operator.
*/

Vector2<T>& operator*=(T rhs)
{
x *= rhs;
y *= rhs;
return * this;
}

/**
* Division operator
* @param rhs Right hand side argument of binary operator.
*/

Vector2<T>& operator/=(T rhs)
{
x /= rhs;
y /= rhs;
return * this;
}

//--------------[ equality operator ]------------------------
/**
* Equality test operator
* @param rhs Right hand side argument of binary operator.
* @note Test of equality is based of threshold EPSILON value. To be two
* values equal, must satisfy this condition | lws.x - rhs.y | < EPSILON,
* same for y-coordinate.
*/

bool operator==(const Vector2<T>& rhs) const
{
return (std::abs(x - rhs.x) < EPSILON) && (std::abs(y - rhs.y) < EPSILON);
}


/**
* Inequality test operator
* @param rhs Right hand side argument of binary operator.
* @return not (lhs == rhs) :-P
*/

bool operator!=(const Vector2<T>& rhs) const { return ! (*this == rhs); }

//-------------[ unary operations ]--------------------------
/**
* Unary negate operator
* @return negated vector
*/

Vector2<T> operator-() const
{
return Vector2<T>(-x, -y);
}

//-------------[ size operations ]---------------------------
/**
* Get lenght of vector.
* @return lenght of vector
*/

T length() const
{
return (T)std::sqrt(x * x + y * y);
}

/**
* Normalize vector
*/

void normalize()
{
T s = length();
x /= s;
y /= s;
}

/**
* Return square of length.
* @return lenght ^ 2
* @note This method is faster then length(). For comparsion
* of length of two vector can be used just this value, instead
* of computionaly more expensive length() method.
*/

T lengthSq() const
{
return x * x + y * y;
}

//--------------[ misc. operations ]-----------------------
/**
* Linear interpolation of two vectors
* @param fact Factor of interpolation. For translation from positon
* of this vector to vector r, values of factor goes from 0.0 to 1.0.
* @param r Second Vector for interpolation
* @note Hovewer values of fact parameter are reasonable only in interval
* [0.0 , 1.0], you can pass also values outside of this interval and you
* can get result (extrapolation?)
*/

Vector2<T> lerp(T fact, const Vector2<T>& r) const
{
return (*this) + (r - (*this)) * fact;
}


//-------------[ conversion ]-----------------------------
/**
* Conversion to pointer operator
* @return Pointer to internaly stored (in managment of class Vector2<T>)
* used for passing Vector2<T> values to gl*2[fd] functions.
*/

operator T*(){ return (T*) this; }
/**
* Conversion to pointer operator
* @return Constant Pointer to internaly stored (in managment of class Vector2<T>)
* used for passing Vector2<T> values to gl*2[fd] functions.
*/

operator const T*() const { return (const T*) this; }

//-------------[ output operator ]------------------------
/**
* Output to stream operator
* @param lhs Left hand side argument of operator (commonly ostream instance).
* @param rhs Right hand side argument of operator.
* @return Left hand side argument - the ostream object passed to operator.
*/

friend std::ostream& operator<<(std::ostream& lhs, const Vector2<T>& rhs)
{
lhs << "[" << rhs.x << "," << rhs.y << "]";
return lhs;
}


};


//--------------------------------------
// Typedef shortcuts for 2D vector
//-------------------------------------
/// Two dimensional Vector of floats
typedef class Vector2 <float> Vector2f;
typedef class Vector2 <unsigned int> Vector2i;
/// Two dimensional Vector of doubles
// typedef class Vector2 <double> Vector2d;


/**
* Class for three dimensional vector.
*/

template <class T>
class DLL_EXPORT Vector3
{
public:
union
{
struct
{
T x, y, z;
};

struct
{
T s, t, u;
};

struct
{
T r, g, b;
};

T v[3];
};

//----------------[ constructors ]--------------------------
/**
* Creates and sets to (0,0,0)
*/

Vector3() : x(0),y(0),z(0)
{ }

/**
* Creates and sets to (x,y,z)
* @param nx intial x-coordinate value
* @param ny intial y-coordinate value
* @param nz intial z-coordinate value
*/

Vector3(T nx, T ny, T nz) : x(nx),y(ny),z(nz)
{ }

/**
* Copy constructor.
* @param src Source of data for new created Vector3 instance.
*/

Vector3(const Vector3<T>& src)
: x(src.x), y(src.y), z(src.z)
{}

/**
* Copy casting constructor.
* @param src Source of data for new created Vector3 instance.
*/

template <class FromT>
Vector3(const Vector3<FromT>& src)
: x(static_cast<T>(src.x)),
y(static_cast<T>(src.y)),
z(static_cast<T>(src.z))
{}


//----------------[ access operators ]-------------------
/**
* Copy operator
* @param rhs Right hand side argument of binary operator.
*/

Vector3<T> operator=(const Vector3<T>& rhs)
{
x = rhs.x;
y = rhs.y;
z = rhs.z;
return * this;
}

/**
* Copy casting operator.
* @param rhs Right hand side argument of binary operator.
*/

template <class FromT>
Vector3<T> operator=(const Vector3<FromT>& rhs)
{
x = static_cast<T>(rhs.x);
y = static_cast<T>(rhs.y);
z = static_cast<T>(rhs.z);
return * this;
}

/**
* Array access operator
* @param n Array index
* @return For n = 0, reference to x coordinate, n = 1
* reference to y, else reference to z
* y coordinate.
*/

T & operator[](int n)
{
assert(n >= 0 && n <= 2);
if (0 == n) return x;
else if (1 == n) return y;
else
return z;
}


//---------------[ vector aritmetic operator ]--------------
/**
* Addition operator
* @param rhs Right hand side argument of binary operator.
*/

Vector3<T> operator+(const Vector3<T>& rhs) const
{
return Vector3<T> (x + rhs.x, y + rhs.y, z + rhs.z);
}

/**
* Substraction operator
* @param rhs Right hand side argument of binary operator.
*/

Vector3<T> operator-(const Vector3<T>& rhs) const
{
return Vector3<T> (x - rhs.x, y - rhs.y, z - rhs.z);
}

/**
* Multiplication operator
* @param rhs Right hand side argument of binary operator.
*/

Vector3<T> operator*(const Vector3<T>& rhs) const
{
return Vector3<T> (x * rhs.x, y * rhs.y, z * rhs.z);
}

/**
* Division operator
* @param rhs Right hand side argument of binary operator.
*/

Vector3<T> operator/(const Vector3<T>& rhs) const
{
return Vector3<T> (x / rhs.x, y / rhs.y, z / rhs.z);
}

/**
* Addition operator
* @param rhs Right hand side argument of binary operator.
*/

Vector3<T>& operator+=(const Vector3<T>& rhs)
{
x += rhs.x;
y += rhs.y;
z += rhs.z;
return * this;
}

/**
* Substraction operator
* @param rhs Right hand side argument of binary operator.
*/

Vector3<T>& operator-=(const Vector3<T>& rhs)
{
x -= rhs.x;
y -= rhs.y;
z -= rhs.z;
return * this;
}

/**
* Multiplication operator
* @param rhs Right hand side argument of binary operator.
*/

Vector3<T>& operator*=(const Vector3<T>& rhs)
{
x *= rhs.x;
y *= rhs.y;
z *= rhs.z;
return * this;
}

/**
* Division operator
* @param rhs Right hand side argument of binary operator.
*/

Vector3<T>& operator/=(const Vector3<T>& rhs)
{
x /= rhs.x;
y /= rhs.y;
z /= rhs.z;
return * this;
}

/**
* Dot product of two vectors.
* @param rhs Right hand side argument of binary operator.
*/

T dotProduct(const Vector3<T>& rhs) const
{
return x * rhs.x + y * rhs.y + z * rhs.z;
}

/**
* Cross product opertor
* @param rhs Right hand side argument of binary operator.
*/

Vector3<T> crossProduct(const Vector3<T>& rhs) const
{
return Vector3<T> (y * rhs.z - rhs.y * z, z * rhs.x - rhs.z * x, x * rhs.y - rhs.x * y);
}


//--------------[ scalar vector operator ]--------------------
/**
* Addition operator
* @param rhs Right hand side argument of binary operator.
*/

Vector3<T> operator+(T rhs) const
{
return Vector3<T> (x + rhs, y + rhs, z + rhs);
}

/**
* Substraction operator
* @param rhs Right hand side argument of binary operator.
*/

Vector3<T> operator-(T rhs) const
{
return Vector3<T> (x - rhs, y - rhs, z - rhs);
}

/**
* Multiplication operator
* @param rhs Right hand side argument of binary operator.
*/

Vector3<T> operator*(T rhs) const
{
return Vector3<T> (x * rhs, y * rhs, z * rhs);
}

/**
* Division operator
* @param rhs Right hand side argument of binary operator.
*/

Vector3<T> operator/(T rhs) const
{
return Vector3<T> (x / rhs, y / rhs, z / rhs);
}

/**
* Addition operator
* @param rhs Right hand side argument of binary operator.
*/

Vector3<T>& operator+=(T rhs)
{
x += rhs;
y += rhs;
z += rhs;
return * this;
}

/**
* Substraction operator
* @param rhs Right hand side argument of binary operator.
*/

Vector3<T>& operator-=(T rhs)
{
x -= rhs;
y -= rhs;
z -= rhs;
return * this;
}

/**
* Multiplication operator
* @param rhs Right hand side argument of binary operator.
*/

Vector3<T>& operator*=(T rhs)
{
x *= rhs;
y *= rhs;
z *= rhs;
return * this;
}

/**
* Division operator
* @param rhs Right hand side argument of binary operator.
*/

Vector3<T>& operator/=(T rhs)
{
x /= rhs;
y /= rhs;
z /= rhs;
return * this;
}

//--------------[ equiality operator ]------------------------
/**
* Equality test operator
* @param rhs Right hand side argument of binary operator.
* @note Test of equality is based of threshold EPSILON value. To be two
* values equal, must satisfy this condition | lws.x - rhs.y | < EPSILON,
* same for y-coordinate, and z-coordinate.
*/

bool operator==(const Vector3<T>& rhs) const
{
return std::fabs(x - rhs.x) < EPSILON
&& std::fabs(y - rhs.y) < EPSILON
&& std::fabs(z - rhs.z) < EPSILON;
}

/**
* Inequality test operator
* @param rhs Right hand side argument of binary operator.
* @return not (lhs == rhs) :-P
*/

bool operator!=(const Vector3<T>& rhs) const { return !(*this == rhs); }

//-------------[ unary operations ]--------------------------
/**
* Unary negate operator
* @return negated vector
*/

Vector3<T> operator-() const
{
return Vector3<T>(-x, -y, -z);
}

/**
* Reflect this vector onto a surface normal.
*/

Vector3<T> reflect(const Vector3<T>& normal) const
{
T factor = -2 * dotProduct(normal);
return Vector3<T>(x + factor * normal.x,
y + factor * normal.y,
z + factor * normal.z);
}

//-------------[ size operations ]---------------------------
/**
* Get lenght of vector.
* @return lenght of vector
*/

T length() const
{
return (T)std::sqrt(x * x + y * y + z * z);
}

/**
* Get the distance between two vectors.
*/

T distance(const Vector3<T>& rhs) const
{
return (T)std::sqrt(distanceSq(rhs));
}

/**
* Get the squared distance between two vectors.
*/

T distanceSq(const Vector3<T>& rhs) const
{
Vector3<T> diff = rhs - *this;
return (T)(diff.dotProduct(diff));
}

/**
* Return square of length.
* @return lenght ^ 2
* @note This method is faster then length(). For comparsion
* of length of two vector can be used just this value, instead
* of computionaly more expensive length() method.
*/

T lengthSq() const
{
return x * x + y * y + z * z;
}

/**
* Normalize vector
*/

void normalize()
{
T s = length();
if (s == 0)
return;

x /= s;
y /= s;
z /= s;
}

//------------[ other operations ]---------------------------
/**
* Rotate vector around three axis.
* @param ax Angle (in degrees) to be rotated around X-axis.
* @param ay Angle (in degrees) to be rotated around Y-axis.
* @param az Angle (in degrees) to be rotated around Z-axis.
*/

void rotate(T ax, T ay, T az)
{
T a = static_cast<T>(cos(DEG2RAD(ax)));
T b = static_cast<T>(sin(DEG2RAD(ax)));
T c = static_cast<T>(cos(DEG2RAD(ay)));
T d = static_cast<T>(sin(DEG2RAD(ay)));
T e = static_cast<T>(cos(DEG2RAD(az)));
T f = static_cast<T>(sin(DEG2RAD(az)));
T nx = static_cast<T>(c * e * x - c * f * y + d * z);
T ny = static_cast<T>((a * f + b * d * e) * x + (a * e - b * d * f) * y - b * c * z);
T nz = static_cast<T>((b * f - a * d * e) * x + (a * d * f + b * e) * y + a * c * z);
x = nx; y = ny; z = nz;
}

/**
* Linear interpolation of two vectors
* @param fact Factor of interpolation. For translation from positon
* of this vector to vector r, values of factor goes from 0.0 to 1.0.
* @param r Second Vector for interpolation
* @note Hovewer values of fact parameter are reasonable only in interval
* [0.0 , 1.0], you can pass also values outside of this interval and you
* can get result (extrapolation?)
*/

Vector3<T> lerp(T fact, const Vector3<T>& r) const
{
return (*this) + (r - (*this)) * fact;
}





//-------------[ conversion ]-----------------------------

/**
* Conversion to pointer operator
* @return Pointer to internaly stored (in managment of class Vector3<T>)
* used for passing Vector3<T> values to gl*3[fd] functions.
*/

operator T*(){ return (T*) this; }

/**
* Conversion to pointer operator
* @return Constant Pointer to internaly stored (in managment of class Vector3<T>)
* used for passing Vector3<T> values to gl*3[fd] functions.
*/

operator const T*() const { return (const T*) this; }

//-------------[ output operator ]------------------------
/**
* Output to stream operator
* @param lhs Left hand side argument of operator (commonly ostream instance).
* @param rhs Right hand side argument of operator.
* @return Left hand side argument - the ostream object passed to operator.
*/

friend std::ostream& operator<<(std::ostream& lhs, const Vector3<T>& rhs)
{
lhs << "[" << rhs.x << "," << rhs.y << "," << rhs.z << "]";
return lhs;
}

};


/// Three dimensional Vector of floats
typedef Vector3 <float> Vector3f;
/// Three dimensional Vector of doubles
// typedef Vector3 <double> Vector3d;

// Romulo - Sphere
template <class T>
class DLL_EXPORT Sphere
{
public:
Vector3<T> m_position;
T m_radius;

Sphere() : m_radius(1) {}
Sphere(const Vector3<T>& pos, T radius) : m_position(pos), m_radius(radius) {}

/**
* Test sphere intersection.
*/

bool intersect(const Sphere<T>& rhs) const
{
Vector3<T> diff = rhs.m_position - m_position;
T radSum = m_radius + rhs.m_radius;
return (diff.lengthSq() <= (radSum*radSum));
}

void draw() const;
};

typedef Sphere<float> Spheref;
// typedef Sphere<double> Sphered;

// Romulo - Bounding Box
template <class T>
class DLL_EXPORT BoundingBox
{
public:

/**
* Since we will need all vertices for occlusion culling
* keep them into an array.
*/

Vector3<T> m_vertices[8];
unsigned short m_indices[36];

Vector3<T> m_mins;
Vector3<T> m_maxs;
bool m_visible;

/**
* Empty Constructor. 0 volume box.
*/

BoundingBox() : m_visible(false) {}

/**
* Constructs a bounding box from minimum and maximum vectors.
*/

BoundingBox(const Vector3<T>& mins, const Vector3<T>& maxs)
: m_mins(mins), m_maxs(maxs), m_visible(false) {}

/**
* Set bounding box visibility.
*/

void setVisible(bool visible)
{
m_visible = visible;
}

/**
* Update the bounding box if necessary.
* Return true if the bounding box changed.
*/

bool update(const Vector3<T>& mins, const Vector3<T>& maxs);

/**
* Draw this bounding box with local coordinates.
* @filled If this is true, cause the AABB to be filled with a single color
* (very useful for occlusion queries)
*/

void draw(bool filled = false);

/**
* Return the vertices of the box in a std::vector
*/

void getVertices();

private:
/**
* Called internally, draw AABB bounds only.
*/

void drawBounds();

/**
* Called internally, draw filled AABB.
*/

void drawFilled();

};

typedef BoundingBox <float> BoundingBoxf;
// typedef BoundingBox <double> BoundingBoxd;

// Romulo - Plane
template <class T>
class DLL_EXPORT Plane
{
public:
Vector3<T> m_normal;
float m_d;

enum planeSides_t
{
FRONT_SIDE,
BACK_SIDE,
IN_PLANE
};

// Concatenate plane information.
friend std::ostream& operator<<(std::ostream& lhs, const Plane<T>& rhs)
{
lhs << "[" << rhs.m_normal << "," << rhs.m_d << "]";
return lhs;
}

/**
* Default constructor
*/

Plane()
{}

/**
* Create a plane from a normal vector and a point inside the plane.
*/

Plane(const Vector3<T>& normal, const Vector3<T>& point)
{
m_normal = normal;
m_d = -(normal.dotProduct(point));
}

/**
* Create a plane from three points.
*/

Plane(const Vector3<T>& a, const Vector3<T>& b, const Vector3<T>& c)
{
Vector3<T> edge1, edge2;
edge1 = b - a;
edge2 = c - a;

m_normal = edge1.crossProduct(edge2);
m_normal.normalize();

m_d = -(m_normal.dotProduct(a));
}

/**
* Returns true if this plane intersects another plane.
*/

bool intersect(const Plane<T>& rhs) const
{
return fabs(m_normal.dotProduct(rhs.m_normal)) < 1;
}

/**
* Returns true if this plane intersects the sphere.
*/

bool intersect(const Sphere<T>& sphere) const
{
return (fabs(distance(sphere.m_position)) <= sphere.m_radius);
}

/**
* Calculate the distance of a point "p" to the plane.
* If the distance is negative, the point is behind the plane.
*/

T distance(const Vector3<T>& p) const
{
return (m_normal.dotProduct(p) + m_d);
}

/**
* Return if the point is in front or back side of this plane.
*/

planeSides_t categorizePosition(const Vector3<T>& p) const
{
if (distance(p) == 0)
return IN_PLANE;
else
if (distance(p) < 0)
return BACK_SIDE;
else
return FRONT_SIDE;
}
};

typedef Plane <float> Planef;
// typedef Plane <double> Planed;

/**
* Class for four dimensional vector.
*/

template <class T>
class DLL_EXPORT Vector4
{
public:
union
{
struct
{
T x, y, z, w;
};

struct
{
T r, g, b, a;
};

T v[4];
};


//----------------[ constructors ]--------------------------
/**
* Creates and sets to (0,0,0,0)
*/

Vector4() : x(0),y(0),z(0),w(0)
{ }


/**
* Creates and sets to (x,y,z,z)
* @param nx intial x-coordinate value (R)
* @param ny intial y-coordinate value (G)
* @param nz intial z-coordinate value (B)
* @param nw intial w-coordinate value (Aplha)
*/

Vector4(T nx, T ny, T nz, T nw) : x(nx), y(ny), z(nz), w(nw)
{ }

/**
* Copy constructor.
* @param src Source of data for new created Vector4 instance.
*/

Vector4(const Vector4<T>& src)
: x(src.x), y(src.y), z(src.z), w(src.w)
{}

/**
* Copy casting constructor.
* @param src Source of data for new created Vector4 instance.
*/

template <class FromT>
Vector4(const Vector4<FromT>& src)
: x(static_cast<T>(src.x)),
y(static_cast<T>(src.y)),
z(static_cast<T>(src.z)),
w(static_cast<T>(src.w))
{}


//----------------[ access operators ]-------------------
/**
* Copy operator
* @param rhs Right hand side argument of binary operator.
*/

Vector4<T> operator=(const Vector4<T>& rhs)
{
x = rhs.x;
y = rhs.y;
z = rhs.z;
w = rhs.w;
return * this;
}

/**
* Copy casting operator
* @param rhs Right hand side argument of binary operator.
*/

template<class FromT>
Vector4<T> operator=(const Vector4<FromT>& rhs)
{
x = static_cast<T>(rhs.x);
y = static_cast<T>(rhs.y);
z = static_cast<T>(rhs.z);
w = static_cast<T>(rhs.w);
return * this;
}


/**
* Array access operator
* @param n Array index
* @return For n = 0, reference to x coordinate, n = 1
* reference to y coordinate, n = 2 reference to z,
* else reference to w coordinate.
*/

T & operator[](int n)
{
assert(n >= 0 && n <= 3);
if (0 == n) return x;
else if (1 == n) return y;
else if (2 == n) return z;
else return w;
}


//---------------[ vector aritmetic operator ]--------------
/**
* Addition operator
* @param rhs Right hand side argument of binary operator.
*/

Vector4<T> operator+(const Vector4<T>& rhs) const
{
return Vector4<T> (x + rhs.x, y + rhs.y, z + rhs.z, w + rhs.w);
}

/**
* Substraction operator
* @param rhs Right hand side argument of binary operator.
*/

Vector4<T> operator-(const Vector4<T>& rhs) const
{
return Vector4<T> (x - rhs.x, y - rhs.y, z - rhs.z, w - rhs.w);
}

/**
* Multiplication operator
* @param rhs Right hand side argument of binary operator.
*/

Vector4<T> operator*(const Vector4<T>& rhs) const
{
return Vector4<T> (x * rhs.x, y * rhs.y, z * rhs.z, w * rhs.w);
}

/**
* Division operator
* @param rhs Right hand side argument of binary operator.
*/

Vector4<T> operator/(const Vector4<T>& rhs) const
{
return Vector4<T> (x / rhs.x, y / rhs.y, z / rhs.z, w / rhs.w);
}

/**
* Addition operator
* @param rhs Right hand side argument of binary operator.
*/

Vector4<T>& operator+=(const Vector4<T>& rhs)
{
x += rhs.x;
y += rhs.y;
z += rhs.z;
w += rhs.w;
return * this;
}

/**
* Substraction operator
* @param rhs Right hand side argument of binary operator.
*/

Vector4<T>& operator-=(const Vector4<T>& rhs)
{
x -= rhs.x;
y -= rhs.y;
z -= rhs.z;
w -= rhs.w;
return * this;
}

/**
* Multiplication operator
* @param rhs Right hand side argument of binary operator.
*/

Vector4<T>& operator*=(const Vector4<T>& rhs)
{
x *= rhs.x;
y *= rhs.y;
z *= rhs.z;
w *= rhs.w;
return * this;
}

/**
* Division operator
* @param rhs Right hand side argument of binary operator.
*/

Vector4<T>& operator/=(const Vector4<T>& rhs)
{
x /= rhs.x;
y /= rhs.y;
z /= rhs.z;
w /= rhs.w;
return * this;
}

//--------------[ equiality operator ]------------------------
/**
* Equality test operator
* @param rhs Right hand side argument of binary operator.
* @note Test of equality is based of threshold EPSILON value. To be two
* values equal, must satisfy this condition | lws.x - rhs.y | < EPSILON,
* same for y-coordinate, z-coordinate, and w-coordinate.
*/

bool operator==(const Vector4<T>& rhs) const
{
return std::fabs(x - rhs.x) < EPSILON
&& std::fabs(y - rhs.y) < EPSILON
&& std::fabs(z - rhs.z) < EPSILON
&& std::fabs(w - rhs.w) < EPSILON;
}


/**
* Inequality test operator
* @param rhs Right hand side argument of binary operator.
* @return not (lhs == rhs) :-P
*/

bool operator!=(const Vector4<T>& rhs) const { return ! (*this == rhs); }

//-------------[ unary operations ]--------------------------
/**
* Unary negate operator
* @return negated vector
*/

Vector4<T> operator-() const
{
return Vector4<T>(-x, -y, -z, -w);
}

//--------------[ scalar vector operator ]--------------------

/**
* Addition operator
* @param rhs Right hand side argument of binary operator.
*/

Vector4<T> operator+(T rhs) const
{
return Vector4<T> (x + rhs, y + rhs, z + rhs, w + rhs);
}

/**
* Substraction operator
* @param rhs Right hand side argument of binary operator.
*/

Vector4<T> operator-(T rhs) const
{
return Vector4<T> (x - rhs, y - rhs, z - rhs, w - rhs);
}

/**
* Multiplication operator
* @param rhs Right hand side argument of binary operator.
*/

Vector4<T> operator*(T rhs) const
{
return Vector4<T> (x * rhs, y * rhs, z * rhs, w * rhs);
}

/**
* Division operator
* @param rhs Right hand side argument of binary operator.
*/

Vector4<T> operator/(T rhs) const
{
return Vector4<T> (x / rhs, y / rhs, z / rhs, w / rhs);
}

/**
* Addition operator
* @param rhs Right hand side argument of binary operator.
*/

Vector4<T>& operator+=(T rhs)
{
x += rhs;
y += rhs;
z += rhs;
w += rhs;
return * this;
}

/**
* Substraction operator
* @param rhs Right hand side argument of binary operator.
*/

Vector4<T>& operator-=(T rhs)
{
x -= rhs;
y -= rhs;
z -= rhs;
w -= rhs;
return * this;
}

/**
* Multiplication operator
* @param rhs Right hand side argument of binary operator.
*/

Vector4<T>& operator*=(T rhs)
{
x *= rhs;
y *= rhs;
z *= rhs;
w *= rhs;
return * this;
}

/**
* Division operator
* @param rhs Right hand side argument of binary operator.
*/

Vector4<T>& operator/=(T rhs)
{
x /= rhs;
y /= rhs;
z /= rhs;
w /= rhs;
return * this;
}

//-------------[ size operations ]---------------------------
/**
* Get lenght of vector.
* @return lenght of vector
*/

T length() const
{
return (T)std::sqrt(x * x + y * y + z * z + w * w);
}

/**
* Normalize vector
*/

void normalize()
{
T s = length();
x /= s;
y /= s;
z /= s;
w /= s;
}

/**
* Return square of length.
* @return lenght ^ 2
* @note This method is faster then length(). For comparsion
* of length of two vector can be used just this value, instead
* of computionaly more expensive length() method.
*/

T lengthSq() const
{
return x * x + y * y + z * z + w * w;
}

//--------------[ misc. operations ]-----------------------
/**
* Linear interpolation of two vectors
* @param fact Factor of interpolation. For translation from positon
* of this vector to vector r, values of factor goes from 0.0 to 1.0.
* @param r Second Vector for interpolation
* @note Hovewer values of fact parameter are reasonable only in interval
* [0.0 , 1.0], you can pass also values outside of this interval and you
* can get result (extrapolation?)
*/

Vector4<T> lerp(T fact, const Vector4<T>& r) const
{
return (*this) + (r - (*this)) * fact;
}




//-------------[ conversion ]-----------------------------

/**
* Conversion to pointer operator
* @return Pointer to internaly stored (in managment of class Vector4<T>)
* used for passing Vector4<T> values to gl*4[fd] functions.
*/

operator T*(){ return (T*) this; }


/**
* Conversion to pointer operator
* @return Constant Pointer to internaly stored (in managment of class Vector4<T>)
* used for passing Vector4<T> values to gl*4[fd] functions.
*/

operator const T*() const { return (const T*) this; }

//-------------[ output operator ]------------------------
/**
* Output to stream operator
* @param lhs Left hand side argument of operator (commonly ostream instance).
* @param rhs Right hand side argument of operator.
* @return Left hand side argument - the ostream object passed to operator.
*/

friend std::ostream& operator<<(std::ostream& lhs, const Vector4<T>& rhs)
{
lhs << "[" << rhs.x << "," << rhs.y << "," << rhs.z << "," << rhs.w << "]";
return lhs;
}
};

/// Three dimensional Vector of floats
typedef Vector4<float> Vector4f;
/// Three dimensional Vector of doubles
// typedef Vector4<double> Vector4d;





/**
* Class for matrix 3x3.
* @note Data stored in this matrix are in column major order.
*/

template <class T>
class DLL_EXPORT Matrix3
{
public:
/// Data stored in column major order
T data[9];

//--------------------------[ constructors ]-------------------------------
/**
* Creates identity matrix
*/

Matrix3()
{
for (int i = 0; i < 9; i++)
data = static_cast<T>((i % 4) ? 0 : 1);
}

/**
* Copy matrix values from array (these data must be in column
* major order!)
*/

Matrix3(const T * dt)
{
std::memcpy(data, dt, sizeof(T) * 9);
}

/**
* Copy constructor.
* @param src Data source for new created instance of Matrix3
*/

Matrix3(const Matrix3<T>& src)
{
std::memcpy(data, src.data, sizeof(T) * 9);
}

/**
* Creates a matrix from 3 vectors that
* will match rows instead of columns. The final matrix
* form will be:
*
* | a.x a.y a.z |
* | b.x b.y b.z |
* | c.x c.y c.z |
*/

Matrix3(const Vector3<T>& a, const Vector3<T>& b, const Vector3<T>& c)
{
data[0] = a.x;
data[1] = b.x;
data[2] = c.x;

data[3] = a.y;
data[4] = b.y;
data[5] = c.y;

data[6] = a.z;
data[7] = b.z;
data[8] = c.z;
}

/**
* Copy casting constructor.
* @param src Data source for new created instance of Matrix3
*/

template<class FromT>
Matrix3(const Matrix3<FromT>& src)
{
for (int i = 0; i < 9; i++)
{
data = static_cast<T>(src.data);
}
}

/**
* Resets matrix to be identity matrix
*/

void identity()
{
for (int i = 0; i < 9; i++)
data = static_cast<T>((i % 4) ? 0 : 1);
}



/**
* Creates rotation matrix by rotation around axis.
* @param a Angle (in radians) of rotation around axis X.
* @param b Angle (in radians) of rotation around axis Y.
* @param c Angle (in radians) of rotation around axis Z.
*/

static Matrix3<T> createRotationAroundAxis(T a, T b, T c)
{
Matrix3<T> ma, mb, mc;
float ac = static_cast<float>(cos(a));
float as = static_cast<float>(sin(a));
float bc = static_cast<float>(cos(b));
float bs = static_cast<float>(sin(b));
float cc = static_cast<float>(cos(c));
float cs = static_cast<float>(sin(c));

ma.at(1,1) = ac;
ma.at(2,1) = as;
ma.at(1,2) = -as;
ma.at(2,2) = ac;

mb.at(0,0) = bc;
mb.at(2,0) = -bs;
mb.at(0,2) = bs;
mb.at(2,2) = bc;

mc.at(0,0) = cc;
mc.at(1,0) = cs;
mc.at(0,1) = -cs;
mc.at(1,1) = cc;

Matrix3<T> ret = ma * mb * mc;
return ret;
}

/**
* Creates roation matrix from ODE Matrix.
*/

template <class It>
static Matrix3<T> fromOde(const It* mat)
{
Matrix3<T> ret;
for (int i = 0; i < 3; i++)
{
for (int j = 0; j < 3; j++)
{
ret.at(i,j) = static_cast<T>(mat[j * 4 + i]);
}
}
return ret;
}


//---------------------[ equiality operators ]------------------------------
/**
* Equality test operator
* @param rhs Right hand side argument of binary operator.
* @note Test of equality is based of threshold EPSILON value. To be two
* values equal, must satisfy this condition all elements of matrix
* | lws - rhs | < EPSILON,
* same for y-coordinate, z-coordinate, and w-coordinate.
*/

bool operator==(const Matrix3<T>& rhs) const
{
for (int i = 0; i < 9; i++)
{
if (std::fabs(data - rhs.data) >= EPSILON)
return false;
}
return true;
}


/**
* Inequality test operator
* @param rhs Right hand side argument of binary operator.
* @return not (lhs == rhs) :-P
*/

bool operator!=(const Matrix3<T>& rhs) const
{
return !(*this == rhs);
}


//---------------------[ access operators ]---------------------------------
/**
* Get reference to element at postion (x,y).
* @param x Number of column (0..2)
* @param y Number of row (0..2)
*/

T& at(int x, int y)
{
assert(x >= 0 && x < 3);
assert(y >= 0 && y < 3);
return data[x * 3 + y];
}

/**
* Get constant reference to element at postion (x,y).
* @param x Number of column (0..2)
* @param y Number of row (0..2)
*/

const T& at(int x, int y) const
{
assert(x >= 0 && x < 3);
assert(y >= 0 && y < 3);
return data[x * 3 + y];
}

/**
* Copy operator
* @param rhs Right hand side argument of binary operator.
*/

Matrix3<T>& operator=(const Matrix3<T>& rhs)
{
std::memcpy(data, rhs.data, sizeof(T) * 9);
return * this;
}

/**
* Copy casting operator
* @param rhs Right hand side argument of binary operator.
*/

template<class FromT>
Matrix3<T>& operator=(const Matrix3<FromT>& rhs)
{
for (int i = 0; i < 9; i++)
{
data = static_cast<T>(rhs.data);
}
return * this;
}

/**
* Copy operator
* @param rhs Right hand side argument of binary operator.
*/

Matrix3<T>& operator=(const T* rhs)
{
std::memcpy(data, rhs, sizeof(T) * 9);
return * this;
}


/*Matrix3<T> & operator=(const double* m)
{
for (int i = 0; i < 9; i++) data = (T)m;
return * this;
}*/


//--------------------[ matrix with matrix operations ]---------------------
/**
* Addition operator
* @param rhs Right hand side argument of binary operator.
*/

Matrix3<T> operator+(const Matrix3<T>& rhs) const
{
Matrix3<T> ret;
for (int i = 0; i < 9; i++)
ret.data = data + rhs.data;
return ret;
}

/**
* Substraction operator
* @param rhs Right hand side argument of binary operator.
*/

Matrix3<T> operator-(const Matrix3<T>& rhs) const
{
Matrix3<T> ret;
for (int i = 0; i < 9; i++)
ret.data = data - rhs.data;
return ret;
}

//--------------------[ matrix with scalar operations ]---------------------
/**
* Addition operator
* @param rhs Right hand side argument of binary operator.
*/

Matrix3<T> operator+(T rhs) const
{
Matrix3<T> ret;
for (int i = 0; i < 9; i++)
ret.data = data + rhs;
return ret;
}

/**
* Substraction operator
* @param rhs Right hand side argument of binary operator.
*/

Matrix3<T> operator-(T rhs) const
{
Matrix3<T> ret;
for (int i = 0; i < 9; i++)
ret.data = data - rhs;
return ret;
}

/**
* Multiplication operator
* @param rhs Right hand side argument of binary operator.
*/

Matrix3<T> operator*(T rhs) const
{
Matrix3<T> ret;
for (int i = 0; i < 9; i++)
ret.data = data * rhs;
return ret;
}

/**
* Division operator
* @param rhs Right hand side argument of binary operator.
*/

Matrix3<T> operator/(T rhs) const
{
Matrix3<T> ret;
for (int i = 0; i < 9; i++)
ret.data = data / rhs;
return ret;
}


//--------------------[ multiply operators ]--------------------------------
/**
* Multiplication operator
* @param rhs Right hand side argument of binary operator.
*/

Vector3<T> operator*(const Vector3<T>& rhs) const
{
return Vector3<T>(
data[0] * rhs.x + data[3] * rhs.y + data[6] * rhs.z,
data[1] * rhs.x + data[4] * rhs.y + data[7] * rhs.z,
data[2] * rhs.x + data[5] * rhs.y + data[8] * rhs.z
);
}

/**
* Multiplication operator
* @param rhs Right hand side argument of binary operator.
*/

Matrix3<T> operator*(Matrix3<T> rhs) const
{
static Matrix3<T> w;
for (int i = 0; i < 3; i++)
{
for (int j = 0; j < 3; j++)
{
T n = 0;
for (int k = 0; k < 3; k++) n += rhs.at(i, k) * at(k, j);
w.at(i, j) = n;
}
}
return w;

}


//---------------------------[ misc operations ]----------------------------
/**
* Transpose matrix.
*/

Matrix3<T> transpose()
{
Matrix3<T> ret;
for (int i = 0; i < 3; i++)
{
for (int j = 0; j < 3; j++)
{
ret.at(i,j) = at(j,i);
}
}
return ret;
}

/**
* Linear interpolation of two vectors
* @param fact Factor of interpolation. For translation from positon
* of this matrix (lhs) to matrix rhs, values of factor goes from 0.0 to 1.0.
* @param rhs Second Matrix for interpolation
* @note Hovewer values of fact parameter are reasonable only in interval
* [0.0 , 1.0], you can pass also values outside of this interval and you
* can get result (extrapolation?)
*/

Matrix3<T> lerp(T fact, const Matrix3<T>& rhs) const
{
Matrix3<T> ret = (*this) + (rhs - (*this)) * fact;
return ret;
}


//-------------[ conversion ]-----------------------------

/**
* Conversion to pointer operator
* @return Pointer to internaly stored (in managment of class Matrix3<T>)
* used for passing Matrix3<T> values to gl*[fd]v functions.
*/

operator T*(){ return (T*) data; }

/**
* Conversion to pointer operator
* @return Constant Pointer to internaly stored (in managment of class Matrix3<T>)
* used for passing Matrix3<T> values to gl*[fd]v functions.
*/

operator const T*() const { return (const T*) data; }

//----------[ output operator ]----------------------------
/**
* Output to stream operator
* @param lhs Left hand side argument of operator (commonly ostream instance).
* @param rhs Right hand side argument of operator.
* @return Left hand side argument - the ostream object passed to operator.
*/

friend std::ostream& operator << (std::ostream& lhs, const Matrix3<T>& rhs)
{
for (int i = 0; i < 3; i++)
{
lhs << "|\t";
for (int j = 0; j < 3; j++)
{
lhs << rhs.at(j,i) << "\t";
}
lhs << "|" << std::endl;
}
return lhs;
}

};

/// Matrix 3x3 of floats
typedef Matrix3<float> Matrix3f;
/// Matrix 3x3 of doubles
// typedef Matrix3<double> Matrix3d;


/**
* Class for matrix 4x4
* @note Data stored in this matrix are in column major order.
*/

template <class T>
class DLL_EXPORT Matrix4
{
public:
/// Data stored in column major order
// Stored in an union to keep alignment
ALIGN(16) T data[16];

//--------------------------[ constructors ]-------------------------------
/**
*Creates identity matrix
*/

Matrix4()
{
/**
* This original constructor was too slow.
*/


/*
for (short i = 0; i < 16; i++)
data = static_cast<T>((i % 5) ? 0 : 1);
*/


memset(data, 0, 16 * sizeof(T));
data[0] = data[5] = data[10] = data[15] = 1;
}

/**
* Copy matrix values from array (these data must be in column
* major order!)
*/

Matrix4(const T * dt)
{
std::memcpy(data, dt, sizeof(T) * 16);
}

/**
* Copy constructor.
* @param src Data source for new created instance of Matrix4.
*/

Matrix4(const Matrix4<T>& src)
{
std::memcpy(data, src.data, sizeof(T) * 16);
}

/**
* Copy casting constructor.
* @param src Data source for new created instance of Matrix4.
*/

template <class FromT>
Matrix4(const Matrix4<FromT>& src)
{
for (int i = 0; i < 16; i++)
{
data = static_cast<T>(src.data);
}
}

/**
* Resets matrix to be identity matrix
*/

void identity()
{
for (int i = 0; i < 16; i++)
data = static_cast<T>((i % 5) ? 0 : 1);
}



/**
* Creates rotation matrix by rotation around axis.
* @param a Angle (in radians) of rotation around axis X.
* @param b Angle (in radians) of rotation around axis Y.
* @param c Angle (in radians) of rotation around axis Z.
*/

static Matrix4<T> createRotationAroundAxis(T a, T b, T c)
{
Matrix4<T> ma, mb, mc;
float ac = cos(a);
float as = sin(a);
float bc = cos(b);
float bs = sin(b);
float cc = cos(c);
float cs = sin(c);

ma.at(1,1) = ac;
ma.at(2,1) = as;
ma.at(1,2) = -as;
ma.at(2,2) = ac;

mb.at(0,0) = bc;
mb.at(2,0) = -bs;
mb.at(0,2) = bs;
mb.at(2,2) = bc;

mc.at(0,0) = cc;
mc.at(1,0) = cs;
mc.at(0,1) = -cs;
mc.at(1,1) = cc;

/*std::cout << "RotVec = " << a << "," << b << "," << c << std::endl;
std::cout << "Rx = " << std::endl << ma;
std::cout << "Ry = " << std::endl << mb;
std::cout << "Rz = " << std::endl << mc;*/


Matrix4<T> ret = ma * mb * mc;
//std::cout << "Result = " << std::endl << ma * (mb * mc);

return ret;
}

/// Creates translation matrix
/**
* Creates translation matrix.
* @param x X-direction translation
* @param y Y-direction translation
* @param z Z-direction translation
* @param w for W-coordinate translation (impictily set to 1)
*/

static Matrix4<T> createTranslation(T x, T y, T z, T w = 1)
{
Matrix4 ret;
ret.at(3,0) = x;
ret.at(3,1) = y;
ret.at(3,2) = z;
ret.at(3,3) = w;

return ret;
}


//---------------------[ equiality operators ]------------------------------
/**
* Equality test operator
* @param rhs Right hand side argument of binary operator.
* @note Test of equality is based of threshold EPSILON value. To be two
* values equal, must satisfy this condition all elements of matrix
* | lws - rhs | < EPSILON,
* same for y-coordinate, z-coordinate, and w-coordinate.
*/

bool operator==(const Matrix4<T>& rhs) const
{
for (int i = 0; i < 16; i++)
{
if (std::fabs(data - rhs.data) >= EPSILON)
return false;
}
return true;
}

/**
* Inequality test operator
* @param rhs Right hand side argument of binary operator.
* @return not (lhs == rhs) :-P
*/

bool operator!=(const Matrix4<T>& rhs) const
{
return !(*this == rhs);
}


//---------------------[ access operators ]---------------------------------
/**
* Get reference to element at postion (x,y).
* @param x Number of column (0..3)
* @param y Number of row (0..3)
*/

inline T& at(int x, int y)
{
/*
assert(x >= 0 && x < 4);
assert(y >= 0 && y < 4);
*/


return data[x * 4 + y];
}

/**
* Get constant reference to element at postion (x,y).
* @param x Number of column (0..3)
* @param y Number of row (0..3)
*/

const inline T& at(int x, int y) const
{
/*
assert(x >= 0 && x < 4);
assert(y >= 0 && y < 4);
*/


return data[x * 4 + y];
}


/**
* Sets translation part of matrix.
*
* @param v Vector of translation to be set.
*/

void setTranslation(const Vector3<T>& v)
{
at(3,0) = v.x;
at(3,1) = v.y;
at(3,2) = v.z;
at(3,3) = 1;
}

Vector3<T> getTranslation()
{ return Vector3<T>(at(3,0),at(3,1),at(3,2)); }

/**
* Sets roation part (matrix 3x3) of matrix.
*
* @param m Rotation part of matrix
*/

void setRotation(const Matrix3<T>& m)
{
for (int i = 0; i < 3; i++)
{
for (int j = 0; j < 3; j++)
{
at(i,j) = m.at(i,j);
}
}
}


/**
* Copy operator
* @param rhs Right hand side argument of binary operator.
*/

Matrix4<T>& operator=(const Matrix4<T>& rhs)
{
std::memcpy(data, rhs.data, sizeof(T) * 16);
return * this;
}

/**
* Copy casting operator
* @param rhs Right hand side argument of binary operator.
*/

template <class FromT>
Matrix4<T>& operator=(const Matrix4<FromT>& rhs)
{
for (int i = 0; i < 16; i++)
{
data = static_cast<T>(rhs.data);
}
return * this;
}

/**
* Copy operator
* @param rhs Right hand side argument of binary operator.
*/

Matrix4<T>& operator=(const T* rhs)
{
std::memcpy(data, rhs, sizeof(T) * 16);
return * this;
}

/*Matrix4<T> & operator=(const double* m)
{
for (int i = 0; i < 16; i++) data = (T)m;
return * this;
}*/


//--------------------[ matrix with matrix operations ]---------------------
/**
* Addition operator
* @param rhs Right hand side argument of binary operator.
*/

Matrix4<T> operator+(const Matrix4<T>& rhs) const
{
Matrix4<T> ret;
for (int i = 0; i < 16; i++)
ret.data = data + rhs.data;
return ret;
}

/**
* Substraction operator
* @param rhs Right hand side argument of binary operator.
*/

Matrix4<T> operator-(const Matrix4<T>& rhs) const
{
Matrix4<T> ret;
for (int i = 0; i < 16; i++)
ret.data = data - rhs.data;
return ret;
}

//--------------------[ matrix with scalar operations ]---------------------
/**
* Addition operator
* @param rhs Right hand side argument of binary operator.
*/

Matrix4<T> operator+(T rhs) const
{
Matrix4<T> ret;
for (int i = 0; i < 16; i++)
ret.data = data + rhs;
return ret;
}

/**
* Substraction operator
* @param rhs Right hand side argument of binary operator.
*/

Matrix4<T> operator-(T rhs) const
{
Matrix4<T> ret;
for (int i = 0; i < 16; i++)
ret.data = data - rhs;
return ret;
}

/**
* Multiplication operator
* @param rhs Right hand side argument of binary operator.
*/

Matrix4<T> operator*(T rhs) const
{
Matrix4<T> ret;
for (int i = 0; i < 16; i++)
ret.data = data * rhs;
return ret;
}

/**
* Division operator
* @param rhs Right hand side argument of binary operator.
*/

Matrix4<T> operator/(T rhs) const
{
Matrix4<T> ret;
for (int i = 0; i < 16; i++)
ret.data = data / rhs;
return ret;
}


//--------------------[ multiply operators ]--------------------------------
/**
* Multiplication operator
* @param rhs Right hand side argument of binary operator.
*/

Vector4<T> operator*(const Vector4<T>& rhs) const
{
return Vector4<T>(
data[0] * rhs.x + data[4] * rhs.y + data[8] * rhs.z + data[12] * rhs.w,
data[1] * rhs.x + data[5] * rhs.y + data[9] * rhs.z + data[13] * rhs.w,
data[2] * rhs.x + data[6] * rhs.y + data[10] * rhs.z + data[14] * rhs.w,
data[3] * rhs.x + data[7] * rhs.y + data[11] * rhs.z + data[15] * rhs.w
);

}

/**
* Multiplication operator
* @param rhs Right hand side argument of binary operator.
*/

Vector3<T> operator*(const Vector3<T>& rhs) const
{
return Vector3<T>(
data[0] * rhs.x + data[4] * rhs.y + data[8] * rhs.z,
data[1] * rhs.x + data[5] * rhs.y + data[9] * rhs.z,
data[2] * rhs.x + data[6] * rhs.y + data[10] * rhs.z
);
}

/**
* Multiplication operator
* @param rhs Right hand side argument of binary operator.
*/

Matrix4<T> operator*(const Matrix4<T>& rhs) const;

//---------------------------[ misc operations ]----------------------------

/**
* Computes determinant of matrix
* @return Determinant of matrix
* @note This function does 3 * 4 * 6 mul, 3 * 6 add.
*/

T det() const
{

return
+ at(3,0) * at(2,1) * at(1,2) * at(0,3)
- at(2,0) * at(3,1) * at(1,2) * at(0,3)
- at(3,0) * at(1,1) * at(2,2) * at(0,3)
+ at(1,0) * at(3,1) * at(2,2) * at(0,3)

+ at(2,0) * at(1,1) * at(3,2) * at(0,3)
- at(1,0) * at(2,1) * at(3,2) * at(0,3)
- at(3,0) * at(2,1) * at(0,2) * at(1,3)
+ at(2,0) * at(3,1) * at(0,2) * at(1,3)

+ at(3,0) * at(0,1) * at(2,2) * at(1,3)
- at(0,0) * at(3,1) * at(2,2) * at(1,3)
- at(2,0) * at(0,1) * at(3,2) * at(1,3)
+ at(0,0) * at(2,1) * at(3,2) * at(1,3)

+ at(3,0) * at(1,1) * at(0,2) * at(2,3)
- at(1,0) * at(3,1) * at(0,2) * at(2,3)
- at(3,0) * at(0,1) * at(1,2) * at(2,3)
+ at(0,0) * at(3,1) * at(1,2) * at(2,3)

+ at(1,0) * at(0,1) * at(3,2) * at(2,3)
- at(0,0) * at(1,1) * at(3,2) * at(2,3)
- at(2,0) * at(1,1) * at(0,2) * at(3,3)
+ at(1,0) * at(2,1) * at(0,2) * at(3,3)

+ at(2,0) * at(0,1) * at(1,2) * at(3,3)
- at(0,0) * at(2,1) * at(1,2) * at(3,3)
- at(1,0) * at(0,1) * at(2,2) * at(3,3)
+ at(0,0) * at(1,1) * at(2,2) * at(3,3);


}


/**
* Computes inverse matrix
* @return Inverse matrix of this matrix.
* @note This is a little bit time consuming operation
* (16 * 6 * 3 mul, 16 * 5 add + det() + mul() functions)
*/

Matrix4<T> inverse() const
{
Matrix4<T> ret;

ret.at(0,0) =
+ at(2,1) * at(3,2) * at(1,3)
- at(3,1) * at(2,2) * at(1,3)
+ at(3,1) * at(1,2) * at(2,3)
- at(1,1) * at(3,2) * at(2,3)
- at(2,1) * at(1,2) * at(3,3)
+ at(1,1) * at(2,2) * at(3,3);

ret.at(1,0) =
+ at(3,0) * at(2,2) * at(1,3)
- at(2,0) * at(3,2) * at(1,3)
- at(3,0) * at(1,2) * at(2,3)
+ at(1,0) * at(3,2) * at(2,3)
+ at(2,0) * at(1,2) * at(3,3)
- at(1,0) * at(2,2) * at(3,3);

ret.at(2,0) =
+ at(2,0) * at(3,1) * at(1,3)
- at(3,0) * at(2,1) * at(1,3)
+ at(3,0) * at(1,1) * at(2,3)
- at(1,0) * at(3,1) * at(2,3)
- at(2,0) * at(1,1) * at(3,3)
+ at(1,0) * at(2,1) * at(3,3);

ret.at(3,0) =
+ at(3,0) * at(2,1) * at(1,2)
- at(2,0) * at(3,1) * at(1,2)
- at(3,0) * at(1,1) * at(2,2)
+ at(1,0) * at(3,1) * at(2,2)
+ at(2,0) * at(1,1) * at(3,2)
- at(1,0) * at(2,1) * at(3,2);

ret.at(0,1) =
+ at(3,1) * at(2,2) * at(0,3)
- at(2,1) * at(3,2) * at(0,3)
- at(3,1) * at(0,2) * at(2,3)
+ at(0,1) * at(3,2) * at(2,3)
+ at(2,1) * at(0,2) * at(3,3)
- at(0,1) * at(2,2) * at(3,3);

ret.at(1,1) =
+ at(2,0) * at(3,2) * at(0,3)
- at(3,0) * at(2,2) * at(0,3)
+ at(3,0) * at(0,2) * at(2,3)
- at(0,0) * at(3,2) * at(2,3)
- at(2,0) * at(0,2) * at(3,3)
+ at(0,0) * at(2,2) * at(3,3);

ret.at(2,1) =
+ at(3,0) * at(2,1) * at(0,3)
- at(2,0) * at(3,1) * at(0,3)
- at(3,0) * at(0,1) * at(2,3)
+ at(0,0) * at(3,1) * at(2,3)
+ at(2,0) * at(0,1) * at(3,3)
- at(0,0) * at(2,1) * at(3,3);

ret.at(3,1) =
+ at(2,0) * at(3,1) * at(0,2)
- at(3,0) * at(2,1) * at(0,2)
+ at(3,0) * at(0,1) * at(2,2)
- at(0,0) * at(3,1) * at(2,2)
- at(2,0) * at(0,1) * at(3,2)
+ at(0,0) * at(2,1) * at(3,2);

ret.at(0,2) =
+ at(1,1) * at(3,2) * at(0,3)
- at(3,1) * at(1,2) * at(0,3)
+ at(3,1) * at(0,2) * at(1,3)
- at(0,1) * at(3,2) * at(1,3)
- at(1,1) * at(0,2) * at(3,3)
+ at(0,1) * at(1,2) * at(3,3);

ret.at(1,2) =
+ at(3,0) * at(1,2) * at(0,3)
- at(1,0) * at(3,2) * at(0,3)
- at(3,0) * at(0,2) * at(1,3)
+ at(0,0) * at(3,2) * at(1,3)
+ at(1,0) * at(0,2) * at(3,3)
- at(0,0) * at(1,2) * at(3,3);

ret.at(2,2) =
+ at(1,0) * at(3,1) * at(0,3)
- at(3,0) * at(1,1) * at(0,3)
+ at(3,0) * at(0,1) * at(1,3)
- at(0,0) * at(3,1) * at(1,3)
- at(1,0) * at(0,1) * at(3,3)
+ at(0,0) * at(1,1) * at(3,3);

ret.at(3,2) =
+ at(3,0) * at(1,1) * at(0,2)
- at(1,0) * at(3,1) * at(0,2)
- at(3,0) * at(0,1) * at(1,2)
+ at(0,0) * at(3,1) * at(1,2)
+ at(1,0) * at(0,1) * at(3,2)
- at(0,0) * at(1,1) * at(3,2);

ret.at(0,3) =
+ at(2,1) * at(1,2) * at(0,3)
- at(1,1) * at(2,2) * at(0,3)
- at(2,1) * at(0,2) * at(1,3)
+ at(0,1) * at(2,2) * at(1,3)
+ at(1,1) * at(0,2) * at(2,3)
- at(0,1) * at(1,2) * at(2,3);

ret.at(1,3) =
+ at(1,0) * at(2,2) * at(0,3)
- at(2,0) * at(1,2) * at(0,3)
+ at(2,0) * at(0,2) * at(1,3)
- at(0,0) * at(2,2) * at(1,3)
- at(1,0) * at(0,2) * at(2,3)
+ at(0,0) * at(1,2) * at(2,3);

ret.at(2,3) =
+ at(2,0) * at(1,1) * at(0,3)
- at(1,0) * at(2,1) * at(0,3)
- at(2,0) * at(0,1) * at(1,3)
+ at(0,0) * at(2,1) * at(1,3)
+ at(1,0) * at(0,1) * at(2,3)
- at(0,0) * at(1,1) * at(2,3);

ret.at(3,3) =
+ at(1,0) * at(2,1) * at(0,2)
- at(2,0) * at(1,1) * at(0,2)
+ at(2,0) * at(0,1) * at(1,2)
- at(0,0) * at(2,1) * at(1,2)
- at(1,0) * at(0,1) * at(2,2)
+ at(0,0) * at(1,1) * at(2,2);

return ret * det();
}



/**
* Transpose matrix.
*/

Matrix4<T> transpose() const
{
Matrix4<T> ret;
for (int i = 0; i < 4; i++)
{
for (int j = 0; j < 4; j++)
{
ret.at(i,j) = at(j,i);
}
}
return ret;
}

/**
* Linear interpolation of two vectors
* @param fact Factor of interpolation. For translation from positon
* of this matrix (lhs) to matrix rhs, values of factor goes from 0.0 to 1.0.
* @param rhs Second Matrix for interpolation
* @note Hovewer values of fact parameter are reasonable only in interval
* [0.0 , 1.0], you can pass also values outside of this interval and you
* can get result (extrapolation?)
*/

Matrix4<T> lerp(T fact, const Matrix4<T>& rhs) const
{
Matrix4<T> ret = (*this) + (rhs - (*this)) * fact;
return ret;
}


//-------------[ conversion ]-----------------------------
/**
* Conversion to pointer operator
* @return Pointer to internaly stored (in managment of class Matrix4<T>)
* used for passing Matrix4<T> values to gl*[fd]v functions.
*/

operator T*(){ return (T*) data; }

/**
* Conversion to pointer operator
* @return Constant Pointer to internaly stored (in managment of class Matrix4<T>)
* used for passing Matrix4<T> values to gl*[fd]v functions.
*/

operator const T*() const { return (const T*) data; }

//----------[ output operator ]----------------------------
/**
* Output to stream operator
* @param lhs Left hand side argument of operator (commonly ostream instance).
* @param rhs Right hand side argument of operator.
* @return Left hand side argument - the ostream object passed to operator.
*/

friend std::ostream& operator << (std::ostream& lhs, const Matrix4<T>& rhs)
{
for (int i = 0; i < 4; i++)
{
lhs << "|\t";
for (int j = 0; j < 4; j++)
{
lhs << rhs.at(j,i) << "\t";
}
lhs << "|" << std::endl;
}
return lhs;
}

};


typedef Matrix4<float> Matrix4f;
// typedef Matrix4<double> Matrix4d;


/**
* Quaternion class implementing some quaternion algebra operations.
* Quaternion is kind of complex number it consists of its real part (w)
* and its complex part v. This complex part has three elements, so we
* can express it as xi + yj + zk . Note that coordinates of (x,y,z) are
* hold inside v field.
*/

template <class T>
class DLL_EXPORT Quaternion
{
public:
/**
* Real part of quaternion.
*/

T w;
/**
* Complex part of quaternion.
*/

Vector3<T> v;

/**
* Quaternion constructor, sets quaternion to (0 + 0i + 0j + 0k).
*/

inline Quaternion(): w(1), v(0,0,0){}

/**
* Copy constructor.
*/

inline Quaternion(const Quaternion<T>& q): w(q.w), v(q.v){ }

/**
* Copy casting constructor.
*/

template <class FromT>
Quaternion(const Quaternion<FromT>& q)
: w(static_cast<T>(q.w)),
v(q.v){ }


/**
* Creates quaternion object from real part w_ and complex part v_.
* @param w_ Real part of quaternion.
* @param v_ Complex part of quaternion (xi + yj + zk).
*/

inline Quaternion(T w_, const Vector3<T>& v_): w(w_), v(v_){}

/**
* Creates quaternion object from value (w_ + xi + yj + zk).
* @param w_ Real part of quaternion.
* @param x Complex cooeficinet for i complex constant.
* @param y Complex cooeficinet for j complex constant.
* @param z Complex cooeficinet for k complex constant.
*/

inline Quaternion(T w_, T x, T y, T z): w(w_), v(x,y,z){}

/**
* Copy operator
* @param rhs Right hand side argument of binary operator.
*/

Quaternion<T>& operator= (const Quaternion<T>& rhs)
{
v = rhs.v;
w = rhs.w;
return *this;
}

/**
* Copy convert operator
* @param rhs Right hand side argument of binary operator.
*/

template<class FromT>
Quaternion<T>& operator= (const Quaternion<FromT>& rhs)
{
v = rhs.v;
w = static_cast<T>(rhs.w);
return *this;
}

/**
* Addition operator
* @param rhs Right hand side argument of binary operator.
*/

Quaternion<T> operator+ (const Quaternion<T>& rhs) const
{
const Quaternion<T>& lhs = *this;
return Quaternion<T>(lhs.w + rhs.w, lhs.v + rhs.v);
}

/**
* Multiplication operator
* @param rhs Right hand side argument of binary operator.
*/

Quaternion<T> operator* (const Quaternion<T>& rhs) const
{
const Quaternion<T>& lhs = *this;
return Quaternion<T>(
lhs.w * rhs.w - lhs.v.x * rhs.v.x - lhs.v.y * rhs.v.y - lhs.v.z * rhs.v.z,
lhs.w * rhs.v.x + lhs.v.x * rhs.w + lhs.v.y * rhs.v.z - lhs.v.z * rhs.v.y,
lhs.w * rhs.v.y - lhs.v.x * rhs.v.z + lhs.v.y * rhs.w + lhs.v.z * rhs.v.x,
lhs.w * rhs.v.z + lhs.v.x * rhs.v.y - lhs.v.y * rhs.v.x + lhs.v.z * rhs.w
);
}

/**
* Multiplication operator
* @param rhs Right hand side argument of binary operator.
*/

Quaternion<T> operator* (T rhs) const
{
return Quaternion<T>(w*rhs, v*rhs);
}

/**
* Substraction operator
* @param rhs Right hand side argument of binary operator.
*/

Quaternion<T> operator- (const Quaternion<T>& rhs) const
{
const Quaternion<T>& lhs = *this;
return Quaternion<T>(lhs.w - rhs.w, lhs.v - rhs.v);
}

/**
* Addition operator
* @param rhs Right hand side argument of binary operator.
*/

Quaternion<T>& operator+= (const Quaternion<T>& rhs)
{
w += rhs.w;
v += rhs.v;
return *this;
}

/**
* Substraction operator
* @param rhs Right hand side argument of binary operator.
*/

Quaternion<T>& operator-= (const Quaternion<T>& rhs)
{
w -= rhs.w;
v -= rhs.v;
return *this;
}

/**
* Multiplication operator
* @param rhs Right hand side argument of binary operator.
*/

Quaternion<T>& operator*= (const Quaternion<T>& rhs)
{
Quaternion q = (*this) * rhs;
v = q.v;
w = q.w;
return *this;
}

/**
* Multiplication operator
* @param rhs Right hand side argument of binary operator.
*/

Quaternion<T>& operator*= (T rhs)
{
w *= rhs;
v *= rhs;
return *this;
}

/**
* Return if the quaternion is identity or not.
*/

bool isIdentity() const
{
return (w == 1 && v.x == 0 && v.y == 0 && v.z == 0);
}

/**
* Equality test operator
* @param rhs Right hand side argument of binary operator.
* @note Test of equality is based of threshold EPSILON value. To be two
* values equal, must satisfy this condition | lws - rhs | < EPSILON,
* for all quaternion coordinates.
*/

bool operator==(const Quaternion<T>& rhs) const
{
const Quaternion<T>& lhs = *this;
return (std::fabs(lhs.w - rhs.w) < EPSILON) && lhs.v == rhs.v;
}

/**
* Inequality test operator
* @param rhs Right hand side argument of binary operator.
* @return not (lhs == rhs) :-P
*/

bool operator!=(const Quaternion<T>& rhs) const { return ! (*this == rhs); }

//-------------[ unary operations ]--------------------------
/**
* Unary negate operator
* @return negated quaternion
*/

Quaternion<T> operator-() const
{
return Quaternion<T>(-w, -v);
}

/**
* Unary conjugate operator
* @return conjugated quaternion
*/

Quaternion<T> operator~() const
{
return Quaternion<T>(w, -v);
}

/**
* Get lenght of quaternion.
* @return Length of quaternion.
*/

T length() const
{
return (T)std::sqrt(w*w + v.lengthSq());
}

/**
* Return square of length.
* @return lenght ^ 2
* @note This method is faster then length(). For comparsion
* of length of two quaternion can be used just this value, instead
* of computionaly more expensive length() method.
*/

T lengthSq() const
{
return w * w + v.lengthSq();
}

/**
* Transform a 3D Vector, rotating it by this quaternion.
*/

Vector3<T> transformVector(const Vector3<T>& vector) const
{
Quaternion<T> inv = Quaternion<T>(w, -v); // conjugate
Quaternion<T> copy = Quaternion<T>(w, v);
Quaternion<T> vectorQuat = Quaternion<T>(0, vector);
return ((copy * vectorQuat) * inv).v;
}

/**
* Inverse transform a 3D Vector, rotating it by this quaternion.
*/

Vector3<T> inverseTransformVector(const Vector3<T>& vector) const
{
// conjugate
Quaternion<T> inv = Quaternion<T>(w, -v);
Quaternion<T> copy = Quaternion<T>(w, v);
Quaternion<T> vectorQuat = Quaternion<T>(0, vector);

return (inv * vectorQuat * copy).v;
}

/**
* Calculates quaternion W component on a unit quaternion.
*/

void calculateUnitW()
{
T t = 1.0f - (v.x * v.x) - (v.y * v.y) - (v.z * v.z);
if (t < 0.0f)
w = 0.0f;
else
w = -(std::sqrt(t));
}

/**
* Normalize quaternion
*/

void normalize()
{
T len = length();
w /= len;
v /= len;
}

/**
* Creates quaternion for eulers angles.
* @param x Rotation around x axis (in degrees).
* @param y Rotation around y axis (in degrees).
* @param z Rotation around z axis (in degrees).
* @return Quaternion object representing transoformation.
*/

static inline Quaternion<T> fromEulerAngles(T x, T y, T z)
{
Quaternion<T> ret = fromAxisRot(Vector3<T>(1,0,0), x)
* fromAxisRot(Vector3<T>(0,1,0),y) * fromAxisRot(Vector3<T>(0,0,1),z);
return ret;
}

/**
* Creates quaternion as rotation around axis.
* @param axis Unit vector expressing axis of rotation.
* @param angleDeg Angle of rotation around axis (in degrees).
*/

static inline Quaternion<T> fromAxisRot(const Vector3<T>& axis, float angleDeg)
{
double angleRad = DEG2RAD(angleDeg);
double sa2 = std::sin(angleRad/2);
double ca2 = std::cos(angleRad/2);
return Quaternion<T>( ca2, axis * sa2);
}

/**
* Converts quaternion into rotation matrix.
* @return Rotation matrix expresing this quaternion.
*/

Matrix3<T> rotMatrix()
{
Matrix3<T> ret;

T xx = v.x * v.x;
T xy = v.x * v.y;
T xz = v.x * v.z;
T xw = v.x * w;

T yy = v.y * v.y;
T yz = v.y * v.z;
T yw = v.y * w;

T zz = v.z * v.z;
T zw = v.z * w;

ret.at(0,0) = 1 - 2 * (yy + zz);
ret.at(1,0) = 2 * (xy - zw);
ret.at(2,0) = 2 * (xz + yw);

ret.at(0,1) = 2 * (xy + zw);
ret.at(1,1) = 1 - 2 * (xx + zz);
ret.at(2,1) = 2 * (yz - xw);

ret.at(0,2) = 2 * (xz - yw);
ret.at(1,2) = 2 * (yz + xw);
ret.at(2,2) = 1 - 2 * (xx + yy);


return ret;
}

/**
* Converts quaternion into transformation matrix.
* @note This method performs same operation as rotMatrix()
* conversion method. But returns Matrix of 4x4 elements.
* @return Transformation matrix expressing this quaternion.
*/

Matrix4<T> transform() const
{
Matrix4<T> ret;

// Check if we are transforming identity
// quaternion.
if (isIdentity() == true)
return ret;

T xx = v.x * v.x;
T xy = v.x * v.y;
T xz = v.x * v.z;
T xw = v.x * w;

T yy = v.y * v.y;
T yz = v.y * v.z;
T yw = v.y * w;

T zz = v.z * v.z;
T zw = v.z * w;

ret.at(0,0) = 1 - 2 * (yy + zz);
ret.at(1,0) = 2 * (xy - zw);
ret.at(2,0) = 2 * (xz + yw);
ret.at(3,0) = 0;

ret.at(0,1) = 2 * (xy + zw);
ret.at(1,1) = 1 - 2 * (xx + zz);
ret.at(2,1) = 2 * (yz - xw);
ret.at(3,1) = 0;

ret.at(0,2) = 2 * (xz - yw);
ret.at(1,2) = 2 * (yz + xw);
ret.at(2,2) = 1 - 2 * (xx + yy);
ret.at(3,2) = 0;

ret.at(0,3) = 0;
ret.at(1,3) = 0;
ret.at(2,3) = 0;
ret.at(3,3) = 1;

return ret;
}

/**
* Linear interpolation of two quaternions
* @param fact Factor of interpolation. For translation from positon
* of this vector to quaternion rhs, values of factor goes from 0.0 to 1.0.
* @param rhs Second Quaternion for interpolation
* @note Hovewer values of fact parameter are reasonable only in interval
* [0.0 , 1.0], you can pass also values outside of this interval and you
* can get result (extrapolation?)
*/

Quaternion<T> lerp(T fact, const Quaternion<T>& rhs) const
{
return Quaternion<T>((1-fact) * w + fact * rhs.w, v.lerp(fact, rhs.v));
}

/**
* Provides output to standard output stream.
*/

friend std::ostream& operator << (std::ostream& oss, const Quaternion<T>& q)
{
oss << "Re: " << q.w << " Im: " << q.v;
return oss;
}

/**
* Creates quaternion from transform matrix.
*
* @param m Transfrom matrix used to compute quaternion.
* @return Quaternion representing rotation of matrix m.
*/

static Quaternion<T> fromMatrix(const Matrix4<T>& m)
{
Quaternion<T> q;

T tr,s;
tr = m.at(0,0) + m.at(1,1) + m.at(2,2);
if (tr >= epsilon)
{
s = (T)sqrt(tr + 1);
q.w = 0.5 * s;
s = 0.5 / s;

q.v.x = (m.at(1,2) - m.at(2,1)) * s;
q.v.y = (m.at(2,0) - m.at(0,2)) * s;
q.v.z = (m.at(0,1) - m.at(1,0)) * s;
}
else
{
T d0 = m.at(0,0);
T d1 = m.at(1,1);
T d2 = m.at(2,2);

char bigIdx = (d0 > d1) ? ((d0 > d2)? 0 : 2) : ((d1 > d2) ? 1 : 2);

if (bigIdx == 0)
{
s = (T)sqrt((d0 - (d1 + d2)) + 1);

q.v.x = 0.5 * s;
s = 0.5 / s;
q.v.y = (m.at(1,0) + m.at(0,1)) * s;
q.v.z = (m.at(2,0) + m.at(0,2)) * s;
q.w = (m.at(1,2) - m.at(2,1)) * s;
}
else if (bigIdx == 1)
{
s = (T)sqrt(1 + d1 - (d0 + d2));
q.v.y = 0.5 * s;
s = 0.5 / s;
q.v.z = (m.at(2,1) + m.at(1,2)) / s;
q.w = (m.at(2,0) - m.at(0,2)) / s;
q.v.x = (m.at(1,0) + m.at(0,1)) / s;
}
else
{
s = (T)sqrt(1 + d2 - (d0 + d1));
q.v.z = 0.5 * s;
s = 0.5 / s;
q.w = (m.at(0,1) - m.at(1,0)) / s;
q.v.x = (m.at(2,0) + m.at(0,2)) / s;
q.v.y = (m.at(2,1) + m.at(1,2)) / s;
}
}

return q;
}

/**
* Creates quaternion from rotation matrix.
*
* @param m Rotation matrix used to compute quaternion.
* @return Quaternion representing rotation of matrix m.
*/

static Quaternion<T> fromMatrix(const Matrix3<T>& m)
{
Quaternion<T> q;

T tr,s;
tr = m.at(0,0) + m.at(1,1) + m.at(2,2);
// if trace is greater or equal then zero
if (tr >= epsilon)
{
s = (T)sqrt(tr + 1);
q.w = 0.5 * s;
s = 0.5 / s;

q.v.x = (m.at(1,2) - m.at(2,1)) * s;
q.v.y = (m.at(2,0) - m.at(0,2)) * s;
q.v.z = (m.at(0,1) - m.at(1,0)) * s;
}
else
{
T d0 = m.at(0,0);
T d1 = m.at(1,1);
T d2 = m.at(2,2);

// find greates diagonal number
char bigIdx = (d0 > d1) ? ((d0 > d2)? 0 : 2) : ((d1 > d2) ? 1 : 2);

if (bigIdx == 0)
{
s = (T)sqrt((d0 - (d1 + d2)) + 1);

q.v.x = 0.5 * s;
s = 0.5 / s;
q.v.y = (m.at(1,0) + m.at(0,1)) * s;
q.v.z = (m.at(2,0) + m.at(0,2)) * s;
q.w = (m.at(1,2) - m.at(2,1)) * s;
}
else if (bigIdx == 1)
{
s = (T)sqrt(1 + d1 - (d0 + d2));
q.v.y = 0.5 * s;
s = 0.5 / s;
q.v.z = (m.at(2,1) + m.at(1,2)) / s;
q.w = (m.at(2,0) - m.at(0,2)) / s;
q.v.x = (m.at(1,0) + m.at(0,1)) / s;
}
else
{
s = (T)sqrt(1 + d2 - (d0 + d1));
q.v.z = 0.5 * s;
s = 0.5 / s;
q.w = (m.at(0,1) - m.at(1,0)) / s;
q.v.x = (m.at(2,0) + m.at(0,2)) / s;
q.v.y = (m.at(2,1) + m.at(1,2)) / s;
}
}

return q;
}

/**
* Computes spherical interpolation between quaternions (this, q2)
* using coeficient of interpolation r (in [0, 1]).
*
* @param r The ratio of interpolation form this (r = 0) to q2 (r = 1).
* @param q2 Second quaternion for interpolation.
* @return Result of interpolation.
*/

Quaternion<T> slerp(T r, const Quaternion<T>& q2) const
{
Quaternion<T> ret;
T cosTheta = w * q2.w + v.x * q2.v.x + v.y *q2.v.y + v.z * q2.v.z;
T theta = (T) acos(cosTheta);
if (fabs(theta) < epsilon)
{
ret = *this;
}
else
{
T sinTheta = (T)sqrt(1.0 - cosTheta * cosTheta);
if (fabs(sinTheta) < epsilon)
{
ret.w = 0.5 * w + 0.5 * q2.w;
ret.v = v.lerp(0.5, q2.v);
}
else
{
T rA = (T)sin((1.0 - r) * theta) / sinTheta;
T rB = (T)sin(r * theta) / sinTheta;

ret.w = w * rA + q2.w * rB;
ret.v.x = v.x * rA + q2.v.x * rB;
ret.v.y = v.y * rA + q2.v.y * rB;
ret.v.z = v.z * rA + q2.v.z * rB;
}
}
return ret;
}
};

typedef Quaternion<float> Quatf;
// typedef Quaternion<double> Quatd;

// Romulo - Create some alias
typedef Vector2f vector2_t;
typedef Vector2i vector2i_t;
typedef Vector3f vector3_t;
typedef Vector4f vector4_t;
typedef Vector4f color_t;
typedef Quatf quaternion_t ;
typedef Matrix3f matrix3_t;
typedef Matrix4f matrix4_t;
typedef Planef plane_t;
typedef Spheref sphere_t;
typedef BoundingBoxf boundingBox_t;
// End Romulo

#ifdef WIN32
#pragma warning(default: 4244)
#endif


#ifdef VMATH_NAMESPACE
}
#endif

#endif // __vmath_Header_File__



vmath.cpp (the crashing part is commented out)

/*
* vmath, set of classes for computer graphics mathemtics.
* Copyright (c) 2005-2006, Jan Bartipan < barzto at gmail dot com >
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* - Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* - Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in
* the documentation and/or other materials provided with the
* distribution.
* - Neither the names of its contributors may be used to endorse or
* promote products derived from this software without specific
* prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
* FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
* COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
* BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
* LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
* CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY
* WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*/


#include "engine.h"
#include "logger.h"
#include "rendersystem.h"
#include "vmath.h"

#ifdef VMATH_NAMESPACE
namespace VMATH_NAMESPACE
{
#endif

template class Vector2<float>;
// template class Vector2<double>;
template class Vector3<float>;
// template class Vector3<double>;
template class Vector4<float>;
// template class Vector4<double>;
template class Matrix3<float>;
// template class Matrix3<double>;
template class Matrix4<float>;
// template class Matrix4<double>;
template class Plane<float>;
// template class Plane<double>;
template class Quaternion<float>;
// template class Quaternion<double>;
template class BoundingBox<float>;
// template class BoundingBox<double>;
template class Sphere<float>;
// template class Sphere<double>;

/*
template <>
Matrix4<float> Matrix4<float>::operator*(const Matrix4<float>& rhs) const
{
ALIGN(16) matrix4_t w;
__m128 a_line, b_line, r_line;

for (int i = 0; i < 16; i += 4)
{
// unroll the first step of the loop to avoid having to initialize r_line to zero
a_line = _mm_load_ps((const float*)data); // a_line = vec4(column(a, 0))
b_line = _mm_set1_ps(rhs.data); // b_line = vec4(b[0])
r_line = _mm_mul_ps(a_line, b_line); // r_line = a_line * b_line

for (int j = 1; j < 4; j++)
{
// a_line = vec4(column(a, j))
a_line = _mm_load_ps((const float*)&data[j * 4]);

// b_line = vec4(b[j])
b_line = _mm_set1_ps(rhs.data[i + j]);

// r_line += a_line * b_line
r_line = _mm_add_ps(_mm_mul_ps(a_line, b_line), r_line);
}

_mm_store_ps((float*)&w.data, r_line); // r = r_line
}

return w;
}
*/


template <typename T>
Matrix4<T> Matrix4<T>::operator*(const Matrix4<T>& rhs) const
{
static Matrix4<T> w;
for (int i = 0; i < 4; i++)
{
for (int j = 0; j < 4; j++)
{
T n = 0;
for (int k = 0; k < 4; k++) n += rhs.at(i, k) * at(k, j);
w.at(i, j) = n;
}
}

return w;
}

template <typename T>
void BoundingBox<T>::getVertices()
{
logTrace();

m_vertices[0] = m_mins;
m_vertices[1] = Vector3<T>(m_maxs.x, m_mins.y, m_mins.z);
m_vertices[2] = Vector3<T>(m_maxs.x, m_mins.y, m_maxs.z);
m_vertices[3] = Vector3<T>(m_mins.x, m_mins.y, m_maxs.z);
m_vertices[4] = Vector3<T>(m_mins.x, m_maxs.y, m_mins.z);
m_vertices[5] = Vector3<T>(m_maxs.x, m_maxs.y, m_mins.z);
m_vertices[6] = m_maxs;
m_vertices[7] = Vector3<T>(m_mins.x, m_maxs.y, m_maxs.z);
}

template <typename T>
void BoundingBox<T>::drawBounds()
{
logTrace();

if (!m_visible)
return;

cg::renderSystem_t* renderSystem = cg::engine_t::m_renderSystem;
BOOST_ASSERT(renderSystem != 0);

// Remove material.
renderSystem->unbindMaterial();

// Update Vertices.
getVertices();

// Generate indices.
index_t indices[24] = {
0, 1,
1, 2,
2, 3,
3, 0,

4, 5,
5, 6,
6, 7,
7, 4,

0, 4,
1, 5,
2, 6,
3, 7
};

renderSystem->draw((float*)&m_vertices[0].x, indices, 24,
cg::renderSystem_t::VERTEXMODE_LINE);
}

template <typename T>
void BoundingBox<T>::drawFilled()
{
logTrace();

cg::renderSystem_t* renderSystem = cg::engine_t::m_renderSystem;
BOOST_ASSERT(renderSystem != 0);

// Update Vertices.
getVertices();

static index_t fIndices[36] = {
0, 1, 2,
0, 2, 3,

0, 1, 5,
0, 5, 4,

1, 5, 6,
1, 6, 2,

0, 4, 7,
0, 7, 3,

3, 2, 6,
3, 6, 7,

4, 5, 6,
4, 6, 7
};

renderSystem->draw((float*)&m_vertices[0].x, fIndices, 36);
}

template <typename T>
void BoundingBox<T>::draw(bool filled)
{
logTrace();

if (filled)
drawFilled();
else
drawBounds();
}

template <typename T>
bool BoundingBox<T>::update(const Vector3<T>& mins, const Vector3<T>& maxs)
{
bool changed = false;

if (m_mins.x > mins.x)
{
m_mins.x = mins.x;
changed = true;
}

if (m_mins.y > mins.y)
{
m_mins.y = mins.y;
changed = true;
}

if (m_mins.z > mins.z)
{
m_mins.z = mins.z;
changed = true;
}

if (m_maxs.x < maxs.x)
{
m_maxs.x = maxs.x;
changed = true;
}

if (m_maxs.y < maxs.y)
{
m_maxs.y = maxs.y;
changed = true;
}

if (m_maxs.z < maxs.z)
{
m_maxs.z = maxs.z;
changed = true;
}

return changed;
}

template <typename T>
void Sphere<T>::draw() const
{
logTrace();

cg::renderSystem_t* renderSystem = cg::engine_t::m_renderSystem;
BOOST_ASSERT(renderSystem != 0);

std::vector< index_t > indices;
std::vector< Vector3<T> > vertices;

unsigned int count = 0;
for (unsigned int i = 0; i < 360; i += 10)
{
vertices.push_back(Vector3<T>(sin(DEG2RAD(i)) * m_radius, cos(DEG2RAD(i)) * m_radius, 0));
indices.push_back(count);

if ((i + 10) == 360)
indices.push_back(0);
else
indices.push_back(count + 1);

count++;
}

unsigned int initialCount = count;
for (unsigned int i = 0; i < 360; i += 10)
{
vertices.push_back(Vector3<T>(0, sin(DEG2RAD(i)) * m_radius, cos(DEG2RAD(i)) * m_radius));
indices.push_back(count);

if ((i + 10) == 360)
indices.push_back(initialCount);
else
indices.push_back(count + 1);

count++;
}

renderSystem->draw((float*)&vertices[0].x, &indices[0], indices.size(),
cg::renderSystem_t::VERTEXMODE_LINE);
}

#ifdef VMATH_NAMESPACE
}
#endif



And this is the optimization algorithm/reference:
http://fhtr.blogspot.com/2010/02/4x4-float-matrix-multiplication-using.html

Thanks for any help, i appreciate.

Share this post


Link to post
Share on other sites
Advertisement

1) That code is sub-optimal is so many regards, that vectorising the matrix multiply is going to be of next to no use. For a start, the classes are all DLL exported - which means none of those routines will ever be inlined. Secondly, even if you do start optimising the odd func here and there, jumping between the FPU and SSE units will kill your performance very quickly. Finally, that is possibly the single worst SSE tutorial I've ever seen. Ignore everything that guys written, he hasn't got a clue! Using _mm_set_ps() or equivalent is a sure fire way to force data to switch between the FPU and SSE. Some (but no where near all) compilers handle _mm_set_ps efficiently, others (eg VC++) do not! (which results in some nasty FPU<->SSE register transfers)

2) Alignment - It's close to non-existant in that code. Where are the overloaded new/delete operators? Where are the asserts the validate that you absolutely have not got the alignment wrong? Why are you doing this: ALIGN(16) matrix4_t w; ? Surely it would be easier to specify the alignment on a per type basis rather than a per instance basis? If you insist on using SSE, then I'd strongly suggest you should be making sure it's bullet-proof with the required amount of asserts..... something like this is a *good* idea:


void vec3Add(__m128& result, const __m128& a, const __m128& b)
{
ASSERT_IF_POINTER_IS_NOT_16_BYTE_ALIGNED(&a);
ASSERT_IF_POINTER_IS_NOT_16_BYTE_ALIGNED(&b);
ASSERT_IF_POINTER_IS_NOT_16_BYTE_ALIGNED(&result);
result = _mm_add_ps(a,b);
}


3) Why are you not using a debugger? All you have to do is verify that the (hex) memory addresses end in zero. If they do, it's aligned. If it's non-zero, it's not aligned. A quick cursory glance in a debugger would tell you whether it's aligned properly - and then it's up to you to identify why it doesn't have the correct alignment.

FWIW, the changes you are making to the library are a touch pointless imho. You may get small improvements in some areas, but to be honest you'd be better off starting from scratch and writing the code using Hybrid-SOA structures instead.

Share this post


Link to post
Share on other sites
1) I dont know much about SSE, thats why i wanted to try this one. Since i couldnt test performance improvements i couldnt check the gains.

2) Those guys NEVER get called by new/delete, thats why i didnt overload the operators. Also i didnt know if i needed to align only class members or the class themselves.

3) I already checked (not present on this code) for the addresses alignment and they are fine.

4) The changes i made was to include planes, spheres, and stuff like that, im not changing the core library except on fewer cases.

Share this post


Link to post
Share on other sites
Quote:
Original post by nightz
1) I dont know much about SSE, thats why i wanted to try this one. Since i couldnt test performance improvements i couldnt check the gains.

Profile first, optimize later, and inspecting the generated asm is a good idea (to ensure you don't see and fld, fadd, fsub instructions in there).

Quote:
Original post by nightz
2) Those guys NEVER get called by new/delete, thats why i didnt overload the operators.


Hmmmmm..... Possibly the least believable statement ever written. For example:


class ALIGN(16) Bone
{
public:
Matrix<float> tm;
};

// tm will not be aligned.
Bone* bone = new Bone();






Even though the Bone has been declared as aligned, bone->tm will not be aligned since the memory returned from new is not aligned. That is why you need to assert on every single argument & this pointer. It's far too easy to cock it up to leave anything to chance.

Quote:

Also i didnt know if i needed to align only class members or the class themselves.


To align tm relative to the start of Bone:


class Bone
{
public:
ALIGN(16) Matrix<float> tm;
};





To align the whole Bone:


class ALIGN(16) Bone
{
public:
Matrix<float> tm;
};





As I said before, I see zero evidence of any alignment the Matrix class....

Quote:
Original post by nightz
3) I already checked (not present on this code) for the addresses alignment and they are fine.


Are you developing on a pentium 2? (or god forbid, something worse?)

Quote:

4) The changes i made was to include planes, spheres, and stuff like that, im not changing the core library except on fewer cases.


Write Hybrid SOA plane / sphere classes in that case. Don't bother re-working the classes there.


class PlaneSOA
{
__m128 a;
__m128 b;
__m128 c;
__m128 d;
};



Share this post


Link to post
Share on other sites
Thanks for your time to helping me out and this cleared all my doubts:

Quote:
Original post by RobTheBloke
To align tm relative to the start of Bone:

class Bone
{
public:
ALIGN(16) Matrix<float> tm;
};

To align the whole Bone:

class ALIGN(16) Bone
{
public:
Matrix<float> tm;
};



The version i posted unfortunately got the typedef removed on last time (because i was cleaning a bit to recompile and test again), one of my versions had typedef ALIGN(16) Matrix4<float> Matrix4f which should do the alignment issues, correct? I still got weird errors/crashes. I will make sure the pointers are aligned once again. I will leave feedback here later about what hapenned. Thanks again.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!