__declspec( align( 16 ) ) union
{
struct
{
__m128 _L1, _L2, _L3, _L4;
};
struct
{
float _11, _12, _13, _14;
float _21, _22, _23, _24;
float _31, _32, _33, _34;
float _41, _42, _43, _44;
};
};
SIMD still need help
Hello all,
i 've started to optimise my matrix class with simd but i run into a problem.
i get an acces violation when accessing _L1. This is declared like this:
i've noticed that the _L1 variable starts at 0x011cead8 which is not aligned at 16 bytes. i thought the __declspec( align( 16 ) ) would take care of that but no...
i'm using visual studio.net 2003.
Could anyone help me??
thx in advance.
[Edited by - codehunter13 on May 12, 2005 5:19:49 AM]
Are you using some type of CPU that complains if you are trying to access a 32b variable that is not alligned properly?
On x86 processors allignment doesn't really matter (except for speed purposes).
Post some code.
On x86 processors allignment doesn't really matter (except for speed purposes).
Post some code.
this is the function that fails:
i've got an amd 2000 xp processor
void cMatrix::rotationMatrix(const float radsX,const float radsY,const float radsZ) { cMatrix x,y,z,res; x.rotateXMatrix(radsX); y.rotateYMatrix(radsY); z.rotateZMatrix(radsZ); res = z*x*y; _L1 = res._L1; _L2 = res._L2; _L3 = res._L3; _L4 = res._L4; }
i've got an amd 2000 xp processor
namespace ML{ class cMatrix; class cVector; class cVector3; class cMatrix { public: __declspec( align( 16 ) ) union { struct { __m128 _L1, _L2, _L3, _L4; }; struct { float _11, _12, _13, _14; float _21, _22, _23, _24; float _31, _32, _33, _34; float _41, _42, _43, _44; }; }; // Constructors cMatrix() {} cMatrix(const cMatrix &m) : _L1(m._L1), _L2(m._L2), _L3(m._L3), _L4(m._L4) {} cMatrix(float _11, float _12, float _13, float _14, float _21, float _22, float _23, float _24, float _31, float _32, float _33, float _34, float _41, float _42, float _43, float _44); float& operator() (int i, int j) { assert((0<=i) && (i<=3) && (0<=j) && (j<=3)); return *(((float *)&_11) + (i<<2)+j); } F32vec4& operator() (int i) { assert((0<=i) && (i<=3)); return *(((F32vec4 *)&_11) + i); } F32vec4& operator[] (int i) { assert((0<=i) && (i<=3)); return *(((F32vec4 *)&_11) + i); } F32vec4& operator[] (int i) const { assert((0<=i) && (i<=3)); return *(((F32vec4 *)&_11) + i); } cMatrix& operator= (const cMatrix &a) { _L1 = a._L1; _L2 = a._L2; _L3 = a._L3; _L4 = a._L4; return *this; } friend cMatrix operator * (const cMatrix&, const cMatrix&); friend cMatrix operator + (const cMatrix&, const cMatrix&); friend cMatrix operator - (const cMatrix&, const cMatrix&); friend cMatrix operator + (const cMatrix&); friend cMatrix operator - (const cMatrix&); friend cMatrix operator * (const cMatrix&, const float); friend cMatrix operator * (const float, const cMatrix&); cMatrix & operator *= (const cMatrix &); cMatrix & operator *= (const float); cMatrix & operator += (const cMatrix &); cMatrix & operator -= (const cMatrix &); // Other Constructors: void zeroMatrix(); void identityMatrix(); void translateMatrix(const float dx, const float dy, const float dz); void scaleMatrix(const float a, const float b, const float c); void scaleMatrix(const float a); void rotationMatrix(const float radsX,const float radsY,const float radsZ); void rotateXMatrix(const float rads); void rotateYMatrix(const float rads); void rotateZMatrix(const float rads); };
Hmm...
I don't really understand this code:
I mean, I don;t really know C++, but in C that doesn't make much sense. Unless this is a feature of C++ I am not familiar with, what exactly are you trying to acomplish in those 2 lines of code?
I don't really understand this code:
res = z*x*y;_L1 = res._L1;
I mean, I don;t really know C++, but in C that doesn't make much sense. Unless this is a feature of C++ I am not familiar with, what exactly are you trying to acomplish in those 2 lines of code?
res = z*x*y => calculate rotation matrix and store in res.
_L1 = res._L1; =>copy first four floats of the res matrix into the first four floats of the current object
_L1 = res._L1; =>copy first four floats of the res matrix into the first four floats of the current object
I still don't understand it.
comment out //res = z*x*y => calculate rotation matrix and store in res.
See if it still crashes.
comment out //res = z*x*y => calculate rotation matrix and store in res.
See if it still crashes.
Quote:Original post by RaduprvThat multiplies the matrices x, y and z, and then it copies line 1 of the matrix into this (and the following 3 lines copy the rest of the result into this).
Hmm...
I don't really understand this code:res = z*x*y;_L1 = res._L1;
I mean, I don;t really know C++, but in C that doesn't make much sense. Unless this is a feature of C++ I am not familiar with, what exactly are you trying to acomplish in those 2 lines of code?
He could also have done:
cMatrix x,y,z,res;x.rotateXMatrix(radsX);y.rotateYMatrix(radsY);z.rotateZMatrix(radsZ);*this = z*x*y;
Also, alignment does matter for SIMD instructions, since the CPU can't perform SIMD operations on unaliged data.
codehunter13: Have you tried that __declspec with other variable types? It seems to work for me.
EDIT: Apparently the __m128 data type is supposed to be aligned to 16 bytes anyway...
Edit 2: Clicky
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement