Back to General and Gameplay Programming

SIMD still need help

General and Gameplay Programming Programming

Started by codehunter13 May 11, 2005 07:42 PM

30 comments, last by codehunter13 18 years, 11 months ago

codehunter13

122

Author

May 11, 2005 07:42 PM

Hello all, i 've started to optimise my matrix class with simd but i run into a problem. i get an acces violation when accessing _L1. This is declared like this:


__declspec( align( 16 ) ) union 
{
    struct 
    {
	__m128 _L1, _L2, _L3, _L4;
    };
    struct 
    {
	float	_11, _12, _13, _14;
	float	_21, _22, _23, _24;
	float	_31, _32, _33, _34;
	float	_41, _42, _43, _44;
    };
};

i've noticed that the _L1 variable starts at 0x011cead8 which is not aligned at 16 bytes. i thought the __declspec( align( 16 ) ) would take care of that but no... i'm using visual studio.net 2003. Could anyone help me?? thx in advance. [Edited by - codehunter13 on May 12, 2005 5:19:49 AM]

Raduprv

997

May 11, 2005 07:47 PM

Are you using some type of CPU that complains if you are trying to access a 32b variable that is not alligned properly?
On x86 processors allignment doesn't really matter (except for speed purposes).
Post some code.

Eternal Lands (free, Open Source MMORPG)

codehunter13

122

Author

May 11, 2005 07:50 PM

this is the function that fails:

void cMatrix::rotationMatrix(const float radsX,const float radsY,const float radsZ)	{		cMatrix x,y,z,res;		x.rotateXMatrix(radsX);		y.rotateYMatrix(radsY);		z.rotateZMatrix(radsZ);		res = z*x*y;		_L1 = res._L1;		_L2 = res._L2;		_L3 = res._L3;		_L4 = res._L4;	}

i've got an amd 2000 xp processor

Raduprv

997

May 11, 2005 07:52 PM

Post the full data declaration too.

Eternal Lands (free, Open Source MMORPG)

codehunter13

122

Author

May 11, 2005 07:55 PM

namespace ML{	class cMatrix;	class cVector;	class cVector3;	class cMatrix {	public:		__declspec( align( 16 ) ) union {			 struct {				__m128 _L1, _L2, _L3, _L4;			};			struct {				float	_11, _12, _13, _14;				float	_21, _22, _23, _24;				float	_31, _32, _33, _34;				float	_41, _42, _43, _44;			};		};	// Constructors 		cMatrix() {}		cMatrix(const cMatrix &m) : _L1(m._L1), _L2(m._L2), _L3(m._L3), _L4(m._L4) {}		cMatrix(float _11, float _12, float _13, float _14,				float _21, float _22, float _23, float _24,				float _31, float _32, float _33, float _34,				float _41, float _42, float _43, float _44);		float& operator() (int i, int j) {			assert((0<=i) && (i<=3) && (0<=j) && (j<=3));			return *(((float *)&_11) + (i<<2)+j);		}		F32vec4& operator() (int i) {			assert((0<=i) && (i<=3));			return *(((F32vec4 *)&_11) + i);		}		F32vec4& operator[] (int i) {			assert((0<=i) && (i<=3));			return *(((F32vec4 *)&_11) + i);		}		F32vec4& operator[] (int i) const {			assert((0<=i) && (i<=3));			return *(((F32vec4 *)&_11) + i);		}		cMatrix& operator= (const cMatrix &a) {			_L1 = a._L1; _L2 = a._L2; _L3 = a._L3; _L4 = a._L4;			return *this;		}		friend cMatrix operator * (const cMatrix&, const cMatrix&);		friend cMatrix operator + (const cMatrix&, const cMatrix&);		friend cMatrix operator - (const cMatrix&, const cMatrix&);		friend cMatrix operator + (const cMatrix&);		friend cMatrix operator - (const cMatrix&);		friend cMatrix operator * (const cMatrix&, const float);		friend cMatrix operator * (const float, const cMatrix&);		cMatrix & operator *= (const cMatrix &);		cMatrix & operator *= (const float);		cMatrix & operator += (const cMatrix &);		cMatrix & operator -= (const cMatrix &);		// Other Constructors:		void zeroMatrix();		void identityMatrix();		void translateMatrix(const float dx, const float dy, const float dz);		void scaleMatrix(const float a, const float b, const float c);		void scaleMatrix(const float a);		void rotationMatrix(const float radsX,const float radsY,const float radsZ);		void rotateXMatrix(const float rads);		void rotateYMatrix(const float rads);		void rotateZMatrix(const float rads);		};

Raduprv

997

May 11, 2005 08:05 PM

Hmm...
I don't really understand this code:

		res = z*x*y;_L1 = res._L1;

I mean, I don;t really know C++, but in C that doesn't make much sense. Unless this is a feature of C++ I am not familiar with, what exactly are you trying to acomplish in those 2 lines of code?

Eternal Lands (free, Open Source MMORPG)

codehunter13

122

Author

May 11, 2005 08:14 PM

res = z*x*y => calculate rotation matrix and store in res.

_L1 = res._L1; =>copy first four floats of the res matrix into the first four floats of the current object

Raduprv

997

May 11, 2005 08:17 PM

I still don't understand it.
comment out //res = z*x*y => calculate rotation matrix and store in res.
See if it still crashes.

Eternal Lands (free, Open Source MMORPG)

Evil Steve

2,021

May 11, 2005 08:20 PM

Quote:Original post by Raduprv
Hmm...
I don't really understand this code:
		res = z*x*y;_L1 = res._L1;
I mean, I don;t really know C++, but in C that doesn't make much sense. Unless this is a feature of C++ I am not familiar with, what exactly are you trying to acomplish in those 2 lines of code?

That multiplies the matrices x, y and z, and then it copies line 1 of the matrix into this (and the following 3 lines copy the rest of the result into this).
He could also have done:

cMatrix x,y,z,res;x.rotateXMatrix(radsX);y.rotateYMatrix(radsY);z.rotateZMatrix(radsZ);*this = z*x*y;

Also, alignment does matter for SIMD instructions, since the CPU can't perform SIMD operations on unaliged data.

codehunter13: Have you tried that __declspec with other variable types? It seems to work for me.
EDIT: Apparently the __m128 data type is supposed to be aligned to 16 bytes anyway...
Edit 2: Clicky

codehunter13

122

Author

May 11, 2005 08:21 PM

already done that and he always crashes when accessing _L1;

SIMD still need help

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

SIMD still need help

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines