SSE alignment

Started by
11 comments, last by mattnewport 16 years ago
I'm trying to make a vector class, to learn the maths involved, and to also learn SSE (for the heck of it). So, I created a class. However, when I use it in my raytracer (which creates arrays of vertices and stores them in the __m128 datatype), it sometimes segfaults. Not always, but I assume that it is because of alignment. However, I have tried everything, seemingly. Please, help me understand! (linux, using gcc, probably not the best choice, but I can move into windows to test stuff if you require it). Class:

struct Vector3 {
	union {
#ifdef USE_SSE
		__m128 vec;
#endif
		struct {
			float x, y, z, w;
		};
		struct {
			float r, g, b, a;
		};
		float xyzw [4];
	};
	inline Vector3() {}
	inline Vector3 (float newx, float newy, float newz, float neww=0.0f) :
		x(newx), y(newy), z(newz), w(neww) {}

	//inline Vector3 (const Vector3& vector3) : vec (vector3.vec) {}

	inline float magnitude ();
	inline float magnitudeSquared ();
	inline void normalize ();	
	inline void clamp(const Scalar& min, const Scalar& max);
#ifdef USE_SSE	

	 void* operator new[](size_t allocSize)
	{
		void* p = _mm_malloc(allocSize, 16);
		return p;
	}

	void operator delete[](void *p)
	{
		_mm_free(p);
	}

	void* operator new (size_t allocSize)
	{
		void* p = _mm_malloc(allocSize, 16);
		return p;
	}

	void operator delete(void* p)
	{
		_mm_free(p);
	}

#endif

	void print ();
};

If I remove the allocation operators, there is no noticable difference (in segfault frequency). I am using the xmmintrin.h header. Is the way I'm using anonymous structs/unions wrong? How do other people do it? Thanks.
Advertisement
You should make sure that your data types used with SSE are 16-byte aligned. using __declspec(align(16)) should fix that issue.
---------------------------Visit my Blog at http://robwalkerdme.blogspot.com
I'm using gcc, that's a VC++ specific option.


More specifically, the __m128 class is already aligned to boundraries, using __attribute__ (as specified in the xmmintrin.h file:

typedef float __m128 __attribute__ ((__vector_size__ (16), __may_alias__));

Also, this issue didn't occur as often before, but probably because my code was simpler.

I'm almost certain that it isn't aligned, yet I have no idea how to check.

Is there a way to check in gdb, or something?

I can post the whole class if you want.
Quote:Original post by solinent
I'm using gcc, that's a VC++ specific option.


More specifically, the __m128 class is already aligned to boundraries, using __attribute__ (as specified in the xmmintrin.h file:

typedef float __m128 __attribute__ ((__vector_size__ (16), __may_alias__));

Also, this issue didn't occur as often before, but probably because my code was simpler.

I'm almost certain that it isn't aligned, yet I have no idea how to check.

Is there a way to check in gdb, or something?

I can post the whole class if you want.


Compile with -g or -ggdb and in GDB break in the scope of the variable you want to check and do "print &vec", where vec is your actual variable's name. You can also use GDB to catch the segfault and see where it happened. You might want to start there. If you don't know how to do this someone can walk you through the basics.

I have stepped through with GDB, and used commands like "where" and "break", "next" to debug my program.

However, I can't seem to actually access the variables. "print" doesn't do anything, even when I set a break before the one area in the header file, and try to print out the values I gave the function, it gives me nothing. Also, it doesn't seem to step through the file correctly.

I have used GDB, but I was just wondering if there was a specific command I can use to check alignment. __align_of apparently just returns what it should be, so I can't use that.

Thanks!
What does your compile command look like? Either you're not generating appropriate debug info or you have stack corruption.
Uh, this is the command to compile a cpp into an object.

g++ -Wall -g -O -msse -ffast-math -funroll-loops -mfpmath=sse -c src/Vector3.cpp -o bin/Vector3.o


And, this is the command to link:
g++ -Wall -g -O -msse -ffast-math -funroll-loops -mfpmath=sse -lrt -lSDL -lboost_thread -o raytracer bin/main.o bin/Vector3.o src/Mesh.o

Also, profiling using -pg option works.

I'm using a Makefile btw.
Ok, so you have no problems when USE_SSE is not defined? Are you actually using any intrinsics yet or just including that __m128 in your union at this point?
Yeah, I am actually using intrinsics, and when USE_SSE is not defined,the code works fine.

Is there some extra step I can do to make sure it stays aligned?
Try something like printf("%p",&vec); or similar in the constructors of your Vector3.

This topic is closed to new replies.

Advertisement