Back to AngelCode

Re: vector3 pod types (in C this time)

AngelCode Affiliates

Started by toukkapoukka September 12, 2011 11:05 AM

22 comments, last by WitchLord 12 years, 5 months ago

toukkapoukka

102

Author

September 12, 2011 11:05 AM

Hi!

There was a topic with similar problem that we are experiencing, http://www.gamedev.net/topic/607167-ogre-vector3-and-amd64/ .
But what is different in our case is that our vector type is in pure C and writing explicit constructors/copy-constructors dont work in our case.
And yes, the problem is returning simple struct type as value on linux x64 (GCC 4.6.1), the vector values are all funky.
Saying returning vector with values (1,2,3) will give out (3,0,1).
Vector struct looks like this;



typedef float vec_t;

typedef vec_t vec3_t[3];

typedef struct asvec3_s 

{ 

	vec3_t v; 

} asvec3_t;

We register the type with flags (asOBJ_VALUE | asOBJ_POD | asOBJ_APP_CLASS).
What are possible options that we have to fix this?

WitchLord

4,835

September 13, 2011 12:35 AM

Linux 64bit will probably return this type in the registers RAX:RDX. However, it looks like the values might be switched, so you're getting the higher elements in the lower elements. The 0 is probably just the default value as the type only has 3 floats, and not 4.

This might be a bug in as_callfunc.cpp (lines 513-514) or it may be a specific situation for this type that is not handled yet. I'll try to investigate this and see if I can figure it out.

In the meantime, you should be able to use the autowrappers to solve this problem.

AngelCode.com - game development and more - Reference DB - game developer references
AngelScript - free scripting library - BMFont - free bitmap font generator - Tower - free puzzle game

toukkapoukka

102

Author

September 13, 2011 08:42 AM

Linux 64bit will probably return this type in the registers RAX:RDX. However, it looks like the values might be switched, so you're getting the higher elements in the lower elements. The 0 is probably just the default value as the type only has 3 floats, and not 4.

This might be a bug in as_callfunc.cpp (lines 513-514) or it may be a specific situation for this type that is not handled yet. I'll try to investigate this and see if I can figure it out.

In the meantime, you should be able to use the autowrappers to solve this problem.

Small update, I found something that temporarily fixes our problem. Any of the following methods produce the vector (1,2,3) correctly:



struct vector {

  union {

	float xyz[3];

	double _as_64bit_hack[3];

  };

};



struct vector {

  float xyz[3];

  float pad[2];	// <- has to be at least 2 elements

};



struct vector {

  double xyz[3];   // works as double, see first method

};

I tried different alignments with GCC align attribute, but none worked like these so I'm not sure is this alignment issue or framesize issue or what?
Naturally this isnt 100% satisfactory workaround and I hope that you find a "correct" solution for this

WitchLord

4,835

September 15, 2011 01:14 AM

All of these make the structure be larger than 128bit which will make the gcc compiler return the type in memory rather than the registers.

What stumps me is that I already have a test for validating that a class similar to this works properly on linux 64bit, and it is working as it should.

Unfortunately I haven't had the time to investigate this problem in detail yet, but hopefully before the end of the week I will at least be able to understand the cause.

AngelCode.com - game development and more - Reference DB - game developer references
AngelScript - free scripting library - BMFont - free bitmap font generator - Tower - free puzzle game

WitchLord

4,835

September 15, 2011 03:51 AM

I made some tests with asvec3_t, and I get the same result as you do, i.e. {3,0,1} where {1,2,3} is expected.

For some reason that I have yet to determine why gcc is treating the following class:



class Class3

{

  asDWORD a;

  asDWORD b;

  asDWORD c;

};

differently than the asvec3_t type, even though both are of the same size and both contain only primitives. Both seem to be returned in the RAX and RDX registers, however the order of the registers is swapped for asvec3_t versus Class3.

Apparently I need to have another flag than asOBJ_APP_CLASS to identify that the asvec3_t type should be returned in a swapped order, however I do not know how the application developer should know when to use one or the other.

If you have any idea on what the rule might be as to when gcc does it one way or the other I would really like to hear it.

AngelCode.com - game development and more - Reference DB - game developer references
AngelScript - free scripting library - BMFont - free bitmap font generator - Tower - free puzzle game

toukkapoukka

102

Author

September 15, 2011 08:20 AM

class Class3 { asDWORD a; asDWORD b; asDWORD c; };

At least this allows us to keep the 12-byte size on the vector, union { float v[3]; int i[3]; };

Apparently I need to have another flag than asOBJ_APP_CLASS to identify that the asvec3_t type should be returned in a swapped order, however I do not know how the application developer should know when to use one or the other.
[/quote]
Yes, this is a bad idea. Even Vicious dont approve

If you have any idea on what the rule might be as to when gcc does it one way or the other I would really like to hear it.
[/quote]
I'm afraid I'm not that well equipped to investigate the bug on gcc/asm level.

I wonder would issues in this thread be related: http://www.gamedev.n...urning-doubles/
Specially those " bigger changes to the code that implements the native calling conventions in version 2.20.2"?
Our version is 2.21.0 btw.

WitchLord

4,835

September 15, 2011 11:11 PM

No, this problem is different from the one described in the other thread. Floats and doubles are returned in the XMM register, and for some reason when compiling with optimizations the value gets lost. Probably the gcc compiler doesn't see the XMM register is used and end up removing the instructions set it.

I'm trying to find some documentation that explains why gcc behaves this way for this type so I can add proper support for it.

AngelCode.com - game development and more - Reference DB - game developer references
AngelScript - free scripting library - BMFont - free bitmap font generator - Tower - free puzzle game

WitchLord

4,835

September 15, 2011 11:38 PM

After reading the documentation at http://www.x86-64.or...ntation/abi.pdf I think the case might be that this structure is actually returned in XMM0 and not RAX:RDX, because it only contains float values. The fact that we get {3,0,1} is might be a coincidence.

Would it be possible for you to compile the following function:



asvec3_t vec3_123()

{

       asvec3_t v = {1,2,3};

   return v;

}

into assembler, so I can see how the return value is loaded into the registers?

You should be able to do this by compiling with 'gcc -S test.c'. It will generate the file test.s instead of test.obj.

AngelCode.com - game development and more - Reference DB - game developer references
AngelScript - free scripting library - BMFont - free bitmap font generator - Tower - free puzzle game

toukkapoukka

102

Author

September 16, 2011 08:35 AM

asvec3.c



#define _TEST_ONLY_



#ifndef _TEST_ONLY_

	#include <stdio.h>

#endif



typedef float vec3_t[3];



typedef struct {

	vec3_t v;

} asvec3_t;



asvec3_t asvec3_123()

{

	asvec3_t v = { {1,2,3} };

	return v;

}



#ifndef _TEST_ONLY_

int main()

{

	asvec3_t v = asvec3_123();

	printf( "%f, %f, %f\n", v.v[0], v.v[1], v.v[2] );

	return 0;

}

#endif

asvec3.s



	.file	"asvec3.c"

	.text

	.globl	asvec3_123

	.type	asvec3_123, @function

asvec3_123:

.LFB0:

	.cfi_startproc

	pushq	%rbp

	.cfi_def_cfa_offset 16

	.cfi_offset 6, -16

	movq	%rsp, %rbp

	.cfi_def_cfa_register 6

	movl	$0x3f800000, %eax

	movl	%eax, -32(%rbp)

	movl	$0x40000000, %eax

	movl	%eax, -28(%rbp)

	movl	$0x40400000, %eax

	movl	%eax, -24(%rbp)

	movq	-32(%rbp), %rax

	movq	%rax, -16(%rbp)

	movl	-24(%rbp), %eax

	movl	%eax, -8(%rbp)

	movq	-16(%rbp), %rdx

	mov	-8(%rbp), %eax

	movq	%rdx, -56(%rbp)

	movq	-56(%rbp), %xmm0

	movq	%rax, -56(%rbp)

	movq	-56(%rbp), %xmm1

	popq	%rbp

	.cfi_def_cfa 7, 8

	ret

	.cfi_endproc

.LFE0:

	.size	asvec3_123, .-asvec3_123

	.ident	"GCC: (GNU) 4.6.1 20110819 (prerelease)"

	.section	.note.GNU-stack,"",@progbits

Yes it seems to use the SSE registers for return value. I found a function attribute sseregparm for GCC, but unfortunately I can't find a complementary attribute to it.

bubu LV

1,436

September 16, 2011 08:52 AM

Here is nice document that lists ABI for various compilers (including MSVC and GCC) for various platforms (including 16-bit, 32-bit and 64-bit PC): http://www.agner.org/optimize/calling_conventions.pdf (page 16)

Re: vector3 pod types (in C this time)

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Re: vector3 pod types (in C this time)

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines