very strange bug (when runing c basic arrays code)
Can make the source available? Think its some tricky trick that lead to the behaviour.
the code is like i said
this is stripped to the bare win main version with the same settings to compilation
https://www.dropbox.com/s/ib4igh5qs85a156/test.zip
it do not crashes as it seem
when I call it from within my program
#include "fist.h"
#include <x86intrin.h>
float modelRight_x = 1.1;
float modelRight_y = 1.2;
float modelRight_z = 1.3;
float modelUp_x = 1.1;
float modelUp_y = 1.2;
float modelUp_z = 1.3;
float modelDir_x = 1.1;
float modelDir_y = 1.2;
float modelDir_z = 1.3;
__attribute__ ((aligned (16))) float normal_x[100*1000];
__attribute__ ((aligned (16))) float normal_y[100*1000];
__attribute__ ((aligned (16))) float normal_z[100*1000];
__attribute__ ((aligned (16))) float n_x[100*1000];
__attribute__ ((aligned (16))) float n_y[100*1000];
__attribute__ ((aligned (16))) float n_z[100*1000];
void initialize_data_for_matrix_mul()
{
static int initialized = 0;
if(initialized) return;
initialized = 1;
for(int i=0; i<100*1000; i++)
{
n_x[i] = (100.+rand()%10000)/1000.;
n_y[i] = (100.+rand()%10000)/1000.;
n_z[i] = (100.+rand()%10000)/1000.;
}
}
void matrix_mul_float()
{
for(int i=0; i<100*1000; i++)
{
normal_x[i] = n_x[i]*modelRight_x + n_y[i]*modelRight_y + n_z[i]*modelRight_z;
normal_y[i] = n_x[i]*modelUp_x + n_y[i]*modelUp_y + n_z[i]*modelUp_z;
normal_z[i] = n_x[i]*modelDir_x + n_y[i]*modelDir_y + n_z[i]*modelDir_z;
}
return;
}
//struct float4 { float x,y,z,w; };
__attribute__ ((aligned (16))) float4 modelRight_4x = {1.1, 1.1, 1.1, 1.1 };
__attribute__ ((aligned (16))) float4 modelRight_4y = {1.2, 1.2, 1.2, 1.2 };
__attribute__ ((aligned (16))) float4 modelRight_4z = {1.3, 1.3, 1.3, 1.3 };
__attribute__ ((aligned (16))) float4 modelUp_4x = {1.1, 1.1, 1.1, 1.1 };;
__attribute__ ((aligned (16))) float4 modelUp_4y = {1.2, 1.2, 1.2, 1.2 };;
__attribute__ ((aligned (16))) float4 modelUp_4z = {1.3, 1.3, 1.3, 1.3 };;
__attribute__ ((aligned (16))) float4 modelDir_4x = {1.1, 1.1, 1.1, 1.1 };;
__attribute__ ((aligned (16))) float4 modelDir_4y = {1.2, 1.2, 1.2, 1.2 };;
__attribute__ ((aligned (16))) float4 modelDir_4z = {1.3, 1.3, 1.3, 1.3 };;
void matrix_mul_sse()
{
__m128 mRx = _mm_load_ps((const float*) &modelRight_4x);
__m128 mRy = _mm_load_ps((const float*) &modelRight_4y);
__m128 mRz = _mm_load_ps((const float*) &modelRight_4z);
__m128 mUx = _mm_load_ps((const float*) &modelUp_4x);
__m128 mUy = _mm_load_ps((const float*) &modelUp_4y);
__m128 mUz = _mm_load_ps((const float*) &modelUp_4z);
__m128 mDx = _mm_load_ps((const float*) &modelDir_4x);
__m128 mDy = _mm_load_ps((const float*) &modelDir_4y);
__m128 mDz = _mm_load_ps((const float*) &modelDir_4z);
for(int i=0; i<100*1000; i+=4)
{
__m128 nx = _mm_load_ps( &n_x[i]);
__m128 ny = _mm_load_ps( &n_y[i]);
__m128 nz = _mm_load_ps( &n_z[i]);
__m128 normalx = _mm_add_ps(_mm_add_ps(_mm_mul_ps(nx,mRx), _mm_mul_ps(ny,mRy)), _mm_mul_ps(nz,mRz));
__m128 normaly = _mm_add_ps(_mm_add_ps(_mm_mul_ps(nx,mUx), _mm_mul_ps(ny,mUy)), _mm_mul_ps(nz,mUz));
__m128 normalz = _mm_add_ps(_mm_add_ps(_mm_mul_ps(nx,mDx), _mm_mul_ps(ny,mDy)), _mm_mul_ps(nz,mDz));
_mm_store_ps( &normal_x[i], normalx);
_mm_store_ps( &normal_y[i], normaly);
// _mm_store_ps( &normal_z[i], normalz);
}
}
void tests()
{
alert("\nstart");
initialize_data_for_matrix_mul();
alert("\nmul float");
matrix_mul_float();
alert("\nmul sse");
matrix_mul_sse();
alert("\ndone");
exit(0);
}
with only chnge changing winmain to tests and including header of my framework, should bo nothing scary there (i could comment this), - this is linked as an seperate .o and called - it crashes as i said
commented header thus calling the same code as this separate win main that not crashes - only change is the renaming winmain to tests and calling this from my appilication (commandline scripts are the same except im linking more objects in my application) - this just crashes when called from my application (I call it in the main loop, could see what would be there if i call it from app setup)
What values to variables hold?
What is the content of the destination memory?
Is it what you expect?
What does the debugger tell you?
When it crashes what is the call stack?
What values to variables hold?
What is the content of the destination memory?
Is it what you expect?
What does the debugger tell you?
callstack is obvious i think
ps, im talking now about this second crash (not the first possibly more mysterious one, (that with simple float code) - as i just off -O3 -Ofast and skipped this)
here it seem crash on the first sse line
100*1000 is 100,000 elements, 6 such arrays make 600,000 elements 4 bytes each, that's 2,400,000 bytes or over 2 MB. That's bigger than default stack size, that's why you get crash. Use new[] operator for those arrays and it'll work just fine.
You have your answer right there, stop blaming the compiler for your mistake. Either allocate the array with new(recommended) or increase the stack size. Default stack size is 1 Mb IIRC.
100*1000 is 100,000 elements, 6 such arrays make 600,000 elements 4 bytes each, that's 2,400,000 bytes or over 2 MB. That's bigger than default stack size, that's why you get crash. Use new[] operator for those arrays and it'll work just fine.
You have your answer right there, stop blaming the compiler for your mistake.
?
but it is not stack, or am i blind?
(checked and my stack values are 0x0020 0000 and static data 0x0040 0000 - this is static data - and is aligned
So, did you run a debugger yet?
what values you want to know - i will tel you
still got no idea what is the reason -
maybe this is some kind of sse mode i need to turn on or something?
i may say - i checked the _mm_loadu_ps and it crashes the same
also decrased the arrays size but it crashes the same