apologies for a query posed in nescience........
i've been poring over a bit of code that uses a lot of cpu (seems to be around 56% a lot) at some points, and not (8%) at others.
there are a few comparisons, and a few divisions, though the divisions are by values 1 and greater... i do not have a profiler in my compiler (borland freecommandlinetools, i really love it.. on winXP..). the code also sets bits on a bitmap, so it could be due to the size of the array.
there are a lot of 2d rotations, and it's possible that these coefficients may incidentally be very small... ..i'm used to observing denormals in IIR filters, i'd be surprised to see them in "one-off" multiplications.. i have also noted that the denormaling occurs predictably when world objects are close to world origin (0,0). (this wouldn not affect the process of accessing/writing the bitmap).
..understanding the low-tech venue and psyche i am operating in, my guess is that the increase is due to the cpu denormaling.
i apologise to everyone that i have not been convinced to switch to a different compiler, or integrate a profiler with fclt, and have asked this very vague question anyway.. i know some people will say i don't have a right to ask anyone anything when i manifest such limits..
..but on the off chance that you've worked with XP and that this is an easy query to address or add to for you,
i spent some time reading about SSE as i vaguely remember that being an issue from a decade or so ago - i tried to adapt information from discussions to address the denormaling on my computer and was unsuccessful :) but i wasn't able to affect denormaling -
my compiler does not have
#include <xmmintrin.h>
which is needed to use either of these..
_MM_SET_FLUSH_ZERO_MODE(
x
)
_MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_ON);
iirc i was able to get this to compile, but it did not affect performance -
#define _MM_DENORMALS_ZERO_MASK 0x0040
#define _MM_DENORMALS_ZERO_ON 0x0040
#define _MM_DENORMALS_ZERO_OFF 0x0000
#define _MM_SET_DENORMALS_ZERO_MODE(mode) \
_mm_setcsr((_mm_getcsr() & ~_MM_DENORMALS_ZERO_MASK) | (mode))
#define _MM_GET_DENORMALS_ZERO_MODE() \
(_mm_getcsr() & _MM_DENORMALS_ZERO_MASK)
..can haz system denormals for borland? :p i'd totally be on top of this if it were an IIR... this kinda stuff is not why i code :) "that kinda guy.."