float vs double
MS recently said at one of their XBox 360 conferences that double is native and therefor faster to the platform. I second the recommendation of using typedefs for your floating point types, to facilitate simple switching between the resulting types used. This could cause compatability problems in file IO and other such things, so a bit of care must be taken.
Quote:Original post by NitageQuote:Original post by blizzard999
In my opinion on modern processors (like P4 and its 128 bit SIMD floating point arithmetic) using floats give no speed benefits.
There are 128 bit instructions where those 128 bits hold 4 floats.
There are 128 bit instructions where those 128 bits hold 2 doubles.
Using floats with 128bit instructions can give a 100% speed up over doubles
Example ?
Quote:Original post by blizzard999Quote:Original post by NitageQuote:Original post by blizzard999
In my opinion on modern processors (like P4 and its 128 bit SIMD floating point arithmetic) using floats give no speed benefits.
There are 128 bit instructions where those 128 bits hold 4 floats.
There are 128 bit instructions where those 128 bits hold 2 doubles.
Using floats with 128bit instructions can give a 100% speed up over doubles
Example ?
because you can do twice as many calculations with floats than doubles. 4 is 100% higher than 2.
What example do you want? xmm0 can hold either 4 floats or 2 doubles. 2x the data calculated, plus you don't need to use SSE2, which all AthlonXPs (still very popular) don't support.
Quote:Original post by blizzard999Quote:Original post by NitageQuote:Original post by blizzard999
In my opinion on modern processors (like P4 and its 128 bit SIMD floating point arithmetic) using floats give no speed benefits.
There are 128 bit instructions where those 128 bits hold 4 floats.
There are 128 bit instructions where those 128 bits hold 2 doubles.
Using floats with 128bit instructions can give a 100% speed up over doubles
Example ?
SSE3 instructions:
ADDSUBPD ( Add-Subtract-Packed-Double )
Input: { A0, A1 }, { B0, B1 }
Output: { A0 - B0, A1 + B1 }
ADDSUBPS ( Add-Subtract-Packed-Single )
Input: { A0, A1, A2, A3 }, { B0, B1, B2, B3 }
Output: { A0 - B0, A1 + B1, A2 - B2, A3 + B3 }
Twice as much gets done in the same time using floats. Therfore floats can be 100% faster.
Quote:Original post by Basiror
I just looked something up, SSE2 is supposed to support 128bit registers to perform 2 double precision operations in one step
yeah but you can also do 4 floating point precision operations in one step.
you get 8 of them registers i think
which makes matrix*matrix very fast. You can the first matrix in the first 4 registers (4 * 4 matrix) then the next matrix in the next 4 registers. If you are using double it will take atleast twice as long.
I havent put this into practice but i remember the theory in my book. Maybe there are other things to consider which i have missed.
there is also the shuffle operation which uses 4 floats. If you are using doubles youll be messing around with 2 registers and moving stuff manually - slow.
Yeah, graphics processors and fpu's are usually (afaik) optimized to use floats.
Quote:NEVER EVER use == on a float
Very true, a most useful function for any program dealing with floating values:
bool compareFloat ( float Value1 , float Value2 , float Tolerance ){ if ( fabs ( Value1 - Value2 ) < Tolerance ) return true ; else return false ;}
The tolerance can be hard coded if desired.
(Pardon if my formatting is off, I've been programming in so many C variant scripting languages lately I can't keep em straight...)
Quote:Original post by Riviera KidQuote:Original post by Basiror
I just looked something up, SSE2 is supposed to support 128bit registers to perform 2 double precision operations in one step
yeah but you can also do 4 floating point precision operations in one step.
you get 8 of them registers i think
which makes matrix*matrix very fast. You can the first matrix in the first 4 registers (4 * 4 matrix) then the next matrix in the next 4 registers. If you are using double it will take atleast twice as long.
I havent put this into practice but i remember the theory in my book. Maybe there are other things to consider which i have missed.
there is also the shuffle operation which uses 4 floats. If you are using doubles youll be messing around with 2 registers and moving stuff manually - slow.
Yeah, graphics processors and fpu's are usually (afaik) optimized to use floats.
yes i know but in some cases this leads to some little problems concerning precision
the few matrix operations i have to perform aren t critical anyways since most of the work will be moved to the gpu its optimized for this kind of operations
Quote:NEVER EVER use == on a floatI do something like this occasionally:
float val = MAXFLOAT;...loop which may change val...if (val == MAXFLOAT) ...
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement