Hi,
Since all computer has SSE2, now, using standard math function (cmath) still needed ?
Is it correct to do all using SSE2 ? AVX is young but now SSE2 is here since a long time.
Thanks
Edited by Alundra, 17 February 2013 - 05:26 PM.
Posted 17 February 2013 - 06:03 PM
Hi,
Since all computer has SSE2, now, using standard math function (cmath) still needed ?
Is it correct to do all using SSE2 ? AVX is young but now SSE2 is here since a long time.
Not all CPUs have SSE2. However, you might not be targeting "all CPUs" with your program. Yes, nearly all personal computers have SSE2 support these days, but there's a good number of devices (like cell phones, RPis, etc.) that don't support SSE2. So it depends on what kind of computers you are targeting, and I'm not sure what that is. It's up to you.
But of course, the cmath functions, plus your compiler's optimizer, very well might be using SSE2 anyway...
Posted 17 February 2013 - 06:03 PM
Let me guess... are you German?AVX is young but now SSE2 is here since a long time.
Edited by Álvaro, 17 February 2013 - 06:08 PM.
Posted 17 February 2013 - 07:10 PM
If you're making a game for Windows PCs, you can just assume that the user will have SSE2, and write on your box/manual/website that an SSE2 CPU is required (basically: Pentium 4 or newer).
On other platforms, there might be different SIMD instruction sets other than SSE/AVX -- e.g. ARM CPUs have NEON and PPC have AltiVec.
However, using on x86/PC, it's not always best to use SSE, because mixing float and SSE code is slow. Within a particular bit of code (e.g. a particle simulation), you either need all of your code to use float, or all of it to use SSE.
If some of the code is float, and some is SSE, then data needs to be transferred from the SSE registers to the float registers (or vice versa) -- in order to do this, the CPU needs to write the data from one type of register to RAM, then read it from RAM to the other type of register. This can be extremely slow.
Posted 17 February 2013 - 09:21 PM
I was writing some 2d collision code about a month ago, and when testing the collision with objects at around position x = 10,000, the collision was messing up - it was getting me stuck on walls and other objects that would not cause an issue around x = 0.
After a lot of debugging, I had a realization, turned off the SSE optimization and my collision issues disappeared. The collision was done with doubles too. There's definitely a tradeoff when using SSE, and while it may not matter most of the time, sometimes it does.
Posted 18 February 2013 - 12:50 AM