How about using the code below (based on
this), does that fulfill the criterias?
static unsigned int seed = 1;float sfrand(){ unsigned int a; a = seed << 14; // seed * 16384 a += (seed << 8); // seed * (16384 + 256) a += (seed << 7); // seed * (16384 + 256 + 128) a += (seed << 5); // seed * (16384 + 256 + 128 + 32) a += (seed << 2); // seed * (16384 + 256 + 128 + 32 + 4) a += (seed << 1); // seed * (16384 + 256 + 128 + 32 + 4 + 2) seed += a; // seed * (16384 + 256 + 128 + 32 + 4 + 2 + 1) = seed * 16807 a = (seed & 0x007fffff) | 0x40000000; return( *((float*)&a) - 3.0f );}
Below is the output generated using the (asm version, see edit at botom of post) of the code above:
Random numbers generated: 1000000000Took: 14114.2 ms (70850586.5 numbers / second)Average in each iterval: 20000000Worst interval: 616 (0.003080% deviation)[-1.00, -0.96) = +465[-0.96, -0.92) = +425[-0.92, -0.88) = -544[-0.88, -0.84) = -506[-0.84, -0.80) = +410[-0.80, -0.76) = +443[-0.76, -0.72) = -478[-0.72, -0.68) = +502[-0.68, -0.64) = -488[-0.64, -0.60) = +603[-0.60, -0.56) = -478[-0.56, -0.52) = +488[-0.52, -0.48) = -514[-0.48, -0.44) = +480[-0.44, -0.40) = -536[-0.40, -0.36) = +616[-0.36, -0.32) = -425[-0.32, -0.28) = +522[-0.28, -0.24) = -594[-0.24, -0.20) = +444[-0.20, -0.16) = -453[-0.16, -0.12) = +498[-0.12, -0.08) = +503[-0.08, -0.04) = -407[-0.04, +0.00) = -574[+0.00, +0.04) = +463[+0.04, +0.08) = +435[+0.08, +0.12) = -526[+0.12, +0.16) = -386[+0.16, +0.20) = +433[+0.20, +0.24) = +369[+0.24, +0.28) = -493[+0.28, +0.32) = -484[+0.32, +0.36) = +470[+0.36, +0.40) = +400[+0.40, +0.44) = -413[+0.44, +0.48) = +460[+0.48, +0.52) = -436[+0.52, +0.56) = +431[+0.56, +0.60) = -408[+0.60, +0.64) = +342[+0.64, +0.68) = -532[+0.68, +0.72) = +388[+0.72, +0.76) = -468[+0.76, +0.80) = +336[+0.80, +0.84) = -532[+0.84, +0.88) = +479[+0.88, +0.92) = +326[+0.92, +0.96) = -604[+0.96, +1.00) = -452
The output of the original algorithm using the single multiply:
Random numbers generated: 1000000000Took: 13766.4 ms (72640427.8 numbers / second)Average in each iterval: 20000000Worst interval: 616 (0.003080% deviation)[-1.00, -0.96) = +465[-0.96, -0.92) = +425[-0.92, -0.88) = -544[-0.88, -0.84) = -506[-0.84, -0.80) = +410[-0.80, -0.76) = +443[-0.76, -0.72) = -478[-0.72, -0.68) = +502[-0.68, -0.64) = -488[-0.64, -0.60) = +603[-0.60, -0.56) = -478[-0.56, -0.52) = +488[-0.52, -0.48) = -514[-0.48, -0.44) = +480[-0.44, -0.40) = -536[-0.40, -0.36) = +616[-0.36, -0.32) = -425[-0.32, -0.28) = +522[-0.28, -0.24) = -594[-0.24, -0.20) = +444[-0.20, -0.16) = -453[-0.16, -0.12) = +498[-0.12, -0.08) = +503[-0.08, -0.04) = -407[-0.04, +0.00) = -574[+0.00, +0.04) = +463[+0.04, +0.08) = +435[+0.08, +0.12) = -526[+0.12, +0.16) = -386[+0.16, +0.20) = +433[+0.20, +0.24) = +369[+0.24, +0.28) = -493[+0.28, +0.32) = -484[+0.32, +0.36) = +470[+0.36, +0.40) = +400[+0.40, +0.44) = -413[+0.44, +0.48) = +460[+0.48, +0.52) = -436[+0.52, +0.56) = +431[+0.56, +0.60) = -408[+0.60, +0.64) = +342[+0.64, +0.68) = -532[+0.68, +0.72) = +388[+0.72, +0.76) = -468[+0.76, +0.80) = +336[+0.80, +0.84) = -532[+0.84, +0.88) = +479[+0.88, +0.92) = +326[+0.92, +0.96) = -604[+0.96, +1.00) = -452
The output of a version using rand():
Random numbers generated: 1000000000Took: 74000.3 ms (13513455.6 numbers / second)Average in each iterval: 20000000Worst interval: 2982 (0.014910% deviation)[-1.00, -0.96) = +1454[-0.96, -0.92) = -119[-0.92, -0.88) = -127[-0.88, -0.84) = -349[-0.84, -0.80) = +958[-0.80, -0.76) = +209[-0.76, -0.72) = +835[-0.72, -0.68) = +588[-0.68, -0.64) = +1911[-0.64, -0.60) = +1355[-0.60, -0.56) = -1675[-0.56, -0.52) = +516[-0.52, -0.48) = -324[-0.48, -0.44) = +1640[-0.44, -0.40) = -197[-0.40, -0.36) = +1498[-0.36, -0.32) = -1362[-0.32, -0.28) = -2007[-0.28, -0.24) = +1828[-0.24, -0.20) = -613[-0.20, -0.16) = -1420[-0.16, -0.12) = -2982[-0.12, -0.08) = -378[-0.08, -0.04) = +625[-0.04, +0.00) = -540[+0.00, +0.04) = -440[+0.04, +0.08) = -808[+0.08, +0.12) = +1491[+0.12, +0.16) = +334[+0.16, +0.20) = -835[+0.20, +0.24) = -781[+0.24, +0.28) = +681[+0.28, +0.32) = -1126[+0.32, +0.36) = +98[+0.36, +0.40) = -1826[+0.40, +0.44) = -672[+0.44, +0.48) = +1668[+0.48, +0.52) = +1095[+0.52, +0.56) = +1495[+0.56, +0.60) = +1859[+0.60, +0.64) = -1702[+0.64, +0.68) = +474[+0.68, +0.72) = +140[+0.72, +0.76) = -527[+0.76, +0.80) = -491[+0.80, +0.84) = -1765[+0.84, +0.88) = -959[+0.88, +0.92) = +171[+0.92, +0.96) = -23[+0.96, +1.00) = +1007
The code used for the last results:
float sfrand(){ unsigned int a; a = rand() << (24 - 15); // assumes atleast 15 bits returned by rand a ^= rand(); a &= 0xffffff; return (float(a) / float(0x7fffff)) - 1.0f;}
Setup used: Windows Vista Ultimate, MSVC 2008 Express, Intel Core 2 Quad @ 2.4 Ghz (Q6600) and lots of fast ram.
Edit: I was suspicious about the performance of the propsed version (when compared to the single mul) and when I checked in the debugger it had optimized the shift adds to a single mul!! Impressive!
Edit2: Replaced the results with the asm version, interestingly it isn't that much slower (I guess that the stats gathering takes some time as well).
The source used for the asm version:
float __forceinlinesfrand(){ unsigned int a; __asm { mov eax, seed mov ebx, seed shl eax, 14 mov ecx, seed shl ebx, 8 mov edx, seed shl ecx, 7 add eax, ebx shl edx, 5 mov ebx, seed add eax, ecx add eax, edx shl ebx, 2 mov ecx, seed add eax, ebx add ecx, ecx add eax, ecx add seed, eax }; a = (seed & 0x007fffff) | 0x40000000; return( *((float*)&a) - 3.0f );}
Used __forceinline since all other functions tested got inlined but the asm version didn't (by default).
[Edited by - eq on May 8, 2008 7:45:17 PM]