Rand # generator w/o int multiplies?

Started by
31 comments, last by discman1028 15 years, 11 months ago
Quote:Original post by discman1028
Quote:Original post by Adam_42
float random(void){  seed = 3.8f * seed * (1.0f - seed);  return seed;}


The results appear reasonably random to me, but you'd need to do some tests to be sure it's random enough for your needs.


Cool will give it a shot.


Looks good at first glance, but recording the distribution of 100000 rand's... not so good.

[0.000000f, 0.100000f): 0
[0.100000f, 0.200000f): 4636
[0.200000f, 0.300000f): 11143
[0.300000f, 0.400000f): 6681
[0.400000f, 0.500000f): 5905
[0.500000f, 0.600000f): 10209
[0.600000f, 0.700000f): 12165
[0.700000f, 0.800000f): 14549
[0.800000f, 0.900000f): 16087
[0.900000f, 1.000000f): 18625
[1.000000f, ...): 0
--== discman1028 ==--
Advertisement
Quote:Original post by discman1028
I need an algorithm, not an implementation... my instruction set is limited beyond standard C.

Sound interesting. Can you elaborate? Are there other constraints, or why can't you implement multiplication with additions and shifts?
Quote:Original post by SnotBob
Quote:Original post by discman1028
I need an algorithm, not an implementation... my instruction set is limited beyond standard C.

Sound interesting. Can you elaborate? Are there other constraints, or why can't you implement multiplication with additions and shifts?


-- I have int additions and shifts (and xor/or/and... almost all but multiply), but I am not sure how to handle overflow. Addition overflows correctly, and I do have shift as well as rotate instructions... but not sure how to implement an overflowing multiply using those.

-- I am currently trying Multiple-Bit Leap Forward LFSR. Will see where that goes...

-- If anyone can even come up with a random integer given my constraints, I can likely massage it into a float representation...
--== discman1028 ==--
Quote:Original post by discman1028
-- I have int additions and shifts (and xor/or/and... almost all but multiply), but I am not sure how to handle overflow. Addition overflows correctly, and I do have shift as well as rotate instructions... but not sure how to implement an overflowing multiply using those.

Let's say you wanted to implement Mersenne Twister, you'd need multiplication and modulo. The multiplication is actually modulo 2^32, so you wouldn't need to handle overflows in any particular way, you'd just need the least 32-bits. Implementing the modulo is no harder than the multiplication.
Quote:Original post by discman1028
-- I have int additions and shifts (and xor/or/and... almost all but multiply), but I am not sure how to handle overflow. Addition overflows correctly, and I do have shift as well as rotate instructions... but not sure how to implement an overflowing multiply using those.
Assuming that the bitwise operators work on the actual floating point bit patterns rather then you could simply use the normal linear congruent method, then bias the value up to a fixed exponent by adding a power of two guaranteed to be larger than the intermediate result, masking off the high bits of the mantissa with an AND and finally subtracting the bias again afterwards.

Here's an 8-bit example (assuming 32-bit IEEE floats):
n = n * 121.0 + 73.0;n += 65536.0;n &= 0xffff00ff; // bitwise ANDn -= 65536.0;
Of course this won't give you much in the way of precision since you'll need twice the integer precision of the final value to fit the intermediate multiplication, so I guess it's mostly useful for 64-bit floats. Then again you could easily piece together two of these.

But I'm still guessing about which operations you've actually got. Doesn't that thing have a manual you can link to somewhere?
No int multiplies? Almost sounds like programming a PIC, or some other 8-bit ucontroller. But it has floats? What is this thing?
Quote:Original post by DanielH
No int multiplies? Almost sounds like programming a PIC, or some other 8-bit ucontroller. But it has floats? What is this thing?


Yeah I knew it would peak (pique?) curiosity. It has int and float units, but its SIMD unit doesn't have int multiplies. They were removed to save valuable silicon real estate.
--== discman1028 ==--
Quote:Original post by discman1028
Hmm, might be able to do this, and use float multiplies for the int multiplies, as long as I make sure it wraps after overflow...


That page is pretty neat; would be great for generating random vectors and the like. Looks like it could even be vectorized.

throw table_exception("(? ???)? ? ???");

How about using the code below (based on this), does that fulfill the criterias?
static unsigned int seed = 1;float sfrand(){	unsigned int a;	a = seed << 14;	//  seed * 16384	a += (seed << 8);	//  seed * (16384 + 256)	a += (seed << 7);	//  seed * (16384 + 256 + 128)	a += (seed << 5);	//  seed * (16384 + 256 + 128 + 32)	a += (seed << 2);	//  seed * (16384 + 256 + 128 + 32 + 4)	a += (seed << 1);	//  seed * (16384 + 256 + 128 + 32 + 4 + 2)	seed += a;	//  seed * (16384 + 256 + 128 + 32 + 4 + 2 + 1) = seed * 16807	a = (seed & 0x007fffff) | 0x40000000;	return( *((float*)&a) - 3.0f );}


Below is the output generated using the (asm version, see edit at botom of post) of the code above:
Random numbers generated: 1000000000Took: 14114.2 ms (70850586.5 numbers / second)Average in each iterval: 20000000Worst interval: 616 (0.003080% deviation)[-1.00, -0.96) = +465[-0.96, -0.92) = +425[-0.92, -0.88) = -544[-0.88, -0.84) = -506[-0.84, -0.80) = +410[-0.80, -0.76) = +443[-0.76, -0.72) = -478[-0.72, -0.68) = +502[-0.68, -0.64) = -488[-0.64, -0.60) = +603[-0.60, -0.56) = -478[-0.56, -0.52) = +488[-0.52, -0.48) = -514[-0.48, -0.44) = +480[-0.44, -0.40) = -536[-0.40, -0.36) = +616[-0.36, -0.32) = -425[-0.32, -0.28) = +522[-0.28, -0.24) = -594[-0.24, -0.20) = +444[-0.20, -0.16) = -453[-0.16, -0.12) = +498[-0.12, -0.08) = +503[-0.08, -0.04) = -407[-0.04, +0.00) = -574[+0.00, +0.04) = +463[+0.04, +0.08) = +435[+0.08, +0.12) = -526[+0.12, +0.16) = -386[+0.16, +0.20) = +433[+0.20, +0.24) = +369[+0.24, +0.28) = -493[+0.28, +0.32) = -484[+0.32, +0.36) = +470[+0.36, +0.40) = +400[+0.40, +0.44) = -413[+0.44, +0.48) = +460[+0.48, +0.52) = -436[+0.52, +0.56) = +431[+0.56, +0.60) = -408[+0.60, +0.64) = +342[+0.64, +0.68) = -532[+0.68, +0.72) = +388[+0.72, +0.76) = -468[+0.76, +0.80) = +336[+0.80, +0.84) = -532[+0.84, +0.88) = +479[+0.88, +0.92) = +326[+0.92, +0.96) = -604[+0.96, +1.00) = -452


The output of the original algorithm using the single multiply:
Random numbers generated: 1000000000Took: 13766.4 ms (72640427.8 numbers / second)Average in each iterval: 20000000Worst interval: 616 (0.003080% deviation)[-1.00, -0.96) = +465[-0.96, -0.92) = +425[-0.92, -0.88) = -544[-0.88, -0.84) = -506[-0.84, -0.80) = +410[-0.80, -0.76) = +443[-0.76, -0.72) = -478[-0.72, -0.68) = +502[-0.68, -0.64) = -488[-0.64, -0.60) = +603[-0.60, -0.56) = -478[-0.56, -0.52) = +488[-0.52, -0.48) = -514[-0.48, -0.44) = +480[-0.44, -0.40) = -536[-0.40, -0.36) = +616[-0.36, -0.32) = -425[-0.32, -0.28) = +522[-0.28, -0.24) = -594[-0.24, -0.20) = +444[-0.20, -0.16) = -453[-0.16, -0.12) = +498[-0.12, -0.08) = +503[-0.08, -0.04) = -407[-0.04, +0.00) = -574[+0.00, +0.04) = +463[+0.04, +0.08) = +435[+0.08, +0.12) = -526[+0.12, +0.16) = -386[+0.16, +0.20) = +433[+0.20, +0.24) = +369[+0.24, +0.28) = -493[+0.28, +0.32) = -484[+0.32, +0.36) = +470[+0.36, +0.40) = +400[+0.40, +0.44) = -413[+0.44, +0.48) = +460[+0.48, +0.52) = -436[+0.52, +0.56) = +431[+0.56, +0.60) = -408[+0.60, +0.64) = +342[+0.64, +0.68) = -532[+0.68, +0.72) = +388[+0.72, +0.76) = -468[+0.76, +0.80) = +336[+0.80, +0.84) = -532[+0.84, +0.88) = +479[+0.88, +0.92) = +326[+0.92, +0.96) = -604[+0.96, +1.00) = -452


The output of a version using rand():
Random numbers generated: 1000000000Took: 74000.3 ms (13513455.6 numbers / second)Average in each iterval: 20000000Worst interval: 2982 (0.014910% deviation)[-1.00, -0.96) = +1454[-0.96, -0.92) = -119[-0.92, -0.88) = -127[-0.88, -0.84) = -349[-0.84, -0.80) = +958[-0.80, -0.76) = +209[-0.76, -0.72) = +835[-0.72, -0.68) = +588[-0.68, -0.64) = +1911[-0.64, -0.60) = +1355[-0.60, -0.56) = -1675[-0.56, -0.52) = +516[-0.52, -0.48) = -324[-0.48, -0.44) = +1640[-0.44, -0.40) = -197[-0.40, -0.36) = +1498[-0.36, -0.32) = -1362[-0.32, -0.28) = -2007[-0.28, -0.24) = +1828[-0.24, -0.20) = -613[-0.20, -0.16) = -1420[-0.16, -0.12) = -2982[-0.12, -0.08) = -378[-0.08, -0.04) = +625[-0.04, +0.00) = -540[+0.00, +0.04) = -440[+0.04, +0.08) = -808[+0.08, +0.12) = +1491[+0.12, +0.16) = +334[+0.16, +0.20) = -835[+0.20, +0.24) = -781[+0.24, +0.28) = +681[+0.28, +0.32) = -1126[+0.32, +0.36) =  +98[+0.36, +0.40) = -1826[+0.40, +0.44) = -672[+0.44, +0.48) = +1668[+0.48, +0.52) = +1095[+0.52, +0.56) = +1495[+0.56, +0.60) = +1859[+0.60, +0.64) = -1702[+0.64, +0.68) = +474[+0.68, +0.72) = +140[+0.72, +0.76) = -527[+0.76, +0.80) = -491[+0.80, +0.84) = -1765[+0.84, +0.88) = -959[+0.88, +0.92) = +171[+0.92, +0.96) =  -23[+0.96, +1.00) = +1007


The code used for the last results:
float sfrand(){	unsigned int a;	a = rand() << (24 - 15); // assumes atleast 15 bits returned by rand	a ^= rand();	a &= 0xffffff;	return (float(a) / float(0x7fffff)) - 1.0f;}


Setup used: Windows Vista Ultimate, MSVC 2008 Express, Intel Core 2 Quad @ 2.4 Ghz (Q6600) and lots of fast ram.

Edit: I was suspicious about the performance of the propsed version (when compared to the single mul) and when I checked in the debugger it had optimized the shift adds to a single mul!! Impressive!

Edit2: Replaced the results with the asm version, interestingly it isn't that much slower (I guess that the stats gathering takes some time as well).
The source used for the asm version:
float __forceinlinesfrand(){	unsigned int a;	__asm	{		mov eax, seed		mov ebx, seed		shl eax, 14		mov ecx, seed		shl ebx, 8		mov edx, seed		shl ecx, 7		add eax, ebx		shl edx, 5		mov ebx, seed		add eax, ecx		add eax, edx		shl ebx, 2		mov ecx, seed		add eax, ebx		add ecx, ecx		add eax, ecx		add seed, eax	};	a = (seed & 0x007fffff) | 0x40000000;	return( *((float*)&a) - 3.0f );}

Used __forceinline since all other functions tested got inlined but the asm version didn't (by default).

[Edited by - eq on May 8, 2008 7:45:17 PM]
Quote:Original post by Ravyne
That page is pretty neat; would be great for generating random vectors and the like. Looks like it could even be vectorized.


Yep, definitely could if I had integer multiplies in my SIMD hardware, like any typical SIMD unit would normally have... SSE, Altivec...

Quote:Original post by eq
How about using the code below (based on this), does that fulfill the criterias?

unsigned int a;
a = seed << 14; // seed * 16384
a += (seed << 8); // seed * (16384 + 256)
a += (seed << 7); // seed * (16384 + 256 + 128)
a += (seed << 5); // seed * (16384 + 256 + 128 + 32)
a += (seed << 2); // seed * (16384 + 256 + 128 + 32 + 4)
a += (seed << 1); // seed * (16384 + 256 + 128 + 32 + 4 + 2)
seed += a; // seed * (16384 + 256 + 128 + 32 + 4 + 2 + 1) = seed * 16807



It almost does... but any of those shifts (seed << N)... i.e.. the multiply's by 2^N... could overflow, i.e. get higher than UINT_MAX. Addition overflows nicely, but the shifts just discard bits.
--== discman1028 ==--

This topic is closed to new replies.

Advertisement