Rand # generator w/o int multiplies?

This topic is 3561 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

Recommended Posts

Anybody know of a random float # generation scheme that doesn't use any integer multiplies of any kind? (and preferably doesn't use 64-bit int's either, but that's not as important.) Single-precision IEEE-754 float math (including multiplies), as well as integer adds, subtracts, and shifts are fair game. EDIT: I need an algorithm, not an implementation... my instruction set is limited beyond standard C. [Edited by - discman1028 on May 7, 2008 10:27:08 PM]

Share on other sites
I don't think I quite understand, but this is what I came up with:

#include <cstdlib> template<class float_type>inline float_type random() { 	return static_cast<float_type>( std::rand() ) / static_cast<float_type>( RAND_MAX );}double x = random<double>();float y = random<float>();

Share on other sites
Hmm, might be able to do this, and use float multiplies for the int multiplies, as long as I make sure it wraps after overflow...

Share on other sites
Quote:
 Original post by _fastcallI don't think I quite understand, but this is what I came up with:*** Source Snippet Removed ***

A little more low-level... my restrictions pertain to what's inside of rand(). ;) I need an algorithm, not an implementation... my instruction set is limited beyond standard C.

Share on other sites
Quote:
 Original post by discman1028Hmm, might be able to do this, and use float multiplies for the int multiplies, as long as I make sure it wraps after overflow...

...except the float multiplies don't match the int multiplies due to precision loss... only 23 bits for the "int" mantissa. Maybe the distribution would still be random enough for me... who knows?

But I still have to figure out how to deal with "int" overflow for the float multiply. I can check the float for an INF hex value, and branch based on that, but I couldn't know what to reset the float to, to continue getting random numbers.

Share on other sites
Quote:
 Original post by discman1028Hmm, might be able to do this,

That code runs 30x faster than my version :)

EDIT:
10,000,000 trials:
4x faster debug mode
30x faster release mode

Share on other sites
Quote:
Original post by _fastcall
Quote:
 Original post by discman1028Hmm, might be able to do this,

That code runs 4x faster than my version :)

From searches, it seems that the integer multiply (or possibly more specifically, the integer overflow behavior) is absolutely essentially to random number generation...? Has no one on earth ever done it without integer multiplies (in the absence of huge tables)?

Share on other sites
Standard linear congruent random number generators multiply by a fixed constant, so you can replace the multiplication by shifts and adds (this still relies on normal integer overflow though). Another simple generator without multiplication would be an LFSR.

What are your limitations exactly? If you're trying to generate random numbers solely with the use floating point arithmetic (say from within a shader or whatever) and without any bit hacks then I don't know of any useful methods without table lookups.

Share on other sites
Depending on how random you need the results you could try this:

static float seed = 0.5f;void srandom(float f){  if (seed <= 0.0f || seed >= 1.0f) return; // Invalid seed  seed = f;}float random(void){  seed = 3.8f * seed * (1.0f - seed);  return seed;}

The results appear reasonably random to me, but you'd need to do some tests to be sure it's random enough for your needs.

There's also the following trick to convert int to float quickly, which you may find useful. It's from www.agner.org/optimize

int n = random_number();float x;*(int*)&x = (n & 0x7FFFFF) | 0x3F800000; // Now 1.0 <= x < 2.0x -= 1.0f;

Share on other sites
Quote:
 Original post by Adam_42float random(void){ seed = 3.8f * seed * (1.0f - seed); return seed;}The results appear reasonably random to me, but you'd need to do some tests to be sure it's random enough for your needs.

Cool will give it a shot. Will give LFSR a shot too.

Quote:
 Original post by implicitWhat are your limitations exactly? If you're trying to generate random numbers solely with the use floating point arithmetic (say from within a shader or whatever) and without any bit hacks then I don't know of any useful methods without table lookups.

Do you have a table-based method? Depending on the size of the table, that could do. No, it's not for a shader, but it is for a SIMD unit with very little SIMD integer support.

[Edited by - discman1028 on May 8, 2008 1:04:08 PM]

Share on other sites
Quote:
Original post by discman1028
Quote:
 Original post by Adam_42float random(void){ seed = 3.8f * seed * (1.0f - seed); return seed;}The results appear reasonably random to me, but you'd need to do some tests to be sure it's random enough for your needs.

Cool will give it a shot.

Looks good at first glance, but recording the distribution of 100000 rand's... not so good.

[0.000000f, 0.100000f): 0
[0.100000f, 0.200000f): 4636
[0.200000f, 0.300000f): 11143
[0.300000f, 0.400000f): 6681
[0.400000f, 0.500000f): 5905
[0.500000f, 0.600000f): 10209
[0.600000f, 0.700000f): 12165
[0.700000f, 0.800000f): 14549
[0.800000f, 0.900000f): 16087
[0.900000f, 1.000000f): 18625
[1.000000f, ...): 0

Share on other sites
Quote:
 Original post by discman1028I need an algorithm, not an implementation... my instruction set is limited beyond standard C.

Sound interesting. Can you elaborate? Are there other constraints, or why can't you implement multiplication with additions and shifts?

Share on other sites
Quote:
Original post by SnotBob
Quote:
 Original post by discman1028I need an algorithm, not an implementation... my instruction set is limited beyond standard C.

Sound interesting. Can you elaborate? Are there other constraints, or why can't you implement multiplication with additions and shifts?

-- I have int additions and shifts (and xor/or/and... almost all but multiply), but I am not sure how to handle overflow. Addition overflows correctly, and I do have shift as well as rotate instructions... but not sure how to implement an overflowing multiply using those.

-- I am currently trying Multiple-Bit Leap Forward LFSR. Will see where that goes...

-- If anyone can even come up with a random integer given my constraints, I can likely massage it into a float representation...

Share on other sites
Quote:
 Original post by discman1028-- I have int additions and shifts (and xor/or/and... almost all but multiply), but I am not sure how to handle overflow. Addition overflows correctly, and I do have shift as well as rotate instructions... but not sure how to implement an overflowing multiply using those.

Let's say you wanted to implement Mersenne Twister, you'd need multiplication and modulo. The multiplication is actually modulo 2^32, so you wouldn't need to handle overflows in any particular way, you'd just need the least 32-bits. Implementing the modulo is no harder than the multiplication.

Share on other sites
Quote:
 Original post by discman1028-- I have int additions and shifts (and xor/or/and... almost all but multiply), but I am not sure how to handle overflow. Addition overflows correctly, and I do have shift as well as rotate instructions... but not sure how to implement an overflowing multiply using those.
Assuming that the bitwise operators work on the actual floating point bit patterns rather then you could simply use the normal linear congruent method, then bias the value up to a fixed exponent by adding a power of two guaranteed to be larger than the intermediate result, masking off the high bits of the mantissa with an AND and finally subtracting the bias again afterwards.

Here's an 8-bit example (assuming 32-bit IEEE floats):
n = n * 121.0 + 73.0;n += 65536.0;n &= 0xffff00ff; // bitwise ANDn -= 65536.0;
Of course this won't give you much in the way of precision since you'll need twice the integer precision of the final value to fit the intermediate multiplication, so I guess it's mostly useful for 64-bit floats. Then again you could easily piece together two of these.

But I'm still guessing about which operations you've actually got. Doesn't that thing have a manual you can link to somewhere?

Share on other sites
No int multiplies? Almost sounds like programming a PIC, or some other 8-bit ucontroller. But it has floats? What is this thing?

Share on other sites
Quote:
 Original post by DanielHNo int multiplies? Almost sounds like programming a PIC, or some other 8-bit ucontroller. But it has floats? What is this thing?

Yeah I knew it would peak (pique?) curiosity. It has int and float units, but its SIMD unit doesn't have int multiplies. They were removed to save valuable silicon real estate.

Share on other sites
Quote:
 Original post by discman1028Hmm, might be able to do this, and use float multiplies for the int multiplies, as long as I make sure it wraps after overflow...

That page is pretty neat; would be great for generating random vectors and the like. Looks like it could even be vectorized.

Share on other sites
How about using the code below (based on this), does that fulfill the criterias?
static unsigned int seed = 1;float sfrand(){	unsigned int a;	a = seed << 14;	//  seed * 16384	a += (seed << 8);	//  seed * (16384 + 256)	a += (seed << 7);	//  seed * (16384 + 256 + 128)	a += (seed << 5);	//  seed * (16384 + 256 + 128 + 32)	a += (seed << 2);	//  seed * (16384 + 256 + 128 + 32 + 4)	a += (seed << 1);	//  seed * (16384 + 256 + 128 + 32 + 4 + 2)	seed += a;	//  seed * (16384 + 256 + 128 + 32 + 4 + 2 + 1) = seed * 16807	a = (seed & 0x007fffff) | 0x40000000;	return( *((float*)&a) - 3.0f );}

Below is the output generated using the (asm version, see edit at botom of post) of the code above:
Random numbers generated: 1000000000Took: 14114.2 ms (70850586.5 numbers / second)Average in each iterval: 20000000Worst interval: 616 (0.003080% deviation)[-1.00, -0.96) = +465[-0.96, -0.92) = +425[-0.92, -0.88) = -544[-0.88, -0.84) = -506[-0.84, -0.80) = +410[-0.80, -0.76) = +443[-0.76, -0.72) = -478[-0.72, -0.68) = +502[-0.68, -0.64) = -488[-0.64, -0.60) = +603[-0.60, -0.56) = -478[-0.56, -0.52) = +488[-0.52, -0.48) = -514[-0.48, -0.44) = +480[-0.44, -0.40) = -536[-0.40, -0.36) = +616[-0.36, -0.32) = -425[-0.32, -0.28) = +522[-0.28, -0.24) = -594[-0.24, -0.20) = +444[-0.20, -0.16) = -453[-0.16, -0.12) = +498[-0.12, -0.08) = +503[-0.08, -0.04) = -407[-0.04, +0.00) = -574[+0.00, +0.04) = +463[+0.04, +0.08) = +435[+0.08, +0.12) = -526[+0.12, +0.16) = -386[+0.16, +0.20) = +433[+0.20, +0.24) = +369[+0.24, +0.28) = -493[+0.28, +0.32) = -484[+0.32, +0.36) = +470[+0.36, +0.40) = +400[+0.40, +0.44) = -413[+0.44, +0.48) = +460[+0.48, +0.52) = -436[+0.52, +0.56) = +431[+0.56, +0.60) = -408[+0.60, +0.64) = +342[+0.64, +0.68) = -532[+0.68, +0.72) = +388[+0.72, +0.76) = -468[+0.76, +0.80) = +336[+0.80, +0.84) = -532[+0.84, +0.88) = +479[+0.88, +0.92) = +326[+0.92, +0.96) = -604[+0.96, +1.00) = -452

The output of the original algorithm using the single multiply:
Random numbers generated: 1000000000Took: 13766.4 ms (72640427.8 numbers / second)Average in each iterval: 20000000Worst interval: 616 (0.003080% deviation)[-1.00, -0.96) = +465[-0.96, -0.92) = +425[-0.92, -0.88) = -544[-0.88, -0.84) = -506[-0.84, -0.80) = +410[-0.80, -0.76) = +443[-0.76, -0.72) = -478[-0.72, -0.68) = +502[-0.68, -0.64) = -488[-0.64, -0.60) = +603[-0.60, -0.56) = -478[-0.56, -0.52) = +488[-0.52, -0.48) = -514[-0.48, -0.44) = +480[-0.44, -0.40) = -536[-0.40, -0.36) = +616[-0.36, -0.32) = -425[-0.32, -0.28) = +522[-0.28, -0.24) = -594[-0.24, -0.20) = +444[-0.20, -0.16) = -453[-0.16, -0.12) = +498[-0.12, -0.08) = +503[-0.08, -0.04) = -407[-0.04, +0.00) = -574[+0.00, +0.04) = +463[+0.04, +0.08) = +435[+0.08, +0.12) = -526[+0.12, +0.16) = -386[+0.16, +0.20) = +433[+0.20, +0.24) = +369[+0.24, +0.28) = -493[+0.28, +0.32) = -484[+0.32, +0.36) = +470[+0.36, +0.40) = +400[+0.40, +0.44) = -413[+0.44, +0.48) = +460[+0.48, +0.52) = -436[+0.52, +0.56) = +431[+0.56, +0.60) = -408[+0.60, +0.64) = +342[+0.64, +0.68) = -532[+0.68, +0.72) = +388[+0.72, +0.76) = -468[+0.76, +0.80) = +336[+0.80, +0.84) = -532[+0.84, +0.88) = +479[+0.88, +0.92) = +326[+0.92, +0.96) = -604[+0.96, +1.00) = -452

The output of a version using rand():
Random numbers generated: 1000000000Took: 74000.3 ms (13513455.6 numbers / second)Average in each iterval: 20000000Worst interval: 2982 (0.014910% deviation)[-1.00, -0.96) = +1454[-0.96, -0.92) = -119[-0.92, -0.88) = -127[-0.88, -0.84) = -349[-0.84, -0.80) = +958[-0.80, -0.76) = +209[-0.76, -0.72) = +835[-0.72, -0.68) = +588[-0.68, -0.64) = +1911[-0.64, -0.60) = +1355[-0.60, -0.56) = -1675[-0.56, -0.52) = +516[-0.52, -0.48) = -324[-0.48, -0.44) = +1640[-0.44, -0.40) = -197[-0.40, -0.36) = +1498[-0.36, -0.32) = -1362[-0.32, -0.28) = -2007[-0.28, -0.24) = +1828[-0.24, -0.20) = -613[-0.20, -0.16) = -1420[-0.16, -0.12) = -2982[-0.12, -0.08) = -378[-0.08, -0.04) = +625[-0.04, +0.00) = -540[+0.00, +0.04) = -440[+0.04, +0.08) = -808[+0.08, +0.12) = +1491[+0.12, +0.16) = +334[+0.16, +0.20) = -835[+0.20, +0.24) = -781[+0.24, +0.28) = +681[+0.28, +0.32) = -1126[+0.32, +0.36) =  +98[+0.36, +0.40) = -1826[+0.40, +0.44) = -672[+0.44, +0.48) = +1668[+0.48, +0.52) = +1095[+0.52, +0.56) = +1495[+0.56, +0.60) = +1859[+0.60, +0.64) = -1702[+0.64, +0.68) = +474[+0.68, +0.72) = +140[+0.72, +0.76) = -527[+0.76, +0.80) = -491[+0.80, +0.84) = -1765[+0.84, +0.88) = -959[+0.88, +0.92) = +171[+0.92, +0.96) =  -23[+0.96, +1.00) = +1007

The code used for the last results:
float sfrand(){	unsigned int a;	a = rand() << (24 - 15); // assumes atleast 15 bits returned by rand	a ^= rand();	a &= 0xffffff;	return (float(a) / float(0x7fffff)) - 1.0f;}

Setup used: Windows Vista Ultimate, MSVC 2008 Express, Intel Core 2 Quad @ 2.4 Ghz (Q6600) and lots of fast ram.

Edit: I was suspicious about the performance of the propsed version (when compared to the single mul) and when I checked in the debugger it had optimized the shift adds to a single mul!! Impressive!

Edit2: Replaced the results with the asm version, interestingly it isn't that much slower (I guess that the stats gathering takes some time as well).
The source used for the asm version:
float __forceinlinesfrand(){	unsigned int a;	__asm	{		mov eax, seed		mov ebx, seed		shl eax, 14		mov ecx, seed		shl ebx, 8		mov edx, seed		shl ecx, 7		add eax, ebx		shl edx, 5		mov ebx, seed		add eax, ecx		add eax, edx		shl ebx, 2		mov ecx, seed		add eax, ebx		add ecx, ecx		add eax, ecx		add seed, eax	};	a = (seed & 0x007fffff) | 0x40000000;	return( *((float*)&a) - 3.0f );}

Used __forceinline since all other functions tested got inlined but the asm version didn't (by default).

[Edited by - eq on May 8, 2008 7:45:17 PM]

Share on other sites
Quote:
 Original post by RavyneThat page is pretty neat; would be great for generating random vectors and the like. Looks like it could even be vectorized.

Yep, definitely could if I had integer multiplies in my SIMD hardware, like any typical SIMD unit would normally have... SSE, Altivec...

Quote:
 Original post by eqHow about using the code below (based on this), does that fulfill the criterias? unsigned int a; a = seed << 14; // seed * 16384 a += (seed << 8); // seed * (16384 + 256) a += (seed << 7); // seed * (16384 + 256 + 128) a += (seed << 5); // seed * (16384 + 256 + 128 + 32) a += (seed << 2); // seed * (16384 + 256 + 128 + 32 + 4) a += (seed << 1); // seed * (16384 + 256 + 128 + 32 + 4 + 2) seed += a; // seed * (16384 + 256 + 128 + 32 + 4 + 2 + 1) = seed * 16807

It almost does... but any of those shifts (seed << N)... i.e.. the multiply's by 2^N... could overflow, i.e. get higher than UINT_MAX. Addition overflows nicely, but the shifts just discard bits.

Share on other sites
Quote:
 It almost does... but any of those shifts (seed << N)... i.e.. the multiply's by 2^N... could overflow, i.e. get higher than UINT_MAX. Addition overflows nicely, but the shifts just discard bits.

Maybe I'm not following you correctly but in the version I posted the shifts are supposed to discard bits:
0x12345678 << 12, should become 0x45678000.

Share on other sites
Quote:
Original post by eq
Quote:
 It almost does... but any of those shifts (seed << N)... i.e.. the multiply's by 2^N... could overflow, i.e. get higher than UINT_MAX. Addition overflows nicely, but the shifts just discard bits.

Maybe I'm not following you correctly but in the version I posted the shifts are supposed to discard bits:
0x12345678 << 12, should become 0x45678000.

I can see now that I overlooked your logic! This looks good, I'll compare speed + distribution with another method I am going with. Thanks!

Share on other sites
Quote:
 Original post by discman1028I can see now that I overlooked your logic! This looks good, I'll compare speed + distribution with another method I am going with. Thanks!

Actually... I think it still will not work. Say your seed (at some point) is 0x80000000. I believe "seed" will then become zero forever, as all the shift results will be zero.

I guess if you made 'seed' 64-bit, you might be able to handle it better... but I can't do 64-bit SIMD integer anythings on this architecture.

Share on other sites
Quote:
 Original post by discman1028Yep, definitely could if I had integer multiplies in my SIMD hardware, like any typical SIMD unit would normally have... SSE, Altivec...

Is this SIMD hardware a Cell by any chance? I know the SPEs in Cell have only 16-bit integer multiplies. Freaking lame.

Share on other sites
Quote:
 Original post by deathkrushIs this SIMD hardware a Cell by any chance? I know the SPEs in Cell have only 16-bit integer multiplies. Freaking lame.

Nah, it's some other "modified-Altivec-unit" hardware. So it's Altivec, basically minus the int multiples. :-/