Sign in to follow this  
discman1028

Rand # generator w/o int multiplies?

Recommended Posts

Anybody know of a random float # generation scheme that doesn't use any integer multiplies of any kind? (and preferably doesn't use 64-bit int's either, but that's not as important.) Single-precision IEEE-754 float math (including multiplies), as well as integer adds, subtracts, and shifts are fair game. EDIT: I need an algorithm, not an implementation... my instruction set is limited beyond standard C. [Edited by - discman1028 on May 7, 2008 10:27:08 PM]

Share this post


Link to post
Share on other sites
I don't think I quite understand, but this is what I came up with:


#include <cstdlib>

template<class float_type>
inline float_type random()
{
return static_cast<float_type>( std::rand() ) / static_cast<float_type>( RAND_MAX );
}

double x = random<double>();
float y = random<float>();





Share this post


Link to post
Share on other sites
Quote:
Original post by _fastcall
I don't think I quite understand, but this is what I came up with:

*** Source Snippet Removed ***


A little more low-level... my restrictions pertain to what's inside of rand(). ;) I need an algorithm, not an implementation... my instruction set is limited beyond standard C.

Share this post


Link to post
Share on other sites
Quote:
Original post by discman1028
Hmm, might be able to do this, and use float multiplies for the int multiplies, as long as I make sure it wraps after overflow...


...except the float multiplies don't match the int multiplies due to precision loss... only 23 bits for the "int" mantissa. Maybe the distribution would still be random enough for me... who knows?

But I still have to figure out how to deal with "int" overflow for the float multiply. I can check the float for an INF hex value, and branch based on that, but I couldn't know what to reset the float to, to continue getting random numbers.

Share this post


Link to post
Share on other sites
Quote:
Original post by _fastcall
Quote:
Original post by discman1028
Hmm, might be able to do this,


That code runs 4x faster than my version :)


Glad I could help. ;)

From searches, it seems that the integer multiply (or possibly more specifically, the integer overflow behavior) is absolutely essentially to random number generation...? Has no one on earth ever done it without integer multiplies (in the absence of huge tables)?

Share this post


Link to post
Share on other sites
Standard linear congruent random number generators multiply by a fixed constant, so you can replace the multiplication by shifts and adds (this still relies on normal integer overflow though). Another simple generator without multiplication would be an LFSR.

What are your limitations exactly? If you're trying to generate random numbers solely with the use floating point arithmetic (say from within a shader or whatever) and without any bit hacks then I don't know of any useful methods without table lookups.

Share this post


Link to post
Share on other sites
Depending on how random you need the results you could try this:


static float seed = 0.5f;
void srandom(float f)
{
if (seed <= 0.0f || seed >= 1.0f) return; // Invalid seed
seed = f;
}
float random(void)
{
seed = 3.8f * seed * (1.0f - seed);
return seed;
}


The results appear reasonably random to me, but you'd need to do some tests to be sure it's random enough for your needs.



There's also the following trick to convert int to float quickly, which you may find useful. It's from www.agner.org/optimize


int n = random_number();
float x;
*(int*)&x = (n & 0x7FFFFF) | 0x3F800000; // Now 1.0 <= x < 2.0
x -= 1.0f;

Share this post


Link to post
Share on other sites
Quote:
Original post by Adam_42

float random(void)
{
seed = 3.8f * seed * (1.0f - seed);
return seed;
}


The results appear reasonably random to me, but you'd need to do some tests to be sure it's random enough for your needs.


Cool will give it a shot. Will give LFSR a shot too.

Quote:
Original post by implicit
What are your limitations exactly? If you're trying to generate random numbers solely with the use floating point arithmetic (say from within a shader or whatever) and without any bit hacks then I don't know of any useful methods without table lookups.


Do you have a table-based method? Depending on the size of the table, that could do. No, it's not for a shader, but it is for a SIMD unit with very little SIMD integer support.

[Edited by - discman1028 on May 8, 2008 1:04:08 PM]

Share this post


Link to post
Share on other sites
Quote:
Original post by discman1028
Quote:
Original post by Adam_42

float random(void)
{
seed = 3.8f * seed * (1.0f - seed);
return seed;
}


The results appear reasonably random to me, but you'd need to do some tests to be sure it's random enough for your needs.


Cool will give it a shot.


Looks good at first glance, but recording the distribution of 100000 rand's... not so good.

[0.000000f, 0.100000f): 0
[0.100000f, 0.200000f): 4636
[0.200000f, 0.300000f): 11143
[0.300000f, 0.400000f): 6681
[0.400000f, 0.500000f): 5905
[0.500000f, 0.600000f): 10209
[0.600000f, 0.700000f): 12165
[0.700000f, 0.800000f): 14549
[0.800000f, 0.900000f): 16087
[0.900000f, 1.000000f): 18625
[1.000000f, ...): 0

Share this post


Link to post
Share on other sites
Quote:
Original post by discman1028
I need an algorithm, not an implementation... my instruction set is limited beyond standard C.

Sound interesting. Can you elaborate? Are there other constraints, or why can't you implement multiplication with additions and shifts?

Share this post


Link to post
Share on other sites
Quote:
Original post by SnotBob
Quote:
Original post by discman1028
I need an algorithm, not an implementation... my instruction set is limited beyond standard C.

Sound interesting. Can you elaborate? Are there other constraints, or why can't you implement multiplication with additions and shifts?


-- I have int additions and shifts (and xor/or/and... almost all but multiply), but I am not sure how to handle overflow. Addition overflows correctly, and I do have shift as well as rotate instructions... but not sure how to implement an overflowing multiply using those.

-- I am currently trying Multiple-Bit Leap Forward LFSR. Will see where that goes...

-- If anyone can even come up with a random integer given my constraints, I can likely massage it into a float representation...

Share this post


Link to post
Share on other sites
Quote:
Original post by discman1028
-- I have int additions and shifts (and xor/or/and... almost all but multiply), but I am not sure how to handle overflow. Addition overflows correctly, and I do have shift as well as rotate instructions... but not sure how to implement an overflowing multiply using those.

Let's say you wanted to implement Mersenne Twister, you'd need multiplication and modulo. The multiplication is actually modulo 2^32, so you wouldn't need to handle overflows in any particular way, you'd just need the least 32-bits. Implementing the modulo is no harder than the multiplication.

Share this post


Link to post
Share on other sites
Quote:
Original post by discman1028
-- I have int additions and shifts (and xor/or/and... almost all but multiply), but I am not sure how to handle overflow. Addition overflows correctly, and I do have shift as well as rotate instructions... but not sure how to implement an overflowing multiply using those.
Assuming that the bitwise operators work on the actual floating point bit patterns rather then you could simply use the normal linear congruent method, then bias the value up to a fixed exponent by adding a power of two guaranteed to be larger than the intermediate result, masking off the high bits of the mantissa with an AND and finally subtracting the bias again afterwards.

Here's an 8-bit example (assuming 32-bit IEEE floats):
n = n * 121.0 + 73.0;
n += 65536.0;
n &= 0xffff00ff; // bitwise AND
n -= 65536.0;
Of course this won't give you much in the way of precision since you'll need twice the integer precision of the final value to fit the intermediate multiplication, so I guess it's mostly useful for 64-bit floats. Then again you could easily piece together two of these.

But I'm still guessing about which operations you've actually got. Doesn't that thing have a manual you can link to somewhere?

Share this post


Link to post
Share on other sites
Quote:
Original post by DanielH
No int multiplies? Almost sounds like programming a PIC, or some other 8-bit ucontroller. But it has floats? What is this thing?


Yeah I knew it would peak (pique?) curiosity. It has int and float units, but its SIMD unit doesn't have int multiplies. They were removed to save valuable silicon real estate.

Share this post


Link to post
Share on other sites
Quote:
Original post by discman1028
Hmm, might be able to do this, and use float multiplies for the int multiplies, as long as I make sure it wraps after overflow...


That page is pretty neat; would be great for generating random vectors and the like. Looks like it could even be vectorized.

Share this post


Link to post
Share on other sites
How about using the code below (based on this), does that fulfill the criterias?

static unsigned int seed = 1;
float
sfrand()
{
unsigned int a;
a = seed << 14; // seed * 16384
a += (seed << 8); // seed * (16384 + 256)
a += (seed << 7); // seed * (16384 + 256 + 128)
a += (seed << 5); // seed * (16384 + 256 + 128 + 32)
a += (seed << 2); // seed * (16384 + 256 + 128 + 32 + 4)
a += (seed << 1); // seed * (16384 + 256 + 128 + 32 + 4 + 2)
seed += a; // seed * (16384 + 256 + 128 + 32 + 4 + 2 + 1) = seed * 16807
a = (seed & 0x007fffff) | 0x40000000;
return( *((float*)&a) - 3.0f );
}



Below is the output generated using the (asm version, see edit at botom of post) of the code above:

Random numbers generated: 1000000000
Took: 14114.2 ms (70850586.5 numbers / second)
Average in each iterval: 20000000
Worst interval: 616 (0.003080% deviation)

[-1.00, -0.96) = +465
[-0.96, -0.92) = +425
[-0.92, -0.88) = -544
[-0.88, -0.84) = -506
[-0.84, -0.80) = +410
[-0.80, -0.76) = +443
[-0.76, -0.72) = -478
[-0.72, -0.68) = +502
[-0.68, -0.64) = -488
[-0.64, -0.60) = +603
[-0.60, -0.56) = -478
[-0.56, -0.52) = +488
[-0.52, -0.48) = -514
[-0.48, -0.44) = +480
[-0.44, -0.40) = -536
[-0.40, -0.36) = +616
[-0.36, -0.32) = -425
[-0.32, -0.28) = +522
[-0.28, -0.24) = -594
[-0.24, -0.20) = +444
[-0.20, -0.16) = -453
[-0.16, -0.12) = +498
[-0.12, -0.08) = +503
[-0.08, -0.04) = -407
[-0.04, +0.00) = -574
[+0.00, +0.04) = +463
[+0.04, +0.08) = +435
[+0.08, +0.12) = -526
[+0.12, +0.16) = -386
[+0.16, +0.20) = +433
[+0.20, +0.24) = +369
[+0.24, +0.28) = -493
[+0.28, +0.32) = -484
[+0.32, +0.36) = +470
[+0.36, +0.40) = +400
[+0.40, +0.44) = -413
[+0.44, +0.48) = +460
[+0.48, +0.52) = -436
[+0.52, +0.56) = +431
[+0.56, +0.60) = -408
[+0.60, +0.64) = +342
[+0.64, +0.68) = -532
[+0.68, +0.72) = +388
[+0.72, +0.76) = -468
[+0.76, +0.80) = +336
[+0.80, +0.84) = -532
[+0.84, +0.88) = +479
[+0.88, +0.92) = +326
[+0.92, +0.96) = -604
[+0.96, +1.00) = -452



The output of the original algorithm using the single multiply:

Random numbers generated: 1000000000
Took: 13766.4 ms (72640427.8 numbers / second)
Average in each iterval: 20000000
Worst interval: 616 (0.003080% deviation)

[-1.00, -0.96) = +465
[-0.96, -0.92) = +425
[-0.92, -0.88) = -544
[-0.88, -0.84) = -506
[-0.84, -0.80) = +410
[-0.80, -0.76) = +443
[-0.76, -0.72) = -478
[-0.72, -0.68) = +502
[-0.68, -0.64) = -488
[-0.64, -0.60) = +603
[-0.60, -0.56) = -478
[-0.56, -0.52) = +488
[-0.52, -0.48) = -514
[-0.48, -0.44) = +480
[-0.44, -0.40) = -536
[-0.40, -0.36) = +616
[-0.36, -0.32) = -425
[-0.32, -0.28) = +522
[-0.28, -0.24) = -594
[-0.24, -0.20) = +444
[-0.20, -0.16) = -453
[-0.16, -0.12) = +498
[-0.12, -0.08) = +503
[-0.08, -0.04) = -407
[-0.04, +0.00) = -574
[+0.00, +0.04) = +463
[+0.04, +0.08) = +435
[+0.08, +0.12) = -526
[+0.12, +0.16) = -386
[+0.16, +0.20) = +433
[+0.20, +0.24) = +369
[+0.24, +0.28) = -493
[+0.28, +0.32) = -484
[+0.32, +0.36) = +470
[+0.36, +0.40) = +400
[+0.40, +0.44) = -413
[+0.44, +0.48) = +460
[+0.48, +0.52) = -436
[+0.52, +0.56) = +431
[+0.56, +0.60) = -408
[+0.60, +0.64) = +342
[+0.64, +0.68) = -532
[+0.68, +0.72) = +388
[+0.72, +0.76) = -468
[+0.76, +0.80) = +336
[+0.80, +0.84) = -532
[+0.84, +0.88) = +479
[+0.88, +0.92) = +326
[+0.92, +0.96) = -604
[+0.96, +1.00) = -452



The output of a version using rand():

Random numbers generated: 1000000000
Took: 74000.3 ms (13513455.6 numbers / second)
Average in each iterval: 20000000
Worst interval: 2982 (0.014910% deviation)

[-1.00, -0.96) = +1454
[-0.96, -0.92) = -119
[-0.92, -0.88) = -127
[-0.88, -0.84) = -349
[-0.84, -0.80) = +958
[-0.80, -0.76) = +209
[-0.76, -0.72) = +835
[-0.72, -0.68) = +588
[-0.68, -0.64) = +1911
[-0.64, -0.60) = +1355
[-0.60, -0.56) = -1675
[-0.56, -0.52) = +516
[-0.52, -0.48) = -324
[-0.48, -0.44) = +1640
[-0.44, -0.40) = -197
[-0.40, -0.36) = +1498
[-0.36, -0.32) = -1362
[-0.32, -0.28) = -2007
[-0.28, -0.24) = +1828
[-0.24, -0.20) = -613
[-0.20, -0.16) = -1420
[-0.16, -0.12) = -2982
[-0.12, -0.08) = -378
[-0.08, -0.04) = +625
[-0.04, +0.00) = -540
[+0.00, +0.04) = -440
[+0.04, +0.08) = -808
[+0.08, +0.12) = +1491
[+0.12, +0.16) = +334
[+0.16, +0.20) = -835
[+0.20, +0.24) = -781
[+0.24, +0.28) = +681
[+0.28, +0.32) = -1126
[+0.32, +0.36) = +98
[+0.36, +0.40) = -1826
[+0.40, +0.44) = -672
[+0.44, +0.48) = +1668
[+0.48, +0.52) = +1095
[+0.52, +0.56) = +1495
[+0.56, +0.60) = +1859
[+0.60, +0.64) = -1702
[+0.64, +0.68) = +474
[+0.68, +0.72) = +140
[+0.72, +0.76) = -527
[+0.76, +0.80) = -491
[+0.80, +0.84) = -1765
[+0.84, +0.88) = -959
[+0.88, +0.92) = +171
[+0.92, +0.96) = -23
[+0.96, +1.00) = +1007



The code used for the last results:

float
sfrand()
{
unsigned int a;
a = rand() << (24 - 15); // assumes atleast 15 bits returned by rand
a ^= rand();
a &= 0xffffff;
return (float(a) / float(0x7fffff)) - 1.0f;
}



Setup used: Windows Vista Ultimate, MSVC 2008 Express, Intel Core 2 Quad @ 2.4 Ghz (Q6600) and lots of fast ram.

Edit: I was suspicious about the performance of the propsed version (when compared to the single mul) and when I checked in the debugger it had optimized the shift adds to a single mul!! Impressive!

Edit2: Replaced the results with the asm version, interestingly it isn't that much slower (I guess that the stats gathering takes some time as well).
The source used for the asm version:

float __forceinline
sfrand()
{
unsigned int a;
__asm
{
mov eax, seed
mov ebx, seed
shl eax, 14
mov ecx, seed
shl ebx, 8
mov edx, seed
shl ecx, 7
add eax, ebx
shl edx, 5
mov ebx, seed
add eax, ecx
add eax, edx
shl ebx, 2
mov ecx, seed
add eax, ebx
add ecx, ecx
add eax, ecx
add seed, eax
};
a = (seed & 0x007fffff) | 0x40000000;
return( *((float*)&a) - 3.0f );
}



Used __forceinline since all other functions tested got inlined but the asm version didn't (by default).

[Edited by - eq on May 8, 2008 7:45:17 PM]

Share this post


Link to post
Share on other sites
Quote:
Original post by Ravyne
That page is pretty neat; would be great for generating random vectors and the like. Looks like it could even be vectorized.


Yep, definitely could if I had integer multiplies in my SIMD hardware, like any typical SIMD unit would normally have... SSE, Altivec...

Quote:
Original post by eq
How about using the code below (based on this), does that fulfill the criterias?

unsigned int a;
a = seed << 14; // seed * 16384
a += (seed << 8); // seed * (16384 + 256)
a += (seed << 7); // seed * (16384 + 256 + 128)
a += (seed << 5); // seed * (16384 + 256 + 128 + 32)
a += (seed << 2); // seed * (16384 + 256 + 128 + 32 + 4)
a += (seed << 1); // seed * (16384 + 256 + 128 + 32 + 4 + 2)
seed += a; // seed * (16384 + 256 + 128 + 32 + 4 + 2 + 1) = seed * 16807



It almost does... but any of those shifts (seed << N)... i.e.. the multiply's by 2^N... could overflow, i.e. get higher than UINT_MAX. Addition overflows nicely, but the shifts just discard bits.

Share this post


Link to post
Share on other sites
Quote:

It almost does... but any of those shifts (seed << N)... i.e.. the multiply's by 2^N... could overflow, i.e. get higher than UINT_MAX. Addition overflows nicely, but the shifts just discard bits.


Maybe I'm not following you correctly but in the version I posted the shifts are supposed to discard bits:
0x12345678 << 12, should become 0x45678000.

Share this post


Link to post
Share on other sites
Quote:
Original post by eq
Quote:

It almost does... but any of those shifts (seed << N)... i.e.. the multiply's by 2^N... could overflow, i.e. get higher than UINT_MAX. Addition overflows nicely, but the shifts just discard bits.


Maybe I'm not following you correctly but in the version I posted the shifts are supposed to discard bits:
0x12345678 << 12, should become 0x45678000.


I can see now that I overlooked your logic! This looks good, I'll compare speed + distribution with another method I am going with. Thanks!

Share this post


Link to post
Share on other sites
Quote:
Original post by discman1028
I can see now that I overlooked your logic! This looks good, I'll compare speed + distribution with another method I am going with. Thanks!


Actually... I think it still will not work. Say your seed (at some point) is 0x80000000. I believe "seed" will then become zero forever, as all the shift results will be zero.

I guess if you made 'seed' 64-bit, you might be able to handle it better... but I can't do 64-bit SIMD integer anythings on this architecture.

Share this post


Link to post
Share on other sites
Quote:
Original post by discman1028

Yep, definitely could if I had integer multiplies in my SIMD hardware, like any typical SIMD unit would normally have... SSE, Altivec...



Is this SIMD hardware a Cell by any chance? I know the SPEs in Cell have only 16-bit integer multiplies. Freaking lame.

Share this post


Link to post
Share on other sites
Quote:
Original post by deathkrush
Is this SIMD hardware a Cell by any chance? I know the SPEs in Cell have only 16-bit integer multiplies. Freaking lame.


Nah, it's some other "modified-Altivec-unit" hardware. So it's Altivec, basically minus the int multiples. :-/

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this