• Advertisement
Sign in to follow this  

Random Numbers Distribution

This topic is 3632 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I was wondering if this scheme of random numbers has been studied in the academics. So here goes: Let p(x) = rand(0 to 1), for x > 0. Sequence is {ti | integral(p(x), x=0..t) mod 1 = 0} In english, the random distribution is accumulated, and for every integer step reached, a number in the sequence is generated. Perhaps am reaching for my left ear with my right hand, in that case I'd like to know the shortcut.

Share this post


Link to post
Share on other sites
Advertisement
Can you give an example instead of making up your own notation?

Share this post


Link to post
Share on other sites
A discrete counter wise example for this formalization would look somethin like this:

pi = rand_integer(0 to 1000);
wi = sum of p from 0 to i;
for( int i = 0; i < arbitrary_value; i++ ) // arbitrary value
//usually doesn't exist since this algo would be implemented to run
//against a state accumulator in real time, so there's no halting
{
if( wi % 100 == 0 )
sequence.push_back( i );
}


[edit]
It's something like a car with a uniform random speed at each instant. I am recording the times it reaches equally spaced checkpoints.

Share this post


Link to post
Share on other sites
I haven't seen any studies on it, but i think the distribution will only be as random as its source. It should increase the prng period though, probably to the least common multiple of the interval and the actual prng period.

I quickly wrote this and clipped the numbers to 8 bits since i didn't know what range of numbers to expect. The distribution looked even (mean of 127, deviation of 64), but 7zip could compress it by ~1% which is a sign of a bias.


int main()
{
unsigned long long seed=1111111;
unsigned int max=2*15485863;
unsigned int current=0;
unsigned int sum=0;
unsigned int* track=new unsigned int[max];
for (int i=0;i<10000000 && current<max;i++)
{
if (!sum)
{
track[current++]=i;
}
sum=(sum+seed)%100;//100=the interval
seed=(seed*652416ULL+72719ULL)%15485863ULL;//lcg for random numbers
}
FILE* out=fopen("data.txt","wb");
for (int i=0;i<current;i++)
{
putc(track&0xff,out);
}
fclose(out);
system("pause");
return 0;
}


Share this post


Link to post
Share on other sites
so this was my attempt to test what was going on
seems to be pretty linear

int main()
{
int sum = 0;
ofstream fl( "output.txt" );
vector<int> sequence;
for( int i = 0; i < 1000000; i++ )
{
sum += rand() % 2;
if( sum > 1000 )
{
sum -= 1000;
sequence.push_back( i );
fl << i << endl;
}
}
return 0;
}



Just point plot them in an excel sheet and you'll know what i mean.

Share this post


Link to post
Share on other sites
I'll summarize your last code snippet.

Let X(t) ∈ {0,1} be a random number sequence which we'll assume uniform, independent and identically distributed. You define a random number sequence T(i) such that the sum X(T(i)+1) + ... + X(T(i+1)) = 1000. You wonder what the properties of T(i) are.

If T(i-1) = t, the probability that T(i) = t + N is the probability that X(t+1) + ... + X(t+N) = 1000, which is related to a Binomial distribution: P(N = n) = P(B(n,0.5) = 1000)). Try plotting that distribution.

Share this post


Link to post
Share on other sites
Assuming rand() is evenly distributed %2, that code is just a complicated way of drawing a line. You said you could see this in excel though, so i don't know what you're looking for.

Share this post


Link to post
Share on other sites
I think the distance between numbers produced by your code snippet follows a displaced Pascal distribution, which gives the probability of having k failures before r successes with a probability of success p. In the code snippet, r is 1000 and p is .5, and the pascal distribution gives the odds of k failures happening before the number of successes. You'd add 1000 to k to get the total number of trials. With such a high value of r, the distribution very closely approximates a normal distribution.

For the continuous case, the number of successes in a given interval follows a Poisson distribution, and the intervals of time between k successes follows a Erlang distribution. Once again, because k is so high, the distribution approximates a normal distribution.

For the case in your original post, where the random variable is uniformly distributed, I have no idea. I'd guess that the variability would be less than in the poisson case.

[Edited by - Vorpy on March 16, 2008 5:01:36 PM]

Share this post


Link to post
Share on other sites
Great enough, I had a feeling it had something to do with Poisson that I took in my probability-ish course.

@ToohrVyk
Would you please recheck what you wrote before I plunge in and try to understand it [smile], that X(T(i)+1) looks just a bit weird.

Share this post


Link to post
Share on other sites
Quote:
Original post by arithma
@ToohrVyk
Would you please recheck what you wrote before I plunge in and try to understand it [smile], that X(T(i)+1) looks just a bit weird.


Your code is equivalent to:

int i = 0;
while (i < N)
{
int r = 1000;
while (r -= rand() % 2) ++i;
output.push_back(i-1);
}


As such, the value of the T(i)-th element of the rand() sequence counts for computing T(i), not T(i+1).

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement