High-quality integer noise function

Started by
10 comments, last by FordPerfect 6 years, 9 months ago

This is basically a repost of my thread from gamedev.ru ( http://www.gamedev.ru/code/forum/?id=215866 ).

If we want a PRNG, than thanks to http://www.pcg-random.org we now have a pretty decent one, at least as far, as gamedev goes.

Sometimes, we want a pure function, however. It can be thought of as a stateless random number generator.

And here situation is somewhat worse. Google search would lead most people to the one from libnoise: http://libnoise.sourceforge.net/noisegen/index.html#coherentnoise . Which is quite frankly, not all that good.

Typical advice is to use some hash function. But without specifying which one, it is not all that helpful. Some are not very good and to find and test a good one takes effort.

So this thread is intended to find a quality one, in hopes that it may be helpful to a lot of people.

More precisely. Problem statement: devise a function with signature


uint32_t noise_uint32(uint32_t n);

that behaves as a discrete white noise. Output is uniformly distributed over range of uint32_t.

Current best shot:


uint32_t noise_uint32(uint32_t n)
{
    uint64_t state=n;
    state=state*state;
    state=state*6364136223846793005ULL+1442695040888963407ULL;
    uint32_t xorshifted=uint32_t(((state>>18u)^state)>>27u);
    uint32_t rot=uint32_t(state>>59u);
    return (xorshifted>>rot)|(xorshifted<<((-rot)&31));
}

This actually passes SmallCrush from but not Crush. NOTE: TestU01 is for stateful rng's, so it was fed with sequental outputs ( noise_uint32(0), noise_uint32(1), noise_uint32(2), ...).

The line


state=state*state; 

makes difference between passing and failing SmallCrush.

One question is of legal issues: this function is clearly derived from PCG (which is under Apache license). Can I place it into the public domain (which is what I want)?

And the main point of the thread: searching for volunteers to invent functions and test them by TestU01. This may require a lot of machine time.

And post findings, naturally, in this thread.

Advertisement
I posted a function that does that for 64-bit numbers in this thread. Feel free to use it or adapt it any way you want.

I use one of those (Made this myself so you can use it):


	inline uint32_t randI (uint32_t time)
	{
		time *= 1664525;
		time ^= (time << 16);
		time *= 16807;
		return time;
	}
	
	inline uint32_t randI2 (uint32_t time)
	{
		time *= 1664525;
		time ^= (time << 16);
		time ^= (time >> 16);

		time *= 16807;
		time ^= (time << 8);
		time ^= (time >> 24);

		time *= 100069;
		time ^= (time << 24);
		time ^= (time >> 8);

		time *= 1000099;
		
		return time;
	}

Maybe not the quality you aim for, but let me know your reults if you test it.

I wonder about your line state=state*state; - doesn't this affect a uniform distribution badly?

I posted a function that does that for 64-bit numbers in this thread. Feel free to use it or adapt it any way you want.


There seems neither a link and nor code in your post? Edit: found the link :)

I use one of those (Made this myself so you can use it):


	inline uint32_t randI (uint32_t time)
	{
		time *= 1664525;
		time ^= (time << 16);
		time *= 16807;
		return time;
	}
	
	inline uint32_t randI2 (uint32_t time)
	{
		time *= 1664525;
		time ^= (time << 16);
		time ^= (time >> 16);

		time *= 16807;
		time ^= (time << 8);
		time ^= (time >> 24);

		time *= 100069;
		time ^= (time << 24);
		time ^= (time >> 8);

		time *= 1000099;
		
		return time;
	}
Maybe not the quality you aim for, but let me know your reults if you test it.


One problem with your code is that it is a permutation of the 2^32 values that a uint32_t can take. So if you feed it 0, 1, 2, ... it will not produce the same output twice. That is not how random numbers would work. But it may not matter, depending on the application.


I wonder about your line state=state*state; - doesn't this affect a uniform distribution badly?


Since he is working with 64-bit arithmetic, he's not losing any information in that operation. I think it should be fine.


I posted a function that does that for 64-bit numbers in this thread. Feel free to use it or adapt it any way you want.


There seems neither a link and nor code in your post?


Yes, there is a link, in the words "this thread".

So if you feed it 0, 1, 2, ... it will not produce the same output twice. That is not how random numbers would work. But it may not matter, depending on the application.


How do you mean this? Probably i get you wrong.
Do you mean it is bad the function never returns the same number from two different inputs?
(I'm not sure this is the case but guess so and i like it)


Some thoughts i have in hope to improve this:


IIRC, the purpose of the prime multiplication is to get a nice divison remainder, E.g:
float prn = float(x) * prime / 1; prn = prn - floor(prn);

and because we have only a fixed number of bits we get a remainder for free and get away with an integer multiplication:
int prn = x * prime; // remainder of division by 2^32, as good as the float version, and i assume 32 bits are enough even if we want 32 bit result

If we use a true an reasonable large prime number, we don't get repetive results on ascending inputs.
(If we only use a very large odd number like you do in your code, is this less guaranteed?)



The additon both of you guys do in your code seems a wasted instruction to me. It only generates an offset so zero input does not generate zero output.
Because this does not improve randomness, it would be better to do this add on the function parameter only if we need that property.
Can we conclude additions are pointless in general and should be removed?

I think the xor stuff has asimilar problem and does not really improve randomness.
It fills bits with stuff on any input, which is why i'm doing it too, but maybe we should do more prime mults instead?

One idea i have is, say we are willing to do 4 mults, would it do well to use prime numbers nearest to 0x000000FF, 0x0000FF00, 0x00FF0000, 0xFF000000?
Would this alone give better results with faster code?



Of course it always depends on what we want.
My goal is no hash function or something hard to reverse engeneer, i want only uniform distribution of pseudo random numbers.

1. The distribution should be random and uniform also no matter if i only use n lowest or highest bits of the result (or any other range of n bits in between).
This implies that every number from [0,2^32) should be generated exactly once.

2. If i display the result as an image with a width of n pixels, it should not produce repetive patterns, even if n is a prime used in the code.
Again i want the same noisy image if we only use a subset of resulting bits to display the image.
The frequency of the noise should not differ if we change width, or use high / low range of input.

3. Also, should results behave similar if we increase input by any other number than just one?

4. What else?


Probably generating an automated test for all this is not easy. (I don't know about the Crush test and what it does.)
I did not spend a lot time here because results were good enough for my usecase,
but i'm pretty sure there is a faster and better solution than anything we've posted yet.

So if you feed it 0, 1, 2, ... it will not produce the same output twice. That is not how random numbers would work. But it may not matter, depending on the application.


How do you mean this? Probably i get you wrong.
Do you mean it is bad the function never returns the same number from two different inputs?
(I'm not sure this is the case but guess so and i like it)


You may like it, but that contradicts your desire to get uniformly-distributed pseudo-random numbers out of this. If you pick 77,000 truly random 32-bit numbers, the probability that two of them are equal is about a half. If with your method the probability is 0, this is a departure from randomness.


If we use a true an reasonable large prime number, we don't get repetive results on ascending inputs.
(If we only use a very large odd number like you do in your code, is this less guaranteed?)



I don't know why you think that. Prime numbers are not magical. But they are odd, so they will do. :)


The additon both of you guys do in your code seems a wasted instruction to me. It only generates an offset so zero input does not generate zero output.



That's not a waste. That's how you provide your user with a random value for the input 0.


Because this does not improve randomness, it would be better to do this add on the function parameter only if we need that property.
Can we conclude additions are pointless in general and should be removed?



No, you can't conclude that.


I think the xor stuff has asimilar problem and does not really improve randomness.

It fills bits with stuff on any input, which is why i'm doing it too, but maybe we should do more prime mults instead?

One idea i have is, say we are willing to do 4 mults, would it do well to use prime numbers nearest to 0x000000FF, 0x0000FF00, 0x00FF0000, 0xFF000000?
Would this alone give better results with faster code?



Multiplying by several numbers in a row is equivalent to multiplying by a single number (the product of the numbers). So consecutive products are truly wasted instructions. If you introduce additions in between, things aren't much better, since a sequence of additions and multiplications in any order is equivalent to a single multiply followed by a single addition. The reason for this is that the composition of affine transformations is an affine transformation.

However, by combining multiplies and XORs, you are using two different mathematical structures in the set of 32-bit integers, and there is no magical simplification like the ones I just discussed.


Of course it always depends on what we want.



Absolutely.


My goal is no hash function or something hard to reverse engeneer, i want only uniform distribution of pseudo random numbers.



Then you don't want the feature that different inputs result in different outputs, as described above.


1. The distribution should be random and uniform also no matter if i only use n lowest or highest bits of the result (or any other range of n bits in between).
This implies that every number from [0,2^32) should be generated exactly once.



No, it implies the opposite.


2. If i display the result as an image with a width of n pixels, it should not produce repetive patterns, even if n is a prime used in the code.
Again i want the same noisy image if we only use a subset of resulting bits to display the image.
The frequency of the noise should not differ if we change width, or use high / low range of input.



All of the bits produced by the function I proposed should be random enough. Multiplications make sure the high-order bits depend on lots of things in the input, and the regular XORing with a shift to the right makes sure the low-order bits are also of high quality. After a few iterations, there should be no recognizable patterns left.


3. Also, should results behave similar if we increase input by any other number than just one?



Shouldn't be an issue.


4. What else?



Not sure, but this is the kind of thing that tests like Crush or Diehard are designed to do.


Probably generating an automated test for all this is not easy. (I don't know about the Crush test and what it does.)


A great opportunity to learn. :)


I did not spend a lot time here because results were good enough for my usecase,
but i'm pretty sure there is a faster and better solution than anything we've posted yet.



I am not sure. I haven't looked at this in detail in several years, but the last time I checked, multiplications had a latency of 3 cycles and XOR had a latency of 1 cycle, or something like that. There are no branches or memory accesses, so my guess is my code runs in less than 20 cycles, which is pretty darn fast for most applications. I wouldn't select it as a target for optimization unless I had a profiler telling me that's where my bottleneck is.

Multiplying by several numbers in a row is equivalent to multiplying by a single number (the product of the numbers). So consecutive products are truly wasted instructions. If you introduce additions in between, things aren't much better, since a sequence of additions and multiplications in any order is equivalent to a single multiply followed by a single addition. The reason for this is that the composition of affine transformations is an affine transformation.

However, by combining multiplies and XORs, you are using two different mathematical structures in the set of 32-bit integers, and there is no magical simplification like the ones I just discussed.


All of the bits produced by the function I proposed should be random enough. Multiplications make sure the high-order bits depend on lots of things in the input, and the regular XORing with a shift to the right makes sure the low-order bits are also of high quality. After a few iterations, there should be no recognizable patterns left.


Agree, i intended to xor the results of various multiplications together, not adding them but avoiding shifting by a fixed number of bits.
But you are right about the shift direction. I think i could remove the left shifts from my code :)


About the primes - i can't remember where i've read about this - maybe they just helped to invent early RNGs like this in the old days but turned out to be unnecessary.

About equal results: Personally i'm more after a randomized shuffle of sequential numbers than a sequence of true randomness, so it depends on what we want.

About the initial add: If we use something like ((nodeIndex<<10) + timeTick) for the input, the initial add is just a offset to the same resulting sequence and thus wasted.

It hasn't been mentioned yet, but the Xoroshiro128+ PRNG is a very new approach that is both very fast and has very good statistical quality. It performs better in the BigCrush test suite than any of the previously mentioned ones. I use it in my path tracer for ray sample generation.

It hasn't been mentioned yet, but the Xoroshiro128+ PRNG is a very new approach that is both very fast and has very good statistical quality. It performs better in the BigCrush test suite than any of the previously mentioned ones. I use it in my path tracer for ray sample generation.


That is very cool, but it's not actually what we are talking about in this thread. We are talking about a function takes an integer input and produces a random-looking integer in the output.

PCG paper says that

>This information indicates that an ideal generator can reasonably fail SmallCrush when b < 32, fail Crush when b < 35, and fail BigCrush when b < 36.

That's number of bits in state, i. e. in our case bits of input.

So it seems, a 32->32 function should not pass Crush (and BigCrush), and the one from the opening post is about as good as it gets, statistically speaking.

Its speed, perhaps, can be improved.

64->32 (and 64->64) functions are still of interest.

This topic is closed to new replies.

Advertisement