• Create Account

# Bacterius

Member Since 28 Feb 2011
Online Last Active Today, 04:58 AM

### In Topic: good random permutation?

Yesterday, 09:56 PM

Here's my (heavily optimized) implementation of the algorithm I mentioned above, as a Java class with some test/benchmark code:

package permutation;

import java.util.*;

/**
* Provides a deterministic pseudorandom permutation of [0, 2^31).
*
* WARNING: heavily tuned for the specific 2^31 - 1 case - do not edit
*          the prime and expect it to work (you will at least have to
*          update the reduce() method and the cofactor array and also
*          make sure that the modPow() 2-ary algorithm still works)!
*/
public class Permutation {
private static final int p = 2147483647;         /* Prime = 2^31 - 1  */
private static final int s = 31;                 /* Mersenne exponent */
private static final int[] cofactors = {         /* List of cofactors */
(p - 1) / 2,
(p - 1) / 3,
(p - 1) / 7,
(p - 1) / 11,
(p - 1) / 31,
(p - 1) / 151,
(p - 1) / 331
};

/**
* Computes the value of x mod p, where p = 2^s - 1 is prime.
*
* This is a division-free optimization of the usual x % p
* method, with performance improvements of around 10-20%.
*
* Only works with p = 2147483647 (it can be implemented for
* smaller Mersenne primes but it becomes less efficient, as
* one needs to check more carefully for modular overflow).
*/
private static long reduce(long x) {
long r = (x & p) + (x >> s);
return r >= p ? r - p : r;
}

/**
* Computes the value of g^e mod p, using precomputed 2-ary
* modular exponentiation. The generator array must contain
* the 2-ary powers of the generator, beginning at g^1.
*/
private static int modPow(int[] generator, int e) {
long r = 1;
int k = 0;

while (e != 0) {
if ((e & 1) != 0) {
r = reduce(r * generator[k]);
}

e >>= 1;
++k;
}

return (int)r;
}

/**
* Returns whether g (as a 2-ary power table) is a generator
* modulo p. Does this efficiently via the factors of p - 1.
*/
private static boolean isGenerator(int[] g) {
if (g[0] <= 1) return false;

for (int cofactor : cofactors) {
if (modPow(g, cofactor) == 1)
return false;
}

return true;
}

/**
* Deterministically converts a seed to a generator, and
* returns it as a 2-ary power table (starting at g^1).
*
* Note: a sizeable fraction of values in [2, p - 2] are
*       generators of p so this will terminate quickly.
*/
private static int[] seedToGenerator(int seed) {
Random random = new Random(seed);
int[] powers = new int[31];

while (true) {
long g = powers[0] = 2 + random.nextInt(p - 3);

for (int k = 1; k < 31; ++k) {
powers[k] = (int)(g = reduce(g * g));
}

if (isGenerator(powers))
return powers;
}
}

/**
* The lower bound of the permutation (inclusive).
*/
public static final int Lower = 0;

/**
* The upper bound of the permutation (inclusive).
*/
public static final int Upper = p;

/**
* Precomputed values of g^(2^k) mod p, used for modular
* exponentiation. The actual generator can be read from
* generator[0] = g^1.
*/
private int[] generator;

/**
* Instantiates a new permutation of [0, p] using
* the given seed (same seed = same permutation).
*/
public Permutation(int seed) {
generator = seedToGenerator(seed);
}

/**
* Permutes the input in [0, p] and returns the
* corresponding output - will throw an illegal
* argument exception if x is out of bounds.
*/
public int permute(int x) {
if ((x < Lower) || (x > Upper))
throw new IllegalArgumentException();
else {
/* This is for randomizing the fixed points at 0 and p. Remember
* that the composition of two permutations is a permutation. */
x ^= generator[0];

if ((x == 0) || (x == p))
return x;
else
return modPow(generator, x);
}
}

/**
* Little test driver for the class.
*/
public static void main(String[] args) {
final int benchTrials = 5, benchSamples = 10000000;
final int dupTrials   = 5, dupSamples   = 10000000;

System.out.printf("Instantiating permutation on [%d, %d] (seed %d)\n",
Permutation.Lower, Permutation.Upper, arbitrarySeed);

Permutation perm = new Permutation(arbitrarySeed);

System.out.println("\nDisplaying first few values:");

for (int x = 0; x < 20; ++x)
{
System.out.printf("%d => %d\n", x, perm.permute(x));
}

System.out.println("\nBenchmarking performance:");

for (int t = 0; t < benchTrials; ++t) {
long begin = System.nanoTime();

for (int x = t * benchSamples; x < (t + 1) * benchSamples; ++x)
perm.permute(x);

double elapsed = (System.nanoTime() - begin) / 1000000000.0;
System.out.printf(" [+] %.1f perms/s\n", benchSamples / elapsed);
}

System.out.println("\nChecking for duplicates:");

for (int t = 0; t < dupTrials; ++t) {
Set<Integer> dups = new HashSet<Integer>();
int dupCount = 0;

for (int x = t * dupSamples; x < (t + 1) * dupSamples; ++x) {
int output = perm.permute(x);

if (!dups.contains(output))
else
++dupCount;
}

if (dupCount > 0)
System.out.printf(" [!] Found %d duplicates!\n", dupCount);
else
System.out.printf(" [+] No duplicates\n");
}
}
}


It meets all of your requirements:

- with the added xor operation the fixed points are gone, and the permutation can now be called pseudorandom (it is at least as good as an LCG, and probably much better - someone else can weigh in here)

- it can give you the output permuted integer for any input in the [0, 2^31 - 1] interval in constant time/memory

- it is rather fast (on the PC I am on I am getting between 15.5 and 17.5 million lookups per second across the entire input range, in Java)

- it is deterministic ("stable") and lets you specify a seed (note: not all seeds produce a unique permutation, but most do, i.e. you have around 1 billion seeds to play with)

- it is, in fact, a permutation (so each output corresponds to one, and only one, input)

You don't have to use it, but that's okay, I really enjoyed writing it anyway

### In Topic: good random permutation?

Yesterday, 05:40 PM

Yes, it would be much easier if you had asked for a 32-bit permutation. I think Java does let you access all 32 bits (signed integers just use a bit for the sign bit, after all) but bitwise operations are apparently rather tricky. I always say signed arithmetic is evil, but Java does not let you choose here, so it's harder than it needs to be.

### In Topic: good random permutation?

Yesterday, 04:00 PM

I don't think a stateless PRNG makes any sense. Is there not any way to store the state of a PRNG with the terrain generator?

Usually the go-to choice for alternative RNGs (if not already provided by the language's libraries) is usually mersenne twister.

He needs a random-access permutation, which is a different problem. But, yes, unless the seed is hardcoded, there is going to be state (by definition).

One possible candidate I can think of is the function:

$f(x) = \begin{cases} x ~ ~ &\text{if} ~ ~ x = 0 ~ ~ \text{or} ~ ~ x = 2^{31} - 1 \\ g^x \bmod{2^{31} - 1} ~ ~ &\text{otherwise} \end{cases}$

which has pretty good pseudorandom properties and is a permutation (there are a couple obvious fixed points by construction, which can be eliminated by e.g. multiplying the input with a suitable constant beforehand). Here g is supposed to be a generator of 2^31 - 1, the smallest one is 7 and about 25% of all possible bases are generators, so you won't run out of seeds (it is easy to test if you have a generator, which means you can efficiently find one by instantiating any PRNG with your seed and looking for a generator that way). It is also reasonably fast, because modular exponentiation can be implemented efficiently using modular square and multiply, and, furthermore, reducing an integer modulo 2^31 - 1 thankfully doesn't require a division. I might write up a full implementation this afternoon if anyone is interested (or just for fun)

### In Topic: Multiverse theory

Yesterday, 06:33 AM

I have another issue with this reasoning, besides the obvious logical fallacy pointed out by cowsarenotevil. I claim the statement is ill-formed because it implies the multiverse theory is somehow a property of any one universe (this is given away by the "in which" phrasing). This is absurd, of course, since the multiverse theory, by definition, makes a statement about the existence of more than one universe. Hence this theory applies not to any one universe, but to the multiverse (the "set of all universes", as it were). All universes refer to the same multiverse theory, they don't have their own (they might have their own perception of said theory, e.g. "there is only one universe I can see", but that is an entirely different theory).

To better understand this, think of a basket of apples. Each apple has a worm on it. A worm on some apple looks around and says, "I see more than one apple". Another worm on another apple also sees more than one apple - of course, they are seeing the same set of apples in the basket. A worm on the opposite side of the basket might not be able to see any apple beyond the one it is on, e.g. because its apple is blocking the view, so it might conclude that there is only one apple. But that doesn't make the theory that "there is only one apple" true, since there are, in fact, multiple apples. This theory is independent of any given worm or apple, as it applies to all apples in the basket. The correct conclusion is that the theory "there are no apples I can see" is true, and that is a property of this particular worm and apple (and that would be a well-formed statement).

This argument can be made rigorous using type theory, or, I suppose, any remotely consistent logic system, but it's not hard to see that the statement is meaningless if you think about it this way. The paradox essentially plays on the unconditional nature of the multiverse theory versus each universe's interpretation of it. It you separate the two properly, the paradox disappears (and the statement becomes uninteresting).

### In Topic: Hard Copy or Soft Copy?

11 July 2014 - 09:57 PM

I hate using "remote hosting and data storage" ( what techies call "the cloud" now-a-days ) due to the fact that I have no clue how secure their servers are - or how trust worthy their employees are.

You do know you can encrypt your files and fingerprint them, right? This is a solved problem - it doesn't matter how trustworthy the host is, just that it works properly and retrieves all your bits when you ask it to (which is of course a concern, making it useful only as secondary or even tertiary backup storage, but completely unrelated to security). It's really not hard.

In any case, data backup or archival is not a two-step process. Data, like everything else, needs to be cared for in order to survive. Fail to keep your backups to date and face the consequences (which are to try and recover your files once an accident occurs and finding that every single one of your backup sites has failed or is otherwise unrecoverable, either because of old age or because it never really worked in the first place but you never bothered to try recovering your files periodically).

PARTNERS