Here's my (heavily optimized) implementation of the algorithm I mentioned above, as a Java class with some test/benchmark code:

package permutation;
import java.util.*;
/**
* Provides a deterministic pseudorandom permutation of [0, 2^31).
*
* WARNING: heavily tuned for the specific 2^31 - 1 case - do not edit
* the prime and expect it to work (you will at least have to
* update the reduce() method and the cofactor array and also
* make sure that the modPow() 2-ary algorithm still works)!
*/
public class Permutation {
private static final int p = 2147483647; /* Prime = 2^31 - 1 */
private static final int s = 31; /* Mersenne exponent */
private static final int[] cofactors = { /* List of cofactors */
(p - 1) / 2,
(p - 1) / 3,
(p - 1) / 7,
(p - 1) / 11,
(p - 1) / 31,
(p - 1) / 151,
(p - 1) / 331
};
/**
* Computes the value of x mod p, where p = 2^s - 1 is prime.
*
* This is a division-free optimization of the usual x % p
* method, with performance improvements of around 10-20%.
*
* Only works with p = 2147483647 (it can be implemented for
* smaller Mersenne primes but it becomes less efficient, as
* one needs to check more carefully for modular overflow).
*/
private static long reduce(long x) {
long r = (x & p) + (x >> s);
return r >= p ? r - p : r;
}
/**
* Computes the value of g^e mod p, using precomputed 2-ary
* modular exponentiation. The generator array must contain
* the 2-ary powers of the generator, beginning at g^1.
*/
private static int modPow(int[] generator, int e) {
long r = 1;
int k = 0;
while (e != 0) {
if ((e & 1) != 0) {
r = reduce(r * generator[k]);
}
e >>= 1;
++k;
}
return (int)r;
}
/**
* Returns whether g (as a 2-ary power table) is a generator
* modulo p. Does this efficiently via the factors of p - 1.
*/
private static boolean isGenerator(int[] g) {
if (g[0] <= 1) return false;
for (int cofactor : cofactors) {
if (modPow(g, cofactor) == 1)
return false;
}
return true;
}
/**
* Deterministically converts a seed to a generator, and
* returns it as a 2-ary power table (starting at g^1).
*
* Note: a sizeable fraction of values in [2, p - 2] are
* generators of p so this will terminate quickly.
*/
private static int[] seedToGenerator(int seed) {
Random random = new Random(seed);
int[] powers = new int[31];
while (true) {
long g = powers[0] = 2 + random.nextInt(p - 3);
for (int k = 1; k < 31; ++k) {
powers[k] = (int)(g = reduce(g * g));
}
if (isGenerator(powers))
return powers;
}
}
/**
* The lower bound of the permutation (inclusive).
*/
public static final int Lower = 0;
/**
* The upper bound of the permutation (inclusive).
*/
public static final int Upper = p;
/**
* Precomputed values of g^(2^k) mod p, used for modular
* exponentiation. The actual generator can be read from
* generator[0] = g^1.
*/
private int[] generator;
/**
* Instantiates a new permutation of [0, p] using
* the given seed (same seed = same permutation).
*/
public Permutation(int seed) {
generator = seedToGenerator(seed);
}
/**
* Permutes the input in [0, p] and returns the
* corresponding output - will throw an illegal
* argument exception if x is out of bounds.
*/
public int permute(int x) {
if ((x < Lower) || (x > Upper))
throw new IllegalArgumentException();
else {
/* This is for randomizing the fixed points at 0 and p. Remember
* that the composition of two permutations is a permutation. */
x ^= generator[0];
if ((x == 0) || (x == p))
return x;
else
return modPow(generator, x);
}
}
/**
* Little test driver for the class.
*/
public static void main(String[] args) {
final int benchTrials = 5, benchSamples = 10000000;
final int dupTrials = 5, dupSamples = 10000000;
final int arbitrarySeed = 0xDEADBEEF;
System.out.printf("Instantiating permutation on [%d, %d] (seed %d)\n",
Permutation.Lower, Permutation.Upper, arbitrarySeed);
Permutation perm = new Permutation(arbitrarySeed);
System.out.println("\nDisplaying first few values:");
for (int x = 0; x < 20; ++x)
{
System.out.printf("%d => %d\n", x, perm.permute(x));
}
System.out.println("\nBenchmarking performance:");
for (int t = 0; t < benchTrials; ++t) {
long begin = System.nanoTime();
for (int x = t * benchSamples; x < (t + 1) * benchSamples; ++x)
perm.permute(x);
double elapsed = (System.nanoTime() - begin) / 1000000000.0;
System.out.printf(" [+] %.1f perms/s\n", benchSamples / elapsed);
}
System.out.println("\nChecking for duplicates:");
for (int t = 0; t < dupTrials; ++t) {
Set<Integer> dups = new HashSet<Integer>();
int dupCount = 0;
for (int x = t * dupSamples; x < (t + 1) * dupSamples; ++x) {
int output = perm.permute(x);
if (!dups.contains(output))
dups.add(output);
else
++dupCount;
}
if (dupCount > 0)
System.out.printf(" [!] Found %d duplicates!\n", dupCount);
else
System.out.printf(" [+] No duplicates\n");
}
}
}

It meets all of your requirements:

- with the added xor operation the fixed points are gone, and the permutation can now be called pseudorandom (it is at least as good as an LCG, and probably much better - someone else can weigh in here)

- it can give you the output permuted integer for any input in the [0, 2^31 - 1] interval in constant time/memory

- it is rather fast (on the PC I am on I am getting between 15.5 and 17.5 million lookups per second across the entire input range, in Java)

- it is deterministic ("stable") and lets you specify a seed (note: not all seeds produce a unique permutation, but most do, i.e. you have around 1 billion seeds to play with)

- it is, in fact, a permutation (so each output corresponds to one, and only one, input)

You don't have to use it, but that's okay, I really enjoyed writing it anyway