Sign in to follow this  
Firestryke31

Algorithm Help

Recommended Posts

Alright, I've got a bit of C++ code that I've written that takes the numbers in an array of 10 elements and shuffles them:
[code]
char s = 0;
int j = 0;
for(int i = 0; i < 10; ++i)
{
j = (data[i] + i) % 10;
s = data[i];
data[i] = data[j];
data[j] = s;
}
[/code]

My problem is, I need to be able to undo this shuffling at a later point, not necessarily in the same run of the program. Unfortunately, I cannot for the life of me figure out how to do so algorithmically.

Some sample output of the above:
[code]
STRT: 3E 46 6F 78 78 67 30 01 01 40
SHF0: 6F 46 3E 78 78 67 30 01 01 40
SHF1: 6F 46 3E 78 78 67 30 01 01 40
SHF2: 6F 46 78 78 3E 67 30 01 01 40
SHF3: 6F 46 78 78 3E 67 30 01 01 40
SHF4: 6F 46 78 78 30 67 3E 01 01 40
SHF5: 6F 46 78 78 30 01 3E 01 67 40
SHF6: 6F 46 78 78 30 01 67 01 3E 40
SHF7: 6F 46 78 78 30 01 67 3E 01 40
SHF8: 6F 46 78 78 30 01 67 3E 40 01
SHF9: 01 46 78 78 30 01 67 3E 40 6F
[/code]

It's not really shuffling it, so much as moving things around in a data-dependent way. This is mostly to help keep 2 sequential things from generating sequential output, especially after a few more steps I take to help muddle things up some more. So far everything's reversible except for this "shuffling."

Bonus cookies if you can make the unshuffling work with an arbitrarily sized array, but it's not required. I might change the whole thing to forgo shuffling the first and last elements since they're a checksum and parity, respectively, and I'd like to make it easier to reject incorrect input earlier rather than later. If it would be easier to just dump this and go with a different reversible data-driven shuffling algorithm I'm all ears.

Also, if this algorithm has a name I would be interested in knowing it.

Share this post


Link to post
Share on other sites
Your example output does not make sense. Why are SHF0 and SHF1 equal? (And what does SHF mean...)

What you want is impossible, your operation is not reversible. There's no way for the output to be { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 } (e: for that matter, any sequence where x[i] % 10 != (i + 1) % 10 for all i), and there are finitely many possible permutations of these numbers -- that's how we know your output is not reversible (it's by the pigeonhole principle).

Share this post


Link to post
Share on other sites
[quote name='sam_hughes' timestamp='1310611700' post='4835103']
Your example output does not make sense.
[/quote]

How so? It's the starting array, plus the array for each iteration of i.

[quote name='sam_hughes' timestamp='1310611700' post='4835103']
What you want is impossible, your operation is not reversible. There's no way for the output to be { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 }, and there are finitely many possible permutations of these numbers -- that's how we know your output is not reversible (it's by the pigeonhole principle).
[/quote]

I just figured that since the algorithm uses the given data (and only the given data) to perform the shuffle, it would be possible to unshuffle it given the output.

Share this post


Link to post
Share on other sites
Unless I'm royally screwing up my math here, this is irreversible.

The modulus division discards information which is necessary for reconstructing the flow of elements. Consider a simple case:

Input: {1, 2, 3}

Pass 1: (data[0] + 0) % 3 = (1 + 0) % 3 = 1
Swap 0, 1
Result: {2, 1, 3}

Pass 2: (data[1] + 1) % 3 = (1 + 1) % 3 = 2
Swap 1, 2
Result: {2, 3, 1}

Pass 3: (data[2] + 2) % 3 = (1 + 2) % 3 = 0
Swap 2, 0
Result: {1, 3, 2}

Now, we try to reverse, starting with index 2. We know that the equation will look like this: (data[x] + 2) % 3 = y. With the benefit of hindsight, we know that x is 2 and therefore y is 0; but [i]without[/i] that knowledge, we don't know x [i]or[/i] y. Even if we knew the value of data[x], we couldn't simply substitute that and solve for y, because the modulus division has infinitely many solutions. If we know the value of y alone, we still can't solve for data[x] because of the same problem. And even if we know what data[x] is, that doesn't help us find x because there may be many duplicate values in the list. Therefore, we cannot reverse this process without storing the indexes which were reversed by each step.

At that point, you have to store the entire chain of swaps to reverse the order; this is a seriously not good situation. It gains us nothing over simply randomly shuffling the original list and recording the list of swaps that way.

A better solution would be to compute a position-independent hash of the input values, use that as a seed into a pseudorandom number generator, and then generate a pseudorandom sequence of swaps. If you use the first value out of the PRNG as the number of swaps to make, you can do this:

[list][*]Compute hash, seed PRNG[*]Randomly choose how many swaps [i]n[/i] to make[*]Generate [i]n*2 [/i]random numbers; use them to swap values[/list]
Reversing is simple:

[list][*]Compute hash, seed PRNG[*]Generate swap count [i]n[/i][*]Generate [i]n*2[/i] random numbers and store them[*]Reverse the list of random numbers[*]Perform the swaps given by the reversed list[/list]

Share this post


Link to post
Share on other sites
[quote name='ApochPiQ' timestamp='1310616152' post='4835121']
A better solution would be to compute a position-independent hash of the input values, use that as a seed into a pseudorandom number generator, and then generate a pseudorandom sequence of swaps. If you use the first value out of the PRNG as the number of swaps to make, you can do this:[/quote]

You might as well use a fixed seed, and use it to do a fair random shuffle.

Share this post


Link to post
Share on other sites
Conside this array

15 17 6 2 9 18 12 2 5 8

At i=0
(a[0] + 0) % 10 = 5
swapping a[0] and a[5]

After first iteration the array would become
[b]18[/b] 17 6 8 9 [b][u]15[/u][/b] 12 [u]2[/u] 5 8

Even after the first iteration, you can't revert it back.

To get the value of 0th position, you have to do the following check.
for all i except 0,
(a[i]+i) % 10 should be 0

In this case its the 5th position, (a[5] + 5) % 10 = 0 (swapped from 0th position)
and the 7th position, (a[7]+7) % 10 = 0 (swapped from 0th position)

So how would you know from which position it (18) was swapped (either 5th or 7th)?

[b]It is irreversible even after the first iteration.[/b]
How can you be sure after 10 iterations where more than one swaps can be there for a single position?

Share this post


Link to post
Share on other sites
How about something like this:
[code]
#include <iostream>
#include <vector>
#include <cstdlib>

void main() {
// intialise swaps
std::srand(0);
const int n(10);
std::vector<int> swaps(n);
for (int i = 0; i < n; ++i) {
swaps[i] = std::rand() % n;
}

// initialise data
std::vector<int> data(n);
for (int i = 0; i < n; ++i) {
data[i] = i;
}
std::ostream_iterator<int> out(std::cout, ", ");
std::copy(data.begin(), data.end(), out);
std::cout << '\n';

// shuffle
for (int i = (n - 1); i >= 0; --i) {
std::swap(data[i], data[swaps[i]]);
}
std::copy(data.begin(), data.end(), out);
std::cout << '\n';

// deshuffle
for (int i = 0; i < n; ++i) {
std::swap(data[swaps[i]], data[i]);
}
std::copy(data.begin(), data.end(), out);
std::cout << '\n';
}
[/code]

You just need to make sure that the random number generator is seeded to the same number.

Share this post


Link to post
Share on other sites
[quote name='rip-off' timestamp='1310632949' post='4835170']
What are your requirements, what is the bigger problem you're trying to solve?
[/quote]

Mostly to help keep sequential input from becoming sequential output while still maintaining the original data.

I'm probably going to change the system to not shuffle the first and last positions, and to shuffle the innards based solely off of some combination of the two. That should be reversable, right?

Share this post


Link to post
Share on other sites
Why does the random shuffle have to have anything to do with the data contents? Can't you just generate a seed and store that seed along with the data stream, and use that seed to reconstruct the shuffling pattern, as has been suggested several times now in this thread?

Share this post


Link to post
Share on other sites
I was thinking of using the elements I was talking about as said seed. Since I decided not to involve them in the shuffle I can reconstruct the seed client-side to replay the shuffle in reverse.

I wanted to minimize the amount of data transmitted, so why not make it so that the only data transmitted is the data I need to transmit, rather than tacking on some data that's only used for one step in the process? The data I'm using to seed the shuffle are checksum and parity, so they're there to validate the data anyway, why not use them for something else as well?

Also, I didn't want to rely on an outside PRNG since it's implementation might change between compilers, making data generated by a version of the program compiled by one compiler unreadable to a version compiled by another. The only way around this is to write my own PRNG or use one online (and deal with the licensing stuff), and I want to write games, not random number generators.

Finally, it doesn't even have to be random, it just has use different steps to shuffle dataset 1 and dataset 2.

Share this post


Link to post
Share on other sites
If you want a PRNG with a very free license (do what you want with it), I posted some code [url="http://www.gamedev.net/topic/601622-random-number-generator-code/page__view__findpost__p__4808079"]here[/url].

Share this post


Link to post
Share on other sites
I just went with something simplistic:

Shuffle:
[code]
char s = 0;
int j = 0;
unsigned short seed = data[0] | (data[9] << 8);
for(int i = 0; i < 8; ++i)
{
j = seed % 8;
seed /= 5;
seed *= j + 1;
s = data[i + 1];
data[i + 1] = data[j + 1];
data[j + 1] = s;
}
[/code]

Unshuffle:
[code]
char unsh[] = {0,0,0,0,0,0,0,0,0,0};
char s = 0;
int j = 0;
unsigned short seed = data[0] | (data[9] << 8);
for(int i = 0; i < 8; ++i)
{
j = seed % 8;
seed /= 5;
seed *= j + 1;
unsh[i + 1] = j;
}

for(int i = 8; i > 0; --i)
{
s = data[i];
data[i] = data[unsh[i] + 1];
data[unsh[i] + 1] = s;
}
[/code]

I don't know how expandable it is, but it gets the job done here and I can implement something similar if I need a bigger version.

Share this post


Link to post
Share on other sites
Is this just a crude encryption of some sort? If so why not use something like XTEA or something similar. If not, my bad, but it does look like you're just trying to obfuscate some data.

Share this post


Link to post
Share on other sites
Honestly, it is. It's just a simple way to keep sequential input from being sequential output. I just needed something simple and easy to write that doesn't really have to be all that secure, just enough to keep everybody and their mom from being able to easily figure out the next value without going through my code first.

I know it'll be broken within 10 minutes of some bored guy pointing his interest in my general direction, so why bother spending more than 10 minutes writing it? Had I known about XTEA before finishing my code, I would have used it instead of rolling my own. I'll probably end up using it on my network traffic.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this