Sign in to follow this  

Periodic Sequence Factoring

This topic is 3723 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I have a sequence of characters. The characters are from an infinite alphabet with the only available operations being equality comparison and hashing. I know that the sequence of characters is periodic. That is, it is of the form:
Seq = ABkC

A = a1 ... an
B = b1 ... bm
C = c1 ... cl
Given the sequence, how can I compute k, l, m and n quickly? (The characters are assembly instructions, and I need to factor them out using a for loop).

Share this post


Link to post
Share on other sites
This certainly stretches my programming capability, but in an attempt to help I've found some good ideas here, if you haven't already seen it.

What you are saying, if I am not mistaken, is that these are T-periodic, of the form:


f(x+T) = f(x)

or in your case:

f(a+n) = f(a)
f(b+m) = f(b)
f(c+l) = f(c)

such that

Seq = f(a)*f(b)k*f(c)


and you are solving for n, m and l. In the absence of any generating function, the sequence for each would have to be subjected to 'pattern matching' or 'property matching', no? Are the character values represented by integers? It may be possible to optimally generate values for n, m and l (intelligently guess them) until the above equations are satisfied. Worst case, assuming they are indeed T-periodic, you could just start with n=2, for example, and iterate your sequence until f(a+n)=f(a). Another possibility is to take f(a)0 and search for the first re-occurence of it, for which there are excellent search algorithms available, and the distance becomes n (I have used this with pseudorandom number generators to verify/test claims about the period).

--random

[Edited by - random_thinker on October 2, 2007 4:25:16 AM]

Share this post


Link to post
Share on other sites
Quote:
Original post by ToohrVyk
I have a sequence of characters. The characters are from an infinite alphabet with the only available operations being equality comparison and hashing.

I know that the sequence of characters is periodic. That is, it is of the form:

Seq = ABkC

A = a1 ... an
B = b1 ... bm
C = c1 ... cl


Given the sequence, how can I compute k, l, m and n quickly?

(The characters are assembly instructions, and I need to factor them out using a for loop).


I think all you need here is the ability to detect cycles. Googling around a little for "cycle detection algorithm" yielded this which looks promising. Knuth's Art of Computer Programming has a cycle detection algorithm in it, which IIRC is similar if not identical to the one described at that URL.

Share this post


Link to post
Share on other sites
Whether or not these algorithms are useful depends upon the state of the data. For example, if you have a large quantity of stored sequence data then a specialized algorithm is certainly advantageous. But if you have yet generated the data, a just as quick method must be to monitor the values as they are generated and capture when they repeat.

For integers this is a straight-forward process but for float or double values you would have to apply some sort of signal processing (smoothing) algorithm.

--random

Share this post


Link to post
Share on other sites
I am interested in this thread because my own pseudo-random period detection code is providing some interesting results. I suppose that the interesting thing about a prng algorithm is that it is a periodic function by design but there should be no detectable mathematical pattern within cycles, if that makes sense.

My own detection code follows values as they are generated and locates the first to re-occur and then identifies the first three consecutive reoccurences as the beginning of the next cycle. Whether this is strictly correct I have not been able to verify. Interestingly, for std::rand() the results of ten separately seeded runs results in the following:


// One set of 10 independently seeded runs, each through one cycle...
Test period of pseudorandom generator...
Warning: First duplicate value occurs at n = 1038947364, continuing...
Warning: First duplicate value occurs at n = 160672077, continuing...

ID : C++ Standard Library rand()
Min Period : 96387719
Max Period : 1732857588
Ave Period : 6.91864e+08
No of cycles : 10
Elapsed mins : 2.04767

// Another set of 10 independently seeded runs, each through one cycle...
Test period of pseudorandom generator...
Warning: First duplicate value occurs at n = 597879981, continuing...
Warning: First duplicate value occurs at n = 1493682969, continuing...
Warning: First duplicate value occurs at n = 133528437, continuing...
Warning: First duplicate value occurs at n = 1195410326, continuing...
Warning: First duplicate value occurs at n = 1192369109, continuing...

ID : C++ Standard Library rand()
Min Period : 115934013
Max Period : 1493682971
Ave Period : 6.46274e+08
No of cycles : 10
Elapsed mins : 1.91167


Either I am doing something wrong or this is another reason to beware using std::rand(). The rarely available literature on it indicates that it should have a period of 232-1. The actual period is all over the place, and there is a high probability that numbers will be repeated within a cycle.

--random

Share this post


Link to post
Share on other sites

This topic is 3723 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this