Integer factorization, the fast way?

Started by
19 comments, last by jperalta 19 years, 5 months ago
If you can afford having heuristic algorithm, there is a lot of great stuff (I mean really reliable):

- Rabbin-Miller's test - to check if the number(n) is prime:
the entry can be any small number(d), and the chance to discover, that 'n' is prime is 1/2. Not much, but you can repeat the test several times with different 'd' as needed. My teachers were joking, that the probability of the processor to mistake is larger than this algorithm to fail (and it's true).
But the usage of this for factorization is a thing to rethink...

- Rho-heuristic - to find a divisor of n. While common sieve takes O(n^(1/2)), the Rho takes up to O(d^(1/2)), where d is a smallest divisor of n (so at worse O(n^(1/4)) ).
But it may work forever with a small probability, so when it works too long, just stop it and try something else...

Search for them, or look up in 'well known' Thomas Cormen's "Introduction to algorithms"

/def
Advertisement
The three top factorization algorithms are (in order of goodness):

The General Number Field Sieve
The Elliptic Curve Method
The Quadratic Sieve

The GNFS is the best overall algorithm, whereas ECM is better than GNFS only for log-base10(n) <= ~80 which makes GNFS more useful in practical applications (such as factoring 'n' in RSA). The ECM is also much easier to implement if you're not terribly great with number theory.
Quote:Original post by deffer
If you can afford having heuristic algorithm, there is a lot of great stuff (I mean really reliable):

- Rabbin-Miller's test - to check if the number(n) is prime:
the entry can be any small number(d), and the chance to discover, that 'n' is prime is 1/2. Not much, but you can repeat the test several times with different 'd' as needed. My teachers were joking, that the probability of the processor to mistake is larger than this algorithm to fail (and it's true).
But the usage of this for factorization is a thing to rethink...

- Rho-heuristic - to find a divisor of n. While common sieve takes O(n^(1/2)), the Rho takes up to O(d^(1/2)), where d is a smallest divisor of n (so at worse O(n^(1/4)) ).
But it may work forever with a small probability, so when it works too long, just stop it and try something else...

Search for them, or look up in 'well known' Thomas Cormen's "Introduction to algorithms"

/def


Ok, i've heard of the Rabbin Millers test (but don't know how to do it, i guess i'll have to [google])

The rho-heuristic is new to me, would you mind explaining more?

jperalta:
How exactly does the eleptic curve factorization work? (i'm limited to 100 trillion, so i guess it'll be one of the best for that).

From,
Nice coder
Click here to patch the mozilla IDN exploit, or click Here then type in Network.enableidn and set its value to false. Restart the browser for the patches to work.
Quote:Original post by Nice Coder
The rho-heuristic is new to me, would you mind explaining more?


It's Rho-Pollard's heuristic in fact.
I don't remember the details well (so gog. as always), but the general idea is, that 'n' and it's divisor 'd' show some common properties, while taking some numbers modulo 'n'.
If you take (almost) any starting number and start iterating it with the_function_I_do_not_remember_at_the_moment (which is as simple as x^2, btw), modulo n, then it will sooner or later fall into a loop. Then you will know, from the numbers from that loop, what the divisor you were looking for is.

\\As far as I remember(correct me!):
While prooving the method's functionality, we show, that if we knew 'd', and were taking the same numbers, but modulo d, we would fall into a loop in the same time.

As you can see, I am unable to give you precise answer without any sources, that I do not have anywhere near me. But I am pretty sure it is all in Cormen.

/def
Quote:Original post by deffer
Quote:Original post by Nice Coder
The rho-heuristic is new to me, would you mind explaining more?


It's Rho-Pollard's heuristic in fact.
I don't remember the details well (so gog. as always), but the general idea is, that 'n' and it's divisor 'd' show some common properties, while taking some numbers modulo 'n'.
If you take (almost) any starting number and start iterating it with the_function_I_do_not_remember_at_the_moment (which is as simple as x^2, btw), modulo n, then it will sooner or later fall into a loop. Then you will know, from the numbers from that loop, what the divisor you were looking for is.

\\As far as I remember(correct me!):
While prooving the method's functionality, we show, that if we knew 'd', and were taking the same numbers, but modulo d, we would fall into a loop in the same time.

As you can see, I am unable to give you precise answer without any sources, that I do not have anywhere near me. But I am pretty sure it is all in Cormen.

/def


ok.
I found This And This

From google.

I still don't understand it much, is there any chance of finding an algarithm? (in psudocode, basic, or just a walk-through)

From,
Nice coder

[Edited by - Nice Coder on November 2, 2004 4:26:26 AM]
Click here to patch the mozilla IDN exploit, or click Here then type in Network.enableidn and set its value to false. Restart the browser for the patches to work.
Quote:Original post by Nice Coder
How exactly does the eleptic curve factorization work? (i'm limited to 100 trillion, so i guess it'll be one of the best for that).


This is how I did it (in C#):

// Elliptic curve factorization methods        /// <summary>        /// This class describes a point on an elliptic curve that is described        /// in the form y^2 congruent to x^3 + ax + b (mod n)        /// </summary>        public class EllipticPoint        {            public BigInteger x;            public BigInteger y;            public BigInteger a;            public BigInteger b;            public BigInteger modulus;            public EllipticPoint(EllipticPoint p)            {                x = p.x;                y = p.y;                a = p.a;                b = p.b;                modulus = p.modulus;            }            public EllipticPoint()            {                x = new BigInteger();                y = new BigInteger();                a = new BigInteger();                b = new BigInteger();                modulus = new BigInteger();            }            public static EllipticPoint operator +(EllipticPoint p1, EllipticPoint p2)            {                if ((p1.modulus != p2.modulus) || (p1.a != p2.a) || (p1.b != p2.b))                    throw new Exception("P1 and P2 must be defined on the same elliptic curve.");                EllipticPoint p3 = new EllipticPoint();                p3.a = p1.a;                p3.b = p1.b;                p3.modulus = p1.modulus;                BigInteger dy = p2.y - p1.y;                BigInteger dx = p2.x - p1.x;                if (dx < 0)                    dx += p1.modulus;                if (dy < 0)                    dy += p1.modulus;                if (GCD(dx, p1.modulus) != 1)                {                    p3.a = GCD(dx, p1.modulus);                    p3.modulus = -1;                    return p3;                }                BigInteger m = (dy * dx.modInverse(p1.modulus)) % p1.modulus;                if (m < 0)                    m += p1.modulus;                p3.x = (Power(m, 2) - p1.x - p2.x) % p1.modulus;                p3.y = m * (p1.x - p3.x) - p1.y % p1.modulus;                if (p3.x < 0)                    p3.x += p1.modulus;                if (p3.y < 0)                    p3.y += p1.modulus;                return p3;            }            public static EllipticPoint operator -(EllipticPoint p)            {                EllipticPoint p2 = new EllipticPoint(p);                p2.x = p.x;                p2.y = -p.y;                return p2;            }            public static EllipticPoint operator -(EllipticPoint p1, EllipticPoint p2)            {                EllipticPoint p3 = new EllipticPoint(p1);                p3 = p1 + (-p2);                return p3;            }            public static EllipticPoint Double(EllipticPoint p)            {                EllipticPoint p2 = new EllipticPoint();                p2.a = p.a;                p2.b = p.b;                p2.modulus = p.modulus;                BigInteger dy = 3 * Power(p.x, 2) + p.a;                BigInteger dx = 2 * p.y;                if (dx < 0)                    dx += p.modulus;                if (dy < 0)                    dy += p.modulus;                if (GCD(dx, p.modulus) != 1)                {                    p2.a = GCD(dx, p.modulus);                    p2.modulus = -1;                    return p2;                }                BigInteger m = (dy * dx.modInverse(p.modulus)) % p.modulus;                p2.x = (Power(m, 2) - p.x - p.x) % p.modulus;                p2.y = (m * (p.x - p2.x) - p.y) % p.modulus;                if (p2.x < 0)                    p2.x += p.modulus;                if (p2.y < 0)                    p2.y += p.modulus;                return p2;            }        }        public static BigInteger Factor(EllipticPoint p, int n)        {            p.b = (Math.Power(p.y, 2) - (Math.Power(p.x, 3) + p.a * p.x)) % p.modulus;            if (p.b < 0)                p.b += p.modulus;            EllipticPoint p2 = EllipticPoint.Double(p);            if (p2.modulus == -1)                return p2.a;            for (int i = 1; i < n; i++)            {                p2 = p2 + p;                if (p2.modulus == -1)                    return p2.a;            }            return -1;        }


Basically, go ahead and choose an elliptic curve and a point on that curve. You then feed this algorithm your number and voila, out pops the smallest factor of it. It is an... O(e^((1 + o(1))sqrt(2ln(p)ln(ln(p)))) algorithm where p is the smallest factor of n. This makes it sub-exponentional, but super-polynomial.

Edit:

For comparison-
QS is O(e^((1 + o(1))sqrt(ln(n)ln(ln(n)))))
GNFS is O(e^((1.92 + o(1))ln(n)^(1/3)ln(ln(n))^(2/3)))

More info... my C# implementation of this takes about 1.3 seconds to factor a 12-digit product of two 6-digit primes.
The following pseudocode is called the Brent variation of the Pollard Rho algorithm. For an explanation, I highly recommend chapter 5 of this book:

Bressoud, David M., Factorization and Primality Testing, Springer-Verlag, 1989.

This algorithm is very nice for factoring moderate-size numbers for which trial division would be prohibitively slow. It's a probabilistic algorithm, however, and could have a very long running time for some input numbers.

A typical generalized factoring program would first perform trial division by all of the primes less than, say, a million. (You'll probably want to have a table of these primes available as a data file that you've precomputed using the sieve of Eratosthenes.) Once trial division has removed all of the small factors, the program would move on to more powerful, but still relatively simple, algorithms such as Pollard Rho. If those fail to completely factor the input number, then you have to pull out the big guns like the Multiple Polynomial Quadratic Sieve.

Anyway, here's the Pollard Rho algorithm:

n = number to be factoredc = constant value (see comment below)max = maximum iterations (depends on how long you're willing to wait)check = how often to check gdc (something around every 10 iterations or so)x1 = 2x2 = 4 + crange = 1product = 1terms = 0for (int i = 0; i < max; i++){   for (int j = 0; j < range; j++)   {      x2 = (x2 * x2 + c) mod n      product *= (x1 - x2) mod n      if (++terms == check)      {         g = gdc(product, n)         if (g > 1) return (g)         product = 1         terms = 0      }   }   x1 = x2   range *= 2   for (int j = 0; j < range; j++)   {      x2 = (x2 * x2 + c) mod n   }}return (0)


The input value of c needs to be an integer that makes the polynomial x^2 + c irreducible. Ordinarily, you just start with c=1 and count up for cases 1 and 2 below.

This algorithm will return one of three possible values:

1) 0 means that the maximum number of iterations ran without a factor being found. Try choosing another value of c.
2) The number n itself means that the gcd did not find a proper factor. Try choosing another value of c.
3) Any other number means you found a nontrivial factor of n.

-- Eric Lengyel
Quote:Original post by Eric Lengyel
The following pseudocode is called the Brent variation of the Pollard Rho algorithm. For an explanation, I highly recommend chapter 5 of this book:

Bressoud, David M., Factorization and Primality Testing, Springer-Verlag, 1989.

This algorithm is very nice for factoring moderate-size numbers for which trial division would be prohibitively slow. It's a probabilistic algorithm, however, and could have a very long running time for some input numbers.

A typical generalized factoring program would first perform trial division by all of the primes less than, say, a million. (You'll probably want to have a table of these primes available as a data file that you've precomputed using the sieve of Eratosthenes.) Once trial division has removed all of the small factors, the program would move on to more powerful, but still relatively simple, algorithms such as Pollard Rho. If those fail to completely factor the input number, then you have to pull out the big guns like the Multiple Polynomial Quadratic Sieve.

Anyway, here's the Pollard Rho algorithm:

*** Source Snippet Removed ***

The input value of c needs to be an integer that makes the polynomial x^2 + c irreducible. Ordinarily, you just start with c=1 and count up for cases 1 and 2 below.

This algorithm will return one of three possible values:

1) 0 means that the maximum number of iterations ran without a factor being found. Try choosing another value of c.
2) The number n itself means that the gcd did not find a proper factor. Try choosing another value of c.
3) Any other number means you found a nontrivial factor of n.

-- Eric Lengyel


!!!!!!

It workes!!!

I'm going to turn it into a function soon, and integrate it into my factorizor!

:) :) :) :) :) :)
[grin][grin][wink][grin]

jperalta: How exactly do you make an eliptic curve?

From,
Nice coder
Click here to patch the mozilla IDN exploit, or click Here then type in Network.enableidn and set its value to false. Restart the browser for the patches to work.
Decent explanation of elliptic curves
Free Mac Mini (I know, I'm a tool)
Quote:
Maurer’s algorithm (Algorithm 4.62) generates random provable primes that are almost
uniformly distributed over the set of all primes of a specified size. The expected time for
generating a prime is only slightly greater than that for generating a probable prime of equal
size using Algorithm 4.44 with security parameter t = 1. (In practice, one may wish to
choose t > 1 in Algorithm 4.44; cf. Note 4.49.)


If you're interested in random generation of provable primes, you might want to take a look at this(pag 152).

This topic is closed to new replies.

Advertisement