# Integer factorization, the fast way?

This topic is 4828 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

Ok, i've been building a small integer factorizer. It is currenly very slow. It takes 2.8 seconds to factorize 102,812,541,257 into 53 * 6,199 * 312,931. It is currently using a few things, First, it checks the common divisors (2,3,5,7 and 11). if it is divisible by those, then it has the factors almost instantly. Then it sqrt's the number. if it is < 200 then it does a trial division search (all numbers < 200, takes < 1 second) And finally, if it is > 200, then it uses a modified sieve of Eratosthenes. Basically, it checks n. if n isn't a factor, then 2n, 3n, 4n, 5n, on. will not be a factor, so you can rule them off your list. It is quite slow for big numbers, but spends most of its time removing all the common factors (2, 3, 5,7 and 11) and i don't know how to get rid of them beforehand. My question is: What is a method, which is faster then my sieve, and which is not too hard (without explanation, some things i just don't understand yet...), to understand or code. From, Nice coder

##### Share on other sites
That is actually the sieve of Eratosthenese[edit can't spell]..;)

To make it as fast as possible, implement an Extended Euclidean Algorithm. That is about as fast as you'll find. It's also what is used for a vast majority of encryption/decryption techniques, so you know it is fast.

Here's a link for javascript implementation:
http://en.wikipedia.org/wiki/Extended_Euclidean_algorithm

Here's a link with a more algorithmic-orientation:
http://www-math.cudenver.edu/~wcherowi/courses/m5410/exeucalg.html

Hope this helps ya.

##### Share on other sites
Quote:
 Original post by Alex SwinneyThat is actually the sieve of Eratosthenese[edit can't spell]..;)To make it as fast as possible, implement an Extended Euclidean Algorithm. That is about as fast as you'll find. It's also what is used for a vast majority of encryption/decryption techniques, so you know it is fast.You will have to adapt it to your situation, though.Here's a link for javascript implementation:http://en.wikipedia.org/wiki/Extended_Euclidean_algorithmHere's a link with a more algorithmic-orientation:http://www-math.cudenver.edu/~wcherowi/courses/m5410/exeucalg.htmlHope this helps ya.

Thanks, it does help me a bit :)

I'm limited to 100 trillion (15 digits before vb stuffs up the numbering system.) so, any algorith which isn't too slow should work.

What i was wondering about, tho is how to use that to get the factors...

I also wouldn't mind dumping the seive, for something a bit more elegent (my sieve is a little hacky... well, maybe more then a little).

From,
Nice coder

##### Share on other sites
? i was logged in before...
Above post is mine

From,
Nice coder

##### Share on other sites
You can get rid of them by taking their product and checking +/-1 and +/- all the primes greater than the highest prime in the product and less than half the product. To eliminate all multiples of the first five primes takes a list of 184 primes. So for example say your product was 2 then you check 2n+1, for 2*3 you check 6n+/-1, for 2*3*5 you check 30n+/-{1,7,11,13}, for 2*3*5*7 you check 210n+/-{1,11,13,...,97,101,103).

##### Share on other sites
Quote:
 Original post by LilBudyWizerYou can get rid of them by taking their product and checking +/-1 and +/- all the primes greater than the highest prime in the product and less than half the product. To eliminate all multiples of the first five primes takes a list of 184 primes. So for example say your product was 2 then you check 2n+1, for 2*3 you check 6n+/-1, for 2*3*5 you check 30n+/-{1,7,11,13}, for 2*3*5*7 you check 210n+/-{1,11,13,...,97,101,103).

Thank you!

Now if only i could find a way to implement one of the faster approaches... (so that i wouldn' need to do this at all)

From,
Nice coder

##### Share on other sites
This should do the trick.
#include <stdlib.h>#include <iostream.h>int main(){__int64 number, j, k;short int again; system("cls"); cout << "\n\n\t\t\t Prime Finder\n\n"; cout << "\n\n\t\t\t By: Jared Stewart.\n\n\n\n\n\t\t "; system("PAUSE"); system("cls"); do { system("cls"); cout << "Please enter a number to get the prime factors: "; cin >> number; j = 2; system("cls"); cout << "The prime factors are: "; do { if (number == 2) break; if (number % j == 0 ) //check if its a factor { cout << j << " "; number = number/j; //make the number to check smaller } else j++; } while (j < 3); for (j =3; j <= sqrt(number); ) { if (number % j == 0 ) //check if its a factor { cout << j << " "; number = number/j; //make the number to check smaller } else j+=2; } cout << number; cout << "\n\n"; cout << "To test a new number, press 1. To quit, press 2.\n\n\n"; cin >> again;} while (again ==1);}

##### Share on other sites
Quote:
 Original post by Jareds411This should do the trick.

Trial division,
very slow!

ok, i got rid of my sieve (its slower then trial division!).

And i replaced it with very simple trial division

Dim sn As DoubleDim i As DoubleDim j As DoubleDim fin As DoubleDim t As DoubleDim k As DoubleDim out(1) As Doublesn = Int(Sqr(n)) + 1fin = Int(sn / 2) + 1i = 12j = sn + 1Doj = j - 1i = i + 1k = number / it = number / jIf Int(k) = k Then    out(0) = k    out(1) = i    factorize = out    Exit FunctionEnd IfIf Int(t) = t Then    out(0) = t    out(1) = j    factorize = out    Exit FunctionEnd IfDoEventsLoop Until j < finout(0) = 1out(1) = numberfactorize = out

Very simple, not very effective (but it still finds factors that are close to 13 and close to sqr(n), quite fast.

Is there any faster way?
From,
Nice coder

##### Share on other sites
There are faster ways, but they involve fun and games with number theory. An example is Quadratic Sieve.

##### Share on other sites
!!!!!

Thats a little... (ok way) over my head.

First, what is p?

Second, How Do you solve the congruencies on the factor base?

What is Dixons factorization?
What is Gausian elemination?

The quadratic prime sieve is nice...
But the factorization sieve is very complicated.

From,
Nice coder

##### Share on other sites
If you can afford having heuristic algorithm, there is a lot of great stuff (I mean really reliable):

- Rabbin-Miller's test - to check if the number(n) is prime:
the entry can be any small number(d), and the chance to discover, that 'n' is prime is 1/2. Not much, but you can repeat the test several times with different 'd' as needed. My teachers were joking, that the probability of the processor to mistake is larger than this algorithm to fail (and it's true).
But the usage of this for factorization is a thing to rethink...

- Rho-heuristic - to find a divisor of n. While common sieve takes O(n^(1/2)), the Rho takes up to O(d^(1/2)), where d is a smallest divisor of n (so at worse O(n^(1/4)) ).
But it may work forever with a small probability, so when it works too long, just stop it and try something else...

Search for them, or look up in 'well known' Thomas Cormen's "Introduction to algorithms"

/def

##### Share on other sites
The three top factorization algorithms are (in order of goodness):

The General Number Field Sieve
The Elliptic Curve Method

The GNFS is the best overall algorithm, whereas ECM is better than GNFS only for log-base10(n) <= ~80 which makes GNFS more useful in practical applications (such as factoring 'n' in RSA). The ECM is also much easier to implement if you're not terribly great with number theory.

##### Share on other sites
Quote:
 Original post by defferIf you can afford having heuristic algorithm, there is a lot of great stuff (I mean really reliable):- Rabbin-Miller's test - to check if the number(n) is prime:the entry can be any small number(d), and the chance to discover, that 'n' is prime is 1/2. Not much, but you can repeat the test several times with different 'd' as needed. My teachers were joking, that the probability of the processor to mistake is larger than this algorithm to fail (and it's true).But the usage of this for factorization is a thing to rethink...- Rho-heuristic - to find a divisor of n. While common sieve takes O(n^(1/2)), the Rho takes up to O(d^(1/2)), where d is a smallest divisor of n (so at worse O(n^(1/4)) ).But it may work forever with a small probability, so when it works too long, just stop it and try something else...Search for them, or look up in 'well known' Thomas Cormen's "Introduction to algorithms"/def

Ok, i've heard of the Rabbin Millers test (but don't know how to do it, i guess i'll have to [google])

The rho-heuristic is new to me, would you mind explaining more?

jperalta:
How exactly does the eleptic curve factorization work? (i'm limited to 100 trillion, so i guess it'll be one of the best for that).

From,
Nice coder

##### Share on other sites
Quote:
 Original post by Nice CoderThe rho-heuristic is new to me, would you mind explaining more?

It's Rho-Pollard's heuristic in fact.
I don't remember the details well (so gog. as always), but the general idea is, that 'n' and it's divisor 'd' show some common properties, while taking some numbers modulo 'n'.
If you take (almost) any starting number and start iterating it with the_function_I_do_not_remember_at_the_moment (which is as simple as x^2, btw), modulo n, then it will sooner or later fall into a loop. Then you will know, from the numbers from that loop, what the divisor you were looking for is.

\\As far as I remember(correct me!):
While prooving the method's functionality, we show, that if we knew 'd', and were taking the same numbers, but modulo d, we would fall into a loop in the same time.

As you can see, I am unable to give you precise answer without any sources, that I do not have anywhere near me. But I am pretty sure it is all in Cormen.

/def

##### Share on other sites
Quote:
Original post by deffer
Quote:
 Original post by Nice CoderThe rho-heuristic is new to me, would you mind explaining more?

It's Rho-Pollard's heuristic in fact.
I don't remember the details well (so gog. as always), but the general idea is, that 'n' and it's divisor 'd' show some common properties, while taking some numbers modulo 'n'.
If you take (almost) any starting number and start iterating it with the_function_I_do_not_remember_at_the_moment (which is as simple as x^2, btw), modulo n, then it will sooner or later fall into a loop. Then you will know, from the numbers from that loop, what the divisor you were looking for is.

\\As far as I remember(correct me!):
While prooving the method's functionality, we show, that if we knew 'd', and were taking the same numbers, but modulo d, we would fall into a loop in the same time.

As you can see, I am unable to give you precise answer without any sources, that I do not have anywhere near me. But I am pretty sure it is all in Cormen.

/def

ok.
I found This And This

I still don't understand it much, is there any chance of finding an algarithm? (in psudocode, basic, or just a walk-through)

From,
Nice coder

[Edited by - Nice Coder on November 2, 2004 4:26:26 AM]

##### Share on other sites
Quote:
 Original post by Nice CoderHow exactly does the eleptic curve factorization work? (i'm limited to 100 trillion, so i guess it'll be one of the best for that).

This is how I did it (in C#):

// Elliptic curve factorization methods        /// <summary>        /// This class describes a point on an elliptic curve that is described        /// in the form y^2 congruent to x^3 + ax + b (mod n)        /// </summary>        public class EllipticPoint        {            public BigInteger x;            public BigInteger y;            public BigInteger a;            public BigInteger b;            public BigInteger modulus;            public EllipticPoint(EllipticPoint p)            {                x = p.x;                y = p.y;                a = p.a;                b = p.b;                modulus = p.modulus;            }            public EllipticPoint()            {                x = new BigInteger();                y = new BigInteger();                a = new BigInteger();                b = new BigInteger();                modulus = new BigInteger();            }            public static EllipticPoint operator +(EllipticPoint p1, EllipticPoint p2)            {                if ((p1.modulus != p2.modulus) || (p1.a != p2.a) || (p1.b != p2.b))                    throw new Exception("P1 and P2 must be defined on the same elliptic curve.");                EllipticPoint p3 = new EllipticPoint();                p3.a = p1.a;                p3.b = p1.b;                p3.modulus = p1.modulus;                BigInteger dy = p2.y - p1.y;                BigInteger dx = p2.x - p1.x;                if (dx < 0)                    dx += p1.modulus;                if (dy < 0)                    dy += p1.modulus;                if (GCD(dx, p1.modulus) != 1)                {                    p3.a = GCD(dx, p1.modulus);                    p3.modulus = -1;                    return p3;                }                BigInteger m = (dy * dx.modInverse(p1.modulus)) % p1.modulus;                if (m < 0)                    m += p1.modulus;                p3.x = (Power(m, 2) - p1.x - p2.x) % p1.modulus;                p3.y = m * (p1.x - p3.x) - p1.y % p1.modulus;                if (p3.x < 0)                    p3.x += p1.modulus;                if (p3.y < 0)                    p3.y += p1.modulus;                return p3;            }            public static EllipticPoint operator -(EllipticPoint p)            {                EllipticPoint p2 = new EllipticPoint(p);                p2.x = p.x;                p2.y = -p.y;                return p2;            }            public static EllipticPoint operator -(EllipticPoint p1, EllipticPoint p2)            {                EllipticPoint p3 = new EllipticPoint(p1);                p3 = p1 + (-p2);                return p3;            }            public static EllipticPoint Double(EllipticPoint p)            {                EllipticPoint p2 = new EllipticPoint();                p2.a = p.a;                p2.b = p.b;                p2.modulus = p.modulus;                BigInteger dy = 3 * Power(p.x, 2) + p.a;                BigInteger dx = 2 * p.y;                if (dx < 0)                    dx += p.modulus;                if (dy < 0)                    dy += p.modulus;                if (GCD(dx, p.modulus) != 1)                {                    p2.a = GCD(dx, p.modulus);                    p2.modulus = -1;                    return p2;                }                BigInteger m = (dy * dx.modInverse(p.modulus)) % p.modulus;                p2.x = (Power(m, 2) - p.x - p.x) % p.modulus;                p2.y = (m * (p.x - p2.x) - p.y) % p.modulus;                if (p2.x < 0)                    p2.x += p.modulus;                if (p2.y < 0)                    p2.y += p.modulus;                return p2;            }        }        public static BigInteger Factor(EllipticPoint p, int n)        {            p.b = (Math.Power(p.y, 2) - (Math.Power(p.x, 3) + p.a * p.x)) % p.modulus;            if (p.b < 0)                p.b += p.modulus;            EllipticPoint p2 = EllipticPoint.Double(p);            if (p2.modulus == -1)                return p2.a;            for (int i = 1; i < n; i++)            {                p2 = p2 + p;                if (p2.modulus == -1)                    return p2.a;            }            return -1;        }

Basically, go ahead and choose an elliptic curve and a point on that curve. You then feed this algorithm your number and voila, out pops the smallest factor of it. It is an... O(e^((1 + o(1))sqrt(2ln(p)ln(ln(p)))) algorithm where p is the smallest factor of n. This makes it sub-exponentional, but super-polynomial.

Edit:

For comparison-
QS is O(e^((1 + o(1))sqrt(ln(n)ln(ln(n)))))
GNFS is O(e^((1.92 + o(1))ln(n)^(1/3)ln(ln(n))^(2/3)))

More info... my C# implementation of this takes about 1.3 seconds to factor a 12-digit product of two 6-digit primes.

##### Share on other sites
The following pseudocode is called the Brent variation of the Pollard Rho algorithm. For an explanation, I highly recommend chapter 5 of this book:

Bressoud, David M., Factorization and Primality Testing, Springer-Verlag, 1989.

This algorithm is very nice for factoring moderate-size numbers for which trial division would be prohibitively slow. It's a probabilistic algorithm, however, and could have a very long running time for some input numbers.

A typical generalized factoring program would first perform trial division by all of the primes less than, say, a million. (You'll probably want to have a table of these primes available as a data file that you've precomputed using the sieve of Eratosthenes.) Once trial division has removed all of the small factors, the program would move on to more powerful, but still relatively simple, algorithms such as Pollard Rho. If those fail to completely factor the input number, then you have to pull out the big guns like the Multiple Polynomial Quadratic Sieve.

Anyway, here's the Pollard Rho algorithm:

n = number to be factoredc = constant value (see comment below)max = maximum iterations (depends on how long you're willing to wait)check = how often to check gdc (something around every 10 iterations or so)x1 = 2x2 = 4 + crange = 1product = 1terms = 0for (int i = 0; i < max; i++){   for (int j = 0; j < range; j++)   {      x2 = (x2 * x2 + c) mod n      product *= (x1 - x2) mod n      if (++terms == check)      {         g = gdc(product, n)         if (g > 1) return (g)         product = 1         terms = 0      }   }   x1 = x2   range *= 2   for (int j = 0; j < range; j++)   {      x2 = (x2 * x2 + c) mod n   }}return (0)

The input value of c needs to be an integer that makes the polynomial x^2 + c irreducible. Ordinarily, you just start with c=1 and count up for cases 1 and 2 below.

This algorithm will return one of three possible values:

1) 0 means that the maximum number of iterations ran without a factor being found. Try choosing another value of c.
2) The number n itself means that the gcd did not find a proper factor. Try choosing another value of c.
3) Any other number means you found a nontrivial factor of n.

-- Eric Lengyel

##### Share on other sites
Quote:
 Original post by Eric LengyelThe following pseudocode is called the Brent variation of the Pollard Rho algorithm. For an explanation, I highly recommend chapter 5 of this book:Bressoud, David M., Factorization and Primality Testing, Springer-Verlag, 1989.This algorithm is very nice for factoring moderate-size numbers for which trial division would be prohibitively slow. It's a probabilistic algorithm, however, and could have a very long running time for some input numbers.A typical generalized factoring program would first perform trial division by all of the primes less than, say, a million. (You'll probably want to have a table of these primes available as a data file that you've precomputed using the sieve of Eratosthenes.) Once trial division has removed all of the small factors, the program would move on to more powerful, but still relatively simple, algorithms such as Pollard Rho. If those fail to completely factor the input number, then you have to pull out the big guns like the Multiple Polynomial Quadratic Sieve.Anyway, here's the Pollard Rho algorithm:*** Source Snippet Removed ***The input value of c needs to be an integer that makes the polynomial x^2 + c irreducible. Ordinarily, you just start with c=1 and count up for cases 1 and 2 below.This algorithm will return one of three possible values:1) 0 means that the maximum number of iterations ran without a factor being found. Try choosing another value of c.2) The number n itself means that the gcd did not find a proper factor. Try choosing another value of c.3) Any other number means you found a nontrivial factor of n.-- Eric Lengyel

!!!!!!

It workes!!!

I'm going to turn it into a function soon, and integrate it into my factorizor!

:) :) :) :) :) :)
[grin][grin][wink][grin]

jperalta: How exactly do you make an eliptic curve?

From,
Nice coder

##### Share on other sites
Quote:
 Maurer’s algorithm (Algorithm 4.62) generates random provable primes that are almostuniformly distributed over the set of all primes of a specified size. The expected time forgenerating a prime is only slightly greater than that for generating a probable prime of equalsize using Algorithm 4.44 with security parameter t = 1. (In practice, one may wish tochoose t > 1 in Algorithm 4.44; cf. Note 4.49.)

If you're interested in random generation of provable primes, you might want to take a look at this(pag 152).

##### Share on other sites
Basicaly, as far as I can tell from all sources that use this method you pick your curve in a rather arbitrary fashion. That is you pick your point P = (x,y) which fixes and x and y and pick an a and solve the equation y^2 = x^3 + ax + b for b.