Jump to content

  • Log In with Google      Sign In   
  • Create Account


Integer factorization, the fast way?


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
20 replies to this topic

#1 Nice Coder   Members   -  Reputation: 366

Like
0Likes
Like

Posted 28 October 2004 - 08:35 PM

Ok, i've been building a small integer factorizer. It is currenly very slow. It takes 2.8 seconds to factorize 102,812,541,257 into 53 * 6,199 * 312,931. It is currently using a few things, First, it checks the common divisors (2,3,5,7 and 11). if it is divisible by those, then it has the factors almost instantly. Then it sqrt's the number. if it is < 200 then it does a trial division search (all numbers < 200, takes < 1 second) And finally, if it is > 200, then it uses a modified sieve of Eratosthenes. Basically, it checks n. if n isn't a factor, then 2n, 3n, 4n, 5n, on. will not be a factor, so you can rule them off your list. It is quite slow for big numbers, but spends most of its time removing all the common factors (2, 3, 5,7 and 11) and i don't know how to get rid of them beforehand. My question is: What is a method, which is faster then my sieve, and which is not too hard (without explanation, some things i just don't understand yet...), to understand or code. From, Nice coder

Sponsor:

#2 Alex Swinney   Members   -  Reputation: 236

Like
1Likes
Like

Posted 29 October 2004 - 05:12 AM

That is actually the sieve of Eratosthenese[edit can't spell]..;)

To make it as fast as possible, implement an Extended Euclidean Algorithm. That is about as fast as you'll find. It's also what is used for a vast majority of encryption/decryption techniques, so you know it is fast.

You will have to adapt it to your situation, though.

Here's a link for javascript implementation:
http://en.wikipedia.org/wiki/Extended_Euclidean_algorithm

Here's a link with a more algorithmic-orientation:
http://www-math.cudenver.edu/~wcherowi/courses/m5410/exeucalg.html

Hope this helps ya.

#3 Anonymous Poster_Anonymous Poster_*   Guests   -  Reputation:

0Likes

Posted 29 October 2004 - 07:23 PM

Quote:
Original post by Alex Swinney
That is actually the sieve of Eratosthenese[edit can't spell]..;)

To make it as fast as possible, implement an Extended Euclidean Algorithm. That is about as fast as you'll find. It's also what is used for a vast majority of encryption/decryption techniques, so you know it is fast.

You will have to adapt it to your situation, though.

Here's a link for javascript implementation:
http://en.wikipedia.org/wiki/Extended_Euclidean_algorithm

Here's a link with a more algorithmic-orientation:
http://www-math.cudenver.edu/~wcherowi/courses/m5410/exeucalg.html

Hope this helps ya.


Thanks, it does help me a bit :)

I'm limited to 100 trillion (15 digits before vb stuffs up the numbering system.) so, any algorith which isn't too slow should work.

What i was wondering about, tho is how to use that to get the factors...

I also wouldn't mind dumping the seive, for something a bit more elegent (my sieve is a little hacky... well, maybe more then a little).

From,
Nice coder

#4 Nice Coder   Members   -  Reputation: 366

Like
0Likes
Like

Posted 29 October 2004 - 08:52 PM

? i was logged in before...
Above post is mine

From,
Nice coder

#5 LilBudyWizer   Members   -  Reputation: 495

Like
0Likes
Like

Posted 30 October 2004 - 06:29 AM

You can get rid of them by taking their product and checking +/-1 and +/- all the primes greater than the highest prime in the product and less than half the product. To eliminate all multiples of the first five primes takes a list of 184 primes. So for example say your product was 2 then you check 2n+1, for 2*3 you check 6n+/-1, for 2*3*5 you check 30n+/-{1,7,11,13}, for 2*3*5*7 you check 210n+/-{1,11,13,...,97,101,103).

#6 Nice Coder   Members   -  Reputation: 366

Like
0Likes
Like

Posted 30 October 2004 - 04:00 PM

Quote:
Original post by LilBudyWizer
You can get rid of them by taking their product and checking +/-1 and +/- all the primes greater than the highest prime in the product and less than half the product. To eliminate all multiples of the first five primes takes a list of 184 primes. So for example say your product was 2 then you check 2n+1, for 2*3 you check 6n+/-1, for 2*3*5 you check 30n+/-{1,7,11,13}, for 2*3*5*7 you check 210n+/-{1,11,13,...,97,101,103).


Thank you!

Now if only i could find a way to implement one of the faster approaches... (so that i wouldn' need to do this at all)

From,
Nice coder





#7 Jareds411   Members   -  Reputation: 108

Like
0Likes
Like

Posted 30 October 2004 - 04:24 PM

This should do the trick.
#include <stdlib.h>
#include <iostream.h>

int main()
{

__int64 number, j, k;
short int again;

     system("cls");
     cout << "\n\n\t\t\t    Prime Finder\n\n";
     cout << "\n\n\t\t\t By: Jared Stewart.\n\n\n\n\n\t\t   ";
     system("PAUSE");
     system("cls");

    do
    {
    system("cls");
	cout << "Please enter a number to get the prime factors: ";

	cin >> number;

	j = 2;

    system("cls");
	cout << "The prime factors are: ";

     do
     {
         if (number == 2)
        break;

		if (number % j == 0 )  //check if its a factor
              {
              cout << j << " ";
              number = number/j;         //make the number to check smaller
               }
       else
       j++;
	}  while (j < 3);

      for (j =3; j <= sqrt(number); )

	{

		if (number % j == 0 )  //check if its a factor
              {
              cout << j << " ";
              number = number/j;         //make the number to check smaller
               }
       else
       j+=2;
	}

    cout << number;
    cout << "\n\n";
    cout << "To test a new number, press 1.  To quit, press 2.\n\n\n";
    cin >> again;
} while (again ==1);
}


#8 Nice Coder   Members   -  Reputation: 366

Like
0Likes
Like

Posted 31 October 2004 - 05:24 PM

Quote:
Original post by Jareds411
This should do the trick.


Trial division,
very slow!

ok, i got rid of my sieve (its slower then trial division!).

And i replaced it with very simple trial division


Dim sn As Double
Dim i As Double
Dim j As Double
Dim fin As Double
Dim t As Double
Dim k As Double
Dim out(1) As Double
sn = Int(Sqr(n)) + 1
fin = Int(sn / 2) + 1
i = 12
j = sn + 1
Do
j = j - 1
i = i + 1
k = number / i
t = number / j
If Int(k) = k Then
out(0) = k
out(1) = i
factorize = out
Exit Function
End If
If Int(t) = t Then
out(0) = t
out(1) = j
factorize = out
Exit Function
End If
DoEvents
Loop Until j < fin
out(0) = 1
out(1) = number
factorize = out


Very simple, not very effective (but it still finds factors that are close to 13 and close to sqr(n), quite fast.

Is there any faster way?
From,
Nice coder

#9 LilBudyWizer   Members   -  Reputation: 495

Like
0Likes
Like

Posted 31 October 2004 - 08:59 PM

There are faster ways, but they involve fun and games with number theory. An example is Quadratic Sieve.

#10 Nice Coder   Members   -  Reputation: 366

Like
0Likes
Like

Posted 31 October 2004 - 09:29 PM

!!!!!

Thats a little... (ok way) over my head.

Ok, Now a few question about that link...

First, what is p?

Second, How Do you solve the congruencies on the factor base?

What is Dixons factorization?
What is Gausian elemination?

The quadratic prime sieve is nice...
But the factorization sieve is very complicated.

From,
Nice coder



#11 deffer   Members   -  Reputation: 754

Like
0Likes
Like

Posted 01 November 2004 - 03:08 AM

If you can afford having heuristic algorithm, there is a lot of great stuff (I mean really reliable):

- Rabbin-Miller's test - to check if the number(n) is prime:
the entry can be any small number(d), and the chance to discover, that 'n' is prime is 1/2. Not much, but you can repeat the test several times with different 'd' as needed. My teachers were joking, that the probability of the processor to mistake is larger than this algorithm to fail (and it's true).
But the usage of this for factorization is a thing to rethink...

- Rho-heuristic - to find a divisor of n. While common sieve takes O(n^(1/2)), the Rho takes up to O(d^(1/2)), where d is a smallest divisor of n (so at worse O(n^(1/4)) ).
But it may work forever with a small probability, so when it works too long, just stop it and try something else...

Search for them, or look up in 'well known' Thomas Cormen's "Introduction to algorithms"

/def

#12 jperalta   Members   -  Reputation: 356

Like
1Likes
Like

Posted 01 November 2004 - 08:50 AM

The three top factorization algorithms are (in order of goodness):

The General Number Field Sieve
The Elliptic Curve Method
The Quadratic Sieve

The GNFS is the best overall algorithm, whereas ECM is better than GNFS only for log-base10(n) <= ~80 which makes GNFS more useful in practical applications (such as factoring 'n' in RSA). The ECM is also much easier to implement if you're not terribly great with number theory.

#13 Nice Coder   Members   -  Reputation: 366

Like
0Likes
Like

Posted 01 November 2004 - 07:54 PM

Quote:
Original post by deffer
If you can afford having heuristic algorithm, there is a lot of great stuff (I mean really reliable):

- Rabbin-Miller's test - to check if the number(n) is prime:
the entry can be any small number(d), and the chance to discover, that 'n' is prime is 1/2. Not much, but you can repeat the test several times with different 'd' as needed. My teachers were joking, that the probability of the processor to mistake is larger than this algorithm to fail (and it's true).
But the usage of this for factorization is a thing to rethink...

- Rho-heuristic - to find a divisor of n. While common sieve takes O(n^(1/2)), the Rho takes up to O(d^(1/2)), where d is a smallest divisor of n (so at worse O(n^(1/4)) ).
But it may work forever with a small probability, so when it works too long, just stop it and try something else...

Search for them, or look up in 'well known' Thomas Cormen's "Introduction to algorithms"

/def


Ok, i've heard of the Rabbin Millers test (but don't know how to do it, i guess i'll have to [google])

The rho-heuristic is new to me, would you mind explaining more?

jperalta:
How exactly does the eleptic curve factorization work? (i'm limited to 100 trillion, so i guess it'll be one of the best for that).

From,
Nice coder

#14 deffer   Members   -  Reputation: 754

Like
0Likes
Like

Posted 01 November 2004 - 08:20 PM

Quote:
Original post by Nice Coder
The rho-heuristic is new to me, would you mind explaining more?


It's Rho-Pollard's heuristic in fact.
I don't remember the details well (so gog. as always), but the general idea is, that 'n' and it's divisor 'd' show some common properties, while taking some numbers modulo 'n'.
If you take (almost) any starting number and start iterating it with the_function_I_do_not_remember_at_the_moment (which is as simple as x^2, btw), modulo n, then it will sooner or later fall into a loop. Then you will know, from the numbers from that loop, what the divisor you were looking for is.

\\As far as I remember(correct me!):
While prooving the method's functionality, we show, that if we knew 'd', and were taking the same numbers, but modulo d, we would fall into a loop in the same time.

As you can see, I am unable to give you precise answer without any sources, that I do not have anywhere near me. But I am pretty sure it is all in Cormen.

/def

#15 Nice Coder   Members   -  Reputation: 366

Like
0Likes
Like

Posted 01 November 2004 - 09:26 PM

Quote:
Original post by deffer
Quote:
Original post by Nice Coder
The rho-heuristic is new to me, would you mind explaining more?


It's Rho-Pollard's heuristic in fact.
I don't remember the details well (so gog. as always), but the general idea is, that 'n' and it's divisor 'd' show some common properties, while taking some numbers modulo 'n'.
If you take (almost) any starting number and start iterating it with the_function_I_do_not_remember_at_the_moment (which is as simple as x^2, btw), modulo n, then it will sooner or later fall into a loop. Then you will know, from the numbers from that loop, what the divisor you were looking for is.

\\As far as I remember(correct me!):
While prooving the method's functionality, we show, that if we knew 'd', and were taking the same numbers, but modulo d, we would fall into a loop in the same time.

As you can see, I am unable to give you precise answer without any sources, that I do not have anywhere near me. But I am pretty sure it is all in Cormen.

/def


ok.
I found This And This

From google.

I still don't understand it much, is there any chance of finding an algarithm? (in psudocode, basic, or just a walk-through)

From,
Nice coder

[Edited by - Nice Coder on November 2, 2004 4:26:26 AM]

#16 jperalta   Members   -  Reputation: 356

Like
0Likes
Like

Posted 02 November 2004 - 05:23 AM

Quote:
Original post by Nice Coder
How exactly does the eleptic curve factorization work? (i'm limited to 100 trillion, so i guess it'll be one of the best for that).


This is how I did it (in C#):


// Elliptic curve factorization methods

/// <summary>
/// This class describes a point on an elliptic curve that is described
/// in the form y^2 congruent to x^3 + ax + b (mod n)
/// </summary>
public class EllipticPoint
{
public BigInteger x;
public BigInteger y;
public BigInteger a;
public BigInteger b;
public BigInteger modulus;

public EllipticPoint(EllipticPoint p)
{
x = p.x;
y = p.y;
a = p.a;
b = p.b;
modulus = p.modulus;
}

public EllipticPoint()
{
x = new BigInteger();
y = new BigInteger();
a = new BigInteger();
b = new BigInteger();
modulus = new BigInteger();
}

public static EllipticPoint operator +(EllipticPoint p1, EllipticPoint p2)
{
if ((p1.modulus != p2.modulus) || (p1.a != p2.a) || (p1.b != p2.b))
throw new Exception("P1 and P2 must be defined on the same elliptic curve.");
EllipticPoint p3 = new EllipticPoint();
p3.a = p1.a;
p3.b = p1.b;
p3.modulus = p1.modulus;

BigInteger dy = p2.y - p1.y;
BigInteger dx = p2.x - p1.x;

if (dx < 0)
dx += p1.modulus;
if (dy < 0)
dy += p1.modulus;

if (GCD(dx, p1.modulus) != 1)
{
p3.a = GCD(dx, p1.modulus);
p3.modulus = -1;
return p3;
}

BigInteger m = (dy * dx.modInverse(p1.modulus)) % p1.modulus;
if (m < 0)
m += p1.modulus;
p3.x = (Power(m, 2) - p1.x - p2.x) % p1.modulus;
p3.y = m * (p1.x - p3.x) - p1.y % p1.modulus;
if (p3.x < 0)
p3.x += p1.modulus;
if (p3.y < 0)
p3.y += p1.modulus;
return p3;
}

public static EllipticPoint operator -(EllipticPoint p)
{
EllipticPoint p2 = new EllipticPoint(p);
p2.x = p.x;
p2.y = -p.y;
return p2;
}

public static EllipticPoint operator -(EllipticPoint p1, EllipticPoint p2)
{
EllipticPoint p3 = new EllipticPoint(p1);
p3 = p1 + (-p2);
return p3;
}

public static EllipticPoint Double(EllipticPoint p)
{
EllipticPoint p2 = new EllipticPoint();
p2.a = p.a;
p2.b = p.b;
p2.modulus = p.modulus;

BigInteger dy = 3 * Power(p.x, 2) + p.a;
BigInteger dx = 2 * p.y;

if (dx < 0)
dx += p.modulus;
if (dy < 0)
dy += p.modulus;

if (GCD(dx, p.modulus) != 1)
{
p2.a = GCD(dx, p.modulus);
p2.modulus = -1;
return p2;
}

BigInteger m = (dy * dx.modInverse(p.modulus)) % p.modulus;
p2.x = (Power(m, 2) - p.x - p.x) % p.modulus;
p2.y = (m * (p.x - p2.x) - p.y) % p.modulus;
if (p2.x < 0)
p2.x += p.modulus;
if (p2.y < 0)
p2.y += p.modulus;

return p2;
}
}

public static BigInteger Factor(EllipticPoint p, int n)
{
p.b = (Math.Power(p.y, 2) - (Math.Power(p.x, 3) + p.a * p.x)) % p.modulus;
if (p.b < 0)
p.b += p.modulus;

EllipticPoint p2 = EllipticPoint.Double(p);

if (p2.modulus == -1)
return p2.a;

for (int i = 1; i < n; i++)
{
p2 = p2 + p;
if (p2.modulus == -1)
return p2.a;
}

return -1;
}






Basically, go ahead and choose an elliptic curve and a point on that curve. You then feed this algorithm your number and voila, out pops the smallest factor of it. It is an... O(e^((1 + o(1))sqrt(2ln(p)ln(ln(p)))) algorithm where p is the smallest factor of n. This makes it sub-exponentional, but super-polynomial.

Edit:

For comparison-
QS is O(e^((1 + o(1))sqrt(ln(n)ln(ln(n)))))
GNFS is O(e^((1.92 + o(1))ln(n)^(1/3)ln(ln(n))^(2/3)))

More info... my C# implementation of this takes about 1.3 seconds to factor a 12-digit product of two 6-digit primes.

#17 Eric Lengyel   Crossbones+   -  Reputation: 2174

Like
-1Likes
Like

Posted 02 November 2004 - 05:37 PM

The following pseudocode is called the Brent variation of the Pollard Rho algorithm. For an explanation, I highly recommend chapter 5 of this book:

Bressoud, David M., Factorization and Primality Testing, Springer-Verlag, 1989.

This algorithm is very nice for factoring moderate-size numbers for which trial division would be prohibitively slow. It's a probabilistic algorithm, however, and could have a very long running time for some input numbers.

A typical generalized factoring program would first perform trial division by all of the primes less than, say, a million. (You'll probably want to have a table of these primes available as a data file that you've precomputed using the sieve of Eratosthenes.) Once trial division has removed all of the small factors, the program would move on to more powerful, but still relatively simple, algorithms such as Pollard Rho. If those fail to completely factor the input number, then you have to pull out the big guns like the Multiple Polynomial Quadratic Sieve.

Anyway, here's the Pollard Rho algorithm:


n = number to be factored
c = constant value (see comment below)
max = maximum iterations (depends on how long you're willing to wait)
check = how often to check gdc (something around every 10 iterations or so)

x1 = 2
x2 = 4 + c
range = 1
product = 1
terms = 0

for (int i = 0; i < max; i++)
{
for (int j = 0; j < range; j++)
{
x2 = (x2 * x2 + c) mod n
product *= (x1 - x2) mod n
if (++terms == check)
{
g = gdc(product, n)
if (g > 1) return (g)
product = 1
terms = 0
}
}

x1 = x2
range *= 2
for (int j = 0; j < range; j++)
{
x2 = (x2 * x2 + c) mod n
}
}

return (0)




The input value of c needs to be an integer that makes the polynomial x^2 + c irreducible. Ordinarily, you just start with c=1 and count up for cases 1 and 2 below.

This algorithm will return one of three possible values:

1) 0 means that the maximum number of iterations ran without a factor being found. Try choosing another value of c.
2) The number n itself means that the gcd did not find a proper factor. Try choosing another value of c.
3) Any other number means you found a nontrivial factor of n.

-- Eric Lengyel

#18 Nice Coder   Members   -  Reputation: 366

Like
0Likes
Like

Posted 02 November 2004 - 06:21 PM

Quote:
Original post by Eric Lengyel
The following pseudocode is called the Brent variation of the Pollard Rho algorithm. For an explanation, I highly recommend chapter 5 of this book:

Bressoud, David M., Factorization and Primality Testing, Springer-Verlag, 1989.

This algorithm is very nice for factoring moderate-size numbers for which trial division would be prohibitively slow. It's a probabilistic algorithm, however, and could have a very long running time for some input numbers.

A typical generalized factoring program would first perform trial division by all of the primes less than, say, a million. (You'll probably want to have a table of these primes available as a data file that you've precomputed using the sieve of Eratosthenes.) Once trial division has removed all of the small factors, the program would move on to more powerful, but still relatively simple, algorithms such as Pollard Rho. If those fail to completely factor the input number, then you have to pull out the big guns like the Multiple Polynomial Quadratic Sieve.

Anyway, here's the Pollard Rho algorithm:

*** Source Snippet Removed ***

The input value of c needs to be an integer that makes the polynomial x^2 + c irreducible. Ordinarily, you just start with c=1 and count up for cases 1 and 2 below.

This algorithm will return one of three possible values:

1) 0 means that the maximum number of iterations ran without a factor being found. Try choosing another value of c.
2) The number n itself means that the gcd did not find a proper factor. Try choosing another value of c.
3) Any other number means you found a nontrivial factor of n.

-- Eric Lengyel


!!!!!!

It workes!!!

I'm going to turn it into a function soon, and integrate it into my factorizor!

:) :) :) :) :) :)
[grin][grin][wink][grin]

jperalta: How exactly do you make an eliptic curve?

From,
Nice coder


#19 igni ferroque   Members   -  Reputation: 415

Like
0Likes
Like

Posted 03 November 2004 - 05:53 PM

Decent explanation of elliptic curves

#20 xor   Members   -  Reputation: 516

Like
0Likes
Like

Posted 03 November 2004 - 06:03 PM

Quote:

Maurer’s algorithm (Algorithm 4.62) generates random provable primes that are almost
uniformly distributed over the set of all primes of a specified size. The expected time for
generating a prime is only slightly greater than that for generating a probable prime of equal
size using Algorithm 4.44 with security parameter t = 1. (In practice, one may wish to
choose t > 1 in Algorithm 4.44; cf. Note 4.49.)


If you're interested in random generation of provable primes, you might want to take a look at this(pag 152).




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS