Started by Oct 25 2012 10:12 AM

,
10 replies to this topic

Posted 25 October 2012 - 10:12 AM

Hello, I am working on some problems on Project Euler and in one problem it asks you to Write a program which computes biggest prime factor of number. I have written program in C, what finds all primes up to entered number, divides that number by primes and shows biggest prime factor. It is working just fine, bet when I entered bigger numbers like million and so on(and number in Project Euler problem is really big) it crashes. I think it might have issues with perfomance of program and my algorithm efficiency. This is my code:

[source lang="cpp"]//Searches for number prime factors and return biggest prime factor 2012-10-23(24)#include <stdio.h>#include <stdlib.h>int isPrime(int num); // checks if number is prime: 1 - prime 0 - non-primevoid setPrimeArray(int * arr, int num); // composes array of primes up to num to test factorsint * primeFactors(int * arr, int num);//finds prime factors of numberint biggestPrimeFactor(int * factors,int size);//finds biggest prime factorint setArray (int * arr); // deletes 0 from array and returns setted array(using dynamic reallocation)int main(){ int num = 0; int * numFactors = 0; int numOfFac = 0; int i = 0; printf("Enter the number whose prime factors you want to find: \n"); scanf("%i", &num); int * primeArray = calloc(num,sizeof(int)); setPrimeArray(primeArray,num);//optimize this function int s = setArray(primeArray); printf("Prime numbers up to %i:\n",num); for(;i < s; i++)printf("%i ",primeArray[i]); i = 0; // primeArray -=3; numFactors = primeFactors(primeArray, num); numOfFac = setArray(numFactors); printf("\nPrime Factors of %i is:\n",num); for(;i < numOfFac; i++)printf("%i ",numFactors[i]); printf("\nThe biggest prime factor of %i is: %i\n",num, biggestPrimeFactor(numFactors,numOfFac)); system("PAUSE"); free(primeArray); free(numFactors); return 0;}int isPrime(int num){ int factor = 0; if (num == 1) return 1; int i = 1; for (; i <= num; i ++) { if( num % i == 0) factor++; } if (factor <= 2) return 1; else return 0;}void setPrimeArray(int * arr, int num){ int i = 1; int j = 0; for(; i <= num; i++) { if(isPrime(i)) { arr[j] = i; j++; } }}int * primeFactors(int * arr, int num){ int i = 1;//not 0 - do not include prime factor 1, starts from 2 int j = 0;//for indexing primeFactors array; int * factors = calloc(num, sizeof(int));//HERE WAS ERROR - MUST USE DYNAMIC MEMORY WITH FACTORS for(;i < num;) { if (num % arr[i] == 0) // { factors[j] = arr[i]; //BIG, BIG //*factors = arr[i]; // //factors++; // j++; // MISTAKE... num /= arr[i]; // i = 1; // } else i++; if (num == 1) return factors; } return factors;}int biggestPrimeFactor(int * factors,int size){ int i = 0; int max = factors[0]; for(;i < size; i ++) { if (factors[i] > max) max = factors[i]; } return max;}int setArray (int * arr){ int i = 0; while(*arr != 0) { arr++; i++; } arr = realloc(arr, sizeof(int)*i); return i;}[/source]

And I think I need optimize this function:

[source lang="cpp"]void setPrimeArray(int * arr, int num){ int i = 1; int j = 0; for(; i <= num; i++) { if(isPrime(i)) { arr[j] = i; j++; } }}[/source]

Because it takes a lot of time with bigger numbers in this function. It makes an array of primes up to numbers, so I can later use it to find prime factors of number. So if anyone have ideas how I can optimize my code, please share them with me.

[source lang="cpp"]//Searches for number prime factors and return biggest prime factor 2012-10-23(24)#include <stdio.h>#include <stdlib.h>int isPrime(int num); // checks if number is prime: 1 - prime 0 - non-primevoid setPrimeArray(int * arr, int num); // composes array of primes up to num to test factorsint * primeFactors(int * arr, int num);//finds prime factors of numberint biggestPrimeFactor(int * factors,int size);//finds biggest prime factorint setArray (int * arr); // deletes 0 from array and returns setted array(using dynamic reallocation)int main(){ int num = 0; int * numFactors = 0; int numOfFac = 0; int i = 0; printf("Enter the number whose prime factors you want to find: \n"); scanf("%i", &num); int * primeArray = calloc(num,sizeof(int)); setPrimeArray(primeArray,num);//optimize this function int s = setArray(primeArray); printf("Prime numbers up to %i:\n",num); for(;i < s; i++)printf("%i ",primeArray[i]); i = 0; // primeArray -=3; numFactors = primeFactors(primeArray, num); numOfFac = setArray(numFactors); printf("\nPrime Factors of %i is:\n",num); for(;i < numOfFac; i++)printf("%i ",numFactors[i]); printf("\nThe biggest prime factor of %i is: %i\n",num, biggestPrimeFactor(numFactors,numOfFac)); system("PAUSE"); free(primeArray); free(numFactors); return 0;}int isPrime(int num){ int factor = 0; if (num == 1) return 1; int i = 1; for (; i <= num; i ++) { if( num % i == 0) factor++; } if (factor <= 2) return 1; else return 0;}void setPrimeArray(int * arr, int num){ int i = 1; int j = 0; for(; i <= num; i++) { if(isPrime(i)) { arr[j] = i; j++; } }}int * primeFactors(int * arr, int num){ int i = 1;//not 0 - do not include prime factor 1, starts from 2 int j = 0;//for indexing primeFactors array; int * factors = calloc(num, sizeof(int));//HERE WAS ERROR - MUST USE DYNAMIC MEMORY WITH FACTORS for(;i < num;) { if (num % arr[i] == 0) // { factors[j] = arr[i]; //BIG, BIG //*factors = arr[i]; // //factors++; // j++; // MISTAKE... num /= arr[i]; // i = 1; // } else i++; if (num == 1) return factors; } return factors;}int biggestPrimeFactor(int * factors,int size){ int i = 0; int max = factors[0]; for(;i < size; i ++) { if (factors[i] > max) max = factors[i]; } return max;}int setArray (int * arr){ int i = 0; while(*arr != 0) { arr++; i++; } arr = realloc(arr, sizeof(int)*i); return i;}[/source]

And I think I need optimize this function:

[source lang="cpp"]void setPrimeArray(int * arr, int num){ int i = 1; int j = 0; for(; i <= num; i++) { if(isPrime(i)) { arr[j] = i; j++; } }}[/source]

Because it takes a lot of time with bigger numbers in this function. It makes an array of primes up to numbers, so I can later use it to find prime factors of number. So if anyone have ideas how I can optimize my code, please share them with me.

Deltron Zero and Automator.

Posted 25 October 2012 - 10:59 AM

Yea that's going to be extremely slow for large numbers. You're reserving memory for num integers when really you only need memory proportional to the number of prime factors of num at the end of the day. So if you entered 1,000,000 you're looking at almost 4 megs of RAM reserved, roughly 797 pages. As you iterate through all that, your swapping pages constantly from CPU cache to memory and visa versa. That's generally a slow(ish) operation.

While your solution certainly works, it's not terribly efficient overall in terms of memory (as you've seen). My suggestion is to go back to the drawing board and see if you can figure out how to calculate the primes without requiring so much memory. You're almost there, you're just being a little overzealous with calloc...

While your solution certainly works, it's not terribly efficient overall in terms of memory (as you've seen). My suggestion is to go back to the drawing board and see if you can figure out how to calculate the primes without requiring so much memory. You're almost there, you're just being a little overzealous with calloc...

Posted 25 October 2012 - 11:03 AM

memory allocation isn't the problem, since that extra memory will never be used, will just sit there.

The problem is that there is so far between the big primes, and you test way too many numbers.

You could cut it in half if you only test odd numbers... (i+=2) but there is probably smarter ways...

But even better, since prime numbers never change, why not precalculate them once, save to file, and then load the prime table from file instead?

The problem is that there is so far between the big primes, and you test way too many numbers.

You could cut it in half if you only test odd numbers... (i+=2) but there is probably smarter ways...

But even better, since prime numbers never change, why not precalculate them once, save to file, and then load the prime table from file instead?

**Edited by Olof Hedman, 25 October 2012 - 11:08 AM.**

Posted 25 October 2012 - 11:11 AM

Yea disregard what I said, Olof is correct. You're not accessing the majority of pages so your memory allocation is indeed not causing the problem. I still disagree with the magnitude of the allocation though...

Posted 25 October 2012 - 11:42 AM

Thanks for reply, I will change it to test just odd numbers, if it will be still slow, I will have to change the algorithm or i will load prime table from file. Thanks.

Deltron Zero and Automator.

Posted 25 October 2012 - 11:55 AM

Not really sure how much of a hint you want here but there's an abstract data type very suited to this type of search. I won't name it, but the next paragraph contains problem spoilers (if that's such a thing...):

Each number has a pair of factors that are, themselves, numbers that are either primes or smaller numbers with pairs of factors. If you keep splitting your number this way, eventually you'll have a particular type of data structure containing a bunch of numbers at the very "bottom" (my word, not the actual term typically used) that are all prime. All you have to do at that point is pick the right one...

You don't necessarily need to store factors in a file, though there's no reason not too if you really want to do it.

Each number has a pair of factors that are, themselves, numbers that are either primes or smaller numbers with pairs of factors. If you keep splitting your number this way, eventually you'll have a particular type of data structure containing a bunch of numbers at the very "bottom" (my word, not the actual term typically used) that are all prime. All you have to do at that point is pick the right one...

You don't necessarily need to store factors in a file, though there's no reason not too if you really want to do it.

Posted 25 October 2012 - 12:07 PM

The main algorithmic improvement here is diving only by numbers up to the square root of the number.

long largest_prime_factor(long n) { // Take out all the factors of 2 while (n%2==0) n/=2; if (n==1) return 2; // Take out all the factors of 3 while (n%3==0) n/=3; if (n==1) return 3; for (long p=5; p*p<=n; p+=6) { // Take out all the factors of p while (n%p==0) n/=p; if (n==1) return p; long q = p+2; // Take out all the factors of q while (n%q==0) n/=q; if (n==1) return q; } return n; // If there's something left that doesn't have a divisor <= sqrt(n), then n is prime and we can return it. }

Posted 25 October 2012 - 12:28 PM

Hi. You dont need to calculate all the factors, or check whether they are prime or not. You just need to divide the number starting from 2 until you reach a prime number.

I have written something quickly and it seems to be working.

[source lang="cpp"]#include <stdio.h>int main() { int number; printf("Enter number: "); scanf("%d", &number); for (int i = 2; i < number; ++i) { while (number % i == 0) number /= i; } printf("Biggest prime: %d\n", number); return 0;}[/source]

I have written something quickly and it seems to be working.

[source lang="cpp"]#include <stdio.h>int main() { int number; printf("Enter number: "); scanf("%d", &number); for (int i = 2; i < number; ++i) { while (number % i == 0) number /= i; } printf("Biggest prime: %d\n", number); return 0;}[/source]

**Edited by aavci, 25 October 2012 - 12:32 PM.**

Posted 25 October 2012 - 07:15 PM

The main algorithmic improvement here is diving only by numbers up to the square root of the number.

...code snippet...

Nice. Here's my version:

int largestPrimeFactor = 2; for (int i=2; i<=n; ++i) { while (n % i == 0) { n/=i; largestPrimeFactor = i; } }

Posted 27 October 2012 - 08:28 PM

aavci's code doesn't seem to handle situations where the largest prime divides the number twice (try number=4).

alnite's code is correct, but it will take a very long time to run when n is a large prime, which might make it unsuitably slow for the Project Euler challenge.

alnite's code is correct, but it will take a very long time to run when n is a large prime, which might make it unsuitably slow for the Project Euler challenge.

Posted 27 October 2012 - 08:49 PM

Also, trial division isn't the only way to factorize numbers. Basically, factoring is a difficult problem, and it is in fact infeasible to do it when the prime factors are larger than a few hundred digits. Fortunately, you're not dealing with such large numbers.

I would expect Project Euler to specifically choose a large number to prevent you from using trial division, forcing you to implement a better algorithm. Look into Pollard's rho algorithm, or Pollard's p + 1. There's also a continued fraction algorithm (SQUFOF) which is somewhat arcane, but works. Quadratic sieve is the next step up but is considerably harder to implement. Number Field Sieve (NFS) is state-of-the-art but good luck implementing that, seriously. You can also look into elliptic curve factorization (elliptic curve method or ECM for short) which is pretty good for average-size factors but is a bit tricky to get right and difficult to understand - you need to understand elliptic curves.

For reference, quadratic sieve can factorize a 100-digit semiprime into two 50-digit primes in four hours or so. Trial division would have done the same in, say, a few trillion years.

All of this is probably overkill for 64-bit integers though.

I would expect Project Euler to specifically choose a large number to prevent you from using trial division, forcing you to implement a better algorithm. Look into Pollard's rho algorithm, or Pollard's p + 1. There's also a continued fraction algorithm (SQUFOF) which is somewhat arcane, but works. Quadratic sieve is the next step up but is considerably harder to implement. Number Field Sieve (NFS) is state-of-the-art but good luck implementing that, seriously. You can also look into elliptic curve factorization (elliptic curve method or ECM for short) which is pretty good for average-size factors but is a bit tricky to get right and difficult to understand - you need to understand elliptic curves.

For reference, quadratic sieve can factorize a 100-digit semiprime into two 50-digit primes in four hours or so. Trial division would have done the same in, say, a few trillion years.

All of this is probably overkill for 64-bit integers though.

**Edited by Bacterius, 27 October 2012 - 08:53 PM.**

*“If I understand the standard right it is legal and safe to do this but the resulting value could be anything.”*