How can I make this use less memory? [Sieve of Eratosthenes]

Started by
35 comments, last by Grain 18 years, 9 months ago
If you just want to make heaps and heaps of primes theres a pretty easy way to do it.

Make a list of small primes up to the suqare root of the number you want the primes to. (so if you want primes to 2^32, grab them to 2^16.)

Load up an array, call it 'counters'.
Set primes up the same way.
In counters shove your primes.
Set powarr as the two powers from 1 to 32

For i = 0 to n / 32Testbyte = ~0For p = 0 to number_of_Primes_in_listif counters /32 = i then 'You can do something to speed this up, like precomputing it in another table, but thats another story.    Testbyte &= ~powarr[counters % i] //Get rid of that byte    counters = counters + primes 'Move the counters to the next oneend ifIf testbyte = 0 then goto next_loop 'If there are no more possible primes,' then why bother testing any more? I know goto's are evil, but this is the only' way i can do itNext pFor tmp = 1 to 32 'Check the bits on the tmpbyte to see if we have anything.if testbyte & powarr[tmp]  then   print i * 32 + tmp " Is a PRIME!!!"end ifnext tmpnext_loop:Next i


This code should work (well, its psudocode), but i'm not making any guarentees.

Simply, its letting you sieve a bit slower, but with o(sqr(n)) memory. (which will help heaps if you ever want to find every prime to 2^64).
You might be able to speed it up quite a bit some some 'upgrades'. (things like storing counters

/32 when you make it.)

Hth.
From,
Nice coder

Click here to patch the mozilla IDN exploit, or click Here then type in Network.enableidn and set its value to false. Restart the browser for the patches to work.
Advertisement
Original post by Zahlman
    for (int i = 0; i < count; ++i) {      if (!(m % s)) { // current value is divisible by a former prime;                         // discard it        prime = false; break;        // In case you're wondering, solving this with a goto is just as tricky;        // you'd need to put it at the *end* of the outer loop, with no        // action following it, which is rather bizarre.      }    } 



I don’t see how that will work. You only compare the current number to be considered with the existing primes. Won’t you need to compare it with ALL the numbers from 2 to half it self to see if it’s prime, and not just the known primes smaller than it self?

No, you don't need to. All of the non-primes smaller than N are made up of the primes in that set. For example, 24 is not a prime, it's quite obviously divisible by 6. As such, it must also be divisible by 3 and 2, so why make more work for yourself by checking it against 6 once you know that it divides by 2 and 3? By the same token, 23 is a prime number. 10 is just under half of 23. However, 10 cleanly divides by 5 and 2. If 23 doesn't divide by 5 or 2, then it can't be divisible by 10, so why check it?

There's a mathematical law (can't remember which) that states that every number N is the product of two primes, p and p'. Thus, if a number is not divisible by any primes, then it won't be divisible by non-primes either, since they themselves are the products of smaller primes.
Quote:Original post by Motz
No, you don't need to. All of the non-primes smaller than N are made up of the primes in that set. For example, 24 is not a prime, it's quite obviously divisible by 6. As such, it must also be divisible by 3 and 2, so why make more work for yourself by checking it against 6 once you know that it divides by 2 and 3? By the same token, 23 is a prime number. 10 is just under half of 23. However, 10 cleanly divides by 5 and 2. If 23 doesn't divide by 5 or 2, then it can't be divisible by 10, so why check it?

There's a mathematical law (can't remember which) that states that every number N is the product of two primes, p and p'. Thus, if a number is not divisible by any primes, then it won't be divisible by non-primes either, since they themselves are the products of smaller primes.


0k. but for some reason that method painfully slow. I just copied and pasted Zahlmans code and compared it side by side with the bit array method, and the difference is ridiculous. That method takes about a minute and a half for just 1 million ints. As where the actual Sieve of Eratosthenes using bit aray takes less than a second.

Just for reference here is the entire program as it is now
#include <iostream>#include <iomanip>#include <vector>#include "bits.h"void Sieve(bits& s ,unsigned int n){	unsigned int m,i;	    	std::cout << "Filling set: " << 1 << " to " << n << std::endl << std::endl;	s.resize(n);	s.set(); // set(), clear() with no argument sets/clears all elements in the bit object;	s.clear(0);	s.clear(1);		std::cout << "Set Filled, " /*<< s.size() << " Items."<< std::endl /**/;	std::cout << "Removing non-primes:" << std::endl << std::endl;		m = 0;	for(i=0; i < n; i++)		if(s)		{			m++;			for(int k = (i+i); k < n; k+=i)				s.clear(k); 		}		std::cout << "Non-primes removed: " << m << " items remaining" << std::endl;	std::cout << "Primes ratio: " << (float)m/(float)n  << std::endl; }void Sieve(std::vector<unsigned int>& s ,unsigned int n) {  unsigned int m,i;  std::cout << "Calculating primes: " <<  std::endl <<  std::endl;  s.erase(s.begin(),s.end());  //s.push_back(1);  s.push_back(2);  for(m=3; m<=n; m+=2) {    int count = s.size();    bool prime = true;    for (int i = 0; i < count; ++i) {      if (!(m % s)) { // current value is divisible by a former prime;                         // discard it        prime = false; break;        // In case you're wondering, solving this with a goto is just as tricky;        // you'd need to put it at the *end* of the outer loop, with no        // action following it, which is rather bizarre.      }    }    if (prime) s.push_back(m);  }  std::cout << s.size() << " primes calculated." << std::endl;}int main(){		unsigned int count = 0;	unsigned int N = 1000000;	bits Bit(N);		std::vector<unsigned int> primes;	std::vector<unsigned int>::iterator itr;	Sieve(Bit,N);			Sieve(primes,N);}


//FILE: bits.h#include <vector>class bits{	std::vector<unsigned char> B;public:	bits(unsigned int size)	{		B.resize(1 + size/8);	}	void resize(unsigned int size)	{		B.resize(1 + size/8);	}	unsigned int size()	{		return B.size()*8;	}	void set(unsigned int bit)	{		unsigned int p = bit/8;		unsigned int b = bit%8;		unsigned char m = 1 << b;		B = B | m;	}	void set()	{		for(unsigned int i = 0; i < this->size(); i++)			set(i);	}	void clear(unsigned int bit)	{		unsigned int p = bit/8;		unsigned int b = bit%8;		unsigned char m = 1 << b;		m = ~m;		B = B & m;	}	void clear()	{		for(unsigned int i = 0; i < this->size(); i++)			clear(i);	}		bool operator [](unsigned int bit)	{		unsigned int p = bit/8;		unsigned int b = bit%8;		unsigned char m = 1 << b;		m = m & B;		m = m >> b;		return m;	}};

EDIT: Using the Bit array is insanely fast. Way faster than my original set method. I just rechecked my original size of 35 million ints which took about 1 minuet using std::set. But with the bit array it took only a couple seconds!!

[Edited by - Grain on July 4, 2005 8:17:39 AM]
Quote:Original post by Motz
There's a mathematical law (can't remember which) that states that every number N is the product of two primes, p and p'.


Thats wrong, just take 24 which is 2*3*4, which are more than two primes.
The theorem you remebered may be this one:
http://mathworld.wolfram.com/FundamentalTheoremofArithmetic.html


To Grain:
Are you looking to find ALL the primes, or are you just interested if a number is prime ?
For instance: (from here)
With the exception of 2 and 3, all primes are of the form p=6n+1 or p=6n-1.

And there are other tests for primes.

For the case of the Sieve, you may think about implementing your own virtual memory manager. Since on a 32 bit machine, your address space is as small as 2 or 3 GB (you need some space for the OS), you may want to manage the primes found in a big file on the harddisk. Then you should group your tests in blocks, i.e. 512byte/sector times 8primes/byte are 4096primes/sector. So you may test many new numbers against all 4096 primes of a sector, before you advance to the next sector. This way you can virtually increase your address space to many gigabytes. Also the OS will cache accesses, you only need to make sure that you work on the same data as long as possible.
Also remember that all primes above 2 end in 1,3,7 or 9.

So, instead of populating your array with loads of numbers, just populate them with those that finish in these digits.
Quote:Original post by Grain
0k. but for some reason that method painfully slow. I just copied and pasted Zahlmans code and compared it side by side with the bit array method, and the difference is ridiculous. That method takes about a minute and a half for just 1 million ints. As where the actual Sieve of Eratosthenes using bit aray takes less than a second.

Just for reference here is the entire program as it is now
*** Source Snippet Removed ***

*** Source Snippet Removed ***
EDIT: Using the Bit array is insanely fast. Way faster than my original set method. I just rechecked my original size of 35 million ints which took about 1 minuet using std::set. But with the bit array it took only a couple seconds!!


Try compiling with optimisations and bound your tests in the second version and you'll find they're much closer (although the bit array is still faster):
void Sieve(std::vector<unsigned int>& s ,unsigned int n) {  unsigned int m;  std::cout << "Calculating primes: " <<  std::endl <<  std::endl;  s.erase(s.begin(),s.end());  //s.push_back(1);  s.push_back(2);  int root = 2;  for(m=3; m<=n; m+=2) {    int count = s.size();    bool prime = true;    while (root * root < m)    {      ++root;    }    for (int i = 0; i < count && s <= root; ++i) {      if (!(m % s)) { // current value is divisible by a former prime;                         // discard it        prime = false; break;        // In case you're wondering, solving this with a goto is just as tricky;        // you'd need to put it at the *end* of the outer loop, with no        // action following it, which is rather bizarre.      }    }    if (prime) s.push_back(m);  }  std::cout << s.size() << " primes calculated." << std::endl;}


Enigma

[Edited by - Enigma on July 4, 2005 9:05:20 AM]
Quote:Original post by nmi
Quote:Original post by Motz
There's a mathematical law (can't remember which) that states that every number N is the product of two primes, p and p'.


Thats wrong, just take 24 which is 2*3*4, which are more than two primes.
The theorem you remebered may be this one:
http://mathworld.wolfram.com/FundamentalTheoremofArithmetic.html


4 isn't a prime, it's the product of 2 and 2. But I was wrong anyway, just checked back on a simulation I did a while ago ;) Sorry about that. Anyhow, the point it supported still stands, since every non-prime is divisable by 1 or more primes and thus you needn't try dividing by non-primes.

Quote:Original post by Prozak
Also remember that all primes above 2 end in 1,3,7 or 9.

So, instead of populating your array with loads of numbers, just populate them with those that finish in these digits.


Think of it in binary.
All primes but 2 end with the last bit set to 1.
Quote:Original post by Motz
Quote:Original post by nmi
Quote:Original post by Motz
There's a mathematical law (can't remember which) that states that every number N is the product of two primes, p and p'.


Thats wrong, just take 24 which is 2*3*4, which are more than two primes.
The theorem you remebered may be this one:
http://mathworld.wolfram.com/FundamentalTheoremofArithmetic.html


4 isn't a prime, it's the product of 2 and 2. But I was wrong anyway, just checked back on a simulation I did a while ago ;) Sorry about that. Anyhow, the point it supported still stands, since every non-prime is divisable by 1 or more primes and thus you needn't try dividing by non-primes.


Sorry for that, i wrote too fast ;-)
Should have written 2*3*5=30. Anyways, a number may have more than only two prime factors.

For the thing that you need not have to divide by non primes, thats correct. But isnt the sieve working like that anyways ? If a prime is found, all multiples of that prime are marked as non-prime.
Or the other way around, if you check a number N=n*n for being prime, you check the prime numbers from 2 to n, if they can divide that number N. If they don't divide it without remainder, then N is a prime.

This topic is closed to new replies.

Advertisement