This topic is 3077 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

Some time ago a bunch of guys here recommended me Haskell as my first functional language. I've read some tutorials and followed the advice of one guy to try to solve the problems in Project Euler to get some pratice. I have solved a couple of problems and found Haskell to be really cool to this sort of math programming, but I have found one that seems an inherently imperative problem.
Quote:
 By listing the first six prime numbers: 2, 3, 5, 7, 11, and 13, we can see that the 6^(th) prime is 13. What is the 10001^(st) prime number?
without optimizing too much, this seems fairly easy in c (assuming number doesn't overflows, about which I can't be sure):
int i = 0;
long long number = 2;
while (i < 10001)
{
if (isPrime(number++))
i++;
}

Now I'm very puzzled about how to solve this in Haskell. I think we can make a recursive list requiring that each element is strictly greater than the previous one and that each element is prime. Is this the functional way to do this? Any better approach? I don't want the solution, just a theoretical discussion about the functional way to do this sort of things.

##### Share on other sites
Rewrite that C code as a recursive solution, that should then more easily translate into Haskell [smile].

##### Share on other sites
When looking for the n-th number that satisfies some condition, it is idiomatic in Haskell to just describe all numbers that satisfy that condition, and then simply pick the n-th one from the list.
import Data.List (nubBy)primes = nubBy (\prime candidate -> gcd prime candidate > 1) [2..]

Usage:
GHCi, version 6.10.2: http://www.haskell.org/ghc/  :? for helpLoading package ghc-prim ... linking ... done.Loading package integer ... linking ... done.Loading package base ... linking ... done.[1 of 1] Compiling Main             ( prime.hs, interpreted )Ok, modules loaded: Main.*Main> primes !! 1031*Main> primes !! 100547*Main> primes !! 10007927*Main>

The 1000th number already takes quite a while to compute. You definitely need a more sophisticated method of generating prime numbers in order to compute the 10001th prime number before the universe ends ;)

##### Share on other sites
Haskell is a very cool language for many sorts of problems, and there are some twisted problems which are simpler in Haskell than any imperative language. But it's nothing you turn to if you need execution speed. Prime number generation requires speed, if you want big primes.

I've used it occasionally for playing with an idea, and once I get my idea working I move the implementation to something faster.

##### Share on other sites
@ScottMayo - If Haskell truly is too slow to find the 10001st prime in under a minute, then I would agree with you that it shouldn't be used for anything beyond experimentation. However, I find that claim *highly* unlikely. I've got no experience in Haskell, but my dumb brute-force solution written in Scheme solves this problem in around 1 second. That's without using even a sieve of Eratosthenes.

##### Share on other sites
Quote:
 Original post by EzbezIf Haskell truly is too slow to find the 10001st prime in under a minute

It's the algorithm's fault, not Haskell's. For faster alternatives, look here.

##### Share on other sites
Quote:
Original post by DevFred
Quote:
 Original post by EzbezIf Haskell truly is too slow to find the 10001st prime in under a minute

It's the algorithm's fault, not Haskell's. For faster alternatives, look here.

I'm expressing doubt that Haskell couldn't calculate the 10001st prime in the absolute most direct brute-force method (ie: a direct translation of the C code in the OP) in under a minute. I'm well aware of better algorithms (I mentioned the sieve of Eratosthenes in my previous post), yet I note that even in "slow" interpreted languages like Scheme the dumb, brute-force method still takes only a second.

I haven't used Haskell - is Haskell truly so slow that such an approach would be infeasible? Is Haskell really orders of magnitude slower than Scheme for such tasks?

##### Share on other sites
Haskell is not an imperative language. It actually has no support for mutability. Much state based things in it are really just elaborate hacks - UnsafeIO monad. Trying to translate imperative algorithims straight is not going to learn you anything. None idiomatic haskell is ugly and verbose and pointless. Here is a simple sieve based prime generator that is the brute force way of haskell:
sieve (n : xs) = n : sieve (filter (\ m -> (rem m n) /= 0) xs)primes = sieve [2..]take 1000 primes -- will instantly list first 1000 primes.

It is short and clear and the naive way. But far more elegant than the naive imperative manner. To improve stuff like that you look to other stuff like Fermat little theorem.

Although Haskell is fast enough for most problems if you need metal biting speed the ugly sacrifices required to get it is pointless better to use other language (BitC soon). But Haskell tends to be faster than scheme.

##### Share on other sites
Quote:
 I haven't used Haskell - is Haskell truly so slow that such an approach would be infeasible? Is Haskell really orders of magnitude slower than Scheme for such tasks?

No. It's not really that bad; the lazy execution scheme does a good job of skipping irrelevant work. But I was assuming that there's no real interest in identifying the 100001'th prime - that one's not hard to find, and occurs shortly after 104729 - but in getting big primes. And for that sort of work you want tight algorithms and bare-metal speeds, on distributed processors.

Haskell is kind of cool, but I never end up building anything with it that I keep. It's great for getting your head around a problem, though.

##### Share on other sites
If you really want to look at it ground-up in the functional way:

The N+1th prime is the smallest prime number larger than the Nth prime.

The smallest prime number larger than X is

X+1, if X+1 is prime;
otherwise, the smallest prime number larger than X+1.

So we recurse in two ways: on prime count, and on values above the previous prime.

##### Share on other sites
Some basic optimizations:

Start at checking i = 3 (and up) for being prime, increasing step by 2.
Then, iterate j from 3 adding up 2 until equal to the square root of i.
Ultimately, j would only iterate over previously found primes.

So, if i is the currently tested number for being prime, and j is the number with which we divide i (to check if the remainder is NOT zero, to be prime), then it'd look something like this:

    for (unsigned int i = 3; i < 1000000; i += 2)    {        bool is_prime = true;        unsigned int i_sqrt = sqrt(i); // sqrt is SLOW, optimize with a version without divisions, see below        for (unsigned int j = 3; j <= i_sqrt; j += 2)        {            if (!(i % j)) // slow...            {                is_prime = false;                break;            }        }        if (is_prime)        {            // i is a prime number        }    }

This is a version I wrote in ASM, which is remarkably faster (mainly because C doesn't allow non-predefined resizable arrays on the stack, while ASM does (just keep pushing at the top of the stack)):
global _mainextern _putcharextern __putcharextern _putnumberextern __putnumberextern _usqrtextern __usqrtsection .text_main:                ; void primes(unsigned int)    push ebp    mov ebp, esp    sub esp, 12          ; 3 local variables    ;mov dword [esp + 0], 2    ;call _putnumber    ;mov dword [esp + 0], 0Ah    ;call _putchar    mov eax, 5          ; to be tested prime number    ;   ebx             ; to be tested against previously found prime number    mov ecx, 0          ; number of previously found prime numbers - 1    ;   [ebp + 8]       ; maximum to variable 1    push 3    ;call __putnumber    ;push 0Ah    ;call __putchar    ;add esp, 4prime_loop:    push eax    call __usqrt        ; put sqrt of number-currently-tested-for-primeness into edx    pop edx             ; we only look for possible divisions of numbers lower then the sqrt    xor ebx, ebx        ; count from 0 to ecx (number of found primes)loop_found_primes:    mov [ebp - 4], ebx ; move to local variable 1 (+4 because ecx is 0 for 1 found prime)    mov ebx, ecx    sub ebx, [ebp - 4]    mov ebx, [esp + (ebx * 4)] ; move previously found prime to ebx    cmp ebx, edx                ; larger then sqrt of tested number    jg found_prime    mov [ebp - 8], edx ; move to local variable 2 (+4 because ecx is 0 for 1 found prime)    xor edx, edx    mov [ebp - 12], eax ; move to local variable 3 (+4 because ecx is 0 for 1 found prime)    idiv ebx    mov eax, [ebp - 12] ; we dont need outcome of dividing, only remainder needed    cmp edx, 0          ; no remainder? perfect division!? no prime number!    je prime_loop_end    mov edx, [ebp - 8] ; get back variable 2 (+4 because ecx is 0 for 1 found prime), the sqrt of the number we're testing against    mov ebx, [ebp - 4] ; get back variable 1 (+4 because ecx is 0 for 1 found prime), the index of the previously found prime we tested against    ; success, its not dividable by this number, lets check the next!    inc ebx    cmp ebx, ecx    jl loop_found_primes ; jump backfound_prime:    push eax            ; FOUND!    inc ecx             ; we found one, increase counter    ;call __putnumber    ;push 0Ah    ;call __putchar    ;add esp, 4prime_loop_end:    add eax, 2    cmp eax, 1000000;[ebp + 8] ; [ebp + 8], the last number we want to evaluate    jle prime_loop     ; jump backexit:    mov eax, [esp + 0]    mov [esp + 0], dword 0Ah    call __putchar    leave    ret

And here's usqrt (NOT written by me):
#include <string.h>#define BITSPERLONG 32#define TOP2BITS(x) ((x & (3L << (BITSPERLONG-2))) >> (BITSPERLONG-2))/* usqrt:    ENTRY x: unsigned long    EXIT  returns floor(sqrt(x) * pow(2, BITSPERLONG/2))    Since the square root never uses more than half the bits    of the input, we use the other half of the bits to contain    extra bits of precision after the binary point.    EXAMPLE        suppose BITSPERLONG = 32        then    usqrt(144) = 786432 = 12 * 65536                usqrt(32) = 370727 = 5.66 * 65536    NOTES        (1) change BITSPERLONG to BITSPERLONG/2 if you do not want            the answer scaled.  Indeed, if you want n bits of            precision after the binary point, use BITSPERLONG/2+n.            The code assumes that BITSPERLONG is even.        (2) This is really better off being written in assembly.            The line marked below is really a "arithmetic shift left"            on the double-long value with r in the upper half            and x in the lower half.  This operation is typically            expressible in only one or two assembly instructions.        (3) Unrolling this loop is probably not a bad idea.    ALGORITHM        The calculations are the base-two analogue of the square        root algorithm we all learned in grammar school.  Since we're        in base 2, there is only one nontrivial trial multiplier.        Notice that absolutely no multiplications or divisions are performed.        This means it'll be fast on a wide range of processors.*/unsigned int usqrt(unsigned long x){	unsigned long a = 0L;			         /* accumulator */	unsigned long r = 0L;                    /* remainder */	unsigned long e = 0L;                    /* trial product */	int i;	for (i = 0; i < BITSPERLONG; i++)	     /* NOTE 1 */	{		r = (r << 2) + TOP2BITS(x); x <<= 2; /* NOTE 2 */		a <<= 1;		e = (a << 1) + 1;		if (r >= e)		{			r -= e;			a++;		}	}	return (unsigned int) (a >> 16);}

Anyways, never even seen a piece of Haskell code, just some basic optimizations suggestions. Good luck!

PS: I just started last week with Project Euler, am at problem 16, and slightly stuck! :P

##### Share on other sites
Thank you all.

As most of you noted, I didn't want a discussion on the best algorithm to find prime numbers, or a way to convert C to Haskell.

I specially liked Zahlman help, which without a line of code, show me an approach which semed to work very well (It gave me the result in less than 10 seconds in a ~6 year old not so high-end laptop (which BTW has a 64 bit CPU but has a 32 bit OS).

The other solutions are not as efficient, specially the DevFred one, but as stated, is not a quick algorithm what I was looking for.

What does the filter exactly do in the following code?
sieve (n : xs) = n : sieve (filter (\ m -> (rem m n) /= 0) xs)

Here, I've seen what the nubBy does. It passes pairs of numbers from the list to the predicate and remove the second one if it returns True.
But, what's that "prime candidate" stuff?
primes = nubBy (\prime candidate -> gcd prime candidate > 1) [2..]

##### Share on other sites
Quote:
 Original post by ravengangrelWhat does the filter exactly do in the following code?sieve (n : xs) = n : sieve (filter (\ m -> (rem m n) /= 0) xs)

Filter removes all multiples of the first element from the rest of the list. We start first with the list [2,3,4,5,6,7,8,9,...]. We extract the first (2, which is a prime) and remove its multiples from the rest of the list, so we've got 2 : sieve [3,5,7,9,11,13,...]. Sieve then again takes the first element (3) and removes its multiples, resulting in 2 : 3 : [5, 7, 11, 13, 17, ...]. This goes on forever, and you end up with a list of primes.

Quote:
 Here, I've seen what the nubBy does. It passes pairs of numbers from the list to the predicate and remove the second one if it returns True.But, what's that "prime candidate" stuff?primes = nubBy (\prime candidate -> gcd prime candidate > 1) [2..]

NubBy removes duplicates from a list, using a user-defined equality test.

I don't really like this solution, as it depends on the actual implementation of nubBy. If the documentation would guarantee which element would be removed (it could as well remove the second of a pair), and the order in which it does it, then it might be ok, but AFAIK, the documentation just states "removes duplicates using given equality relation".

E.g. you could define two numbers to be "equal" if their squares are equal, (which means the same as x == x and x == -x), so that
nubBy (\x y -> x*x == y*y) [-2,-1,0,1,2]
will give you [-2,-1,0], but [0,1,2] or [-2,0,1] should be valid solutions as well.

So, what it does, is declare that x and y are equal if their gcd is larger than one (i.e. not equal to 1), so that eventually, after removal of all "duplicates", you end up with a list of numbers where the gcd of two arbitrary elements is 1. But, in theory, [3,4,7,11,13,17,19,23,25,29,31,...] is as good a solution as [2,3,5,7,11,13,17,19,23,29,31...], which is why I don't like this solution.

[Edited by - SamLowry on July 12, 2009 5:39:56 AM]

##### Share on other sites
Quote:
 Original post by ravengangrelBut, what's that "prime candidate" stuff?primes = nubBy (\prime candidate -> gcd prime candidate > 1) [2..]

prime and candidate are just the argument names of a lambda function. I could also have written
primes = nubBy (\x y -> gcd x y > 1) [2..]

but I like descriptive names :)

##### Share on other sites
Quote:
Original post by ScottMayo
Quote:
 I haven't used Haskell - is Haskell truly so slow that such an approach would be infeasible? Is Haskell really orders of magnitude slower than Scheme for such tasks?

No. It's not really that bad; the lazy execution scheme does a good job of skipping irrelevant work. But I was assuming that there's no real interest in identifying the 100001'th prime - that one's not hard to find, and occurs shortly after 104729 - but in getting big primes. And for that sort of work you want tight algorithms and bare-metal speeds, on distributed processors.

Okay, good. Though I would argue that if you wanted large primes for an actual purpose, the best way would be to utilize the internet unless you want *really* huge primes.

##### Share on other sites
Quote:
 Original post by DevFredprime and candidate are just the argument names of a lambda function.

Now I feel stupid :-)
So effectively it's just an erathostenes sieve.

Thank you, all these samples and explanations are really useful to grasp the functional thinking!

EDIT: However, I'm a bit puzzled that my naive approach following Zahlman advice is quite faster than the others which do an erathostenes sieve.
Here's my code just in case someone wants to take a look:

Quote:
 firstPrimeGreaterThan:: Integer -> IntegerfirstPrimeGreaterThan n = if isPrime (n + 1) then n + 1 else firstPrimeGreaterThan (n + 1)prime:: Integer -> Integerprime 1 = 2prime n = firstPrimeGreaterThan (prime (n - 1))isPrime:: Integer -> BoolisPrime 2 = TrueisPrime n = isPrime' n 2isPrime':: Integer -> Integer -> BoolisPrime' n k = if ((fromIntegral k) > (sqrt (fromIntegral n))) then True else if ((mod n k) == 0) then False else isPrime' n (k + 1)main = print (prime 10001)

##### Share on other sites
Quote:
Original post by ravengangrel
Quote:
 Original post by DevFredprime and candidate are just the argument names of a lambda function.

Now I feel stupid :-)
So effectively it's just an erathostenes sieve.

Thank you, all these samples and explanations are really useful to grasp the functional thinking!

EDIT: However, I'm a bit puzzled that my naive approach following Zahlman advice is quite faster than the others which do an erathostenes sieve.
Here's my code just in case someone wants to take a look:

Precisely, it is exactly the Sieve of Eratosthenes. The code you give is by far less efficient than the sieve in generating primes. For one, its subject to stack overflows. On my wimpy winhugs it overflows at 10001. In addition this methods gets slower as the gap between primes becomes larger and larger.

The sieve depends on how well lazy structures are implemented under the hood of the compiler and whether or not memoization is utilized. Did you try indexing (!!) into the generated list? as take n [n] will take a long time printing at the interactive session.

And without stepping on any toe's I must say that Zahlman's suggestion is not a particularly advisable start. A far more efficient intuition would be to start from the fact that a numbers least factor is always prime and hence only positive integers which are their own least divisors are prime. To save time you will note that given integer z if for all n ^ 2 < z, n does not divide z then z is prime. You would then proceed to filter the list of all integers which fail this primality check to get a prime list. Methods as this and the sieve are useful when factoring.

Even this method, not as efficient as the sieve is quicker than that which you provide. But if you truly cared about speed you would change algorithims. Using a battery of probabilistic primality checks, leveraging multiple processors, would give you speed gains that are many orders of magnitude greater. 10,000,000th prime in a couple seconds.

##### Share on other sites

This topic is 3077 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Create an account

Register a new account

• ### Forum Statistics

• Total Topics
628688
• Total Posts
2984243

• 16
• 13
• 13
• 10
• 10