I’m beginning to learn F# and the best way for me to go about that is to start applying it. There are some interesting computational problems over at Project Euler . This post is the continuation of a series of posts each involving a separate Project Euler problem. This post involves solutions to Project Euler problem #7.

By listing the first six prime numbers: 2, 3, 5, 7, 11, and 13, we can see that the 6

^{th}prime is 13.What is the 10001

^{st}prime number?

My first naive attempt is to generate a sequence of primes simply by testing each and every number for whether it is prime:

let dividesEvenly dividend divisor = int64(dividend % divisor) = 0L let isPrime x = seq { 2 .. x-1 } |> Seq.exists (dividesEvenly x) |> not let primes1 = Seq.initInfinite (fun x -> x + 2) |> Seq.filter isPrime let version1 = Seq.nth (10001-1 (* nth is zero-based *)) primes1 printfn "Project Euler Problem #7: What is the 10001st prime number?" printf "Answer: %d" version1 |

This code took 8 minutes to generate the answer on my machine. What kind of optimizations can we make?

- The isPrime function currently tests the number from 2 up to, but not including, the number being tested against. This can be improved in two ways: 1) it need only test up to the square root of the number being tested, and 2) it only needs to test against the prime numbers.

We first start by memoizing and lazily evaluating the sequence of primes by turning it into a LazyList:

let rec primes2 = let nextPrime prevPrime = Seq.initInfinite (add (prevPrime + 1)) |> Seq.filter isPrime1 |> Seq.hd let rec PrimesStartingFrom prime = LazyList.consf prime (fun () -> PrimesStartingFrom (nextPrime prime)) LazyList.consf 2 (fun () -> PrimesStartingFrom 3) |

The code still takes 8 minutes to return an answer, but now we can reuse the sequence without reincurring the processing cost. Now we transform the isPrime function to use the new primes sequence.

open System let rec isPrime2 x = let limit = int(Math.Sqrt(float(x))) primes2 |> LazyList.to_seq |> Seq.takeWhile (fun i -> i <= limit) |> Seq.exists (dividesEvenly x) |> not and primes2 = let nextPrime prevPrime = Seq.initInfinite (add (prevPrime + 1)) |> Seq.filter isPrime2 |> Seq.hd let rec PrimesStartingFrom prime = LazyList.consf prime (fun () -> PrimesStartingFrom (nextPrime prime)) LazyList.consf 2 (fun () -> PrimesStartingFrom 3) let answer2 = lazy (primes2 |> LazyList.drop 10000 |> LazyList.hd) |

This version only takes about 200 milliseconds to generate the answer on my machine! That’s a 2400x improvement! This is pretty good, but I’d like to try a different algorithm to see if it’s any better: the well known Sieve of Eratosthenes.

let sieveOfEratosthenes n = let prime = true let composite = false let (candidates:bool array) = Array.create (int(n+1)) prime candidates.[0] <- composite candidates.[1] <- composite for i in { 2..n } do if candidates.[int(i)] = prime then for j in { i..i..n } do // flag all multiples of i as composite since candidate i is prime candidates.[int(j)] <- composite candidates |

This will produce an array of boolean values where the value indicates whether the corresponding index is prime. This can be sped up by recognizing that all of the multiples of i, from 2*i to i^{2} have already been marked as composite, so we can start marking composites from i^{2} instead of i.

let sieveOfEratosthenes n = let prime = true let composite = false let (candidates:bool array) = Array.create (int(n+1)) prime candidates.[0] <- composite candidates.[1] <- composite for i in { 2..n } do if candidates.[int(i)] = prime && (new bigint(i))*(new bigint(i)) <= (new bigint(n)) (* avoid 32-bit integer overflow *) then for j in { i*i..i..n } do // flag all multiples of i as composite since candidate i is prime candidates.[int(j)] <- composite candidates |

That adjustment reduces the running time by about 25%. For low values of primes like the one in Project Euler problem #7, my memoized version is faster than the Sieve of Eratosthenes, but comparing the algorithms when trying to find the millionth prime and the Sieve wins.

The only problem with the Sieve of Eratosthenes is that is produces all of the primes up to a certain number – which is difficult when you don’t know how high you may go. Luckily the primes follow a well-known distribution, and the upper bound on how big the n^{th} prime will be is given by the equation n * ln n + n * ln ln n.

let upperBound n:int = let f = float(n) let ln x = Math.Log x f*(ln f) + f*(ln (ln f)) |> int |

Putting it all together we get:

open Microsoft.FSharp.Math let sieveOfEratosthenes n = let prime = true let composite = false let (candidates:bool array) = Array.create (int(n+1)) prime candidates.[0] <- composite candidates.[1] <- composite for i in { 2..n } do if candidates.[int(i)] = prime && (new bigint(i))*(new bigint(i)) <= (new bigint(n)) then for j in { i*i..i..n } do // flag all multiples of i as composite since candidate i is prime candidates.[int(j)] <- composite candidates let upperBound n:int = let f = float(n) let ln x = Math.Log x f*(ln f) + f*(ln (ln f)) |> int let sieve2Seq (sieve:bool array) = seq { for i = 2 to sieve.Length do if sieve.[i] then yield i } let nthPrime3 n = sieveOfEratosthenes (upperBound n) |> sieve2Seq |> Seq.nth (int(n - 1 (* zero-based *))) let answer3 = lazy (nthPrime3 10001) |