views:

231

answers:

4

I am trying to learn F# so I paid a visit to Project Euler and I am currently working on problem 3.

The prime factors of 13195 are 5, 7, 13 and 29.

What is the largest prime factor of the number 600851475143?

Some things to consider:

  1. My first priority is to learn good functional habits.
  2. My second priority is I would like it to be fast and efficient.

Within the following code I have marked the section this question is regarding.

let isPrime(n:int64) = 
    let rec check(i:int64) = 
        i > n / 2L or (n % i <> 0L && check(i + 1L))
    check(2L) 

let greatestPrimeFactor(n:int64) =
    let nextPrime(prime:int64):int64 = 
        seq { for i = prime + 1L to System.Int64.MaxValue do if isPrime(i) then yield i }  
        |> Seq.skipWhile(fun v -> n % v <> 0L) 
        |> Seq.hd
    let rec findNextPrimeFactor(number:int64, prime:int64):int64 =
        if number = 1L then prime else 
            //************* No variable
            (fun p -> findNextPrimeFactor(number / p, p))(nextPrime(prime))    
            //*************               
            //************* Variable
            let p = nextPrime(prime)
            findNextPrimeFactor(number / p, p) 
            //*************
    findNextPrimeFactor(n, 2L) 

Update

Based off some of the feedback I have refactored the code to be 10 times faster.

module Problem3

module private Internal =
    let execute(number:int64):int64 = 
        let rec isPrime(value:int64, current:int64) = 
            current > value / 2L or (value % current <> 0L && isPrime(value, current + 1L))   
        let rec nextPrime(prime:int64):int64 = 
            if number % prime = 0L && isPrime(prime, 2L) then prime else nextPrime(prime + 1L)       
        let rec greatestPrimeFactor(current:int64, prime:int64):int64 =
            if current = 1L then prime else nextPrime(prime + 1L) |> fun p -> greatestPrimeFactor(current / p, p)
        greatestPrimeFactor(number, 2L)


let execute() = Internal.execute(600851475143L)

Update

I would like to thank everyone for there advice. This latest version is a compilation of all the advice I received.

module Problem3

module private Internal =
    let largestPrimeFactor number = 
        let rec isPrime value current = 
            current > value / 2L || (value % current <> 0L && isPrime value (current + 1L))   
        let rec nextPrime value =
            if number % value = 0L && isPrime value 2L then value else nextPrime (value + 1L) 
        let rec find current prime =
            match current / prime with
            | 1L -> prime
            | current -> nextPrime (prime + 1L) |> find current
        find number (nextPrime 2L)            

let execute() = Internal.largestPrimeFactor 600851475143L
+4  A: 

the variable p is actually a name binding, not a variable. Using name binding is not a bad style. And it is more readable. The lazy style of nextPrime is good, and it actually prime-test each number only once during the whole program.

My Solution

let problem3 = 
    let num = 600851475143L
    let rec findMax (n:int64) (i:int64) =
        if n=i || n<i then
            n
        elif n%i=0L then
            findMax (n/i) i
        else
            findMax n (i+1L)
    findMax num 2L

I basically divides num from 2, 3, 4.. and don't consider any prime numbers. Because if we divides all 2 from num, then we won't be able to divide it by 4,8, etc.

on this number, my solution is quicker:

> greatestPrimeFactor 600851475143L;;
Real: 00:00:01.110, CPU: 00:00:00.702, GC gen0: 1, gen1: 1, gen2: 0
val it : int64 = 6857L
> 
Real: 00:00:00.001, CPU: 00:00:00.000, GC gen0: 0, gen1: 0, gen2: 0

val problem3 : int64 = 6857L
Yin Zhu
Your solution is good but I think the question is more about feedback on the *style* of the asker's solution.
Brian McKenna
Impressive but not very helpful.
ChaosPandion
Looking at your algorithm I am wondering if it would work for any `n`.
ChaosPandion
If `n` is a big prime number, the algorithm does not work. In that case, we need to generate all prime number under `sqrt(n)`.
Yin Zhu
I think you mean "binding" rather than "blinding."
Chuck
+1  A: 

I think that the code with the temporary binding is significantly easier to read. It's pretty unusual to create an anonymous function and then immediately apply it to a value as you do in the other case. If you really want to avoid using a temporary value, I think that the most idiomatic way to do that in F# would be to use the (|>) operator to pipe the value into the anonymous function, but I still think that this isn't quite as readable.

kvb
+5  A: 

Concerning "good functional habit" or rather good practice I see three minor things. Using the yield in your sequence is a little harder to read than just filter. Unnecessary type annotations in a type inferred language leads to difficult refactoring and makes the code harder to read. Don't go overboard and try to remove every type annotation though if you're finding it difficult. Lastly making a lambda function which only takes a value to use as a temp variable reduces readability.

As far as personal style goes I prefer more spaces and only using tupled arguments when the data makes sense being grouped together.

I'd write your original code like this.

let isPrime n = 
    let rec check i = 
        i > n / 2L || (n % i <> 0L && check (i + 1L))
    check 2L

let greatestPrimeFactor n =
    let nextPrime prime = 
        seq {prime + 1L .. System.Int64.MaxValue}
        |> Seq.filter isPrime
        |> Seq.skipWhile (fun v -> n % v <> 0L) 
        |> Seq.head

    let rec findNextPrimeFactor number prime =
        if number = 1L then 
            prime 
        else 
            let p = nextPrime(prime)
            findNextPrimeFactor (number / p) p

    findNextPrimeFactor n 2L

Your updated code is optimal for your approach. You would have to use a different algorithm like Yin Zhu answer to go faster. I wrote a test to check to see if F# makes the "check" function tail recursive and it does.

gradbot
Removing the type annotation makes the code so much friendlier. Thanks.
ChaosPandion
+7  A: 

Functional programming becomes easier and more automatic with practice, so don't sweat it if you don't get it absolutely right on the first try.

With that in mind, let's take your sample code:

let rec findNextPrimeFactor(number:int64, prime:int64):int64 =
        if number = 1L then prime else 
            //************* No variable
            (fun p -> findNextPrimeFactor(number / p, p))(nextPrime(prime))    
            //*************               
            //************* Variable
            let p = nextPrime(prime)
            findNextPrimeFactor(number / p, p) 
            //*************

Your no variable version is just weird, don't use it. I like your version with the explicit let binding.

Another way to write it would be:

nextPrime(prime) |> fun p -> findNextPrimeFactor(number / p, p)

Its ok and occasionally useful to write it like this, but still comes across as a little weird. Most of the time, we use |> to curry values without needing to name our variables (in "pointfree" style). Try to anticipate how your function will be used, and if possible, re-write it so you can use it with the pipe operator without explicit declared variables. For example:

let rec findNextPrimeFactor number prime =
    match number / prime with
    | 1L -> prime
    | number' -> nextPrime(prime) |> findNextPrimeFactor number'

No more named args :)


Ok, now that we have that out of the way, let's look at your isPrime function:

let isPrime(n:int64) = 
    let rec check(i:int64) = 
        i > n / 2L or (n % i <> 0L && check(i + 1L))
    check(2L) 

You've probably heard to use recursion instead of loops, and that much is right. But, wherever possible, you should abstract away recursion with folds, maps, or higher order functions. Two reasons for this:

  1. its a little more readable, and

  2. improperly written recursion will result in a stack overflow. For example, your function is not tail recursive, so it'll blow up on large values of n.

I'd rewrite isPrime like this:

let isPrime n = seq { 2L .. n / 2L } |> Seq.exists (fun i -> n % i = 0L) |> not

Most of the time, if you can abstract away your explicit looping, then you're just applying transformations to your input sequence until you get your results:

let maxFactor n =
    seq { 2L .. n - 1L }                        // test inputs
    |> Seq.filter isPrime                       // primes
    |> Seq.filter (fun x -> n % x = 0L)         // factors
    |> Seq.max                                  // result

We don't even have intermediate variables in this version. Coolness!


My second priority is I would like it to be fast and efficient.

Most of the time, F# is going to be pretty comparable with C# in terms of speed, or it's going to be "fast enough". If you find your code takes a long time to execute, it probably means you're using the wrong data structure or a bad algorithm. For a concrete example, read the comments on this question.

So, the code I've written is "elegant" in the sense that its concise, gives the correct results, and doesn't rely on any trickery. Unfortunately, its not very fast. For start:

  • it uses trial division to create a sequence of primes, when the Sieve of Eratosthenes would be much faster. [Edit: I wrote a somewhat naive version of this sieve which didn't work for numbers larger than Int32.MaxValue, so I've removed the code.]

  • read Wikipedia's article on the prime counting function, it'll give you pointers on calculating the first n primes as well as estimating the upper and lower bounds for the nth prime.

[Edit: I included some code with a somewhat naive implementation of a sieve of eratosthenes. It only works for inputs less than int32.MaxValue, so it probably isn't suitable for project euler.]

Juliet
really cool! let isPrime n = seq { 2L .. n / 2L } |> Seq.exists (fun i -> n % i = 0L) |> not. will this generate code that has an early stop, just like the break statement in a for loop. thanks.
Yin Zhu
Yes, Seq.exists breaks as soon as it finds a true element.
gradbot
For what its worth, it *looks like* the `removeMultiples` function can be parallelized for even more speed. At least in my own trial runs I found the overhead of creating a bunch of short-lived threads resulted in really awful performance.
Juliet
notice that there is a bug in your last code. try maxFactor (2L*17L). we need to divide n by all factors less than sqrt(n) to see if there's a big prime number left.
Yin Zhu
Yin Zhu: oops! Additionally, I used a bitarray for my prime sieve, which can only represent primes from 2 to int32.MaxValue, making it unsuitable for project euler (oops again!).
Juliet
I'd rewrite `Seq.exists (fun i -> n%i=0L) |> not` as `Seq.forall (fun i -> n%i<>0L)`. It also breaks as soon as it finds an i that divides n and avoids an, IMO, somewhat confusing negation at the end.
cfern
Thanks so much, I know you must have put in considerable effort to write out this answer and its sagely advice.
ChaosPandion