tags:

views:

218

answers:

1

Referencing this code: http://stackoverflow.com/questions/2840714/f-static-member-type-constraints/2842037#2842037

Why is, for example,

let gL = G_of 1L
[1L..100000L] |> List.map (fun n -> factorize gL n)

significantly slower than

[1L..100000L] |> List.map (fun n -> factorize (G_of 1L) n)

By looking at Reflector, I can see that the compiler is treating each of these in very different ways, but there is too much going on for me to decipher the essential difference. Naively I assumed the former would perform better than the later because gL is precomputed whereas G_of 1L has to be computed 100,000 times (at least it appears that way).

[Edit]

It looks like this may be a bug with F# 2.0 / .NET 2.0 / Release-mode, see @gradbot's answer and discussion.

+1  A: 

Reflector shows test2() turned into 4 classes while test1() is turned into two classes. This only happens in debug mode. Reflector shows identical code (one class for each) in release mode. Unfortunately Reflector is crashing when I try to view the source in C# and the IL is really long.

let test1() =
    let gL = G_of 1L
    [1L..1000000L] |> List.map (fun n -> factorize gL n)

let test2() =
    [1L..1000000L] |> List.map (fun n -> factorize (G_of 1L) n)

A quick benchmark.

let sw = Stopwatch.StartNew()
test1() |> ignore
sw.Stop()
Console.WriteLine("test1 {0}ms", sw.ElapsedMilliseconds)

let sw2 = Stopwatch.StartNew()
test2() |> ignore
sw2.Stop()
Console.WriteLine("test2 {0}ms", sw2.ElapsedMilliseconds)

Benchmarks ran on I7 950 @3368Mhz, windows 7 64bit, VS2010 F#2.0

x86 Debug
test1 8216ms
test2 8237ms

x86 Release
test1 6654ms
test2 6680ms

x64 Debug
test1 10304ms
test2 10348ms

x64 Release
test1 8858ms
test2 8977ms

Here is the complete code.

open System
open System.Diagnostics

let inline zero_of (target:'a) : 'a = LanguagePrimitives.GenericZero<'a>
let inline one_of (target:'a) : 'a = LanguagePrimitives.GenericOne<'a>
let inline two_of (target:'a) : 'a = one_of(target) + one_of(target)
let inline three_of (target:'a) : 'a = two_of(target) + one_of(target)
let inline negone_of (target:'a) : 'a = zero_of(target) - one_of(target)

let inline any_of (target:'a) (x:int) : 'a =
    let one:'a = one_of target
    let zero:'a = zero_of target
    let xu = if x > 0 then 1 else -1
    let gu:'a = if x > 0 then one else zero-one

    let rec get i g = 
        if i = x then g
        else get (i+xu) (g+gu)
    get 0 zero 

type G<'a> = {
    negone:'a
    zero:'a
    one:'a
    two:'a
    three:'a
    any: int -> 'a
}    

let inline G_of (target:'a) : (G<'a>) = {
    zero = zero_of target
    one = one_of target
    two = two_of target
    three = three_of target
    negone = negone_of target
    any = any_of target
}

let inline factorizeG n = 
    let g = G_of n
    let rec factorize n j flist =  
        if n = g.one then flist 
        elif n % j = g.zero then factorize (n/j) j (j::flist) 
        else factorize n (j + g.one) (flist) 
    factorize n g.two []

let inline factorize (g:G<'a>) n =   //'
    let rec factorize n j flist =  
        if n = g.one then flist 
        elif n % j = g.zero then factorize (n/j) j (j::flist) 
        else factorize n (j + g.one) (flist) 
    factorize n g.two []

let test1() =
    let gL = G_of 1L
    [1L..100000L] |> List.map (fun n -> factorize gL n)

let test2() =
    [1L..100000L] |> List.map (fun n -> factorize (G_of 1L) n)

let sw2 = Stopwatch.StartNew()
test1() |> ignore
sw2.Stop()
Console.WriteLine("test1 {0}ms", sw2.ElapsedMilliseconds)

let sw = Stopwatch.StartNew()
test2() |> ignore
sw.Stop()
Console.WriteLine("test2 {0}ms", sw.ElapsedMilliseconds)

Console.ReadLine() |> ignore
gradbot
in the question, test1() is slower, not test2().
Yin Zhu
@Yin: that's correct. @gradbot: very interesting observation of different compilation based on debug vs. release mode... did you observe any performance differences depending on the mode?
Stephen Swensen
@gradbot: I experienced the same crash in Reflector.
Stephen Swensen
Thanks! I'm surprised though that even in debug mode your test1 and test2 results are very close, whereas both myself and Tomas saw test2 running about 3 seconds slower than test1 (I'm pretty sure I was in debug mode, I'm not sure about Tomas). But it's getting late for me, I'm going to have to try this out in release mode tomorrow.
Stephen Swensen
Very curious. I was able to duplicate comparable results as yours (test1 is slightly faster, as I had originally expected) with F# 2.0 / .NET 2.0 / x86 / Debug-mode, but in Release-mode get test1 running significantly slower. Looks like a bug specific to .NET 2.0 Release-mode. (note, my original benchmarks were done through FSI, which apparently always compiles in Release-mode, at least by default, and also ran faster than when complied and run in an .exe).
Stephen Swensen