the f# implementation uses a structural comparison of the resulting key.
let sortBy keyf seq =
let comparer = ComparisonIdentity.Structural
mkDelayedSeq (fun () ->
(seq
|> to_list
|> List.sortWith (fun x y -> comparer.Compare(keyf x,keyf y))
|> to_array) :> seq<_>)
(also sort)
let sort seq =
mkDelayedSeq (fun () ->
(seq
|> to_list
|> List.sortWith Operators.compare
|> to_array) :> seq<_>)
both Operators.compare and the ComparisonIdentity.Structural.Compare become (eventually)
let inline GenericComparisonFast<'T> (x:'T) (y:'T) : int =
GenericComparisonIntrinsic x y
// lots of other types elided
when 'T : float = if (# "clt" x y : bool #)
then (-1)
else (# "cgt" x y : int #)
but the route to this for the Operator is entirely inline, thus the JIT compiler will end up inserting a direct double comparison instruction with no additional method invocation overhead except for the (required in both cases anyway) delegate invocation.
The sortBy uses a comparer so will go through an additional virtual method call but is basically about the same.
In comparison the OrderBy function also must go through virtual method calls for the equality (Using EqualityComparer<T>.Default
) but the significant difference is that it sorts in place and uses the buffer created for this as the result. In comparison if you take a look at the sortBy you will see that it sorts the list (not in place, it uses the StableSortImplementation which appears to be merge sort) and then creates a copy of it as a new array. This additional copy (given the size of your input data) is likely the principle cause of the slow down though the differing sort implementations may also have an effect.
That said this is all guessing. If this area is a concern for you in performance terms then you should simply profile to find out what is taking the time.
If you wish to see what effect the sorting/copying change would have try this alternate:
// these are taken from the f# source so as to be consistent
// beware doing this, the compiler may know about such methods
open System.Collections.Generic
let mkSeq f =
{ new IEnumerable<'b> with
member x.GetEnumerator() = f()
interface System.Collections.IEnumerable with
member x.GetEnumerator() = (f() :> System.Collections.IEnumerator) }
let mkDelayedSeq (f: unit -> IEnumerable<'T>) =
mkSeq (fun () -> f().GetEnumerator())
// the function
let sortByFaster keyf seq =
let comparer = ComparisonIdentity.Structural
mkDelayedSeq (fun () ->
let buffer = Seq.to_array seq
Array.sortInPlaceBy (fun x y -> comparer.Compare(keyf x,keyf y)) buffer
buffer :> seq<_>)
I get some reasonable percentage speedups within the repl with very large (> million) input sequences but nothing like an order of magnitude. Your mileage, as always, may vary.