If you create a comparison function that traces its arguments, like this in GHCi's command line:
> :module + Data.List Debug.Trace
> let myCompare x y = trace ("\tCmp " ++ show x ++ " " ++ show y) $ compare x y
then you can see the behaviour yourself:
> sortBy myCompare "foobar"
" Cmp 'f' 'o'
Cmp 'o' 'b'
Cmp 'f' 'b'
Cmp 'a' 'r'
Cmp 'b' 'a'
a Cmp 'b' 'r'
b Cmp 'f' 'o'
Cmp 'f' 'r'
f Cmp 'o' 'o'
Cmp 'o' 'r'
o Cmp 'o' 'r'
or"
Haskell is evaluating the string lazily, one character at a time. The left hand column is being printed as each character is found, with the right hand column recording the comparisons required, as printed by "trace".
Note that if you compile this, especially with optimisations on, you might get a different result. The optimiser runs a strictness analyser that will probably notice that the entire string gets printed, so it would be more efficient to evaluate it eagerly.
Then try
> head $ sortBy myCompare "foobar"
Cmp 'f' 'o'
Cmp 'o' 'b'
Cmp 'f' 'b'
Cmp 'a' 'r'
Cmp 'b' 'a'
'a'
If you want to understand how this works, look up the source code for the sort function and evaluate 'sort "foobar"' manually on paper.
qsort [] = []
qsort (x:xs) = qsort less ++ [x] ++ qsort greater
where (less, greater) = partition (< x) xs
So
qsort ('f':"oobar")
= qsort ('b':"a") ++ "f" ++ qsort ('o':"or")
= ("a" ++ "b") ++ "f" ++ qsort ('o':"or")
And now we have done enough to find that 'a' is the first item in the result without having to evaluate the other call to "qsort". I've omitted the actual comparison because its hidden inside the call to "partition". Actually "partition" is also lazy, so in fact the argument to the other "qsort" hasn't been evaluated as far as I've shown it.