tags:

views:

162

answers:

3
group :: Ord a => [(a, [b])] -> [(a, [b])]

I want to look up all pairs that have the same fst, and merge them, by appending all the list of bs together where they have the same a and discarding the unnessecary pair and so on...

I got as far as:

group ((s, ls):(s', ls'):ps) = 
    if s == s' 
    then group ((s, ls++ls'):ps) 
    else (s, ls) : group ((s', ls'):ps)
group p = p

but obviously this ain't going to cut it, because it doesn't group everything.

Edit: example

[("a", as),("c", cs), ("c", cs3), ("b", bs),("c", cs2), ("b", bs2)]

would output

[("a", as),("c", cs++cs2++cs3),("b", bs++bs2)]
+5  A: 
import Data.List hiding (group)

group :: (Eq a) => [(a, [b])] -> [(a, [b])]
group ((s,l):rest) = (s, l ++ concatMap snd matches) : group nonmatches
    where
        (matches, nonmatches) = partition (\x-> fst x == s) rest
group x = x

this function produces the result:

group [("Dup", ["2", "3"]), ("Dup", ["1"]), ("Non", ["4"]), ("Dup", ["5"])]
    = [("Dup", ["2", "3", "1", "5"]), ("Non", ["4"])]

it works by filtering the remaining bits into two camps, the bits that match and the bits that dont. it then combines the ones that match and recurses on the ones that don't. This effectly means you will have one tuple in the output list per 'key' in the input list.

barkmadley
Your 'doubleFilter' function exists in Data.List as 'partition'.
Nefrubyr
Doesn't the choice to partition (`doubleFilter`) the list result in *O(n^2)* complexity? If so, then sorting the list first (see my solution) may improve the algorithm to *O(n log n)*. Is this assertion correct or not?
Stephan202
you are correct. you could even increase efficiency on your solution by writing a custom mergesort that merges the groups in the merge phase, instead of merging after sorting.
barkmadley
I think it will be O(n*m) where m is the number of distinct values the first element in the tuples will take, so depending on the input, the sort and group version could be either faster or slower.
Tirpen
@barkmadley: That seems more trouble than it's worth :). By the way, you can now remove the import `Data.Monoid` statement, since you no longer use `mappend`.
Stephan202
@Tirpen: agreed.
Stephan202
@Stephan202 thanks...my monoid...i miss it already :(
barkmadley
+8  A: 
Stephan202
Nice, I didn't know about 'fromListWith'. That's really the best answer provided the first element of the pairs is an instance of Ord.
Nefrubyr
`combine = assocs . fromListWith (flip (++))` would also solve the reversal problem
barkmadley
A: 

Another solution, using a fold to accumulate the groups in a Map. Because of the Map this does require that a is an instance of Ord (BTW your original definition requires that a is an instance of Eq, which barkmadley has incorporated in his solution).

import qualified Data.Map as M

group :: Ord a => [(a, [b])] -> [(a, [b])]
group = M.toList . foldr insert M.empty
  where
    insert (s, l) m = M.insertWith (++) s l m

If you're a big fan of obscurity, replace the last line with:

    insert = uncurry $ M.insertWith (++)

This omits the unnecessary m and uncurry breaks the (s, l) pair out into two arguments s and l.

Nefrubyr