views:

152

answers:

3

I have a list like this: (Pseudo notation)

(X,...) -> (X,...) -> (X,...) -> ...
   |          |          |
   V          V          V
(Y,...)    (Y,...)    (Y,...)
   |          |          |
   V          V          V
(Z,...)    (Z,...)    (Z,...)

Type is (Enum a, Bounded a) => [[(a,x)]]. But I need something like this:

(X, ... -> ... -> ... -> ...
   |
   V
(Y, ... -> ... -> ... -> ...
   |
   V
(Z, ... -> ... -> ... -> ...

Type is like (Enum a, Bounded a) => [(a,[x])]

x has an arbitrary number of elements. It can be assumed, that each Member of x is a key in each sublist of the first list.

How is this transformation possible as a lazy haskell algorithm (List doesn't needs to be evaluated completely to return (partitially) result)?

Test data

See above, something like this:

--Input
[[(Foo,1),(Bar,1),(Baz,1)],[(Foo,2),(Bar,2),(Baz,2)],...]

--Output
[(Foo,[1,2,3,...]),(Bar,[1,2,3,...),(Baz,[1,2,3,...])]

What I want to do with the data

I want to use it in a function like this:

myFunc :: [(MyEnum,[Int])]
myFunc x@((_,(_:[])):_) = x
myFunc x            = foldTheListRecursively

The function has to work on large amounts of data (~10'000 entries per enum), the list should be garbage collectable by the runtime system (The list is adhoc build by another part of the program)

My (uggly) implementation

This is the way, I implemented it, but obviously it doesn't fits the requirements, as the list is traversed multiple times:

restructList :: [[(a,x)]] -> [(a,[x])]
resturctList list = (\x -> (x,listFor x)) <$> keys where
  keys = fst <$> head list
  listFor x = snd <$> any ((==x).fst) <$> list

I'm not at home so can't test it, so there may be a mistake.

+1  A: 

I'm not a 100% sure, but from the sourcecode it looks like Data.List.transpose is lazy. http://www.haskell.org/ghc/docs/6.12.2/html/libraries/base-4.2.0.1/src/Data-List.html#transpose is my source for it. I think that transpose can help you to restructure the pointers:

transpose [[1,2,3],[4,5,6],[7,8,9]]
-- results in [[1,4,7],[2,5,8],[3,6,9]]

So I'd think of something like

foo :: [[(a, b)]] -> [(a, [b])]
foo = map (\x -> (fst (head x), map snd x)) . transpose
Jakob Runge
Seems to be really good. I check, whether this is suitable.
FUZxxl
I did it, works. But I found another approach for my solution (using a fold over the entire structure with something like `foldl (zipWith parseOneEnum) init list`
FUZxxl
+1: makes my point-free side wish that something like `twoply :: (a->b, a->c) -> a -> (b,c)` was a standard library function.
rampion
@rampion: Ever had a look at arrows?
FUZxxl
+5  A: 

Some sample data would have made your question much easier to understand. I assume that given a list like:

input = [[("foo", 1), ("foo", 2)], [("bar", 3), ("bar", 4)]]

You want to get

output = [("foo",[1,2]), ("bar",[3,4])]

If so, the first thing that springs to mind is Data.Map.insertWith. This is like creating a map from keys to values, except if the value already exists, a function you specify is applied to the current value and the new value, and the result is inserted.

For example, if we write:

import qualified Data.Map as M
step0 = M.insertWith (++) "key" ["value"] M.empty

Then step0 is just a map that maps key to value. But if we call it again:

step1 = M.insertWith (++) "key" ["OH HAI"] step0

Now we have a map from key to ["value","OH HAI"]. This is almost exactly what you want, but instead of lists of strings, you want a list of some Enum/Boundeds.

So, the first step is to take one "row" of your data, and add that to a map:

import qualified Data.List as L
toMap1 :: M.Map a b -> [(a,b)] -> M.Map a b
toMap1 = L.foldr (λ(k,v) m → M.insertWith (++) k [v] m)

Given the first element of input from the very top, you get:

toMap M.empty (head input)
    ==> [("foo",[1,2])]

Now we just need to accumulate into this map for every row, instead of just the first one. That's just another fold:

toMap2 :: [[(a,b)]] -> Map a b
toMap2 = L.foldr (flip toMap1) M.empty

Now you can write:

toMap2 input

and get:

fromList [("bar",[3,4]),("foo",[1,2])]

A simple M.toList turns this back into a regular list, which yields output.

jrockway
Nice, I didn't realize you can use `λ` in haskell code like that!
Daenyth
@Daenyth: [No you can't](http://hackage.haskell.org/trac/ghc/ticket/1102).
KennyTM
@KennyTM: Ah, I see. I wonder what @jrockway was doing...
Daenyth
Emacs does this for me; I think it makes everything easier to read.
jrockway
@KennyTM: There's a pragma: `{-# LANGUAGE UnicodeSyntax #-}`
FUZxxl
The question is: Does this solution fits the requirement.
FUZxxl
If you care about efficiency, it may be better to use insertWith with cons instead of append. This is left as an exercise for the reader.
jrockway
@FUZ: [That pragma](http://www.haskell.org/ghc/docs/6.12.2/html/users_guide/syntax-extns.html#unicode-syntax) won't make `λ` as a substitution of `\ `.
KennyTM
A: 

So I'm assuming that you started out with a list of q, then mapped them (q -> [(k,v)]) to extract attribute value pairs to give [[(k,v)]] and you want to turn it into a list of pairs that contain the attribute and all the values that were present. Additionally the attribute keys are Bounded Enum so you can enumerate all the keys.

Then what you want to do is iterate over all the keys and select the values

f :: (Enum k, Bounded k) => [[(k,v)]] -> [(k,[v])]
f kvss = map (\k -> (k, map snd $ filter ((eqenum k).fst) $ kvs)) $ enumFromTo minBound maxBound 
  where kvs = concat kvss
        eqenum e1 e2 = (fromEnum e1) == (fromEnum e2)

This is lazy; you can test that with

data Foo = Foo1 | Foo2
  deriving (Enum, Bounded, Show, Eq)

infd = map (\x -> [(Foo1, 2*x), (Foo2, x*x)]) [1..]

take 5 $ snd $ (f infd) !! 0
take 5 $ snd $ (f infd) !! 1
Geoff Reedy