views:

254

answers:

4

I am pretty new to Haskell (still working on totally understanding monads). I have a problem where I have a tree like structure

type Tree = [DataA]

data DataA =  DataA1 [DataB] 
            | DataA2 String 
            | DataA3 String [DataA]
               deriving Show

data DataB =  DataB1 [DataA] 
            | DataB2 String 
            | DataB3 String [DataB]
               deriving Show

What I would like to do is be able to traverse this and generate a new tree with a filter. For example I may want to change all DataB2 in the tree to "foo".

I have seen examples of tree when they are in the same data section, and the constructors are similar.

In the python world, I would just traverse the list, match to whatever I needed, and replace the value.

In Haskell I am guessing that I need to be able to copy my tree, but how do you deal with lists buried in constructors and different data types?

+2  A: 

I don't know a general answer to your question. The data type is quite contrived, and I would probably choose to implement a fold rather than a filter. Here, however, are some filter functions that can update strings in all four positions. I have put the code through the compiler, so it typechecks, but I haven't run it.

type SFilter = String -> String

-- to filter a tree, say how A2, A3, B2, and B3 should be changed

type Filter tree = SFilter -> SFilter -> SFilter -> SFilter -> (tree -> tree)

afilter :: Filter DataA
bfilter :: Filter DataB
tfilter :: Filter Tree

tfilter a2 a3 b2 b3 = map (afilter a2 a3 b2 b3)
afilter a2 a3 b2 b3 = fil
  where fil (DataA1 bs)   = DataA1 $ map (bfilter a2 a3 b2 b3) bs
        fil (DataA2 s)    = DataA2 (a2 s)
        fil (DataA3 s as) = DataA3 (a3 s) (map fil as)

bfilter a2 a3 b2 b3 = fil
  where fil (DataB1 as)   = DataB1 $ map (afilter a2 a3 b2 b3) as
        fil (DataB2 s)    = DataB2 (b2 s)
        fil (DataB3 s bs) = DataB3 (b3 s) (map fil bs)
Norman Ramsey
Interesting.The example is contrived because the actual tree that I am working with is a abstract tree for a language using Parsec. So you tend to get expressions in expressions and every kind of other odd twist that you can think of.So you are saying go create a SFilter for each object I wish to manipulate.The only issue that I see with that is that the acutual tree is quite large with many types. This is a really good example which I think I can work out.
Chris
@Chris: If your tree types are mutually recursive, because of the static type system there is no way around a function for each of the types. I have managed some of this kind of complexity using type classes. Or if you are really brave, you can try Scrap Your Boilerplate or Ralf Hinze's Generic Programming for the Masses or Generics Now.
Norman Ramsey
Yes, it looks like the keyword you're looking for is "generic programming".
Wei Hu
*"I have put the code through the compiler, so it typechecks, but I haven't run it."* --aww, how could you pass up a perfect opportunity to quote Knuth? Given the Curry-Howard correspondence, that's almost exactly a paraphrase of "Beware of bugs in the above code; I have only proved it correct, not tried it."
camccann
+2  A: 

You want to traverse the whole data structure and change some items here and there. This is usually done by a function that takes the data structure as a parameter and returns the new, changed version of the structure.

For every case of input this function defines how the new value that is returned should look like.

The basic function that modifies a Tree (which is just a list of DataA values) probably should just return a new list of modified values. If we defer those modifications of the values to a modifyA function, the main modification function looks like this:

-- # function to change a |Tree|
mutate :: Tree -> Tree
mutate as = map mutateA as
     -- # (The |map| function applies the |mutateA| function to every
     -- #  element of |as|, creating a list of all the return values)

Now the mutateA function needs to be defined to change all possible DataA values, and best it is accompanied by a mutateB function that handles DataB values.

These functions look at the different possible cases of values and return the appropriate new values:

-- # function to change |DataA| items
mutateA :: DataA -> DataA
     -- # A |DataA1| is a |DataA1| with modified values
mutateA (DataA1 bs)   = DataA1 (map mutateB bs)
     -- # A |DataA3| is a |DataA3| with modified values
mutateA (DataA3 s as) = DataA3 s (map mutateA as)
     -- # In the remaining case(s) the value stays the same
mutateA d             = d

-- # function to change |DataB| items
mutateB :: DataB -> DataB
mutateB (DataB1 as) = DataB1 (map mutateA as)
mutateB (DataB3 s bs) = DataB3 s (map mutateB bs)
     -- # Here comes a real change
mutateB (DataB2 _)  = DataB2 "foo"

This way for every element in the tree a new element is computed, where all the DataB2 values anywhere in the tree are replaced by "foo".

It's relatively verbose because you have five different cases that contain a list of values that needs to be walked through, but that is not specific to Haskell. In an imperative language you would usually have five for loops in place of the five calls to map.

Maybe you could simplify your data structure to reduce this "overhead". This of course depends on your actual use case, but maybe, for example, you don't need the Data2 cases: Is there a difference between DataA2 "abc" and DataA3 "abc" []?

sth
The example that I gave is actually a small part of a abstract tree that comes from parsing a language. So I may be able to simplify, but not to the extent that it would be much simpler than what you see.This is a good way to do it, I would have to modify it so that it I could use it for many different filters.Basically I am trying to take this parsed language and modify it using several filters so that it can be pretty printed out for another language.
Chris
+5  A: 

You can use generic programming for this.

One such generic programming library is called Scrap Your Boilerplate. At the very top of your module, enable Scrap Your Boilerplate by writing:

{-# LANGUAGE DeriveDataTypeable #-}

Import module Data.Generics. Then besides Show, also derive Typeable and Data instances for your datatypes.

Now you can write the function you requested like this:

toFoo :: Data a => a -> a
toFoo = everywhere (mkT step)
  where
    step (DataA2 _)  = DataA2 "foo"
    step x           = x

That's all you need to do to make this work. For example, when you call toFoo [DataA1 [], DataA2 "hi", DataA3 "yo" []], the answer is [DataA1 [],DataA2 "foo",DataA3 "yo" []].

Hope this helps!

Martijn
Thanks. This is what I was looking for.To get this to work, I had to import Data.Generics
Chris
You're right, my bad! Fixed my answer. Thanks.
Martijn
Brilliant. Glad somebody here knows how to make SYB work. +1
Norman Ramsey
A: 

You may want to take a look at the multirec library for working with mutually recursive data types. I've not used it, but from what you've described it sounds like it's aimed at precisely the kind of problem that you're working with. It uses generic programming like the other answers here have suggested, but might save you the time of implementing everything yourself.

camccann