views:

114

answers:

2

Let's say that I have an adt representing some kind of tree structure:

data Tree = ANode (Maybe Tree) (Maybe Tree) AValType
          | BNode (Maybe Tree) (Maybe Tree) BValType
          | CNode (Maybe Tree) (Maybe Tree) CValType

As far as I know there's no way of pattern matching against type constructors (or the matching functions itself wouldn't have a type?) but I'd still like to use the compile-time type system to eliminate the possibility of returning or parsing the wrong 'type' of Tree node. For example, it might be that CNode's can only be parents to ANodes. I might have

parseANode :: Parser (Maybe Tree)

as a Parsec parsing function that get's used as part of my CNode parser:

parseCNode :: Parser (Maybe Tree)
parseCNode = try (
                  string "<CNode>" >>
                  parseANode >>= \maybeanodel ->
                  parseANode >>= \maybeanoder ->
                  parseCValType >>= \cval ->
                  string "</CNode>"
                  return (Just (CNode maybeanodel maybeanoder cval))
             ) <|> return Nothing

According to the type system, parseANode could end up returning a Maybe CNode, a Maybe BNode, or a Maybe ANode, but I really want to make sure that it only returns a Maybe ANode. Note that this isn't a schema-value of data or runtime-check that I want to do - I'm actually just trying to check the validity of the parser that I've written for a particular tree schema. IOW, I'm not trying to check parsed data for schema-correctness, what I'm really trying to do is check my parser for schema correctness - I'd just like to make sure that I don't botch-up parseANode someday to return something other than an ANode value.

I was hoping that maybe if I matched against the value constructor in the bind variable, that the type-inferencing would figure out what I meant:

parseCNode :: Parser (Maybe Tree)
parseCNode = try (
                  string "<CNode>" >>
                  parseANode >>= \(Maybe (ANode left right avall)) ->
                  parseANode >>= \(Maybe (ANode left right avalr)) ->
                  parseCValType >>= \cval ->
                  string "</CNode>"
                  return (Just (CNode (Maybe (ANode left right avall)) (Maybe (ANode left right avalr)) cval))
             ) <|> return Nothing

But this has a lot of problems, not the least of which that parseANode is no longer free to return Nothing. And it doesn't work anyways - it looks like that bind variable is treated as a pattern match and the runtime complains about non-exhaustive pattern matching when parseANode either returns Nothing or Maybe BNode or something.

I could do something along these lines:

data ANode = ANode (Maybe BNode) (Maybe BNode) AValType
data BNode = BNode (Maybe CNode) (Maybe CNode) BValType
data CNode = CNode (Maybe ANode) (Maybe ANode) CValType

but that kind of sucks because it assumes that the constraint is applied to all nodes - I might not be interested in doing that - indeed it might just be CNodes that can only be parenting ANodes. So I guess I could do this:

data AnyNode = AnyANode ANode | AnyBNode BNode | AnyCNode CNode

data ANode = ANode (Maybe AnyNode) (Maybe AnyNode) AValType
data BNode = BNode (Maybe AnyNode) (Maybe AnyNode) BValType
data CNode = CNode (Maybe ANode) (Maybe ANode) CValType

but then this makes it much harder to pattern-match against *Node's - in fact it's impossible because they're just completely distinct types. I could make a typeclass wherever I wanted to pattern-match I guess

class Node t where
    matchingFunc :: t -> Bool

instance Node ANode where
    matchingFunc (ANode left right val) = testA val

instance Node BNode where
    matchingFunc (BNode left right val) = val == refBVal

instance Node CNode where
    matchingFunc (CNode left right val) = doSomethingWithACValAndReturnABool val

At any rate, this just seems kind of messy. Can anyone think of a more succinct way of doing this?

+2  A: 

I'd still like to use the compile-time type system to eliminate the possibility of returning or parsing the wrong 'type' of Tree node

This sounds like a use case for GADTs.

{-# LANGUAGE GADTs, EmptyDataDecls #-}
data ATag
data BTag
data CTag

data Tree t where
  ANode :: Maybe (Tree t) -> Maybe (Tree t) -> AValType -> Tree ATag
  BNode :: Maybe (Tree t) -> Maybe (Tree t) -> BValType -> Tree BTag
  CNode :: Maybe (Tree t) -> Maybe (Tree t) -> CValType -> Tree CTag

Now you can use Tree t when you don't care about the node type, or Tree ATag when you do.

keegan
I believe GADTs are a bit of a large hammer for this problem. An unrestricted tree is now an existential over Tree: `exists t. Tree t`, which is not easily expressed in Haskell.
luqui
I think you need to include either `t` or a tag for the `Maybe Tree` parameters in this example.
John
@John: Actually, I think that makes this solution problematic—the tree has to encode the tag for all of its subtrees now, which would get… messy.
Antal S-Z
@Antal - indeed it does; I think this is the same issue as @luqui's objection. But the code example doesn't compile as-is because of the missing type argument, which is all I meant to point out.
John
I actually looked quickly at GADTs for this problem, but I didn't think that they were relevant - I thought GADTs were just a means of explicitly building adt constructors with functions. The code here confuses me a bit - it doesn't compile but I'm intrigued enough that I'm wondering if there's a solution here pending maybe a small fix?
Ted Middleton
Sorry, I typed in haste and had a kind error before. Fixed
keegan
@luqui: The existential is not so bad in GHC Haskell. Not saying it's perfect.
keegan
Uh, fixed again? I think the SO downtime ate my previous fix.
keegan
@Ted: "I thought GADTs were just a means of explicitly building adt constructors with functions" Doesn't make sense to me. GADTs are a way of attaching type-level information to individual data constructors, so that well-formedness properties can be proven in the type system.
keegan
@Ted, @keegan, yeah they are a bit of both. I usually use them as a pretty way to write normal ADTs. But they are strictly more powerful; eg. `data Eq a b where Refl :: Eq a a` encodes type equality as a data type. Try doing that with regualr ADTs!
luqui
+4  A: 

I don't understand your objection to your final solution. You can still pattern match against AnyNodes, like this:

f (AnyANode (ANode x y z)) = ...

It's a little more verbose, but I think it has the engineering properties you want.

luqui
Yes - it works...when I started typing out this question I hadn't thought that far ahead. At this point I'm more interested in seeing if there's a solution that's more ... compact? Expressive?
Ted Middleton
It doesn't sound like expressive is what you're looking for. I can understand your desire for succinctness. But I agree with luqui, your final solution seems to solve your problem, in a very straightforward way. As an added benefit, when one reads the code, it is clear the properties you're trying to express in your data structure. -- I think this is an important property to consider.
mayahustle
One more thing, I think a more interesting problem would be enforcement of properties at a further depth from the parent. For example, right now, you enforce direct parent to child relationship. But what if you never want a C node to contain a BNode in its subtree? Right now, the heirarchy allows that CNode->ANode->BNode..
mayahustle