tags:

views:

174

answers:

4

I'm going through the 99 Haskell problems to build my proficiency with the language. On problem 7 ("Flatten a nested list structure"), I found myself wanting to define a conditional behavior based on the type of argument passed to a function. That is, since

*Main> :t 1
1 :: (Num t) => t
*Main> :t [1,2]
[1,2] :: (Num t) => [t]
*Main> :t [[1],[2]]
[[1],[2]] :: (Num t) => [[t]]

(i.e. lists nested at different levels have different data types) it seems like I should be able to write a function that can read the type of the argument, and then behave accordingly. My first attempt was along these lines:

listflatten l = do
    if (:t l) /= ((Num t) => [t]) then
        listflatten (foldl (++) [] l)
        else id l

But when I try to do that, Haskell returns a parse error. Is Haskell flexible enough to allow this sort of type manipulation, do I need to find another way?

+3  A: 

Look at the example for that problem:

flatten (List [Elem 1, List [Elem 2, List [Elem 3, Elem 4], Elem 5]])

As you see, the problem wants you to create your own data structure for arbitrarily nested lists.

Normal haskell lists can not be arbitrarily nested. Every element of the list has to have the same type, statically known, which is why it makes no sense to check the type of the elements dynamically.

In general haskell does not allow you to create a list of different types and then check the type at runtime. You could use typeclasses to define different behaviors for flatten with different types of arguments, but that still wouldn't give you arbitrarily nested lists.

sepp2k
+5  A: 

You are confusing the interactive command :t in the interpreter with a built-in function. You cannot query the type at runtime.

ShiDoiSi
[Not really](http://www.haskell.org/ghc/docs/6.12.1/html/libraries/base/Data-Typeable.html).
KennyTM
+6  A: 

(Sorry about the length—I go a little bit far afield/in excessive depth. The CliffsNotes version is "No, you can't really do what you want because types aren't values and we can't give your function a sensible type; use your own data type.". The first and the fifth paragraph, not counting this one or the code block, explain the core of what I mean by that first part, and the rest of the answer should provide some clarification/detail.)

Roughly speaking, no, this is not possible, for two reasons. The first is the type-dispatch issue. The :t command is a feature (an enormously useful one) of GHCi, and isn't a Haskell function. Think about why: what type would it have? :t :: a -> ?? Types themselves aren't values, and thus don't have a type. It's two different worlds. So the way you're trying to do this isn't possible. Also note that you have a random do. This is bad—do notation is a syntactic sugar for monadic computation, and you aren't doing any of that. Get rid of it!

Why is this? Haskell has two kinds polymorphism, and the one we're concerned with at the moment is parametric polymorphism. This is what you see when you have a type like concat :: [[a]] -> a. That a says that one single definition of concat must be usable for every possible a from now until the end of time. How on earth would you type flatten using this scheme? It's just not possible.

You're trying to call a different function, defined ad-hoc, for different kinds of data. This is called, shockingly, ad-hoc polymorphism. For instance, in C++, you could define the following function:

template <typename T>
void flatten(vector<T>& v) { ... }

template <typename T>
void flatten(vector< vector<T> >& v) { ... }

This would allow you do different things for different types. You could even have template <> void flatten(int) { ... }! You can accomplish this in Haskell by using type classes such as Num or Show; the whole point of a type signature like Show a => a -> String is that a different function can be called for different as. And in fact, you can take advantage of this to get a partial solution to your problem…but before we do, let's look at the second problem.

This issue is with the list you are trying to feed in. Haskell's list type is defined as (roughly) data [a] = [] | a : [a]. In other words, every element of a list must have the same type; a list of ints, [Int], contains only ints, Int; and a list of lists of ints, [[Int]], contains only lists of ints, [Int]. The structure [1,2,[3,4],5] is illegal! Reading your code, I think you understand this; however, there's another ramification. For similar reasons, you can't write a fully-generic flatten function of type flatten :: [...[a]...] -> [a]. Your function also has to be able to deal with arbitrary nesting depth, which still isn't possible with a list. You need [a], [[a]], and so on to all be the same type!

Thus, to get all of the necessary properties, you want a different type. The type you want has a different property: it contains either nothing, a single element followed by the rest of the value, or a nested list of elements followed by the rest of the value. In other words, something like

data NList a = Nil
             | a         :>  NList a
             | (NList a) :>> NList a
             deriving (Eq, Show)
infixr 5 :>, :>>

Then, instead of the list [1,2,3] == 1 : 2 : 3 : [], you would write 1 :> 2 :> 3 :> Nil; instead of Lisp's (1 (2 3) 4 ()), you would write 1 :> (2 :> 3 :> Nil) :>> 4 :> Nil :>> Nil. You can even begin to define functions to manipulate it:

nhead :: NList a -> Either a [a]
nhead Nil       = error "nhead: Empty NList."
nhead (h :>  _) = Left  a
nhead (h :>> _) = Right a

ntail :: NList a -> NList a
ntail Nil       = error "nhead: Empty NList."
ntail (_ :>  t) = t
ntail (_ :>> t) = t

Admittedly, you might find this a bit clunky (or perhaps not), so you might try to think about your type differently. Another option, which the Haskell translation of the 99 problems uses, is to realize that everything in a nested list is either a single item or a list of nested lists. This translation gives you

data NestedList a = Elem a
                  | List [NestedList a]
                  deriving (Eq, Show)

The two above lists then become List [Elem 1, Elem 2, Elem 3] and List [Elem 1, List [Elem 2, Elem 3], Elem 4, List []]. As for how to flatten them—since you're trying to learn from the 99 problems, that I won't say :) And after all, you seem to have a handle on that part of the problem.


Now, let's return to type classes. I lied a bit when I said that you couldn't write something which took an arbitrarily-nested list—you can, in fact, using type classes and some GHC extensions. Now, before I continue, I should say: don't use this! Seriously. The other technique is almost definitely a better choice. However, this technique is cool, and so I will present it here. Consider the following code:

{-# LANGUAGE MultiParamTypeClasses, FlexibleInstances, UndecidableInstances #-}

class Flattenable f e where
  flatten :: f -> [e]

instance Flattenable a a where
  flatten = return

instance Flattenable f e => Flattenable [f] e where
  flatten = concatMap flatten

We are creating a type class whose instances are the things we can flatten. If we have Flattenable f e, then f should be a collection, in this case a list, whose elements are ultimately of type e. Any single object is such a collection, and its element type is itself; thus, the first instance declaration allows us to flatten anything into a singleton list. The second instance declaration says that if we can flatten an f into a list of es, then we can also flatten a list of fs into a list of es by flattening each f and sticking the resulting lists together. This recursive class definition defines the function recursively for the nested list types, giving you the ability to flatten a list of any nesting with the single function flatten: [1,2,3], [[4,5],[6]], [[[7,8],[9]],[[10]],[[11],[12]]], and so on.

However, because of the multiple instances and such, it does require a single type annotation: you will need to write, for instance, flatten [[True,False],[True]] :: [Bool]. If you have something that's type class-polymorphic within your lists, then things are a little stricter; you need to write flatten [[1],[2,3 :: Int]] :: [Int], and as far as I can tell, the resulting list cannot be polymorphic itself. (However, I could well be wrong about this last part, as I haven't tried everything by any means.) For a similar reason, this is too open—you could declare instance Flattenable [f] () where flatten = [()] if you wanted too. I tried to get things to work with type families/functional dependencies in order to remove some of these problems, but thanks to the recursive structure, couldn't get it to work (I had no e and a declaration along the lines of type Elem a = a and type Elem [f] = Elem f, but these conflicted since [f] matches a). If anyone knows how, I'd very much like to see it!

Again, sorry about the length—I tend to start blathering when I get tired. Still, I hope this is helpful!

Antal S-Z
Classes in Haskell is something I wanted for a really long time... I'd use `class Flattenable f e | f -> e where` (FunctionalDependencies) to make sure that return type of `flatten` can be deduced from its argument. And I guess that "flatten" is just an simple example, so this answer can look a bit shorter.
ony
I'm not sure what you mean by "wanted for a really long time"—they've always been around. And fundeps (and type families, as I said in my answer) don't work here, because there are *two* instances for *e.g.* `Flattenable [a]`: `Flattenable [a] [a]` and `Flattenable [a] a`. I couldn't figure out how to tell OverlappingInstances or something to just pick the more specific one, and so reverted to this solution. Like I said, if you know how to do *that*, please let me know.
Antal S-Z
This is just the sort of technique I was reaching for. Basically I was looking for something like the J expression ,(> L:1 ^:_) which iteratively opens the levels of a boxed array to the limit of application and then ravels the resulting rank-2 array into a list. Defining new data structures/operators to create a desired behavior looks like a really interesting way to attack programming problems. Thanks!
estanford
+6  A: 

1. Use pattern matching instead

You can solve that problem without checking for data types dynamically. In fact, it is very rarely needed in Haskell. Usually you can use pattern matching instead.

For example, if you have a type

data List a = Elem a | Nested [List a]

you can pattern match like

flatten (Elem x) = ...
flatten (Nested xs) = ...
Example:
data List a = Elem a | Nested [List a]
  deriving (Show)

nested = Nested [Elem 1, Nested [Elem 2, Elem 3, Nested [Elem 4]], Elem 5]
main = print $ flatten nested

flatten :: List a -> [a]
flatten (Elem x) = [x]
flatten (Nested lists) = concat . map flatten $ lists  

map flatten flattens every inner list, thus it behaves like [List a] -> [[a]], and we produce a list of lists here. concat merges all lists together (concat [[1],[2,3],[4]] gives [1,2,3,4]). concat . map flatten is the same as concatMap flatten.

2. To check types dynamically, use Data.Typeable

And if on some rare occasion (not in this problem) you really need to check types dynamically, you can use Data.Typeable type class and its typeOf function. :t works only in GHCI, it is not part of the language.

ghci> :m + Data.Typeable
ghci> typeOf 3 == typeOf "3"
False
ghci> typeOf "a" == typeOf "b"
True

Likely, you will need to use DeriveDataTypeable extension too.

jetxee
I was ready to upvote until I saw you recommend Data.Typeable to a raw beginner...
Norman Ramsey
It's the second suggestion of two, though. As much as I like tinkering with obscure programming constructs, the pattern matching approach looks like something I can keep in my toolbox for application to a larger class of problems.
estanford