tags:

views:

307

answers:

3

I'm reading Learn You a Haskell and I'm wondering why so many things are acting like a list, and nothing in the Prelude is using the native facility of type classes to set this up:

"The bytestring version of : is called cons It takes a byte and a bytestring and puts the byte at the beginning. It's lazy though, so it will make a new chunk even if the first chunk in the bytestring isn't full. That's why it's better to use the strict version of cons, cons' if you're going to be inserting a lot of bytes at the beginning of a bytestring."

Why isn't there a TypeClass listable or something that offers the : function to unify Data.ByteString, Data.List, Data.ByteString.Lazy, etc? Is there a reason for this, or is this just an element of legacy Haskell? Using : as an example is kind of an understatement, also from LYAH:

Otherwise, the bytestring modules have a load of functions that are analogous to those in Data.List, including, but not limited to, head, tail, init, null, length, map, reverse, foldl, foldr, concat, takeWhile, filter, etc.

+6  A: 

that offers the : function to unify Data.ByteString, Data.List, Data.ByteString.Lazy, etc?

There have been attempts to come up with a good a) sequence interface, and b) containers interface, however, unifying data types of different kinds, with different type constraints, has generally made the results non-standard enough that it is hard to imagine putting them in the base library. Similarly for arrays, though the Vector package now has a fairly general interface (based on associated data types).

There are a couple of projects to unify these various semi-related data types with a single interface, so I'm hopeful we'll see a result soon. Similarly for container types. The result won't be trivial though.

Don Stewart
Is Data.Foldable an appropriate solution?
Phil
@phil: Data.Foldable and Data.Traversable are great, but neither offers anything close to a complete interface.
John
I'm less hopeful about progress on this. I see two big flaws in most current efforts. The first is that people want to re-use existing type classes in ways they aren't particularly suited IMHO (`Monoid` is a common example). The second is that most attempts I've seen so far involve big, monolithic classes (such as `ListLike`), which rather hamstrings instance writers when their instance can't quite implement all the required methods. I don't think a solution is impossible, but it's definitely non-trivial.
John
+7  A: 

The ListLike package seems to provide what you're looking for. I've never understood why it isn't more popular.

ListLike aside, one reason this isn't implemented in the Prelude is because it's not possible to do so well without invoking some language extensions (multi-param type classes and fundeps or associated types). There are three sorts of containers to consider:

  1. Containers that don't care about their elements at all (e.g. [])
  2. Containers which are only implemented for specific elements (e.g. bytestrings)
  3. Containers which are polymorphic over elements but require a context (e.g. Data.Vector.Storable, which will hold any type with a storable instance).

Here's a very basic ListLike-style class without using any extensions:

class Listable container where
  head :: container a -> a

instance Listable [] where
  head (x:xs) = x

instance Listable ByteString where --compiler error, wrong kind

instance Listable SV.Vector where
  head v = SV.head    --compiler error, can't deduce context (Storable a)

Here container has kind *->*. This won't work for bytestrings because they don't allow an arbitrary type; they have kind *. It also won't work for a Data.Vector.Storable vector, because the class doesn't include the context (the Storable constraint).

You can fix this problem by either changing your class definition to

class ListableMPTC container elem | container -> elem where

or

class ListableAT container where
  type Elem container :: *

Now container has kind *; it's a fully-applied type constructor. That is, your instances look like

instance ListableMPTC [a] a where

but you're no longer Haskell98.

That's why even a simple Listable-type interface is non-trivial; it gets a bit harder when you have different collection semantics to account for (e.g. queues). The other really big challenge is mutable-vs.-immutable data. So far every attempt I've seen (except one) punts on that issue by creating a mutable interface and an immutable one. The one interface I know which did unify the two was mind-bending, invoked a bunch of extensions, and had quite poor performance.

Addendum: bytestrings

Totally conjecture on my part, but I think we're stuck with bytestrings as a product of evolution. That is, they were the first solution to low performance I/O operations, and it made sense to use Ptr Word8s for interfacing with IO system calls. Operations on pointers require Storable, and most likely the necessary extensions (as described above) to make polymorphism work weren't available then. Now it's difficult to overcome their momentum. A similar container with polymorphism is certainly possible, the storablevector package implements this, but it's not anywhere near as popular.

Could bytestrings be polymorphic without any restrictions on the elements? I think the closest Haskell has to this is the Array type. This isn't nearly as good as a bytestring for low-level IO because data needs to be unpacked from the pointer into the array's internal format. Also the data is boxed, which adds significant space overhead. If you want unboxed storage (less space) and efficient interfacing with C, pointers are the way to go. Once you have a Ptr, you need Storable, and then you need to include the element type in the type class, so then you're left with requiring extensions.

That being said, I think that with the appropriate extensions available this is essentially a solved problem for any single container implementation (modulo mutable/immutable APIs). The harder part now is coming up with a sensible set of classes that are usable for many different types of structures (lists, arrays, queues, etc.) and is flexible enough to be useful. I personally would expect this to be relatively straightforward, but I could be wrong.

John
I'm new to Haskell, so go lite on me: why does a ByteString have a kind of `*`. That seems rather random too -- why not make it polymorphic? I think I can understand the reasoning currently, but isn't assuming an 8 bit byte a totally needless assumption? Why not permit a `ByteString[Word7]` or something with a type synonym alias that makes a `ByteString` more like a `String`... With that said, I like this answer the most because it makes an attempt to explain why this isn't trivial. Would an update on Haskell language to standardize GHC pragmas make this trivial?
Evan Carroll
@Evan: edited my reply to address the questions on bytestrings.
John
A: 

ByteString is not a generic type.

In other languages, there is something like Sequence for all list-like data structures. I think this works, with correct extensions:

class Seq a b | a -> b where
  head :: a -> b
  isTail :: a -> Bool

# ([a]) is a sequence of a's
instance Seq [a] a where
  head (x:xs) = x
  isTail = (== [])

# ByteString is a sequence of chars
instance Seq ByteString Char

Or try this?

type BS a = ByteString
instance List BS
SHiNKiROU