views:

270

answers:

3

I'm using Java's DataInputStream with scala to parse some simple binary file (which is very bad exprerience due to the lack of unsigned types, even in scala, but that's a different story).

However I find myself forced to use mutable data structure, since Java's streams are inherently state preserving entities.

What's a good design to wrap Java's streams with nice functional data structure?

+2  A: 

Have a look at the IO Monad from Haskell for a pure functional approach.

A pragmatic Scala implementation would implement a Iterator/Iterable based on the Stream. For example scala.io.Source supports this.

Thomas Jung
The pragmatic Scala implementation is not good for, say, binary stream. What if I wish to read an int and then a short? The iterator won't be able to handle that. And even if it could I'm keeping state while using it!
Elazar Leibovich
Every implementation will have to define a type M[T] as a container your stream. So T has to support T.toShort and T.toInt to read the current value. Otherwise M[_] would be a inhomogeneous container.
Thomas Jung
+5  A: 

There a project currently in progress which aims to create an IO API for Scala: scala IO It is inspired by Java 7 NIO API. It is still a WIP, but you might get some interesting ideas out of it. There's also some samples on how to use it, which can be found here

Arjan Blokzijl
Isn't NIO's main point is being asynchronous? I'm not sure I want asynchronous data retieval. It can make simple things more complicate.
Elazar Leibovich
@Elazar `java.nio` is not only about asynchronous I/O, it also contains a lot of functionality for "normal" synchronous operations. The fact that scala-io is built on `java.nio` does not necessarily mean that it's just for asynchronous I/O.
Jesper
+2  A: 

The whole point of reading a file is to gain state that you didn't have before. I don't, therefore, exactly understand what you're after.

One can pretend that one has the entire universe as an input (and output) parameter and create a "functional" analog, but I've never seen a clear demonstration that this has any superior characteristics.

Most functional data structures allow you to abstract over copy number. For example, a list lets you extend operations on an individual element to all elements in convenient ways (map, reduce, etc.). But when you want to read a file, you need to abstract over data type, and furthermore, you don't actually want it completely abstract--you want to match some sort of template that you expect. How you specify this template--and what to do on error conditions--is, I suspect, the core of your binary file reading challenge.

(Note also that unless you're running on one of those highly multicore Sun boxes (T2000 for example), you don't need immutability for safety since one thread is plenty fast enough to handle all of the low-level input processing.)

One possibility is to treat binary file reading as a parsing problem. Scala doesn't have a robust library for that at the moment, but see this thread for a nice bit of code written by Paul Phillips that helps in this regard.

Another possibility is to create some sort of template yourself, like

List(classOf[Float],classOf[Int],classOf[String])

and then write something that parses that stream sequentially with match statements:

val FloatClass = classOf[Float]
listEntry match {
  case FloatClass => // Read float
  ...
 }

These sorts of things make the job of reading binary files a lot easier, and it is at least sort of functional, since you can map your input stream of bytes into a List[Any] and then use pattern matching to grab out the data that you want.

Rex Kerr