tags:

views:

138

answers:

3

Hi,

I'm parsing XML, and keep finding myself writing code like:

val xml = <outertag>
<dog>val1</dog>
<cat>val2</cat>
</outertag>

var cat = ""
var dog = ""

for (inner <- xml \ "_") {
  inner match {
    case <dog>{ dg @ _* }</dog> => dog = dg(0).toString()
    case <cat>{ ct @ _* }</cat> => cat = ct(0).toString()
  }
}

/* do something with dog and cat */

It annoys me because I should be able to declare cat and dog as val (immutable), since I only need to set them once, but I have to make them mutable. And besides that it just seems like there must be a better way to do this in scala. Any ideas?

+1  A: 

Consider wrapping up the XML inspection and pattern matching in a function that returns the multiple values you need as a tuple (Tuple2[String, String]). But stop and consider: it looks like it's possible to not match any dog and cat elements, which would leave you returning null for one or both of the tuple components. Perhaps you could return a tuple of Option[String], or throw if either of the element patterns fail to bind.

In any case, you can generally solve these initialization problems by wrapping up the constituent statements into a function to yield an expression. Once you have an expression in hand, you can initialize a constant with the result of its evaluation.

seh
Thanks! Can you show an example using a yield expression? Or at least a good url that talks about them. Like I said, I'm a newb =)
Stephen
That wasn't quite what I meant. By defining a function that returns a value -- here, a Tuple2[String, String] -- you now have an expression available, because calling a function that returns a value is an expression. Declaring and initializing a "val" is itself a statement, and the right hand side (the initial value) must be an expression, not a statement. That means that the right hand side can be a literal value, or a function invocation, but it can't be one or more statements. It's this distinction between statements and expressions that often encourages on to write these small functions.
seh
+2  A: 

Here are two (now make it three) possible solutions. The first one is pretty quick and dirty. You can run the whole bit in the Scala interpreter.

val xmlData = <outertag>
<dog>val1</dog>
<cat>val2</cat>
</outertag>

// A very simple way to do this mapping.
def simpleGetNodeValue(x:scala.xml.NodeSeq, tag:String) = (x \\ tag).text

val cat = simpleGetNodeValue(xmlData, "cat")
val dog = simpleGetNodeValue(xmlData, "dog")

cat will be "val2", and dog will be "val1".

Note that if either node is not found, an empty string will be returned. You can work around this, or you could write it in a slightly more idiomatic way:

// A more idiomatic Scala way, even though Scala wouldn't give us nulls.
// This returns an Option[String].
def getNodeValue(x:scala.xml.NodeSeq, tag:String) = {
  (x \\ tag).text match {
    case "" => None
    case x:String => Some(x)
  }
}

val cat1 = getNodeValue(xmlData, "cat") getOrElse "No cat found."
val dog1 = getNodeValue(xmlData, "dog") getOrElse "No dog found."
val goat = getNodeValue(xmlData, "goat") getOrElse "No goat found."

cat1 will be "val2", dog1 will be "val1", and goat will be "No goat found."

UPDATE: Here's one more convenience method to take a list of tag names and return their matches as a Map[String, String].

// Searches for all tags in the List and returns a Map[String, String].
def getNodeValues(x:scala.xml.NodeSeq, tags:List[String]) = {
  tags.foldLeft(Map[String, String]()) { (a, b) => a(b) = simpleGetNodeValue(x, b)}
}

val tagsToMatch = List("dog", "cat")
val matchedValues = getNodeValues(xmlData, tagsToMatch)

If you run that, matchedValues will be Map(dog -> val1, cat -> val2).

Hope that helps!

UPDATE 2: Per Daniel's suggestion, I'm using the double-backslash operator, which will descend into child elements, which may be better as your XML dataset evolves.

Steven Merrill
I sense someone has been working quite a bit with XML lately... :-) However, I propose you use double backslash instead of single backslash in the first example. (damned stackoverflow keep messing with the backslashes)
Daniel
I haven't really been working much with XML - I'm just trying to get a feel for Scala. In any case, I'm trying to clean up the easier questions so you can answer the doozies, like some weird Python conversion.
Steven Merrill
Good call on the double-backslash, by the way. :)
Steven Merrill
+2  A: 
scala> val xml = <outertag><dog>val1</dog><cat>val2</cat></outertag>
xml: scala.xml.Elem = <outertag><dog>val1</dog><cat>val2</cat></outertag>

scala> val cat = xml \\ "cat" text
cat: String = val2

scala> val dog = xml \\ "dog" text
dog: String = val1
Daniel