views:

248

answers:

4

I must be doing some stupid mistake. I have a server that returns the XML <a><b>123</b></a> and now I would like to match against that XML. So I write something like

xml match {
  case <a><b>{_}</b></a> => true
}

This works as long as I do not have to deal with multi-line XML literals. So the important thing is that the server sends me the whole XML as a one-liner. The XML is large enough to explode a single line of code, but I can not figure out how to get this to work.

Server sends <a><b>123</b><c>123</c><d>123</d><e>123</e><f>123</f></a> and I would like to do this:

xml match {
  case <a>
    <b>{_}</b>
    <c>{valueOfC}</c>
    <d>{_}</d>
    <e>{_}</e>
    <f>{_}</f>
  </a> => valueOfC
}

But I always get a MatchError. If I write everything in a single line it works. So the question is: how can I match XML while writing human-readable code?

I have of course tried to find an answer via Google. Funny enough all examples are one-liners or work recursive.

+2  A: 

XML with and without newlines and other whitespace is not considered the same using "match". If you use scala.xml.Utility.trim, you can remove whitespace. (You probably want to trim both your input and what the server gives you unless you're positive the server will send you no whitespace.)

Rex Kerr
error: method trim is not a case class constructor, nor does it have an unapply/unapplySeq method
Joa Ebert
+1  A: 

Perhaps you could try something like:

x match {
  case <a><b>{n @ _*}</b></a> => println(n)
}

I'm not saying it will work... but it might

Don Mackenzie
Well, but how would you match the a-b-c-d-e-f case like that? This could become pretty complex and you loose the information about the match order.
Joa Ebert
I just tried this in the REPLval x = <a><b><c>123</c><d>123</d> <e>123</e><f>123</f></b></a>x match { case <a><b>{m @ _*}</b></a> => println(m.getClass.getName)}and I get "scala.xml.NodeBuffer" with contents ArrayBuffer(<c>123</c>, <d>123</d>, , <e>123</e>, <f>123</f>)
Don Mackenzie
Yes, but how would you match that it satisfies the condition and extract the value? You would have to chain match after match if I am not mistaken, right?
Joa Ebert
I see your point, sorry Joa, I saw the first part of your question about multipart matching and used something I saw in Jesse Eichar's blog. Regarding the second part of your question, I'll keep trying.
Don Mackenzie
It seems to me that the "best" solution is to hide all the ugly matching code behind extractors since it is the only reusable approach. Otherwise I have to edit the match-madness at n places. But then again, something like this is not really readable "xml match { case Entity(_, _, _, _, _, Some(value), _, _, _) => {} }".
Joa Ebert
My best guess is that a case clause split over multiple lines using the literal notation gets caught in whitespace issues and will fail to match whereas the XML case classes could be used nested over multiple lines, however it won't win any prizes for elegance. I'd use the "xpath" operators at least for the top of the hierarchy.
Don Mackenzie
+2  A: 

This is considerably uglier than I had initially imagined. I do have a partial solution, but I'm not sure it's worth the effort. The default pattern match treats whitespace as tokens, and I've not found any clean way to get around it. So I've done the opposite: decorate the input string with whitespace. This example has just a single level of indentation; you could imagine recursing the whitespace-addition to match your favorite indentation style.

Here's the example (need to compile and run; the 2.7 REPL at least doesn't seem to like multi-line XML in case statements).

object Test {

import scala.xml._

def whiten(xml: Node,w:String): Node = {
  val bits = Node.unapplySeq(xml)
  val white = new Text(w)
  val ab = new scala.collection.mutable.ArrayBuffer[Node]()
  ab += white;
  bits.get._3.foreach {b => ab += b ; ab += white }
  new Elem(
    xml.prefix,
    xml.label,
    xml.attributes,
    xml.scope,
    ab: _*
  );
}

val xml = <a><b>123</b><c>Works</c></a>

def main(args:Array[String]) {
  whiten(xml,"""
         """  // You must match the multiline whitespace to your case indentation!
  ) match { 
    case <a>
         <b>123</b>
         <c>{x}</c>
         </a> => println(x)
    case _ => println("Fails")
  }
}

}

Rather inelegant, but it does (marginally) achieve what you want.

Rex Kerr
Thank you for the effort Rex. However I wonder why this rather obvious case causes so much stress.
Joa Ebert
A: 

Well, I don't have a solution to the match/case problem. You do need an extractor which whitens the input xml due to how Scala pattern matching works -- you cannot apply trim to an xml literal which is a pattern, as that exists just at compile time, patterns being translated into a series of function calls at runtime.

However, to get the value of the c tag, you could always use the XPath like syntax of taking xml apart. For example, to get the value of c in your XML you could use:

// a collection of all the values of all the c subelements (deep search)
val c1 = (xml \\ "c").map(_.text.toInt) 

// same as above, but shallow
val c2 = (xml \ "c").map(_.text.toInt)

Also see the XML chapter from Programming in Scala (part of which is on Google books)

Hope it helps,

-- Flaviu Cipcigan

Flaviu Cipcigan