tags:

views:

70

answers:

1

Hello,

I'm parsing an xml file, that has nodes with text like this:

<img src="someUrl1"> American Dollar 1USD | 2,8567 | sometext
<img src="someUrl2"> Euro 1EUR | 3,9446 | sometext
<img src="someUrl3"> Japanese Jen 100JPY | 3,4885 | sometext

What I want to get is values like this:

American Dollar, USD, 2,8576
Euro, EUR, 3,9446
Japanese Jen, JPY, 3,4885

I wonder how could I write the regular expression for this. Scala has some weird regular expressions and I can't figure it out.

+6  A: 

If I am understanding you correct, you just want to use regex to get your informations. In this case, you can use the Extractor functionality of Scala and do something like this:

scala> val RegexParser = """(.*) \d+([A-Z]+) \| (.*) \|.*""".r
RegexParser: scala.util.matching.Regex = (.*) \d+([A-Z]+) \| (.*) \|.*

scala> val RegexParser(name,shortname,value) = "American Dollar 1USD | 2,8567 | sometext"
name: String = American Dollar
shortname: String = USD
value: String = 2,8567

scala> val RegexParser(name,shortname,value) = "Euro 1EUR | 3,9446 | sometext"
name: String = Euro
shortname: String = EUR
value: String = 3,9446

scala> val RegexParser(name,shortname,value) = "Japanese Jen 100JPY | 3,4885 | sometext"
name: String = Japanese Jen
shortname: String = JPY
value: String = 3,4885

First, you create an Extractor based on a Regex-String. This can be done by calling r on a String (class StringOps to be exact). After that you can use this Extractor to read out all matched elements (name, shortname, value). In this blog post you will find a good explanation.

Steve