tags:

views:

132

answers:

3

I'm trying to parse a rss feed that looks like this for the attribute "date":

<rss version="2.0">
<channel>
    <item>
        <y:c date="AA"></y:c>
    </item>
</channel>
</rss>

I tried several different versions of this: (rssFeed contains the RSS data)

println(((rssFeed \\ "channel" \\ "item" \ "y:c" \"date").toString))

But nothing seems to work. What am I missing?

Any help would really be appreciated!

+5  A: 

The "y" in <y:c is a namespace prefix. It's not part of the name. Also, attributes are referred to with a '@'. Try this:

println(((rssFeed \\ "channel" \\ "item" \ "c" \ "@date").toString))
sblundy
+6  A: 

Attributes are retrieved using the "@attrName" selector. Thus, your selector should actually be something like the following:

println((rssFeed \\ "channel" \\ "item" \ "c" \ "@date").text)
Daniel Spiewak
Note the .text to get the date as a String rather than a Node
sblundy
Indeed. The `text` method is generally preferable to `toString` since it will gracefully handle the case where your selector grabbed a chunk of XML rather than a `Text` node.
Daniel Spiewak
+1  A: 

Also, think about the difference between \ and \\. \\ looks for a descendent, not just a child, like this (note that it jumps from channel to c, without item):

scala> (rssFeed \\ "channel" \\ "c" \ "@date").text
res20: String = AA

Or this sort of thing if you just want all the < c > elements, and don't care about their parents:

scala> (rssFeed \\ "c" \ "@date").text            
res24: String = AA

And this specifies an exact path:

scala> (rssFeed \ "channel" \ "item" \ "c" \ "@date").text
res25: String = AA
James Moore