I'm writing code where I retrieve XML from a web api, then parse that XML using Groovy. Unfortunately, it seems that both XmlParser and XmlSlurper for Groovy strip newline characters from the attributes of nodes when .text() is called.
How can I get at the text of the attribute including the newlines?
Sample code:
def xmltest = '''
<snippet>
<preSnippet att1="testatt1" code="This is line 1
This is line 2
This is line 3" >
<lines count="10" />
</preSnippet>
</snippet>'''
def parsed = new XmlParser().parseText( xmltest )
println "Parsed"
parsed.preSnippet.each { pre ->
println pre.attribute('code');
}
def slurped = new XmlSlurper().parseText( xmltest )
println "Slurped"
slurped.children().each { preSnip ->
println [email protected]()
}
the output of which is:
Parsed
This is line 1 This is line 2 This is line 3
Slurped
This is line 1 This is line 2 This is line 3
Ok, I was able to convert the text before I parsed it, then re-convert after, a la:
def newxml = xmltest.replaceAll( /code="[^"]*/ ) {
return it.replaceAll( /\n/, "~#~" )
}
def parsed = new XmlParser().parseText( xmltest )
def code = pre.attribute('code').replaceAll( "~#~", "\n" )
Not my favorite hack, but it'll do until they fix their XML output.