tags:

views:

47

answers:

4

Are there any helpers that will transform/escape a string to be a valid XML name ?

Example, I have the string max(OfAll) and need to generate some XML like e.g.

<max(OfAll)>SomeText</<max(OfAll)>

That's obviously not a valid name, are there some helper methods that can transform the string to be a valid xml name ?

(For comparison, .NET have some methods that the above xml fragment would be:

 <max_x028_OfAll_x028_>SomeText</<max_x028_OfAll_x028_>)
A: 

One class which may be of use in other situations is StringEscapeUtils in the apache commons-lang project. It can escape text for use in XML documents, I'm not aware of anything to escape XML element names.

Could you not generate something more readable such as

<aggregation type="max(OfAll)">SomeText</aggregation>

There are lots of libraries available to marshall/unmarshall objects to xml and back including JAXB (part of the JDK), JiBX, Castor, XStream

Jon Freedman
A: 

I don't know of any helper methods for that, but rules here http://www.w3.org/TR/REC-xml/#NT-Name are pretty straightforward, so it should be easy to implement one.

Shooshpanchick
+1  A: 

The encoding in your .NET example looks like the one defined in ISO9075. I don't think there is a built-in implementation in the jdk, but this encoding is also used by content repositories like alfresco or jackrabbit for their xml import/exports and query apis. A quick search turned up these two implementations, both available under open source licenses:

Jörn Horstmann
A: 

As should be clear, normal XML escaping (replacing inappropriate characters with character entities) does not result in a valid XML identifier.

For the record, what you are doing is frequently called "name mangling".

Stephen C