tags:

views:

50

answers:

3

So, i have some not well-formated xml document, some empty tags() don't have "/" at the end, example: <loader local="test.bat" dir="/usr/home"> And how can i elegant(using regex:)) add for each "loader" tag "/" at the of this tag(using Java); must be:

 <loader local="test.jpg" dir="/usr/home"/>
A: 

Usual disclaimer: Note that regular expressions are really not the best choice for processing xml. Almost any regular expression you see here will be flawed in some form, so the regex in this answer is not intended to be put into arbitrary code, but rather for highly controlled use.

Here is a possible solution (that will not work if you have closing angle brackets in attributes, for example)

xml.replaceAll("<loader\\b(.*?)>", "<loader$1/>");
soulmerge
i have no angle brackets in attributes, but it doesn't works...
Le_Coeur
Are you sure? Have you assigned the return value to the variable again (i.e. `xml = xml.replaceAll...`?)
soulmerge
+1  A: 

This might not be much help but I think overall saves time

If I had this sort of issue the first thing I would do would be to go to the provider of the data and ask them for a correct file. If they said they would provide xml then they should provide a valid file and xml is a well defined standard that it is easy to say the file is invalid.

One of the main benefits of XML is that it is a standard and you can use many well tested and support tools with it. If the file is not xml then it is another undocumented thing and everyone has to spend time in dealing with the mess.

Only if the supplier won't fix it then do the coding - however the supplier has then failed to meet their contract which affects how you deal with them in the future.

Mark
+1  A: 

I am not sure if there is an regular expression that can do this in an generic xml document but if you just want to transform them to valid xml you can use tidy.

For example its integrated in notepad++

TextFX - TextFx Html Tidy - Tidy Reindent Xml

<abc>
    <loader local="test.jpg" dir="/usr/home"/>
</abc>

results in

<abc>
  <loader local="test.jpg" dir="/usr/home" />
</abc>

which is probably what you expect. Tidy is also available to be integrated into applications like done in notepad++

Totonga