tags:

views:

57

answers:

3

what would be the best way to parse, for example, the string below and make a valid xml document out of it with java. So, for example, '\b' would be converted to <b> </b>, spaces to </space>, etc. Im rather new with xml so asking this really stupid question, sorry. :)

Example string:

Lorem\B ipsum\I dolor\B sit \COLOR=RGB(255,0,0)amet\COLOR=RGB(0,255,0) consectetur\COLOR adipisicing\COLOR elit.

thanks in advance!

A: 

You'll have to parse your string and do it yourself. There's nothing that I know of that will read your mind and create XML from what you posted.

You could use JDOM to create the XML once you've parsed the string.

duffymo
well yes, i want to do it myself. Im just asking for the best way to do it - a quick research showed that there are a lot of options for xml in java. So
S W Erdnase
A: 

Your format looks a bit like RTF.

Here is a sample that converts RTF to XML. This could solve your XML part of the problem.

To read your format you could think about writing you own EditorKit. (The sample code uses RTFEditorKit)

stacker
+1  A: 

The mechanics of converting it into xml are easy enough, either you write a general parser, parse it to a string and then convert (which is easy but means you would have to validate it) with a document reader, or you generate the xml as you go along (more complicated but cuts down validation). The problem from your example above is defining what you will allow in your language:

Lorem\B ipsum\I dolor\B sit \COLOR=RGB(255,0,0)amet\COLOR

Is this supposed to come out as

Lorem<b> ipsum<i> dolor</b> sit<color>=rgb(255,0,0)amet</color>

or

Lorem<b> impsum</b><i> dolor</i><b> sit</b><color>RGB(255,0,0)amet</color><color>

Neither seem particulrly what you would want, the first isn't valid, the second means you could just make one word bold (and never bold and italic).

It seems to be going back towards sgml where you need an additional file to know what is permitted.

But the easiest way for you to test it would just be to make a parser and load the results into a stringbuilder, then when you are done you just need to do something like

StringBuilder stringbuilder = new StringBuilder();
...
// parse the input string into the stringbuilder
...
String xml = stringbuilder.toString(); 
DocumentBuilderFactory factory =
        DocumentBuilderFactory.newInstance();

DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.parse(new InputSource(new StringReader(xml)));

will give you the answer in a dom if you wanted that (or throw an exception if you used that string above)

Woody