views:

64

answers:

3

Hi,

I have a String that I'm passing to log4j to have it logged to a file, the contents of that string is XML, and it is formatted to be on multiple lines with indentations and so forth, to make it easy to read.

However, I would like that XML to be all on one line, how can I do this? I've had a look at StringUtils, I guess I could strip the tabs and carriage returns, but there must be a cleaner way?

Thanks

+1  A: 

Perhaps with JDom http://www.jdom.org/

public static Document createFromString(final String xml) {
    try {
        return new SAXBuilder().build(new ByteArrayInputStream(xml.getBytes("UTF-8")));
    } catch (JDOMException e) {
        throw new RuntimeException(e);
    } catch (IOException e) {
        throw new RuntimeException(e);
    }
}

public static String renderRaw(final Document description) {
    return renderDocument(description, getRawFormat());
}

public static String renderDocument(final Document description, final Format format) {
    return new XMLOutputter(format).outputString(description);
}
Sylvain M
`getCompactFormat()`, more likely
skaffman
Isn't that fairly expensive for just producing a log message? And it would also require the XML to be valid for it to be logged. Not that the solution isn't elegant, but it does have some drawbacks which may or may not be important.
extraneon
I'm interred in knowing what is "not elegant" ?
Sylvain M
He didn't say/mean it that way :) Note the *"Not"* in front of the sentence.
BalusC
extraneon brings up a good point about logging invalid XML, which you could make the case that invalid xml needs more logging than valid!
glowcoder
@BalusC: Oh, ok, I really must be better at english comprehension...
Sylvain M
Who needs nail clippers? I got me this here chainsaw... ;)
Carl Smotricz
A: 
String oneline(String multiline) {
    String[] lines = multiline.split(System.getProperty("line.separator"));
    StringBuilder builder = new StringBuilder();
    builder.ensureCapacity(multiline.length()); // prevent resizing
    for(String line : lines) builder.append(line);
    return lines.toString();
}
glowcoder
+4  A: 

I'd throw a regexp replace on it. That's not highly efficient but sure to be faster than XML parsing!

This is untested:

 String cleaned = original.replaceAll("\\s*[\\r\\n]+\\s*", "").trim();

If I haven't goofed, that will eliminate all line terminators as well as any whitespace immediately following those line terminators. The whitespace at the beginning of the pattern should kill any trailing whitespace on individual lines. trim() is thrown in for good measure to eliminate whitespace at the start of the first line and the end of the last.

Carl Smotricz
You can just use `[\r\n]`, no need to doubleslash there.
polygenelubricants
@polyg wouldn't you need the double because you want the resulting regex to be `\r`, and you need to escape the `\\` in the Java string?
glowcoder
@glowcoder: the resulting regex doesn't _HAVE_ to be `\r`; it can just be the actual unescaped character.
polygenelubricants
As I understand it, you could double the \ to have the regexp scanner evaluate it, or leave it single so the `\r` is treated as a CR by the Java language and inserted verbatim in the brackets.
Carl Smotricz
On second thought, I'm putting the double backslashes back in! To hell with tiny differences in inefficiency, the inconsistency in the notation looks irritating to me, a maintenance accident waiting to happen.
Carl Smotricz
@poly, carl, `glowcoder.regexKnowledge++;` thanks!
glowcoder
More concise, no need of 3rd party lib... very better than mine. And thx for the regex training :)
Sylvain M
I *did* goof! The 2nd `\s` was followed by `+`, meaning that a line terminator *without* leading whitespace on the next line would not be eliminated. I've changed that to a `*`.
Carl Smotricz
Thanks for the answers, Carls is the simplest solution so I've gone for that :)
James.Elsey
Just make sure your formatted XML doesn't have any attributes on new lines, which many formatters output, or it won't be XML after this. - `<a\r\n foo="bar"` becomes `<afoo="bar"`
Pete Kirkham
@Pete: Argh! You're correct of course. Something to watch out for. I suppose it wouldn't hurt to consistently re-insert one blank for every newline removed?
Carl Smotricz