tags:

views:

409

answers:

1

Hi all,

I'm using System.Xml to parse a xml file I generated. Some of the inner text of the nodes contains carriage return like this: " \r\n[...]\r\n ". Thant is because I used Visual Studio to format it before I parse it.

Is there any way that remove the carriage return added by formating tools?

Thanks!

Edit @Jon: Really thanks! I use a similar way to circumvent the problem.

            char[] escape = { ' ', '\r', '\n' };
            string text = node.InnerText.Trim(escape);

This avoid filtering out carriage returns in the text.

As you mentioned, my real question is how to filter out formatings while parsing xml. My assumption is: there is just plain text in the node, no child element embeded.

Is there any other formattings except for carriage returns?

+1  A: 

Removing all newlines is easy:

node.InnerText = node.InnerText.Replace("\r", "")
                               .Replace("\n", "");

But your question asks about specifically removing newlines added by formatting tools - and you can't really detect which ones have been added manually and which automatically, if they're all just in the file.

Also, do you have nodes which have text and elements interspersed, such as:

<node>
   text
   <element />
   text
</node>

If you do, you really need to process each text node individually instead of just changing InnerText.

If this isn't helpful to you, could you give more information about the problem?

Jon Skeet