views:

92

answers:

2

For a one-shot operation, i need to parse the contents of an XML string and change the numbers of the "ID" field. However, i can not risk changing anything else of the string, eg. whitespace, line feeds, etc. MUST remain as they are!

Since i have made the experience that XmlReader tends to mess whitespace up and may even reformat your XML i don't want to use it (but feel free to convince me otherwise). This also screams for RegEx but ... i'm not good at RegEx, particularly not with the .NET implementation.

Here's a short part of the string, the number of the ID field needs to be updated in some cases. There can be many such VAR entries in the string. So i need to convert each ID to Int32, compare & modify it, then put it back into the string.

<VAR NAME="sf_name" ID="1001210">

I am looking for the simplest (in terms of coding time) and safest way to do this.

+4  A: 

The regex pattern you are looking for is:

ID="(\d+)"

Match group 1 would contain the number. Use a MatchEvaluator Delegate to replace matches with dynamically calculated replacements.

Regex r = new Regex("ID=\"(\\d+)\"");
string outputXml = r.Replace(inputXml, new MatchEvaluator(ReplaceFunction));

where ReplaceFunction is something like this:

public string ReplaceFunction(Match m)
{
  // do stuff with m.Groups(1);
  return result.ToString();
}

If you need I can expand the Regex to match more specifically. Currently all ID values (that contain numbers only) are replaced. You can also build that bit of "extra intelligence" into the match evaluator function and make it return the match unchanged if you don't want to change it.

Tomalak
Thanks! Works like a charm. Only minor error is a missing \ in the regex string: "ID=\"(\d+)\"" ... should be "ID=\"(\\d+)\""
steffenj
Of course. I'll fix that.
Tomalak
A: 

Take a look at this property PreserveWhitespace in XmlDocument class

bruno conde