views:

298

answers:

2

Given the (specimen - real markup may be considerably more complicated) markup and constraints listed below, could anyone propose a solution (C#) more effective/efficient than walking the whole tree to retrieve { "@@value1@@", "@@value2@@", "@@value3@@" }, i.e. a list of tokens that are going to be replaced when the markup is actually used.

Note: I have no control over the markup, structure of the markup or format/naming of the tokens that are being replaced.

<markup>
    <element1 attributea="blah">@@value1@@</element1>
    <element2>@@value2@@</element2>
    <element3>
     <element3point1>@@value1@@</element3point1>
     <element3point2>@@value3@@</element3point2>
     <element3point3>apple</element3point3>
    <element3>
    <element4>pear</element4>
</markup>
A: 

I wrote a quick prog with your sample, this should do the trick.

class Program
    {
        //I just copied your stuff to Test.xml
        static void Main(string[] args)
        {
            XDocument doc = XDocument.Load("Test.xml");
            var verbs=new Dictionary<string,string>();
            //Add the values to replace ehre
            verbs.Add("@@value3@@", "mango");
            verbs.Add("@@value1@@", "potato");
            ReplaceStuff(verbs, doc.Root.Elements());
            doc.Save("Test2.xml");
        }

        //A simple replace class
        static void ReplaceStuff(Dictionary<string,string> verbs,IEnumerable<XElement> elements)
        {
            foreach (var e in elements)
            {
                if (e.Elements().Count() > 0)
                    ReplaceStuff(verbs, e.Elements() );
                else
                {
                    if (verbs.ContainsKey(e.Value.Trim()))
                        e.Value = verbs[e.Value];
                }
            }
        }
    }
amazedsaint
+2  A: 

How about:

    var keys = new HashSet<string>();
    Regex.Replace(input, "@@[^@]+@@", match => {
        keys.Add(match.Value);
        return ""; // doesn't matter
    });
    foreach (string key in keys) {
        Console.WriteLine(key);
    }

This:

  • doesn't bother parsing the xml (just string manipulation)
  • only includes the unique values (no need to return a MatchCollection with the duplicates we don't want)

However, it may build a larger string, so maybe just Matches:

var matches = Regex.Matches(input, "@@[^@]+@@");
var result = matches.Cast<Match>().Select(m => m.Value).Distinct();
foreach (string s in result) {
    Console.WriteLine(s);
}
Marc Gravell
Worked a treat (second snippet), thankyou! =)
Rob