views:

146

answers:

5

I'm developing a small application with needs to replace .NET style "variables" in strings with their values. An example is given below:

Replace:

"{D}/{MM}/{YYYY}"

With: "31/12/2009"

I have decided to use Regex to find the occurrences of "{var}" but I'm not too sure how to properly go about it.

A: 

Here's some code I use in a logging utility that does almost exactly what you want, although with a slightly different format:

 Regex rgx = new Regex("{([Dd]:([^{]+))}");

 private string GenerateTrueFilename()
 {
  // Figure out the new file name
  string lfn = LogFileName;
  while (rgx.IsMatch(lfn))
  {
   Match m = rgx.Matches(lfn)[0];
   string tm = m.Groups[1].Value;
   if (tm.StartsWith("D:"))
   {
    string sm = m.Groups[2].Value;
    string dts = DateTime.Now.ToString(sm);
    lfn = lfn.Replace(m.Groups[0].Value, dts);
   }
  }

  return lfn;
 }

The rgx defines the format, which currently really only has one format:

D:{datestring}

LogFileName is a variable defined elsewhere in code, but has a default:

LogFileName = "LogFile.{D:yyyy-MM-dd}.log"

so hopefully you can see from this how to proceed with other variables if needed.

Michael Bray
+1  A: 

Regular Expressions are really powerful, but also take a lot of overhead. It would probably be simpler to use something like inputString.Replace("{var}", value) instead:

void Foo()
{
    string s = "{D}/{MM}/{YYYY}";
    s = ReplaceTokenInString(s, "{D}", "31");
    s = ReplaceTokenInString(s, "{MM}", "12");
    s = ReplaceTokenInString(s, "{YYYY}", "2009");
}

string ReplaceTokenInString(string input, string token, string replacement)
{
    input.Replace(token, replacement);
}
Jeremy Seghi
Except for creating those 2 intermediate strings.... not a big deal with only 3 replacements on a small string, but it's not very smart. Use a StringBuilder and its' method sb.Replace would probably be better when you consider that's the point of the 'program' is to replace things. Also, why confuse things with 1 line method ReplaceTokenInString which does exactly what s.Replace does?
Robert Paulson
@Robert Paulson: A replace-in-place with a StringBuilder isn't necessarily more efficient than creating a new string--not if the token and its replacement are different lengths. Your MatchEvaluator solution beats both approaches handily by doing everything in one pass.
Alan Moore
@Alan M, true, which is why I said 'probably' because I've never profiled it, but would at least hope it's better.
Robert Paulson
@Robert Paulson: My thought process was probably being derailed by the thought of the overhead impact the Regex engine would have if the tokens were a small finite set. I honestly just wasn't thinking of having the tokens defined on the fly. Either way your solution runs rings around mine. I would've commented sooner, but I didn't realize I could comment on my own answers regardless of rep until a couple weeks ago. D'oh!
Jeremy Seghi
I guess any time I see a strings being created/used a bunch of times I wonder if it's the best way, and my first instinct is to use a StringBuilder (hence my original comment). Each intermediate exists somewhere in memory for a period of time, and the memory use penalty increases the more often the code runs, and also increases the work the GC has to do. Also the OP did ask about using a Regex, so I went with that approach after my initial comment to you. I didn't profile anything. When it comes down to it, if it really matters, it should be profiled. Cheers.
Robert Paulson
A: 

If you're specifically doing dates, you should look into using a format string. Obviously if you're doing more than that this won't help you.

Joel Coehoorn
+3  A: 

If you're going to use Regex, and you have a known pattern, it doesn't get much simpler than using a match evaluator method instead of calling replace a bunch of times:

void Main() // run this in LinqPad
{
 string text = "I was here on {D}/{MMM}/{YYYY}.";
 string result = Regex.Replace(text, "{[a-zA-Z]+}", ReplaceMatched);

 Console.WriteLine(result);
}

private string ReplaceMatched(Match match)
{
 if( match.Success )
 {
  switch( match.Value )
  {
   case "{D}":
    return DateTime.Now.Day.ToString();
   case "{YYYY}":
    return DateTime.Now.Year.ToString();
   default:
    break;
  }
 }
 // note, early return for matched items.

 Console.WriteLine("Warning: Unrecognized token: '" + match.Value + "'");
 return match.Value;
}

gives the result

Warning: Unrecognized token: '{MMM}'
I was here on 2/{MMM}/2009.

It's not entirely implemented obviously.

Check out LinqPad and a Regex Tool like Expresso

Robert Paulson
+1 - two things I really love about .NET regexps is this, and ability to parse arbitrarily nested constructs.
Pavel Minaev
A: 

I've done it with Regex and using a StringBuilder. I'm finding the matches in the source string and appending blocks of the string and evaluated matches.

StringBuilder text = new StringBuilder(snippet.Length * 2);
Regex pattern = new Regex("\\{[A-Za-z]+\\}");
MatchCollection matches = pattern.Matches(snippet);
if (matches.Count == 0) return snippet;

int from = 0;
foreach (Match match in matches)
{
    text.Append(snippet.Substring(from, match.Index - from));
    string variable = snippet.Substring(match.Index + 1, match.Length - 2);
    text.Append(EvaluateVariable(variable));
    from = match.Index + match.Length;
}
text.Append(snippet.Substring(from));
return text.ToString();
Nick Bedford
Great! You've just reinvented the Regex.Replace method. You really should take another look at Robert Paulson's answer.
Alan Moore
Yeah I didn't even notice that answer. I came from native C++, go figure hah
Nick Bedford