tags:

views:

1383

answers:

8

I have a large text template which needs tokenized sections replaced by other text. The tokens look something like this: ##USERNAME##. My first instinct is just to use String.Replace(), but is there a better, more efficient way or is Replace() already optimized for this?

+9  A: 

System.Text.RegularExpressions.Regex.Replace() is what you seek - IF your tokens are odd enough that you need a regex to find them.

Some kind soul did some performance testing, and between Regex.Replace(), String.Replace(), and StringBuilder.Replace(), String.Replace() actually came out on top.

Greg Hurlman
+1  A: 

string.Replace is fine. I'd prefer using a Regex, but I'm * for regular expressions.

The thing to keep in mind is how big these templates are. If its real big, and memory is an issue, you might want to create a custom tokenizer that acts on a stream. That way you only hold a small part of the file in memory while you manipulate it.

But, for the naiive implementation, string.Replace should be fine.

Will
+2  A: 

If you are doing multiple replaces on large strings then it might be better to use StringBuilder.Replace(), as the usual performance issues with strings will appear.

samjudson
+1  A: 

Had to do something similar recently. What I did was:

  • make a method that takes a dictionary (key = token name, value = the text you need to insert)
  • Get all matches to your token format (##.+?## in your case I guess, not that good at regular expressions :P) using Regex.Matches(input, regular expression)
  • foreach over the results, using the dictionary to find the insert value for your token.
  • return result.

Done ;-)

If you want to test your regexes I can suggest the regulator.

Erik van Brakel
+1  A: 

This is an ideal use of Regular Expressions. Check out this helpful website, the .Net Regular Expressions class, and this very helpful book Mastering Regular Expressions.

Factor Mystic
+3  A: 

The only situation in which I've had to do this is sending a templated e-mail. In .NET this is provided out of the box by the MailDefinition class. So this is how you create a templated message:

MailDefinition md = new MailDefinition();
md.BodyFileName = pathToTemplate;
md.From = "[email protected]";

ListDictionary replacements = new ListDictionary();
replacements.Add("<%To%>", someValue);
// continue adding replacements

MailMessage msg = md.CreateMailMessage("[email protected]", replacements, this);

After this, msg.Body would be created by substituting the values in the template. I guess you can take a look at MailDefinition.CreateMailMessage() with Reflector :). Sorry for being a little off-topic, but if this is your scenario I think it's the easiest way.

Slavo
+2  A: 

Regular expressions would be the quickest solution to code up but if you have many different tokens then it will get slower. If performance is not an issue then use this option.

A better approach would be to define token, like your "##" that you can scan for in the text. Then select what to replace from a hash table with the text that follows the token as the key.

If this is part of a build script then nAnt has a great feature for doing this called Filter Chains. The code for that is open source so you could look at how its done for a fast implementation.

Gareth Farrington
+3  A: 

Well, depending on how many variables you have in your template, how many templates you have, etc. this might be a work for a full template processor. The only one I've ever used for .NET is NVelocity, but I'm sure there must be scores of others out there, most of them linked to some web framework or another.

dguaraglia