




How do I solve the problem below?

I'm creating a simple content management system, where there is a HTML template with specific markup that denotes where content should be:

<html><head></head><body><!-- #Editable "Body1" --><p>etc etc</p><!-- #Editable "Extra" --></body></html>

Separate from this, there is content in a database field that looks a little like this:

<!-- #BeginEditable "Body1" -->This is Test Text<!-- #EndEditable --><!-- #BeginEditable "Extra" -->This is more test text<!-- #EndEditable -->

As you can guess I need to merge the two, that is, replacing

<!-- #Editable "Body1" -->


This is Test Text

I've begun the code here. But I'm having problems using the Regex Replace function that should be located at the very bottom of that For/Each....

    //Html Template
    string html = "<html><head></head><body><!-- #Editable \"Body1\" --><p>etc etc</p><!-- #Editable \"Extra\" --></body></html>";        

    //Regions that need to be put in the Html Template
    string regions = "<!-- #BeginEditable \"Body1\" -->This is Test Text<!-- #EndEditable --><!-- #BeginEditable \"Extra\" -->This is more test text<!-- #EndEditable -->";

    //Create a Regex to only extract what's between the 'Body' tag
    Regex oRegex = new Regex("<body.*?>(.*?)</body>", RegexOptions.Multiline);

    //Get only the 'Body' of the html template
    string body = oRegex.Match(html).Groups[1].Value.ToString();

    // Regex to find sections inside the 'Body' that need replacing with what's in the string 'regions'
    Regex oRegex1 = new Regex("<!-- #Editable \"(.*?)\"[^>]*>",RegexOptions.Multiline);
    MatchCollection matches = oRegex1.Matches(body);

    // Locate section titles i.e. Body1, Extra
    foreach (Match match in matches)
        string title = oRegex1.Match(match.ToString()).Groups[1].ToString();
        Regex oRegex2 = new Regex("<!-- #BeginEditable \"" + title + "\"[^>]*>(.*?)<!-- #EndEditable [^>]*>", RegexOptions.Multiline);
        // Replace the 'Body' sections with whats in the 'regions' string cross referencing the titles i.e. Body1, Extra

It might be better to use the Html Agility Pack to deal with this for you then resorting to regex's. It can parse Html into a XML like tree in a DOM structure, and it would be easier to handle this problem using this pack.


string sReg = @"<body.*?>((?<Region>\<\!\-\-\s+\#Editable\s?\\$(?<editable>.+)\\$\s?\-\-\>[^\>]).*?)";
string sNewReg = sReg1.Replace('$', '\"');            System.Diagnostics.Debug.WriteLine(string.Format("Regex: {0}", sNewReg))
Regex MyRegex = new Regex(sNewReg,
    | RegexOptions.CultureInvariant
    | RegexOptions.IgnorePatternWhitespace
    | RegexOptions.Compiled
string sMg = "<html><head></head><body><!-- #Editable \\\"Body1\\\" --><p>etc etc</p><!-- #Editable \\\"Extra\\\" --></body></html>";
Match m = MyRegex.Match(sMg);
if (m.Success)
  System.Diagnostics.Debug.WriteLine(string.Format("{0}", m.Groups["editable"].Value));

Note how I had to use the dollar sign in place to prevent escaping, and replace it with a double quote at runtime..

Hope this helps, Best regards, Tom.

I would suggest using a templating engine like NVelocity for this kind of stuff.

Sorin Comanescu
Not optimized for performance (or anything else) but it's simple and works :

var html = "<html><head></head><body><!-- #Editable \"Body1\" --><p>etc etc</p><!-- #Editable \"Extra\" --></body></html>";
var regions = "<!-- #BeginEditable \"Body1\" -->This is Test Text<!-- #EndEditable --><!-- #BeginEditable \"Extra\" -->This is more test text<!-- #EndEditable -->";
var regionRegex = new Regex(@"<!-- #BeginEditable ""(?<Name>\w+)"" -->(?<Content>.*?)<!-- #EndEditable -->", RegexOptions.Multiline);
var regionMatches = regionRegex.Matches(regions);

foreach (Match regionMatch in regionMatches)
    var regionName = regionMatch.Groups["Name"].Value;
    var regionContent = regionMatch.Groups["Content"].Value;
    html = html.Replace(string.Format(@"<!-- #Editable ""{0}"" -->", regionName), regionContent);
BTW, you should promote `regionRegex` to private static field and append the `RegexOptions.Compiled` flag. Also use `StringBuilder` to do the replacements inside the loop.
use MatchEvaluator as anonymous delegate your code will be looks like

string html = "<html><head></head><body><!-- #Editable \"Body1\" --><p>etc etc</p><!-- #Editable \"Extra\" --></body></html>";
string regions = "<!-- #BeginEditable \"Body1\" -->This is Test Text<!-- #EndEditable --><!-- #BeginEditable \"Extra\" -->This is more test text<!-- #EndEditable -->";

Regex oRegex1 = new Regex("<!-- #Editable \"(.*?)\"[^>]*>", RegexOptions.Multiline);

html = oRegex1.Replace(html, delegate(Match m) {
    string title = m.Groups[1].Value;
    Regex oRegex2 = new Regex("<!-- #BeginEditable \"" + title + "\"[^>]*>(.*?)<!-- #EndEditable [^>]*>", RegexOptions.Multiline);
    return oRegex2.Match(regions).Groups[1].Value;

