Been banging my head against the wall on this one all day and am getting close to my wits end on this. Looking for some fresh perspective.
Sample Input Text:
(line breaks added for clarity, not in actual data )
</div>#My Novel<br />
##Chapter1<br />
It was a dark and stormy night<br />
##Chapter 2<br />
The End
Desired Output
</div><h1>My Novel</h1><br />
<h1>Chapter1</h1><br />
It was a dark and stormy night<br />
<h1>Chapter 2</h1><br />
The End
Actual Output
</div><h1>My Novel</h1><br />
##Chapter1<br />
It was a dark and stormy night<br />
<h1>Chapter 2</h1><br />
The End
Here is the match expression
(formatted for easy reading, comments/linebreaks are not in expression)
(?<preamble>
(
([<]\/\w+\d*[>])|([<]\w+\d*\s*\/[>]) #</tag> or <tag />
)
\s* #optional whitespace
)
(?<hashmarks>
\#{1,6} #1-6 hash marks
)
(?<content>
.+? #header content
)
(?<closing>
([<](br|\/\s*br|br\s*\/)[>]) #<br>,</br>, or <br />
)
Here is the replace Expression
${preamble}<h1>${content}</h1>${closing}
If it matters I am using the following C# regex.replace overload:
Regex.Replace(Source,SrchExp,ReplExpr,RegexOptions.IgnoreCase)
The question (finally)
Can anyone see why it is replacing #My Novel and ##Chapter 2, but not ##Chapter 1?
Sorry for the long post, and hopefully I didn't munge anything trying to format it to make it readible for SO.
Update:
One more thing that might help. Adding an extra break tag right after "Novel" makes the provided code start working perfectly. No idea why yet.
Sample Input Text (modified):
</div>#My Novel<br /><br />
##Chapter1<br />
It was a dark and stormy night<br />
##Chapter 2<br />
The End