Hi, I'm using regular expressions to search against the plain text returned by the following property:
namespace Microsoft.Office.Interop.Word
{
public class Range
{
...
public string Text { get; set; }
...
}
}
Based upon the matches I want to make changes to the formatted text that corresponds to the plain text. The problem I have is that the indices of characters in the .Text
property do not match up with the .Start
and .End
properties of the Range
object. Does anyone know any way to match these indices up?
(I can't use the Word wildcard find capabilities (as a replacement for .NET regular expressions) because they aren't powerful enough for the patterns I'm searching (non-greedy operators etc.))
Thank you.
Update: I can move the correct number of characters by starting with Document.Range().Collapse(WdCollapseStart)
and then range.MoveStart(WdUnitChar, match.Index)
since moving by characters matches the formatted text position up with the matches in the plain text. My problem now is that I'm always 4 characters too far along in the formatted text...so maybe it has something to do with the other story ranges? I'm not sure...
Update2: I have a solution that is working. Apparently the reason my matches were still off had to do with hidden "Bell" characters (char bell = '\a';
). By replacing these with the empty string inside Application.ActiveDocument.Range().Text
, my matches on this property now match up correctly with the range achieved by:
Word.Range range = activeDocument.Range();
range.Collapse(Word.WdCollapseStart);
range.MoveStart(Word.WdUnits.Character, regexMatch.Index);
Thanks.