tags:

views:

61

answers:

1

I am trying to create a .Net Regex that will separate an Excel header data string into its component parts. The following example shows the format of the raw data string I need to parse:

&LLeft-side text&CCenter text&RRight-side text

The tags &L, &C and &R respectively demark the left, center and right sections of the header data. So, I need a Regex that will separate the above string into a number of sub-strings, each of which commences with &L, &C or &R.

Notes: the left, center, right sections may occur zero or more times in a single header (where multiple occurences for a given section will be concatenated in client code). The ampersand escape character is also used for formatting within each section, so the Regex must ignore usages than the above delimeters.

Thanks, in advance, for your suggestions.

+1  A: 

This might help you on your way:

((?<=&L)(?[\w\W]+?)(?=(&C|&R|$))))

this will match the text between &L and &C or &L and &R (if &C doesn't exist) or &L and the end (if neither &C or &R exist), into a named match called LeftText. you could repeat for the centre text by changing the for &L for a &C and removing the &C from the end section, and similar for the right section. This might or might not be what you want, it is hard to tell from the simple example given. The middle part [\w\W] might need to be tweaked as it might grab too much, but it seems to work for the given example.

Sam Holder
Many thanks. I think there might be one too many right brackets at the end, but it's exactly the principle that I was looking for.
Tim Coulter