Using Regex in .Net
I will have a set of data that comes in something like this
< Bunch o' Data Here >
where <
is just the indicator of a new record and >
is the end of the record.
these records may come in like this
< Dataset 1><Dataset 2 broken, no closing tag <dataset 3>
they could also come in as
< Dataset 1>Dataset 2 broken, no opening tag ><dataset 3>
although, i'm not certain that this latter case is possible, and i'll cross that bridge when i have to.
I'm trying to use Regex to split these into records based on this start and end character, ultimately something like this
Match 1 = < Dataset 1>
Match 2 = <Dataset 2 broken, no closing tag
Match 3 = <Dataset 3>
i'm trying to figure out how the non-capturing groups work and maybe my understanding is wrong.
<.*?(?:<|>)
gets me pretty close i think, except for that it includes the opening character of the 3rd set of data with the capture of the second group.
I also suspect that ?:
is not doing what it needs to and if it take it out, it returns the same set of matches(2).