How do you back reference inner parenthesis in Regex?
The sample data is a product price list showing different price breaks based on quantity purchased. The format is quantityLow - quantityHigh : pricePer ; multiples.
I used LINQPad to construct this C# Regex expression to separate the parts, which shows a handy visualization of the Regex data separation. In this example, there are "inner" parenthesis (selections), creating a hierarchical data structure.
string mys = "1-4:2;5-9:1.89";
Regex.Matches (mys, @"((\d+)[-|\+](\d*):(\d+\.?\d*);?)").Dump(); // Graphically show
This breaks down to (Match is everything. Within match, there is a single match and a group match. Within the group match is a few single matches.)
- MatchCollection (2 items)
- Group Collection (4 items)
- CaptureCollection (1 item) () Group "1-4:2;"
- CaptureCollection (1 item) () Group "1"
- CaptureCollection (1 item) () Group "4"
- CaptureCollection (1 item) () Group "2"
- CaptureCollection (1 item) () Match "1-4;2;"
- Group Collection (4 items)
- CaptureCollection (1 item) () Group "5-9:1.89"
- CaptureCollection (1 item) () Group "5"
- CaptureCollection (1 item) () Group "9"
- CaptureCollection (1 item) () Group "1.89"
- CaptureCollection (1 item) () Match "5-9:1.89"
- Group Collection (4 items)
Just for reference:
- () parenthesis group found results which can be referenced by a \1..\9 (I think).
- \d matches a single digit. The + after matches one or more digits. * after matches zero or more digits. ? after says this match is optional.
- . matches a single character. \. matches a period or decimal in this case.