tags:

views:

93

answers:

6

I am trying to write a simple regex to convert some two digit years to four digit years in a pipe delimited file. I am using:

Regex dateFormat = new Regex(@"\|(\d\d)/(\d\d)/([\d\d)\|");
string convertedString = dateFormat.Replace(contents, @"|$1$220$3|'");

What I want is |10/31/09| to be replaced with |10312009|.

What I am getting is |10$22009|

I think the problem is .NET is evaluating $1 and $3 but is thinking there is a group in the middle with no value ($220 maybe?). How can I let .NET know that the 20 is a constant value instead of part of the group value?

Thanks in advance

A: 

You can modify your Regex to use named groups instead. The syntax for a named group is (?). Then, in your Replace function you can use the group names instead of the group number.

Regex dateFormat = new Regex(@"\|(?<month>\d\d)/(?<day>\d\d)/(?<year>[\d\d)\|");
string convertedString = dateFormat.Replace(contents, @"|${month}${day}20${year}|'");
TLiebe
+4  A: 

Your intuition about the problem is correct: the second backreference is being interpreted as $220, not $2. To fix this, use curly braces:

dateFormat.Replace(contents,@"|$1${2}20$3|'");

More info about .NET regular expressions is available here.

Welbog
Also, http://msdn.microsoft.com/en-us/library/hs600312.aspx
Martinho Fernandes
A: 

I don't know how to do that but here is my workaround. To use named group.

Regex dateFormat = new Regex(@"\|(?<month>\d\d)/(?<date>\d\d)/(?<year>\d\d)\|");
string convertedString = dateFormat.Replace(contents, @"|${month}${date}20${year}|'");

See more infor at the bottom of this page.

Hope this help.

NawaMan
Actually, he used `<pre>` tags to format the code, and the group names got interpreted as tags. @NawaMan, to format code, just indent it four spaces (or select it and press `CTRL-k`).
Alan Moore
Oh, that old HTML sanitizer strikes again....
Martinho Fernandes
You can check the preview below the entry area to see how it will look before posting...
John Fisher
+1  A: 

Your regex text doesn't parse. Was the "[" supposed to be there? Wrap the number in {} to fix the replace issue:

Regex dateFormat = new Regex(@"\|(\d\d)/(\d\d)/(\d\d)\|");
string convertedString = dateFormat.Replace(contents, @"|${1}${2}20${3}|'");
John Fisher
A: 

I see problems with your regular expression, namely the unmatched [ character. The following works fine:

\|(?<month>\d{2})/(?<day>\d{2})/(?<year>\d{2})\|

That will group the month, day, and year results. You can then replace with the following string:

|$1/$2/20$3|
David in Dakota
The replace string was his problem, and doesn't meet his needs. The problem occurs when you don't have the "/" characters in the replace string.
John Fisher
A: 

Try this:

string contents = "|10/31/09|";
Regex dateFormat = new Regex(@"\|(?<mm>\d\d)/(?<dd>\d\d)/(?<yy>\d\d)\|");
Console.WriteLine(dateFormat.Replace(contents, "|${mm}${dd}20${yy}|"));

More information:

Call RegexObj.Replace("subject", "replacement") to perform a search-and-replace using the regex on the subject string, replacing all matches with the replacement string. In the replacement string, you can use $& to insert the entire regex match into the replacement text. You can use $1, $2, $3, etc... to insert the text matched between capturing parentheses into the replacement text. Use $$ to insert a single dollar sign into the replacement text. To replace with the first backreference immediately followed by the digit 9, use ${1}9. If you type $19, and there are less than 19 backreferences, the $19 will be interpreted as literal text, and appear in the result string as such. To insert the text from a named capturing group, use ${name}. Improper use of the $ sign may produce an undesirable result string, but will never cause an exception to be raised.

From http://www.regular-expressions.info/dotnet.html

Rubens Farias