tags:

views:

2322

answers:

5

I have following text in a file

23456789

When i tried to replace the above text using command

1,$s/\(\d\)\(\d\d\d\)\(\d\d\)*\>/\3\g

I am getting 89. Should't it be 6789? Can anyone tell me why it is 89.

A: 

Group 3 is defined as being 2 digits long. If you want to match the last 4 digits you want (\d\d\d\d) with no * at the end. If you just want to match all digits but the first 4, put your * inside the group match rather than outside

MrWiggles
shouldn't * at the group 3 match last 2,4,6.. digits?
chappar
No, see rampions answer
MrWiggles
+2  A: 

As written, your regex captures one digit, then three digits, then any number of groups of two digits each. The third match will, therefore, always be two digits if it exists. In your particular test case, the '89' is in \4, not \3.

Changing the regex to

 1,$s/\(\d\)\(\d\d\d\)\(\d\d\+\)\>/\3\g

will give you '6789' as the result, since it will capture two or more digits (up to as many as are there) in the third group.

Dave Sherohman
(\d\d)* matches 2 or multiple of 2's. In our case it should match last 4 digits. So, shouldn't \3 contain all the 4 digits. \4 will have nothing as i have only 3 ().
chappar
(\d\d)* does match digits of length multiples of 2 (so 12, 3456, but not 789), but it only *captures* the last atom, since the same parentheses (the same capturing group) are used for multiple pairs of numbers. To make sure you're only matching even multiple lengths, use Hasturken's regex.
rampion
+1  A: 

You want to use a non-capturing group here, like so

1,$s/\(\d\)\(\d\d\d\)\(\%(\d\d\)*\)\>/\3/g

which gives 6789 as the result here, and if input was changed to

2345678

would change the line to 278

Hasturkun
I was actually looking for the non-capturing group for vim regex.
lambacck
A: 

You'd probably want (need an extra wrapping group):

%s/\(\d\)\(\d\d\d\)\(\(\d\d\)*\)\>/\3\g

Although I'm not sure why you're capturing the first 2 groups.

orip
Why do i need a extra set of parenthesis at group 3? what was the problem with my original example?
chappar
\(\d\d\)* would indeed match any digit pairs, but it won't _capture_ them for you to use later. To capture it you need to wrap it in its own group - that's the extra set of parentheses.
orip
A: 

Hello! I have tried this one in nvi and it does not work. In vim it works, only that you must correct the final inverted dash before the g, for a dash, like this:

1,$s/(\d)(\d\d\d)(\d\d)*>/\3/g

and it gets replaced with 89. The reason is that you are saying with the * that the last \d\d can be repeated zero, one or more times, and with > you are saying end word boundary. With the group 3 you are saying that you want the las group, but because of the * the las two digits (\d\d) are 89. Taking out the *> you can get 6789. Like this:

1,$s/(\d)(\d\d\d)(\d\d)/\3/g

Watch out for the > who is playing a tricky part because with this: :1,$s/(\d)(\d\d\d)(\d\d)>/\3 you get 2389 LOL! Because from the end of word-boundary perspective dddddd is matching 456789 and it gets replaced with the last two dd, and that is 89. So you get 23+89 Mind blowing! LOL Cheers! gaston.

Gaston