ansaurus

Question

How can I repeatedly match from A until B in VIM?

Answer 1

+3 A:

Does it have to be done within vim? Could you cheat, and open a second window where you pipe something into more/less that tells you what line number to go to within vim?

-- edit --

I have never done a multi-line match/search in vi[m]. However, to cheat in another window:

perl -n -e 'if ( /<tag/ .. /<\/tag/)' -e '{ print "$.:$_"; }' file.xml | less

will show the elements/blocks for "tag" (or other longer matching names), with line numbers, in less, and you can then search for the other text within each block.

Close enough?

-- edit --

within "less", type

/MATCH

to search for occurrences of MATCH. On the left margin will be the line number where that instance (within the targeted element/tags) is.

within vi[m], type

:n

where "n" is the desired line number.

Of course, if what you really wanted to do was some kind of search/yank/replace, it's more complicated. At that point, awk / perl / ruby (or something similar which meets your tastes ... or xsl?) is really the tool you should be using for the transformation.

Roboprog 2009-04-10 01:13:33

I think something like this will be the only possible answer, as to do this right you need to use an XML parser.

Eddie 2009-04-10 01:14:53

Where is the MATCH word supposed to be? In the place of ..?

Masi 2009-04-10 14:40:16

Answer 2

+4 A:

First, a disclaimer: Any attempt to slice and dice XML with regular expressions is fragile; a real XML parser would do better.

The pattern:

\(<Annotation\(\s*\w\+="[^"]\{-}"\s\{-}\)*>\)\@<=\(\(<\/Annotation\)\@!\_.\)\{-}"MATCH\_.\{-}\(<\/Annotation>\)\@=

Let's break it down...

Group 1 is <Annotation\(\s*\w\+="[^"]\{-}"\s\{-}\)*>. It matches the start-tag of the Attribute element. Group 2, which is embedded in Group 1, matches an attribute and may be repeated 0 or more times.

Group 2 is \s*\w\+="[^"]\{-}"\s\{-}. Most of these pieces are commonly used; the most unusual is \{-}, which means non-greedy repetition (*? in Perl-compatible regular expressions). The non-greedy whitespace match at the end is important for performance; without it, Vim will try every possible way to split the whitespace between attributes between the \s* at the end of Group 2 and the \s* at the beginning of the next occurrence of Group 2.

Group 1 is followed by \@<=. This is a zero-width positive look-behind. It prevents the start-tag from being included in the matched text (e.g., for s///).

Group 3 is \(<\/Annotation\)\@!\_.. It includes Group 4, which matches the beginning of the Attribute end-tag. The \@! is a zero-width negative look-ahead and \_. matches any character (including newlines). Together, this groups matches at any character except where the Attribute end-tag starts. Group 3 is followed by a non-greedy repetition marker \{-} so that it matches the smallest block of text before MATCH. If you were to use \_. instead of Group 3, the matched text could include the end-tag of an Annotation element that did not include MATCH and continue through into the next Annotation element with MATCH. (Try it.)

The next bit is straightforward: Find MATCH and a minimal number of other characters before the end-tag.

Group 5 is easy: It's the end tag. \@= is a zero-width positive look-ahead, which is included here for the same reason as the \@<= for the start-tag. We have to repeat <\/Attribute rather than use \4 because groups with zero-width modifiers aren't captured.

Nathan Kitchen 2009-04-10 02:42:13

+1 for the explanations. It takes me some time to thoroughly understand them :)

Masi 2009-04-10 14:43:49

ansaurus

tags:

views:

answers:

How can I repeatedly match from A until B in VIM?

related questions