ansaurus

Question

Regexp, how to copy part of document, only start and end?

Answer 1

+1 A:

Start with an expression that matches the pieces you care about:

/lorem ipsum(.*?)tellus(.*?)aliquam(.*?)retrum/

Now the first and third sub-pattern, concatenated together, are your final content.

In some flavors of regular expression you can get the middle sub-pattern not to count — in Perl's flavor (and PHP's preg) it's (?:.*?).

VoteyDisciple 2010-10-13 12:55:07

i test that in rubular, but doesnt work:http://www.rubular.com/r/U0SBv3zV6W, can help me one more time?i try modify this regexp to make 2 groups and work but doesnt owrk too

Stefhan 2010-10-13 13:32:14

Rubular already includes the slashes that delimit the regular expression; you can't paste them into the expression itself. You also need the `m` and `i` flags to account for case insensitivity and multiline text. And finally I wrote `retrum` but the text has `rutrum`. http://www.rubular.com/r/2hZ8xeKS9e

VoteyDisciple 2010-10-13 13:45:04

Answer 2

A:

If you're looking for first and last line (its not clear (at least to me) what you mean by first and last part), the following regex will capture first line in $1 and last line in $2 (provided there are at least two lines)

 \A([^\n]+)[\s\S]+([^\n]+)\Z

Amarghosh 2010-10-13 12:58:51

see bold, i want match first text part...and end text part...not line.

Stefhan 2010-10-13 13:34:07

Answer 3

+1 A:

If the groups you want are always separated in blocks, like the paragraphs in your example you can find all occurrences of that block, probably using the newline as the ending item, and then display the first and last numbered matches.

Or do you need the actual RegEx to match those blocks? If so, first of all I recommend http://rubular.com/ for testing out RegEx stuff since it is in real time it makes it easier to see how things affect it.

Knowing what language are you doing this with or if it is a cli kind of search, i.e. egrep, would help some in the answer.

LokNessMobster 2010-10-13 13:05:57

hi want 2 groups, one start and other end, i am using rubular but i am newbie in regexp i tried several times, need some help =/...i am using java

Stefhan 2010-10-13 13:23:55

Answer 4

+2 A:

In Perl, you can do:

#!/usr/bin/perl 
use 5.10.1;
use warnings;
use strict;

my $str = q!Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vestibulum vitae dapibus tortor. Duis odio massa, viverra quis vestibulum nec, tincidunt ac tellus.
Ut id enim sapien, ut varius dolor. Curabitur ipsum dolor, consectetur quis fermentum ut,
aliquam nec felis. Praesent sed malesuada sem. Integer cursus lectus ac eros aliquet rutrum.!;

$str =~ /\A(.+)[\s\S]+?(.+)\Z/;
say '$1 = ',$1;
say '$2 = ',$2;

Output:

$1 = Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vestibulum vitae dapibus tortor. Duis odio massa, viverra quis vestibulum nec, tincidunt ac tellus.
$2 = aliquam nec felis. Praesent sed malesuada sem. Integer cursus lectus ac eros aliquet rutrum.

Explanation:

/         : begin of regex
 \A       : begining of string
 (        : begining of group 1
  .+      : any char except newline one or more time
 )        : end of group 1
 [\s\S]   : any char including newlines
   +?     :   one or more time non greedy
 (        : begining of group 2
  .+      : any char except newline one or more time
 )        : end of group 2
 \Z       : end of string
/         : end of regex

Sure this can be adapted to others languages.

M42 2010-10-13 13:41:50

ansaurus

tags:

views:

answers:

Regexp, how to copy part of document, only start and end?

related questions