ansaurus

Question

Answer 1

A:

Well this, will remove EVEN lines from the text file:

grep '[13579]$' textfile > textfilewithoddlines

And output this:

line1

line3

line5

emil 2010-02-12 07:57:44

that is not scalable.

ghostdog74 2010-02-12 07:58:27

Oh yes it is. No matter how large the number, whether it's odd is decided by just the last digit (`$`) which must be one of these 5 digits.

bart 2010-02-12 08:02:17

what i mean by not scalable is that the data may not be a literal "line1", "line2". It may be anything. So grepping for patterns that end with a number is not scalable.

ghostdog74 2010-02-12 08:04:22

It is plenty scalable. What ghostdog means is that it's not _general_. Unfortunately the spec (question) doesn't state what kind of generality is required, so we are left to guess.

Jay Bazuzi 2010-02-12 08:06:16

@ghostdog74: As OP said he wanted to use regex I assumed he wanted lines ending in odd/even numbers. Otherwise as many says, one wouldn't use regex. `sed -n '2,$n;p' textfile` might be better suited.

emil 2010-02-12 10:34:53

Answer 2

+5 A:

Actually, you don't use regex for that. With your favourite language, iterate the file, use a counter and do modulus. eg with awk (*nix)

$ awk 'NR%2==1' file
line1
line3
line5

even lines:

$ awk 'NR%2==0' file
line2
line4
line6

ghostdog74 2010-02-12 07:57:59

Tried this and it works ! Thanks :)

sthg 2010-02-12 08:19:46

Answer 3

A:

Perhaps you are on the command line. In PowerShell:

$x = 0; gc .\foo.txt | ? { $x++;  $x % 2 -eq 0 }

Jay Bazuzi 2010-02-12 08:05:15

Answer 4

+1 A:

Well, if you do a search-and-replace-all-matches on

^(.*)\r?\n.*

in "^ matches start-of-line mode" and ". doesn't match linebreaks mode"; replacing with

\1

then you lose every even line.

E. g. in C#:

resultString = Regex.Replace(subjectString, @"^(.*)\r?\n.*", "$1", RegexOptions.Multiline);

or in Python:

result = re.sub(r"(?m)^(.*)\r?\n.*", r"\1", subject)

Tim Pietzcker 2010-02-12 08:22:23

You should also cover the case that there is an odd number of lines.

Gumbo 2010-02-12 08:26:15

Thought so at first, too - but we want to keep the odd lines, don't we? :) By the way, congratulations on being elected moderator - I just noticed the diamond (and I did vote for you ;)

Tim Pietzcker 2010-02-12 08:40:11

Answer 5

+1 A:

First, I fully agree with the consensus that this is not something regex should be doing.

Here's a Java demo:

public class Test {

    public static String voodoo(String lines) {
        return lines.replaceAll("\\G(.*\r?\n).*(?:\r?\n|$)", "$1");
    }

    public static void main(String[] args) {
        System.out.println("a)\n"+voodoo("1\n2\n3\n4\n5\n6"));
        System.out.println("b)\n"+voodoo("1\r\n2\n3\r\n4\n5\n6\n7"));
        System.out.println("c)\n"+voodoo("1"));
    }
}

output:

a)
1
3
5

b)
1
3
5
7

c)
1

A short explanation of the regex:

\G       # match the end of the previous match
(        # start capture group 1
  .*     #   match any character except line breaks and repeat it zero or more times
  \r?    #   match the character '\r' and match it once or none at all
  \n     #   match the character '\n'
)        # end capture group 1
.*       # match any character except line breaks and repeat it zero or more times
(?:      # start non-capture group 1 
  \r?    #   match the character '\r' and match it once or none at all
  \n     #   match the character '\n'
  |      #   OR
  $      #   match the end of the input
)        # end non-capture group 1

\G begins at the start of the string. Every pair of lines (where the second line is optional, in case of the last uneven line) gets replaced by the first line in the pair.

But again: using a normal programming language (if one can call awk "normal" :)) is the way to go.

EDIT

And as Tim suggested, this also works:

replaceAll("(?m)^(.*)\r?\n.*", "$1")

Bart Kiers 2010-02-12 08:54:33

Shouldn't `String result = subject.replaceAll("(?m)^(.*)\r?\n.*", "$1");` work just the same? After a match, the regex engine will automatically have arrived at the start of the next odd line.

Tim Pietzcker 2010-02-12 13:51:28

Yes, of course it does! As is often the case with me: I try to solve things in a much too difficult way!

Bart Kiers 2010-02-12 14:04:42

ansaurus

tags:

views:

answers:

Regex to remove EVEN lines

related questions