I need help to build a regex that can remove EVEN lines in a plain textfile.
Given this input:
line1
line2
line3
line4
line5
line6
It would output this:
line1
line3
line5
Thanks !
I need help to build a regex that can remove EVEN lines in a plain textfile.
Given this input:
line1
line2
line3
line4
line5
line6
It would output this:
line1
line3
line5
Thanks !
Well this, will remove EVEN lines from the text file:
grep '[13579]$' textfile > textfilewithoddlines
And output this:
line1
line3
line5
Actually, you don't use regex for that. With your favourite language, iterate the file, use a counter and do modulus. eg with awk (*nix)
$ awk 'NR%2==1' file
line1
line3
line5
even lines:
$ awk 'NR%2==0' file
line2
line4
line6
Perhaps you are on the command line. In PowerShell:
$x = 0; gc .\foo.txt | ? { $x++; $x % 2 -eq 0 }
Well, if you do a search-and-replace-all-matches on
^(.*)\r?\n.*
in "^
matches start-of-line mode" and ".
doesn't match linebreaks mode"; replacing with
\1
then you lose every even line.
E. g. in C#:
resultString = Regex.Replace(subjectString, @"^(.*)\r?\n.*", "$1", RegexOptions.Multiline);
or in Python:
result = re.sub(r"(?m)^(.*)\r?\n.*", r"\1", subject)
First, I fully agree with the consensus that this is not something regex should be doing.
Here's a Java demo:
public class Test {
public static String voodoo(String lines) {
return lines.replaceAll("\\G(.*\r?\n).*(?:\r?\n|$)", "$1");
}
public static void main(String[] args) {
System.out.println("a)\n"+voodoo("1\n2\n3\n4\n5\n6"));
System.out.println("b)\n"+voodoo("1\r\n2\n3\r\n4\n5\n6\n7"));
System.out.println("c)\n"+voodoo("1"));
}
}
output:
a)
1
3
5
b)
1
3
5
7
c)
1
A short explanation of the regex:
\G # match the end of the previous match
( # start capture group 1
.* # match any character except line breaks and repeat it zero or more times
\r? # match the character '\r' and match it once or none at all
\n # match the character '\n'
) # end capture group 1
.* # match any character except line breaks and repeat it zero or more times
(?: # start non-capture group 1
\r? # match the character '\r' and match it once or none at all
\n # match the character '\n'
| # OR
$ # match the end of the input
) # end non-capture group 1
\G
begins at the start of the string. Every pair of lines (where the second line is optional, in case of the last uneven line) gets replaced by the first line in the pair.
But again: using a normal programming language (if one can call awk
"normal" :)) is the way to go.
EDIT
And as Tim suggested, this also works:
replaceAll("(?m)^(.*)\r?\n.*", "$1")