tags:

views:

2649

answers:

7

Hi,

I have text file which has lot of character entries one line after another. I want to find all lines which start with :: and delete all those lines.

What is the regular expression to do this?

-AD

A: 

Simple as:

^::
José Leal
A: 
^::.*[\r\n]*

If you're reading the file line-by-line you won't need the [\r\n]* part.

Alan Moore
+6  A: 

Regular expressions don't "do" anything. They only match text.

What you want is some tools that uses regular expressions to identify a line and then apply some command to those tools.

One such tools is sed (there's also awk and many others). You'd use it like this:

sed -e "/^::/d" < input.txt > output.txt

The part "/^::/" tells sed to apply the following command to all lines that start with "::" and "d" simply means "delete that line".

Or the simplest solution (which my brain didn't produce for some strange reason):

grep -v "^::" input.txt > output.txt
Joachim Sauer
I think you have forgotten the Regex.Replace function... That actually "does" something, doesn't it?
Dscoduc
@Dcoduc: as you said: The function does something (its one of the tools I mentioned). The regular expression itself still only matches some text. It's the semantics of the function that defines what is to be done with the matched text.
Joachim Sauer
Thanks for the clarification... I stand corrected...
Dscoduc
+2  A: 
sed -i -e '/^::/d' yourfile.txt
mouviciel
I think this is perhaps the best answer, but it might be worth mentioning that not all versions of sed have a -i option.
oylenshpeegul
A: 

If you don't have sed or grep, find this and replace with empty string:

^::.*[\r\n]
jcoon
A: 

Thanks for the pointers:

Following thing worked for me. After "::" any character was possiblly present in the text file so i gave:

^::[a-zA-Z0-9 I put all punctuation symbols here]*$

-AD

goldenmean
you don't need to match enything after the initial ^::In your example you are forced to "account for" all the characters because you put a $ at the end.
Manu
If he's using a line-oriented tool like grep you're right. But he still hasn't said.
Alan Moore
@goldenmean, what's preventing you from using .* instead of that monster character class?
Alan Moore
I agree, it would be probably better to use a singleline option and add the .* to the expression.
Dscoduc
Single-line? Why would you want the dot to match newline characters? If you read one line at a time, there won't be any newlines to match, and if you read the whole file into memory before processing, the dot-star will consume the rest of the file the first time it's applied.
Alan Moore
A: 

Here's my contribution in C#:

Text stream:

string stream = :: This is a comment line

Syntax:

Regex commentsExp = new Regex("^::.*", RegexOptions.Singleline);

Usage:

Console.WriteLine(commentsExp.Replace(stream, string.Empty));

Alternatively, if I wanted to simply take a text file that included comments and produce an exact duplicate without the comment lines I could use a simple but effective combination of the type and findstr commandline tools:

type commented.txt | findstr /v /R "^::" > uncommented.txt
Dscoduc