tags:

views:

87

answers:

3

Hi, I have the following lines in a files:

a class="rss tip" rel="direct" title="Linq2Sql" href="http://feeds2.feedburner.com/pippo_ORM"></a>
a class="rss tip" title="ORM" href="http://feeds2.feedburner.com/pippo_ORM" rel="nofollow"></a>
a class="rss tip" rel="boh" title="Nhibernate" href="http://feeds2.feedburner.com/pippo_ORM"&gt;&lt;/a&gt;
a class="rss tip" rel="direct" title="Linq2Sql" href="http://pippo.it/pippo_ORM"&gt;&lt;/a&gt;
a class="rss tip" title="Linq2Sql" href="http://pippo.it/pippo_ORM"&gt;&lt;/a&gt;
<a class="rss tip" title="direct" href="pippo"></a>

I need to get all the anchors that haven't the url "pippo.it" in href. I would like to remove the lines that contains the word rel="direct" from the result.

How can I do that?

I use RegexBuddy and I need to put the code on a .NET console program. I need to search the lines on the whole file.

Tnx

A: 

Something like this should do it

grep -v "pippo.it" myfile.txt | grep -v "rel=\"direct\""

The -v inverts the match, so that lines without the pattern are output

Paul Dixon
+1  A: 
grep -v 'href="[^"]*pippo.it\|rel="direct"' file.txt
Draemon
A: 
awk '!/rel=\"direct\"/ && !/href.*pippo.it/s' file
ghostdog74