ansaurus

Question

Answer 1

+2 A:

Try this:

cat /path/to/file | egrep -o "(mailto|ftp|http(s)?://){1}[^'\"]+"

Outputs one link per line. It assumes every link is inside single or double quotes. To exclude some certain domain links, use -v:

cat /path/to/file | egrep -o "(mailto|ftp|http(s)?://){1}[^'\"]+" | egrep -v "yahoo.com"

hudolejev 2010-06-09 12:34:34

Thank you for replying It works for me thanks again.

Amar 2010-06-10 04:45:02

You're welcome. 'Thanks' is a way too much, accepting an answer would be sufficient (:

hudolejev 2010-06-10 17:19:22

Answer 2

A:

By default grep prints the entire line a match was found on. The -o switch selects only the matched parts of a line. See the man page.

wds 2010-06-09 12:38:03

Regex to find external links from the html file using grep