ansaurus

Question

Answer 1

+1 A:

sed -n 's/.*<title>\(.*\)<\/title>.*/\1/ip;T;q'

From Linux Commands.

1st result for Google: unix extract page title.

mcandre 2010-07-07 14:48:59

Thank you so much!

Samantha 2010-07-07 15:02:00

Answer 2

A:

this awk one liner works also for title that spans more than 1 line.

$ cat file
<html>
    <title>How to extract a page
title - Stack Overflow</title>
    <link rel="stylesheet" href="http://sstatic.net/so/all.css?v=4864b39b46cf"&gt;
    <link rel="shortcut icon" href="http://sstatic.net/so/favicon.ico"&gt;
    <link rel="apple-touch-icon" href="http://sstatic.net/so/apple-touch-icon.png"&gt;
</html>

$ awk 'BEGIN{RS="</title>"}/title/{gsub(".*<title>","");print}' file
How to extract a page
title - Stack Overflow

ghostdog74 2010-07-07 15:43:32

ansaurus

tags:

views:

answers:

How to extract a page title

related questions