I am writing an application to get the title of an html page, some text under the body tag and an image. It is something like the share stuff of facebook. I can get a regular expression that does that. Thanks for your assitance.
+1
A:
A regexp like <title>(.*?)</title>
will get you the content of title.
The .*? part is for matching any characters, in a non greedy way (in case there is another title end tag in the page).
Scharron
2010-07-21 10:26:54
Pls hw do i go about this, new to regular expressions
2010-07-21 11:12:32
Thanks I got it
2010-07-21 11:41:57
+2
A:
You should probably use a HTML Parser instead of Regular Expression. See Simple HTML DOM, for example.
A regular expression for your task will be very hard to maintain and will break easily on any changes of the pages in question, not to mention that you cannot account for HTML comments.
Jens
2010-07-21 10:27:54