tags:

views:

121

answers:

2

How can I get specific content out of a URL fed into CURL? I'm unsure as to how to begin doing this, and haven't been able to Google for it properly since I don't know exactly what I'm looking for.

+1  A: 

I have already written an answer about that here for a question entitled "How to display content from one site on another using PHP?"

Oren
A: 

you probably want to use regexes to do a search for the local portion of the page that you want to scrape for content look up

http://php.net/preg_match

Zak
I suggest another post if you have trouble creating an exact regex to get the portion of the page you want...
Zak
-1: No, no, no, no, no, no, no, no, no! **Never scrape HTML using regular expressions.** Use a parser like phpQuery or DOMDocument, but not regular expressions... NO... http://www.codinghorror.com/blog/2009/11/parsing-html-the-cthulhu-way.html
Andrew Moore
I didn't say he should use regular expressions to parse HTML. I said he should use a regex to find the *content* he was interested in. I challenge you to use any DOM in the world to narrow down a block of *content* to the few pieces of information you want...
Zak
You narrow it down to the container the information is in and then parse the text nodes. You do NOT use regex on an HTML document.
Andrew Moore
I'll take that as an "I'm sorry, go ahead and use a regex to find the specific content you are interested in." Noting, that the user *edit* only stated after *end edit* my answer that it was an HTML document that curl was returning.
Zak
@Zak: gee, I don't know... Isn't most of the documents off the net in HTML format?
Andrew Moore