How can I get specific content out of a URL fed into CURL? I'm unsure as to how to begin doing this, and haven't been able to Google for it properly since I don't know exactly what I'm looking for.
+1
A:
I have already written an answer about that here for a question entitled "How to display content from one site on another using PHP?"
Oren
2010-06-04 18:05:43
A:
you probably want to use regexes to do a search for the local portion of the page that you want to scrape for content look up
Zak
2010-06-04 18:09:34
I suggest another post if you have trouble creating an exact regex to get the portion of the page you want...
Zak
2010-06-04 18:10:43
-1: No, no, no, no, no, no, no, no, no! **Never scrape HTML using regular expressions.** Use a parser like phpQuery or DOMDocument, but not regular expressions... NO... http://www.codinghorror.com/blog/2009/11/parsing-html-the-cthulhu-way.html
Andrew Moore
2010-06-04 18:23:21
I didn't say he should use regular expressions to parse HTML. I said he should use a regex to find the *content* he was interested in. I challenge you to use any DOM in the world to narrow down a block of *content* to the few pieces of information you want...
Zak
2010-06-04 18:28:26
You narrow it down to the container the information is in and then parse the text nodes. You do NOT use regex on an HTML document.
Andrew Moore
2010-06-04 18:31:41
I'll take that as an "I'm sorry, go ahead and use a regex to find the specific content you are interested in." Noting, that the user *edit* only stated after *end edit* my answer that it was an HTML document that curl was returning.
Zak
2010-06-04 18:34:54
@Zak: gee, I don't know... Isn't most of the documents off the net in HTML format?
Andrew Moore
2010-06-04 20:12:23