views:

268

answers:

2

Hello again,

Would anyone perhaps know how to get the value of a specific element in an HTML document with PHP? What I'm doing right now is using file_get_contents to pull up the HTML code from another website, and on that website there is a textarea:

<textarea id="body" name="body" rows="12" cols="75" tabindex="1">Hello World!</textarea>

What I want to do is have my script do the file_get_contents and just pull out the "Hello World!" from the textarea. Is that possible? Sorry for bugging you guys, again, you give such helpful advice :].

+8  A: 

Don't be sorry for bugging us, this is a good question I'm happy to answer. You can use PHP Simple HTML DOM Parser to get what you need:

$html     = file_get_html('http://www.domain.com/');
$textarea = $html->find('textarea[id=body]'); 
$contents = $textarea->innertext;

echo $contents; // Outputs 'Hello World!'

If you want to use file_get_contents(), you can do it like this:

$raw_html = file_get_contents('http://www.domain.com/');
$html     = str_get_html($raw_html);
...

Although I don't see any need for the file_get_contents() as you can use the outertext method to get the original, full HTML out if you need it somewhere:

$html     = file_get_html('http://www.domain.com/');
$raw_html = $html->outertext;

Just for the kicks, you can do this also with an one-liner regular expression:

preg_match('~<textarea id="body".*?>(.*?)</textarea>~', file_get_contents('http://www.domain.com/'), $matches);
echo $matches[1][0]; // Outputs 'Hello World!'

I'd strongly advise against this though as you are a lot more vulnerable to code changes which might break this regular expression.

Tatu Ulmanen
I'm not getting any output, could it be the contents of the textbox? (they aren't blank)
Baehr
+1  A: 

I'd suggest using PHPs DOM & DOMXPath classes.

$dom = DOMDocument::loadHTMLFile( $url );
$xpath = new DOMXPath( $dom );
$nodes = $xpath->query('//textarea[id=body]' )

$result = array();
for( $nodes as $node ) {
    $result[] = $node->textContent;
}

There $result would contain the value of every textarea with id body.

Juan
So many good answers! Thank you all very much, you've been more than helpful.
Baehr
When I use this code, I get an error: Warning: DOMDocument::loadHTMLFile() [domdocument.loadhtmlfile]: Unexpected end tag : inputIs there a fix for this?
Baehr
That sounds like the HTML youre trying to parse is broken, a common nightmare. You should just go with Tatu's regexp solution.
Juan