views:

230

answers:

3

I would like to place an iframe right below the start of the body tag. This has some issues since the body tag can have various attributes and odd whitespace. My guess is this will will require regular expressions to do correctly.

EDIT: This solution has to work with php 4 & performance is a concern of mine. It's for this http://drupal.org/node/586210#comment-2567398

+4  A: 

You can use DOMDocument and friends. Assuming you have a variable html containing the existing HTML document as a string, the basic code is:

$doc = new DOMDocument();
$doc->loadHTML(html);
$body = $doc->getElementsByTagName('body')->item(0);
$iframe = $doc->createElement('iframe');
$body->insertBefore($iframe, $body->firstChild);

To retrieve the modified HTML text, use

$html = $doc->saveHTML();

EDIT: For PHP4, you can try DOM XML.

Matthew Flaschen
mikeytown2
DOM XML is deprecated, I wouldn't advise using that. Drupal has to be backwards-compatible but not to the extent of using deprecated features deliberately. This extension is not even part of standard PHP distro as of 5.0.
Rowlf
Rowlf, PHP 4 is deprecated too. In fact, there have been no releases since August 2008 (not even security fixes). See also http://stackoverflow.com/questions/1734072/official-end-of-support-for-php4 . If you're using a deprecated programming language, it shouldn't be surprising to use deprecated libraries.
Matthew Flaschen
I don't understand your point. Drupal is widely used on any and all versions of PHP from 4 to 5.3. It's only been recently upgraded to be 5.3-compatible, which was a huge step in the right direction. Using a blast-from-the-past DOM extension in one of the core modules is the _wrong_ direction to go in.
Rowlf
My point is simple. PHP 4 is blast from the past language, so as long as they are forced to support it, why not conditionally use an API that version provides?
Matthew Flaschen
Now I see. It wasn't quite clear from your previous posts that you suggested using both libraries conditionally.
Rowlf
DOM XML is not part of core PHP; so I can't use that either...http://www.php.net/manual/en/domxml.installation.phpThe whole point of the boost module is it works with very bad hosts and makes Drupal fast; thus I'm limited in what I can do.
mikeytown2
+2  A: 

Both PHP 4 and PHP 5 should be happy with preg_split():

/* split the string contained in $html in three parts: 
 * everything before the <body> tag
 * the body tag with any attributes in it
 * everything following the body tag
 */
$matches = preg_split('/(<body.*?>)/i', $html, -1, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE); 

/* assemble the HTML output back with the iframe code in it */
$injectedHTML = $matches[0] . $matches[1] . $iframeCode . $matches[2];
Rowlf
This fails when the body tag is capitalized, which is perfectly valid. Before you add /i, read http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454
Matthew Flaschen
@Matthew: I'd rather read and agree with this one, thanks: http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1733489#1733489
Rowlf
A: 

Using regular expressions brings up performance concerns... This is what I'm going for

<?php
$html = file_get_contents('http://www.yahoo.com/');
$start = stripos($html, '<body');
$end = stripos($html, '>', $start);
$body = substr_replace($html, '<IFRAME INSERT>', $end+1, 0);
echo htmlentities($body);
?>

Thoughts?

mikeytown2
scratch this, its slow
mikeytown2