I originally asked a question along these lines using Regex but was recommended to use the PHP DOM library instead...which is superior, but I am still stuck.
Basically I want to wrap the contents of an <a>
in a <span>
if it is not already wrapped in <span>
.
<?php
$input = <<<EOT
<html><head></head>
<body bgcolor="#393a36">
<a href="#"><span style="color:#ffffff;">Link 1</span></a>
<a href="#">Link 2</a>
<a href="#"><img src="mypic.gif" />Image Link</a>
<a href="#"><u>Underlined Link</u></a>
</body>
</html>
EOT;
$doc = new DOMDocument();
$doc->loadHTML($input);
$tags = $doc->getElementsByTagName('a');
foreach ($tags as $tag) {
$spancount = $tag->getElementsByTagName("span")->length;
if($spancount == 0){
$content = nodeContent($tag);
$element = $doc->createElement('span');
$element->setAttribute('style','color:#ffffff;');
$frag = $doc->createDocumentFragment();
$frag->appendXML($content);
$element->appendChild($frag);
$tag->nodeValue = ""; //clear node
$tag->appendChild($element);
}
}
echo $doc->saveHTML();
function nodeContent($n, $outer=false) {
$d = new DOMDocument('1.0');
$d->formatOutput = true;
$b = $d->importNode($n->cloneNode(true),true);
$d->appendChild($b);
$h = $d->saveHTML();
// remove outter tags
if (!$outer) $h = substr($h,strpos($h,'>')+1,-(strlen($n->nodeName)+4));
return $h;
}
It provides this output:
PHP Warning: DOMDocumentFragment::appendXML(): Entity: line 1: parser error : Premature end of data in tag img line 1 in /private/var/folders/78/78vHGigZHcuFeXB1KKJSb++++TI/-Tmp-/untitled_3xd..php on line 24
PHP Warning: DOMDocumentFragment::appendXML(): <img src="mypic.gif">Image Link in /private/var/folders/78/78vHGigZHcuFeXB1KKJSb++++TI/-Tmp-/untitled_3xd..php on line 24
PHP Warning: DOMDocumentFragment::appendXML(): ^ in /private/var/folders/78/78vHGigZHcuFeXB1KKJSb++++TI/-Tmp-/untitled_3xd..php on line 24
PHP Warning: DOMNode::appendChild(): Document Fragment is empty in /private/var/folders/78/78vHGigZHcuFeXB1KKJSb++++TI/-Tmp-/untitled_3xd..php on line 25
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<head></head>
<body bgcolor="#393a36">
<a href="#"><span style="color:#ffffff;">Link 1</span></a>
<a href="#"><span style="color:#ffffff;">Link 2</span></a>
<a href="#"><span style="color:#ffffff;"></span></a>
<a href="#"><span style="color:#ffffff;"><u>Underlined Link</u></span></a>
</body>
</html>
This mostely works, except that it is really picky, and as you can see it dies if here is an img
(or similar) tag in side the ahref
.
What is the best way to make this work. I've been banging my head against for an embarrassing long time now.
EDIT
Based on feeback below, here is the revised code and output. Note that the text preceding the img
tag isn't being wrapped for some reason. Any Ideas?
$doc = new DOMDocument();
$doc->loadHTML($input);
$tags = $doc->getElementsByTagName('a');
foreach ($tags as $tag) {
$spancount = $tag->getElementsByTagName("span")->length;
if($spancount == 0){
$element = $doc->createElement('span');
$element->setAttribute('style','color:#ffffff;');
foreach ($tag->childNodes as $child) {
$tag->removeChild($child);
$element->appendChild($child);
}
$tag->appendChild($element);
}
}
echo $doc->saveHTML();
Output:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<head></head>
<body bgcolor="#393a36">
<a href="#"><span style="color:#ffffff;">Link 1</span></a>
<a href="#"><span style="color:#ffffff;">Link 2</span></a>
<a href="#">Image Link<span style="color:#ffffff;"><img src="mypic.gif"></span></a>
<a href="#"><span style="color:#ffffff;"><u>Underlined Link</u></span></a>
</body>
</html>