views:

86

answers:

1
+1  Q: 

Split HTML files

How would I split a HTML formatted file into several HTML files (complete with with HTML, HEAD and BODY tags) with PHP? I would have a placeholder tag (something like <div class='placeholder'></div> ) for all the places I want to cut.

Thanks.

+4  A: 
$sourceHTML = file_get_contents('sourcefile');

$splitContents = explode("<div class='placeholder'></div>", $sourceHTML);

foreach ($splitContents as $html) {
    // save html to file
}

Edit: whoops. As user201140 correctly points out, I missed the fact that each html file has to be a valid document. Since it's not specified exactly what the head tag should contain, I'll assume that the head tag of the combined document should be replicated to each copy. In that case:

$sourceHTML = file_get_contents('sourcefile');
preg_match("/(^.*<body.*?>)(.*)(<\/body.*$)/is", $sourceHTML, &$matches);
$top = $matches[1];
$contents = $matches[2];
$bottom = $matches[3];
$splitContents = explode("<div class='placeholder'></div>", $contents);
foreach ($splitContents as $chunk) {
    $html = $top.$chunk.$bottom;
    // save html to file
}
sidereal
What would happen to the HTML, HEAD BODY tags for each page? Each part needs to be a stand alone HTML file.
usertest
Doesn't seem to work, it creates a single empty file.
usertest
I assume you actually implemented the filesaving code? If so, it's tough to know what the problem is. Most likely the regex doesn't match, although I've tried it on a bunch of html files and it matches them all. It certainly won't work if the source html doesn't have a body tag.
sidereal