Perform the following steps on server-side:
- Retrieve a snapshot of the webpage via HTTP GET
- Save consecutive snapshots of a page with different names for later comparison
- Compare the files with an HTML-aware diff tool (see HtmlDiff tool listing page on ESW wiki).
As a proof-of-concept example with Linux shell, you can perform this comparison as follows:
wget --output-document=snapshot1.html http://example.com/
wget --output-document=snapshot2.html http://example.com/
diff snapshot1.html snapshot2.html
You can of course wrap up these commands into a server-side program or a script.
For PHP, I would suggest you to take a look at daisydiff-php. It readily provides a PHP class that enables you to easily create an HTML-aware diff tool. Example:
<?
require_once('HTMLDiff.php');
$file1 = file_get_contents('snapshot1.html');
$file2 = file_get_contents('snapshot1.html');
HTMLDiffer->htmlDiffer( $file1, $file2 );
?>
Note that with file_get_contents
, you can also retrieve data from a given URL as well.
Note that DaisyDiff itself is very fine tool for visualisation of structural changes as well.