tags:

views:

1642

answers:

5

I am attempting to automate the sitemap.xml file on my site since the content is constantly changing. I currently open the file for appending: fopen($file_name, 'a'); so that I can add the new set of tags. However, I just noticed that the entire sitemap file has to be ended with a tag which means that every time I open the file, I need to append the text not to the end of the file, but to 1 line from the end.

So basically, how can I move the file pointer up after opening the file for appending so that I can achieve this? Thanks.

Update: here is what the sitemap looks like:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.google.com/schemas/sitemap/0.84"&gt;
 <url>
  <loc>...</loc>
  <lastmod>2009-08-23</lastmod>
  <changefreq>weekly</changefreq>
  <priority>0.8</priority>
 </url>
    </urlset>

so whenever I append, i need to add the <url>..</url> part which must go right before the closing </urlset> tag. I already have code that can append the xml to the end of the file. I just need to figure out how to append the new portion right before the closing tag.

+3  A: 

Without seeing the XML you are talking about, and without knowing what you are trying to add (please provide these for a full coded up answer) may I suggest this approach...

  1. Load the entire file using the PHP XML parser (http://uk3.php.net/manual/en/ref.xml.php)
  2. Add a new element into XML
  3. Save using the fopen() and fwrite() functions (i'm guessing your doing this bit anyway)

As I say, without seeing the XML or some more code, its very hard to provide and answer

neilc
A: 

Keep two versions of the file: the sitemap and a tmp one without the closing tag. When you want to extend, first extend the tmp one; then copy it over to sitemap, and add the closing tag there.

Zed
A: 

fseek($fp, -n, SEEK_END);, but you must open the file as 'r+' and not 'a'.

It's not generally a good idea to be processing XML like this; relying on exact byte positions is very fragile. Better would be to open the file in an XML parser, add the elements you want, serialise it to a new file and swap them over (so that nothing reads the XML in the middle of you writing it).

On a database-backed site, you could also consider generating your sitemap XML dynamically using PHP itself.

bobince
Given the update that I posted..can you forsee any issues which using your suggested approach? If I am adding xml of the same format every time, shouldn't be okay to just use fseek as you suggested?
Well you can certainly do it, but you'd be relying on the length of the closing ‘</urlset>’ tag. If, say, an extra newline or spaces were added after the tag, your assumptions about where you had to insert the new content would be wrong. This makes your program very fragile.
bobince
A: 

Use php fseek() to seek to the end of the file (find using filesize()), then iterate backwards one line. read the last line and store it temporarily. overwrite the last line with what you want to insert, then append the temporary line you stored previously.

To iterate backwards one line, use fseek combined with fgetc()

$offset = filesize($fhandle) - 1;
fseek($fhandle, $offset--); //seek to the end of the line
while(fgetc($fhandle) != '\n') {
   fseek($fhandle, $offset--);
}

and now your internal file pointer should be pointed to a line before the last line. off course you'll have to deal with corner cases when your file only has one line, but I'll let you figure out the details ;)

now store the last line in a tmp variable

$lastline = fgets($fhandle);
fseek($fhandle, $offset); //go back to where the last line began

insert your line, and add the last line back to the file

$fwrite($fhandle, $myLine);
$fwrite($fhandle,$lastline);
Charles Ma
could you possibly provide code for this? specifically I am unsure of how to iterate backwards one line...thanks
yep, added some [untested] code. btw, this is an arcane C style of doing things. The only reason you would do this over using an xml parser is if you have very large files and don't want to read the entire file into an array before writing it out again to disk. :P
Charles Ma
A: 

Adding to Charles Ma's answer. You can save this in a file called sitemapper.php, and call this file with a GET query, although I would advise you to add more security, and flock() if you might have concurrent writes.

Another reason to use this would be if you are using PHP4 which doesn't have the SimpleXMLParser.

<?php
/*--------------------------------------------------------
==================
sitemapper.php 1.0
==================
Pass me a path to a page, and I will add it to your XML sitemap.

Paths are passed from root, as 
www.example.com/sitemapper.php?path=/test/index.html
for a page at http://www.example.com/test/index.html

This script is faster than parsing XML, for large sitemaps.
--------------------------------------------------------*/

if (isset($_GET[path])) {

    // Get the path to the new page to add to our sitemap.
    $fname = urldecode($_GET[path]);

    // Real path to files is different on some hosts.
    $current_path = realpath(dirname(__FILE__));

    if (!is_file("./sitemap.xml")){

        $xml = '';
        $xml =  "<?xml version=\"1.0\" encoding=\"UTF-8\"?>"."\n";
        $xml .= "<urlset xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xsi:schemaLocation=\"http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd\" xmlns=\"http://www.sitemaps.org/schemas/sitemap/0.9\"&gt;"."\n";
        $xml .= "</urlset>"."\n";

        $sitemap = fopen("./sitemap.xml", "x");             
        fwrite($sitemap, $xml); 
        fclose($sitemap);
    }

    // Write sitemap.xml entry for current file
    // This is a very old-school 'C' style of doing it.
    // The modern way would be to open the file in an XML parser, 
    // add the elements you want, serialise it to a new file 
    // and swap them over (so that nothing reads the XML in 
    // the middle of you writing it)
    //
    // However I expect this XML file to become *huge* shortly
    // So I am avoiding use of the PHP XML Parser.

    $date = date('Y-m-dTH:i:sP', time()); //Date in w3c format for XML sitemaps

    $xml = '';
    $xml .= "<url>"."\n";
    $xml .= "   <loc>http://". $_SERVER["HTTP_HOST"] . $fname . "</loc>"."\n";
    $xml .= "   <lastmod>" . $date . "</lastmod>"."\n";
    $xml .= "   <priority>0.50</priority>"."\n";
    $xml .= "   <changefreq>never</changefreq>"."\n";
    $xml .= "</url>"."\n";            

    if ($sitemap = @fopen("./sitemap.xml", "r+")) 
    {

        // seek to the end of the file, then iterate backwards one line. 
        // read the last line and store it temporarily. overwrite the 
        // last line with what you want to insert, then append the 
        // temporary line you stored previously.

        $offset = filesize("./sitemap.xml") - 1;

        fseek($sitemap, ($offset - 1)); //seek to before the last character in the file
        while( (($char = fgetc($sitemap)) !== "\n") && ($offset > -2000) && ($offset < 2000)) {
            // Go backwards, trying to find a line-break.
            // The offset range is just a sanity check if something goes wrong.
            $offset = $offset - 1;
            fseek($sitemap, $offset);
        }  

        $offset = $offset + 1; // Come to the beginning of the next line
        fseek($sitemap, $offset);
        $lastline = fgets($sitemap); // Copy the next line into a variable
        fseek($sitemap, $offset); //go back to where the last line began

        fwrite($sitemap, $xml); // add the current entry
        fwrite($sitemap, $lastline); // add the </urlset> line back.

        fclose($sitemap);
    }
}
?>
Pranab