views:

34

answers:

1

Users can upload files to the server, which are stored effectively forever.

I want to know if anyone has an idea for tracking orphan files. A few of my ideas involve logging every upload, but then the files are usually referenced in html which isn't easy to track.

Files can sit unused but still be referenced. I could do a fuill text search on these, but that's pretty brute force.

Do I just give up and let them grow old?

+2  A: 

I don't know your situation but what I have done in the past is move all old files (images) to a folder one off of the images folder and used Xenu to check the links in all of my HTML pages. At the end of the link verification, Xenu returned a list of 404s. I then wrote a script using the list of 404s to move back the files from the backup location back into the images folder.

This worked great... Still monitored the log files for a couple of weeks though just in case I missed something.

Xenu, BTW, is a free app that helps you find broken links by giving it a starting page. It then finds links in that page to crawl your whole site. It would require additional starting pages if the pages that have links to these files are not found otherwise during a crawl.

Eric
cheers for the Xenu info. just spent 30mins reading about Scientology
burnt_hand
Ah, I meant to add a link to Xeno. Seems you found it. :)
Eric