I am working on a number of websites with files dating back to 2000. These sites have grown organically over time resulting in large numbers of orphaned web pages, include files, images, CSS files, JavaScript files, etc... These orphaned files cause a number of problems including poor maintainability, possible security holes, poor customer experience, and driving OCD/GTD freaks like myself crazy.
These files number in the thousands so a completely manual solution is not feasible. Ultimately, the cleanup process will require a fairly large QA effort in order to ensure we have not inadvertently deleted needed files but I am hoping to develop a technological solution to help speed the manual effort. Additionally, I hope to put processes/utilities in place to help prevent this state of disorganization from happening in the future.
Environment Considerations:
- Classic ASP and .Net
- Windows servers running IIS 6 and IIS 7
- Multiple environments (Dev, Integration, QA, Stage, Prodction)
- TFS for source control
Before I start I would like to get some feedback from others who have successfully navigated a similar process.
Specifically I am looking for:
- Process for identifying and cleaning up orphaned files
- Process for keeping environments clean from orphaned files
- Utilities that help identify orphaned files
- Utilities that help identify broken links (once files have been removed)
I am not looking for:
- Solutions to my organizational OCD...I like how I am.
- Snide comments about us still using classic ASP. I already feel the pain. There is no need to rub it in.