views:

427

answers:

8

Question: I have an ASP.NET application which creates temporary PDF files (for the user to download). Now, many users over many days can create many PDFs, which take much disk space.

What's the best way to schedule deletion of files older than 1 day/ 8 hours ? Preferably in the asp.net application itselfs...

+1  A: 

Try using Path.GetTempPath(). It will give you a path to a windows temp folder. Then it will be up to windows to clean up :)

You can read more about the method here http://msdn.microsoft.com/en-us/library/system.io.path.gettemppath.aspx

Oskar Kjellin
Would you need to give asp.net worker process/app pool permission to write to temp?
James Westgate
@James Not sure, perhaps
Oskar Kjellin
@James: Yes, definitely (Vista+). Moreover, the ASP.NET user would need read access to temp, to download the file, and file enumeration permisson to get a list of files in the directory.
Quandary
PS: Windows will most likely only cleanup the temp folder on restart. In case the server doesn't crash/update often, which I hope it doesn't, that would not be sufficient.
Quandary
A: 

How do you store the files? If possible, you could just go with a simple solution, where all files are stored in a folder named after the current date and time.
Then create a simple page or httphandler that will delete old folders. You could call this page at intervals using a Windows schedule or other cron job.

Jakob Gade
It's not necessary to name the files after current date and time. The filename is a guid.tostring + ".pdf". You can read the file creation/modification date from file attributes. Yes, they are all in the same folder.
Quandary
A: 

Not really an answer to your question, but two suggestions to avoid cleanups in fixed intervals:

In one solution I store temporary files in temporary folders named according to the session id. I iterate over the folders, checking if the corresponding session is already alive and deleting them otherwise.

In another solution a avoid creating temporary files, returning generated PDFs on the fly (as attached files).

Dirk
+2  A: 

For each temporary file that you need to create, make a note of the filename in the session:

// create temporary file:
string fileName = System.IO.Path.GetTempFileName();
Session[string.Concat("temporaryFile", Guid.NewGuid().ToString("d"))] = fileName;
// TODO: write to file

Next, add the following cleanup code to global.asax:

<%@ Application Language="C#" %>
<script RunAt="server">
    void Session_End(object sender, EventArgs e) {
        // Code that runs when a session ends. 
        // Note: The Session_End event is raised only when the sessionstate mode
        // is set to InProc in the Web.config file. If session mode is set to StateServer 
        // or SQLServer, the event is not raised.

        // remove files that has been uploaded, but not actively 'saved' or 'canceled' by the user
        foreach (string key in Session.Keys) {
            if (key.StartsWith("temporaryFile", StringComparison.OrdinalIgnoreCase)) {
                try {
                    string fileName = (string)Session[key];
                    Session[key] = string.Empty;
                    if ((fileName.Length > 0) && (System.IO.File.Exists(fileName))) {
                        System.IO.File.Delete(fileName);
                    }
                } catch (Exception) { }
            }
        }

    }       
</script>
Fredrik Johansson
I think it's easier to create a directory based on the session ID of the user in a temp directory, or another directory the ASP.NET user has write rights. When the sessions ends simply remove the whole directory of the user based on the session ID.
Joop
I'll go for the sessionid folder.
Quandary
A: 

Create a timer on Appication_Start and schedule the timer to call a method on every 1 hours and flush the files older than 8 hours or 1 day or whatever duration you need.

this. __curious_geek
A: 

I sort of agree with whats said in the answer by dirk.

The idea being that the temp folder in which you drop the files to is a fixed known location however i differ slightly ...

  1. Each time a file is created add the filename to a list in the session object (assuming there isn't thousands, if there is when this list hits a given cap do the next bit)

  2. when the session ends the Session_End event should be raised in global.asax should be raised. Iterate all the files in the list and remove them.

Wardy
A: 
    private const string TEMPDIRPATH = @"C:\\mytempdir\";
    private const int DELETEAFTERHOURS = 8;

    private void cleanTempDir()
    {
        foreach (string filePath in Directory.GetFiles(TEMPDIRPATH))
        {
            FileInfo fi = new FileInfo(filePath);
            if (!(fi.LastWriteTime.CompareTo(DateTime.Now.AddHours(DELETEAFTERHOURS * -1)) <= 0)) //created or modified more than x hours ago? if not, continue to the next file
            {
                continue;
            }

            try
            {
                File.Delete(filePath);
            }
            catch (Exception)
            {
                //something happened and the file probably isn't deleted. the next time give it another shot
            }
        }
    }

The code above will remove the files in the temp directory that are created or modified more than 8 hours ago.

However I would suggest to use another approach. As Fredrik Johansson suggested, you can delete the files created by the user when the session ends. Better is to work with an extra directory based on the session ID of the user in you temp directory. When the session ends you simply delete the directory created for the user.

    private const string TEMPDIRPATH = @"C:\\mytempdir\";
    string tempDirUserPath = Path.Combine(TEMPDIRPATH, HttpContext.Current.User.Identity.Name);
    private void removeTempDirUser(string path)
    {
        try
        {
            Directory.Delete(path);
        }
        catch (Exception)
        {
            //an exception occured while deleting the directory.
        }
    }
Joop
A: 

The Best way is to create a batch file which it be called by the windows task scheduler one at the interval that you want.

OR

you can create a windows service with the class above

public class CleanUpBot {

public bool KeepAlive;

private Thread _cleanUpThread;

public void Run() {

_cleanUpThread = new Thread(StartCleanUp);

}

private void StartCleanUp() {

do

{

// HERE THE LOGIC FOR DELETE FILES

_cleanUpThread.Join(TIME_IN_MILLISECOND);

}while(KeepAlive)

}

}

Notice that you can also call this class at the pageLoad and it wont affect the process time because the treatment is in another thread. Just remove the do-while and the Thread.Join().

Jean-Christophe Fortin
These are exactly the things I try to avoid. First, windows scheduler is never a good idea (in a very permission locked-down environment, such as at our customers, it never works), second windows service requires an installer, I've done one once, and honestly, the gathered experience is exactly why I don't want to do it again.
Quandary
If you're not familiar with windows installer service, you could just call the class CleanUpBot at the pageLoad no ? at the first line put something like this ... CleanUp(); and in this method .... private bool cleanUp(){try{var lastCleanUp = DateTime.Parse(Application["LastCleanUp"].ToString());}catch (Exception){Application["LastCleanUp"] = DateTime.Now;}finally{new CleanUpBot().Run(lastCleanUp);}}
Jean-Christophe Fortin