I've got a web application that controls which web applications get served traffic from our load balancer. The web application runs on each individual server.
It keeps track of the "in or out" state for each application in an object in the ASP.NET application state, and the object is serialized to a file on the disk whenever the state is changed. The state is deserialized from the file when the web application starts.
While the site itself only gets a couple requests a second tops, and the file it rarely accessed, I've found that it was extremely easy for some reason to get collisions while attempting to read from or write to the file. This mechanism needs to be extremely reliable, because we have an automated system that regularly does rolling deployments to the server.
Before anyone makes any comments questioning the prudence of any of the above, allow me to simply say that explaining the reasoning behind it would make this post much longer than it already is, so I'd like to avoid moving mountains.
That said, the code that I use to control access to the file looks like this:
internal static Mutex _lock = null;
/// <summary>Executes the specified <see cref="Func{FileStream, Object}" /> delegate on the filesystem copy of the <see cref="ServerState" />.
/// The work done on the file is wrapped in a lock statement to ensure there are no locking collisions caused by attempting to save and load
/// the file simultaneously from separate requests.
/// </summary>
/// <param name="action">The logic to be executed on the <see cref="ServerState" /> file.</param>
/// <returns>An object containing any result data returned by <param name="func" />.</returns>
private static Boolean InvokeOnFile(Func<FileStream, Object> func, out Object result)
{
var l = new Logger();
if (ServerState._lock.WaitOne(1500, false))
{
l.LogInformation("Got lock to read/write file-based server state.", (Int32)VipEvent.GotStateLock);
var fileStream = File.Open(ServerState.PATH, FileMode.OpenOrCreate, FileAccess.ReadWrite, FileShare.None);
result = func.Invoke(fileStream);
fileStream.Close();
fileStream.Dispose();
fileStream = null;
ServerState._lock.ReleaseMutex();
l.LogInformation("Released state file lock.", (Int32)VipEvent.ReleasedStateLock);
return true;
}
else
{
l.LogWarning("Could not get a lock to access the file-based server state.", (Int32)VipEvent.CouldNotGetStateLock);
result = null;
return false;
}
}
This usually works, but occasionally I cannot get access to the mutex (I see the "Could not get a lock" event in the log). I cannot reproduce this locally - it only happens on my production servers (Win Server 2k3/IIS 6). If I remove the timeout, the application hangs indefinitely (race condition??), including on subsequent requests.
When I do get the errors, looking at the event log tells me that the mutex lock was achieved and released by the previous request before the error was logged.
The mutex is instantiated in the Application_Start event. I get the same results when it is instantiated statically in the declaration.
Excuses, excuses: threading/locking is not my forté, as I generally don't have to worry about it.
Any suggestions as to why it randomly would fail to get a signal?
Update:
I've added proper error handling (how embarrassing!), but I am still getting the same errors - and for the record, unhandled exceptions were never the problem.
Only one process would ever be accessing the file - I don't use a web garden for this application's web pool, and no other applications use the file. The only exception I can think of would be when the app pool recycles, and the old WP is still open when the new one is created - but I can tell from watching the task manager that the issue occurs while there is only one worker process.
@mmr: How is using Monitor any different from using a Mutex? Based on the MSDN documentation, it looks like it is effectively doing the same thing - if and I can't get the lock with my Mutex, it does fail gracefully by just returning false.
Another thing to note: The issues I'm having seem to be completely random - if it fails on one request, it might work fine on the next. There doesn't seem to be a pattern, either (certainly no every other, at least).
Update 2:
This lock is not used for any other call. The only time _lock is referenced outside the InvokeOnFile method is when it is instantiated.
The Func that is invoked is either reading from the file and deserializing into an object, or serializing an object and writing it to the file. Neither operation is done on a separate thread.
ServerState.PATH is a static readonly field, which I don't expect would cause any concurrency problems.
I'd also like to re-iterate my earlier point that I cannot reproduce this locally (in Cassini).
Lessons learned:
- Use proper error handling (duh!)
- Use the right tool for the job (and have a basic understanding of what/how that tool does). As sambo points out, using a Mutex apparently has a lot of overhead, which was causing issues in my application, whereas Monitor is designed specifically for .NET.