My application traverses a directory tree and in each directory it tries to open a file with a particular name (using File.OpenRead()
). If this call throws FileNotFoundException
then it knows that the file does not exist. Would I rather have a File.Exists()
call before that to check if file exists? Would this be more efficient?
views:
618answers:
12Update
I ran these two methods in a loop and timed each:
void throwException()
{
try
{
throw new NotImplementedException();
}
catch
{
}
}
void fileOpen()
{
string filename = string.Format("does_not_exist_{0}.txt", random.Next());
try
{
File.Open(filename, FileMode.Open);
}
catch
{
}
}
void fileExists()
{
string filename = string.Format("does_not_exist_{0}.txt", random.Next());
File.Exists(filename);
}
Random random = new Random();
These are the results without the debugger attached and running a release build :
Method Iterations per second throwException 10100 fileOpen 2200 fileExists 11300
The cost of a throwing an exception is a lot higher than I was expecting, and calling FileOpen on a file that doesn't exist seems much slower than checking the existence of a file that doesn't exist.
In the case where the file will often not be present it appears to be faster to check if the file exists. I would imagine that in the opposite case - when the file is usually present you will find it is faster to catch the exception. If performance is critical to your application I suggest that you benchmark both apporaches on realistic data.
As mentioned in other answers, remember that even in you check for existence of the file before opening it you should be careful of the race condition if someone deletes the file after your existence check but just before you open it. You still need to handle the exception.
I would say that, generally speaking, exceptions "increase" the overall "performance" of your system!
In your sample, anyway, it is better to use File.Exists...
Is this behavior truly exceptional? If it is expected, you should be testing with an if statement, and not using exceptions at all. Performance isn't the only issue with this solution and from the sound of what you are trying to do, performance should not be an issue. Therefore, style and a good approach should be the items of concern with this solution.
So, to summarize, since you expect some tests to fail, do use the File.Exists to check instead of catching exceptions after the fact. You should still catch other exceptions that can occur, of course.
No, don't. If you use File.Exists, you introduce concurrency problem. If you wrote this code:
if file exists then
open file
then if another program deleted your file between when you checked File.Exists and before you actually open the file, then the program will still throw exception.
Second, even if a file exists, that does not mean you can actually open the file, you might not have the permission to open the file, or the file might be a read-only filesystem so you can't open in write mode, etc.
File I/O is much, much more expensive than exception, there is no need to worry about the performance of exceptions.
EDIT: Benchmarking Exception vs Exists in Python under Linux
import timeit
setup = 'import random, os'
s = '''
try:
open('does not exist_%s.txt' % random.randint(0, 10000)).read()
except Exception:
pass
'''
byException = timeit.Timer(stmt=s, setup=setup).timeit(1000000)
s = '''
fn = 'does not exists_%s.txt' % random.randint(0, 10000)
if os.path.exists(fn):
open(fn).read()
'''
byExists = timeit.Timer(stmt=s, setup=setup).timeit(1000000)
print 'byException: ', byException # byException: 23.2779269218
print 'byExists: ', byExists # byExists: 22.4937438965
I don't know about efficiency but I would prefer the File.Exists check. The problem is all the other things that could happen: bad file handle, etc. If your program logic knows that sometimes the file doesn't exist and you want to have a different behavior for existing vs. non-existing files, use File.Exists. If its lack of existence is the same as other file-related exceptions, just use exception handling.
- Vexing Exceptions -- more about using exceptions well
The problem with using File.Exists first is that it opens the file too. So you end up opening the file twice. I haven't measured it, but I guess this additional opening of the file is more expensive than the occasional exceptions.
If the File.Exists check improves the performance depends on the probability of the file existing. If it likely exists then don't use File.Exists, if it usually doesn't exist the the additional check will improve the performance.
Yes, you should use File.Exists. Exceptions should be used for exceptional situations not to control the normal flow of your program. In your case, a file not being there is not an exceptional occurrence. Therefore, you should not rely on exceptions.
UPDATE:
So everyone can try it for themselves, I'll post my test code. For non existing files, relying on File.Open to throw an exception for you is about 50 times worse than checking with File.Exists.
class Program
{
static void Main(string[] args)
{
TimeSpan ts1 = TimeIt(OpenExistingFileWithCheck);
TimeSpan ts2 = TimeIt(OpenExistingFileWithoutCheck);
TimeSpan ts3 = TimeIt(OpenNonExistingFileWithCheck);
TimeSpan ts4 = TimeIt(OpenNonExistingFileWithoutCheck);
}
private static TimeSpan TimeIt(Action action)
{
int loopSize = 10000;
DateTime startTime = DateTime.Now;
for (int i = 0; i < loopSize; i++)
{
action();
}
return DateTime.Now.Subtract(startTime);
}
private static void OpenExistingFileWithCheck()
{
string file = @"C:\temp\existingfile.txt";
if (File.Exists(file))
{
using (FileStream fs = File.Open(file, FileMode.Open, FileAccess.Read))
{
}
}
}
private static void OpenExistingFileWithoutCheck()
{
string file = @"C:\temp\existingfile.txt";
using (FileStream fs = File.Open(file, FileMode.Open, FileAccess.Read))
{
}
}
private static void OpenNonExistingFileWithCheck()
{
string file = @"C:\temp\nonexistantfile.txt";
if (File.Exists(file))
{
using (FileStream fs = File.Open(file, FileMode.Open, FileAccess.Read))
{
}
}
}
private static void OpenNonExistingFileWithoutCheck()
{
try
{
string file = @"C:\temp\nonexistantfile.txt";
using (FileStream fs = File.Open(file, FileMode.Open, FileAccess.Read))
{
}
}
catch (Exception ex)
{
}
}
}
On my computer:
- ts1 = .75 seconds (same with or without debugger attached)
- ts2 = .56 seconds (same with or without debugger attached)
- ts3 = .14 seconds (same with or without debugger attached)
- ts4 = 14.28 seconds (with debugger attached)
- ts4 = 1.07 (without debugger attached)
UPDATE:
I added details on whether a dubgger was attached or not. I tested debug and release build but the only thing that made a difference was the one function that ended up throwing exceptions while the debugger was attached (which makes sense). Still though, checking with File.Exists is the best choice.
It depends !
If there's a high chance for the file to be there (you know this for your scenario, but as an example something like desktop.ini
) I would rather prefer to directly try to open it.
Anyway, in case of using File.Exist
you need to put File.OpenRead
in try/catch for concurrency reasons and avoiding any run-time exception but it would considerably boost your application performance if the chance for file to be there is low. Ostrich algorithm
File.Exists is a good first line of defense. If the file doesn't exist, then you're guaranteed to get an exception if you try to open it. The existence check is cheaper than the cost of throwing and catching an exception. (Maybe not much cheaper, but a bit.)
There's another consideration, too: debugging. When you're running in the debugger, the cost of throwing and catching an exception is higher, because the IDE has hooks into the exception mechanism that increase your overhead. And if you've checked any of the "Break on thrown" checkboxes in Debug > Exceptions, then any avoidable exceptions become a huge pain point. For that reason alone, I would argue for preventing exceptions when possible.
However, you still need the try-catch, for the reasons pointed out by other answers here. The File.Exists call is merely an optimization; it doesn't save you from needing to catch exceptions due to timing, permissions, solar flares, etc.
Wouldn't it be most efficient to run a directory search, find it, and then try to open it?
Dim Files() as string = System.IO.Directory.GetFiles("C:\", "SpecificName.txt", IO.SearchOption.AllDirectories)
Then you would get an array of strings that you know exist.
Oh, and as an answer to the original question, I would say that yes, try/catch would introduce more processor cycles, I would also assume that IO peeks actually take longer than the overhead of the processor cycles.
Running the Exists first, then the open second, is 2 IO functions against 1 of just trying to open it. So really, I'd say the overall performance is going to be a judgment call on processor time vs. hard drive speed on the PC it will be running on. If you've got a slower processor, I'd go with the check, if you've got a fast processor, I might go with the try/catch on this one.
The overhead of an exception is noticeable, but it's not significant compared to file operations.