views:

3585

answers:

11

Hello. I have to check, if directory on disk is empty. It means, that it does not contain any folders/files. I know, that there is a simple method. We get array of FileSystemInfo's and check if count of elements equals to zero. Something like that:

    public static bool CheckFolderEmpty(string path)
 {
  if (string.IsNullOrEmpty(path))
  {
   throw new ArgumentNullException("path");
  }

  var folder = new DirectoryInfo(path);
  if (folder.Exists)
  {
   return folder.GetFileSystemInfos().Length == 0;
  }

  throw new DirectoryNotFoundException();
 }

This approach seems OK. BUT!! It is very, very bad from a perspective of performance. GetFileSystemInfos() is a very hard method. Actually, it enumerates all filesystem objects of folder, gets all their properties, creates objects, fills typed array etc. And all this just to simply check Length. That's stupid, isn't it?

I just profiled such code and determined, that ~250 calls of such method are executed in ~500ms. This is very slow and i believe, that it is possible to do it much quicker.

Any suggestions? Thanks.

+2  A: 

You could try Directory.Exists(path) and Directory.GetFiles(path) - probably less overhead (no objects - just strings etc).

Marc Gravell
As always, you are fastest off the trigger! Beat me by a few seconds there! :-)
Cerebrus
You were both quicker than me... damn my attention to detail ;-)
Eoin Campbell
Didn't do me any good, though; first answer, and the only one without a vote ;-(
Marc Gravell
Unfixed... somebody has an axe to grind, methinks
Marc Gravell
+5  A: 

I don't know about the performance statistics on this one, but have you tried using the Directory.GetFiles() static method ?

It returns a string array containing filenames (not FileInfos) and you can check the length of the array in the same way as above.

Cerebrus
same issue, it can be slow if there are many files... but it's probably faster that GetFileSystemInfos
Thomas Levesque
+3  A: 

You will have to go the hard drive for this information in any case, and this alone will trump any object creation and array filling.

Don Reba
True, although creating some of the objects involves looking up extra metadata on disk that might not be necessary.
Adam Rosenfield
The ACL would be required for every object for sure. There is no way around it. And once you have to look up those, you might as well read any other information in MFT headers for the files in the folder.
Don Reba
+9  A: 
private static void test()
{
    System.Diagnostics.Stopwatch sw = new System.Diagnostics.Stopwatch();
    sw.Start();

    string [] dirs = System.IO.Directory.GetDirectories("C:\\Test\\");
    string[] files = System.IO.Directory.GetFiles("C:\\Test\\");

    if (dirs.Length == 0 && files.Length == 0)
        Console.WriteLine("Empty");
    else
        Console.WriteLine("Not Empty");

    sw.Stop();
    Console.WriteLine(sw.ElapsedMilliseconds);
}

This quick test came back in 2 milliseconds for the folder when empty and when containing subfolders & files (5 folders with 5 files in each)

Eoin Campbell
You could improve this by returning if 'dirs' is none-empty straight away, without having to get the list of files.
samjudson
Yes, but what if there are thousands of files in it ?
Thomas Levesque
+3  A: 

I'm not aware of a method that will succinctly tell you if a given folder contains any other folders or files, however, using:

Directory.GetFiles(path);
&
Directory.GetDirectories(path);

should help performance since both of these methods will only return an array of strings with the names of the files/directories rather than entire FileSystemInfo objects.

CraigTP
+1  A: 

Thanks, everybody, for replies. I tried to use Directory.GetFiles() and Directory.GetDirectories() methods. Good news! The performance improved ~twice! 229 calls in 221ms. But also I hope, that it is possible to avoid enumeration of all items in the folder. Agree, that still the unnecessary job is executing. Don't you think so?

After all investigations, I reached a conclusion, that under pure .NET further optimiation is impossible. I am going to play with WinAPI's FindFirstFile function. Hope it will help.

zhe
Out of interest, what are the reasons you need such high performance for this operation?
meandmycode
Rather than answer your own question, mark one of the correct answers as the answer (probably the first one posted or the clearest one). This way future users of stackoverflow will see the best answer right under your question!
Ray Hayes
A: 

My code is amazing it just took 00:00:00.0007143 less than milisecond with 34 file in folder

   System.Diagnostics.Stopwatch sw = new System.Diagnostics.Stopwatch();
    sw.Start();

     bool IsEmptyDirectory = (Directory.GetFiles("d:\\pdf").Length == 0);

     sw.Stop();
     Console.WriteLine(sw.Elapsed);
Prashant
Actually, if you multiply it by 229 and add GetDirectories(), you will get the same result, as mine :)
zhe
+3  A: 

Here is the extra fast solution, that i finally implemented. Here i am using WinAPI and functions FindFirstFile, FindNextFile. It allows to avoid enumeration of all items in Folder and stops right after detecting the first object in the Folder. This approach is ~6(!!) times faster, than described above. 250 calls in 36ms!

 private static readonly IntPtr INVALID_HANDLE_VALUE = new IntPtr(-1);

 [StructLayout(LayoutKind.Sequential, CharSet = CharSet.Auto)]
 private struct WIN32_FIND_DATA
 {
  public uint dwFileAttributes;
  public System.Runtime.InteropServices.ComTypes.FILETIME ftCreationTime;
  public System.Runtime.InteropServices.ComTypes.FILETIME ftLastAccessTime;
  public System.Runtime.InteropServices.ComTypes.FILETIME ftLastWriteTime;
  public uint nFileSizeHigh;
  public uint nFileSizeLow;
  public uint dwReserved0;
  public uint dwReserved1;
  [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 260)]
  public string cFileName;
  [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 14)]
  public string cAlternateFileName;
 }

 [DllImport("kernel32.dll", CharSet=CharSet.Auto)]
 private static extern IntPtr FindFirstFile(string lpFileName, out WIN32_FIND_DATA lpFindFileData);

 [DllImport("kernel32.dll", CharSet=CharSet.Auto)]
 private static extern bool FindNextFile(IntPtr hFindFile, out WIN32_FIND_DATA lpFindFileData);

 [DllImport("kernel32.dll")]
 private static extern bool FindClose(IntPtr hFindFile);

 public static bool CheckDirectoryEmpty_Fast(string path)
 {
  if (string.IsNullOrEmpty(path))
  {
   throw new ArgumentNullException(path);
  }

  if (Directory.Exists(path))
  {
   if (path.EndsWith(Path.DirectorySeparatorChar.ToString()))
    path += "*";
   else
    path += Path.DirectorySeparatorChar + "*";

   WIN32_FIND_DATA findData;
   var findHandle = FindFirstFile(path, out findData);

   if (findHandle != INVALID_HANDLE_VALUE)
   {
    try
    {
     bool empty = true;
     do
     {
      if (findData.cFileName != "." && findData.cFileName != "..")
       empty = false;
     } while (empty && FindNextFile(findHandle, out findData));

     return empty;
    }
    finally
    {
     FindClose(findHandle);
    }
   }

   throw new Exception("Failed to get directory first file",
                       Marshal.GetExceptionForHR(Marshal.GetHRForLastWin32Error()));
  }

  throw new DirectoryNotFoundException();
 }

I hope, it will be useful for somebody in the future. Seems like i proposed the best solution myself :) Anyway, thanks for your help, guys!

PS. seems, like i am going to accept my own answer ))

zhe
+4  A: 

If you don't mind leaving pure C# and going for WinApi calls, then you might want to consider the PathIsDirectoryEmpty() function. According to the MSDN, the function:

Returns TRUE if pszPath is an empty directory. Returns FALSE if pszPath is not a directory, or if it contains at least one file other than "." or "..".

That seems to be a function which does exactly what you want, so it is probably well optimised for that task (although I haven't tested that).

I don't have time now to describe how to call it from C#, but the pinvoke.net site should help you with that. (Unfortunately, it doesn't describe this certain function yet, but you should be able to find some functions with similar arguments and return type there and use them as the basis for your call. If you look again into the MSDN, it says that the DLL to import from is shlwapi.dll)

(Sorry for not linking the pinvoke.net address, but I received a "new users can only post a maximum of one hyperlink" message here...)

akavel
Great idea. I didn't know about this function. I'll try to compare it's performance with my approach, that i described above. If it would do faster, i'll reuse it in my code.Thanks.
zhe
A note for those who wants go this route. It seems that this PathIsDirectoryEmpty() method from shlwapi.dll works fine on Vista32/64 and XP32/64 machines, but bombs out on some Win7 machines. It must be something to do with versions of shlwapi.dll shipped with different versions of Windows. Beware.
Alex_P
+4  A: 

There is a new feature in Directory and DirectoryInfo in .NET 4 that allows to return a IEnumerable instead of an array, and starts returning results before reading all the directory contents.

See here and there

public bool IsDirectoryEmpty(string path)
{
    IEnumerable<string> items = Directory.EnumerateFileSystemEntries(path);
    using (IEnumerator<string> en = items.GetEnumerator())
    {
        return !en.MoveNext();
    }
}

EDIT: seeing that answer again, I realize this code can be made much simpler...

public bool IsDirectoryEmpty(string path)
{
    return !Directory.EnumerateFileSystemEntries(path).Any();
}
Thomas Levesque
A: 

You should also wrap your test into a try/catch block in order to make sure you properly handle a DirectoryNotFoundException. This is a classical race condition in case the folder gets deleted right after you having checked whether it existed.

Philipp Sumi