views:

27

answers:

2

I need to calculate the directory size in VB .Net

I know the following 2 methods

Method 1: from MSDN http://msdn.microsoft.com/en-us/library/system.io.directory.aspx

' The following example calculates the size of a directory ' and its subdirectories, if any, and displays the total size ' in bytes. Imports System Imports System.IO

Public Class ShowDirSize

Public Shared Function DirSize(ByVal d As DirectoryInfo) As Long
    Dim Size As Long = 0
    ' Add file sizes.
    Dim fis As FileInfo() = d.GetFiles()
    Dim fi As FileInfo
    For Each fi In fis
        Size += fi.Length
    Next fi
    ' Add subdirectory sizes.
    Dim dis As DirectoryInfo() = d.GetDirectories()
    Dim di As DirectoryInfo
    For Each di In dis
        Size += DirSize(di)
    Next di
    Return Size
End Function 'DirSize

Public Shared Sub Main(ByVal args() As String)
    If args.Length <> 1 Then
        Console.WriteLine("You must provide a directory argument at the command line.")
    Else
        Dim d As New DirectoryInfo(args(0))
        Dim dsize As Long = DirSize(d)
        Console.WriteLine("The size of {0} and its subdirectories is {1} bytes.", d, dsize)
    End If
End Sub 'Main

End Class 'ShowDirSize

Method 2: from http://stackoverflow.com/questions/468119/whats-the-best-way-to-calculate-the-size-of-a-directory-in-net

Dim size As Int64 = (From strFile In My.Computer.FileSystem.GetFiles(strFolder, _ FileIO.SearchOption.SearchAllSubDirectories) _ Select New System.IO.FileInfo(strFile).Length).Sum()

Both these methods work fine. However they take lot of time to calculate the directory size if there are lot of sub-folders. e.g i have a directory with 150,000 sub-folders. The above methods took around 1 hr 30 mins to calculate the size of the directory. However, if I check the size from windows it takes less than a minute.

Please suggest better and faster ways to calculate the size of the directory.

A: 

Though this answer is talking about Python, the concept applies here as well.

Windows Explorer uses system API calls FindFirstFile and FindNextFile recursively to pull file information, and then can access the file sizes very quickly through the data that's passed back via a struct, WIN32_FIND_DATA: http://msdn.microsoft.com/en-us/library/aa365740(v=VS.85).aspx.

My suggestion would be to implement these API calls using P/Invoke, and I believe you will experience significant performance gains.

rakuo15
A: 

Doing the work in parallel should make it faster, at least on multi-core machines. Try this C# code. You will have to translate for VB.NET.

private static long DirSize(string sourceDir, bool recurse) 
{ 
    long size = 0; 
    string[] fileEntries = Directory.GetFiles(sourceDir); 

    foreach (string fileName in fileEntries) 
    { 
        Interlocked.Add(ref size, (new FileInfo(fileName)).Length); 
    } 

    if (recurse) 
    { 
        string[] subdirEntries = Directory.GetDirectories(sourceDir); 

        Parallel.For<long>(0, subdirEntries.Length, () => 0, (i, loop, subtotal) => 
        { 
            if ((File.GetAttributes(subdirEntries[i]) & FileAttributes.ReparsePoint) != FileAttributes.ReparsePoint) 
            { 
                subtotal += DirSize(subdirEntries[i], true); 
                return subtotal; 
            } 
            return 0; 
        }, 
            (x) => Interlocked.Add(ref size, x) 
        ); 
    } 
    return size; 
} 
Jamie