tags:

views:

302

answers:

4

Here is the scenario. I have a directory with 2+ million files. The code I have below writes out all the files in about 90 minutes. Does anybody have a way to speed it up or make this code more efficent? I'd also like to only write out the file names in the listing.

string lines = (listBox1.Items.ToString());  
        string sourcefolder1 = textBox1.Text;  
        string destinationfolder = (@"C:\anfiles");  
        using (StreamWriter output = new StreamWriter(destinationfolder + "\\" + "MasterANN.txt"))  
        {  
            string[] files = Directory.GetFiles(textBox1.Text, "*.txt");  
            foreach (string file in files)  
            {  
                FileInfo file_info = new FileInfo(file);  
                output.WriteLine(file_info.Name);  
            }  
        }

The slow down is it it rights out 1 line at a time. It takes about 13-15 minutes to get all the files it needs to write out. The following 75 minutes is creatign the file.

+5  A: 

The first thing I would need to know is, where's the slow down? is it taking 89 minutes for Directory.GetFiles() to execute or is the delay spread out over the calls to FileInfo file_info = new FileInfo(file);?

If the delay is from the latter, you can probably speed things up by getting the file name from the path instead of creating an FileInfo instance to get the filename.

System.IO.Path.GetFileName(file);
Ben
its okay the FileInfo file_info = new FileInfo(File; output.WriteLine(file_info.Name);
Mike
+4  A: 

You're reading 2+ million file descriptors into memory. Depending on how much memory you have you may well be swapping. Try breaking it up into smaller chunks by filtering on the file name.

Bill Barnes
+3  A: 

From my experience, it's Directory.GetFiles that's slowing you down (aside from console output). To overcome this, P/Invoke into FindFirstFile/FindNextFile to avoid all the memory consumption and generall lagginess.

Anton Gogolev
+5  A: 

It could help if you don't make a FileInfo instance for every file, use Path.GetFileName instead:

string lines = (listBox1.Items.ToString());  
        string sourcefolder1 = textBox1.Text;  
        string destinationfolder = (@"C:\anfiles");  
        using (StreamWriter output = new StreamWriter(Path.Combine(destinationfolder, "MasterANN.txt"))  
        {  
            string[] files = Directory.GetFiles(textBox1.Text, "*.txt");  
            foreach (string file in files)  
            {  
                output.WriteLine(Path.GetFileName(file));
            }  
        }
AlbertEin
Awesome! Thanks this defiently did it.
Mike
You're welcome!
AlbertEin
for some reason this doesnt work now
Mike
what do you mean?
AlbertEin