views:

152

answers:

6

Hi,

i have an algorithm that searches into a directory and search for all the text files in that directory and any sub-directory. Assuming i do not know how many sub-directories and sub-sub directories there are in the parent directory. how do i calculate the complexity?

this is the code i am using

 public List<string> GetFilesInDirectory(string directoryPath)
    {            
        // Store results in the file results list.
        List<string> files = new List<string>();

        // Store a stack of our directories.
        Stack<string> stack = new Stack<string>();

        // Add initial directory.
        stack.Push(Server.MapPath(directoryPath));

        // Continue while there are directories to process
        while (stack.Count > 0)
        {                
            // Get top directory
            string dir = stack.Pop();

            try
            {             
                // Add all files at this directory to the result List.
                files.AddRange(Directory.GetFiles(dir, "*.txt"));                    

                // Add all directories at this directory.
                foreach (string dn in Directory.GetDirectories(dir))
                {
                    stack.Push(dn);
                }
            }
            catch(Exception ex)
            {

            }
        }

        return files;
    }

thanks

A: 

Time complexity is calculated in terms of n, where n would be the number of items that are being processed. You don't need exact numbers, and more to the point you can't use exact numbers for big-O complexity, since you're trying to calculate worst-case running time.

Donnie
+1  A: 

I'd say it's O(N) on the number of files, together, in all directories.

Navigating those directories is not a complex task, it's just bookkeeping.

Carl Smotricz
The problem is that i do not know how many files there are in the directory or sub-directory.
No problem: That just means your question cannot be sensibly answered.
Carl Smotricz
"I'm standing in the lobby of a strange building with my eyes blindfolded. How hard will it be for me to find out how many people are in the house?"
Carl Smotricz
the complexity is linear then. you can't possibly estimate the running time without knowing the amount of items, you can only tell how much slower it'll be if there are more items
lhahne
A: 

the worse case running time is a function of the maximum depth of the directory tree (in windows it is limited by the max path length) and the number of files allowed in a subdirectory

Alon
+1  A: 

Your algorithm pushes all directories on your stack and does work for every directory it encounters, so the complexity is in the order of directories times 2, or O(2n) where n is the number of directories, as far as complexity is concerned this is equivalent to O(n).

rsp
+3  A: 

Big-O notation says something about how the problem complexity grows when the argument size grows. In other terms, how the time complexity grows when the set of elements increase. 1 or 8972348932 files/directories does not matter. Your code works in O(N) linear time, assuming directories and files are only visited once. O(123N) is still written as O(N). What does this mean? It means that Big O notation says nothing about the actual initial cost. Only how the complexity grows.

Compare two algorithms for the same problem, which runs in O(N) time and O(N log N) time. The O(N log N) algorithm might be faster for smaller N than the O(N) one, but given a large enough N, the O(N) will catch up.

Mads Elvheim
A: 

I would say it is O(N^2) because you have a double-nested for loop, but the size of each loop is not the same, so we have to modify it a bit.

The number of directories is likely smaller than the number of files. So let's say that the number of files is N and the number of directories is M then you would get O(N*M). That's my best guess.

Brian T Hannan
Yes i think you are right because i dofor each directory add to stackfor each file in directory add to file list