I am trying to use test-driven development for an application that has to read a lot of data from disk. The problem is that the data is organized on the filesystem in a somewhat complex directory structure (not my fault). The methods I'm testing will need to see that a large number of files exist in several different directories in order for the methods to complete.
The solution I'm trying to avoid is just having a known folder on the hard drive with all the data in it. This approach sucks for several reasons, one reason being that if we wanted to run the unit tests on another computer, we'd have to copy a large amount of data to it.
I could also generate dummy files in the setup method and clean them up in the teardown method. The problem with this is that it would be a pain to write the code to replicate the existing directory structure and dump lots of dummy files into those directories.
I understand how to unit test file I/O operations, but how do I unit test this kind of scenario?
Edit: I will not need to actually read the files. The application will need to analyze a directory structure and determine what files exist in it. And this is a large number of subdirectories with a large number of files.