views:

860

answers:

5

I just started writing tests for a lot of code. There's a bunch of classes with dependencies to the file system, that is they read CSV files, read/write configuration files and so on.

Currently the test files are stored in the test directory of the project (it's a Maven2 project) but for several reasons this directory doesn't always exist, so the tests fail.

Do you know best practices for coping with file system dependencies in unit/integration tests?

Edit: I'm not searching an answer for that specific problem I described above. That was just an example. I'd prefer general recommendations how to handle dependencies to the file system/databases etc.

A: 

Usually, file system tests aren't very critical: The file system is well understood, easy to set up and to keep stable. Also, accesses are usually pretty fast, so there is no reason per se to shun it or to mock the tests.

I suggest that you find out why the directory doesn't exist and make sure that it does. For example, check the existence of a file or directory in setUp() and copy the files if the check fails. This only happens once, so the performance impact is minimal.

Aaron Digulla
A: 

There are two options for testing code that needs to read from files:

  1. Keep the files related to the unit tests in source control (e.g. in a test data folder), so anyone who gets the latest and runs the tests always has the relevant files in a known folder relative to the test binaries. This is probably the "best practice".

  2. If the files in question are huge, you might not want to keep them in source control. In this case, a network share that is accessible from all developer and build machines is probably a reasonable compromise.

Obviously most well-written classes will not have hard dependencies on the file system in the first place.

Mark Heath
+10  A: 

First one should try to keep the unit tests away from the filesystem - see this Set of Unit Testing Rules. If possible have your code working with Streams that will be buffers (i.e. in memory) for the unit tests, and FileStream in the production code.

If this is not feasible, you can have your unit tests generates the files they need. This makes the test easy to read as everything is in one file. This may also prevent permissions problem.

You can mock the filesystem/database/network access in your unit tests.

You can consider the unit tests that rely on DB or file systems as integration tests.

philippe
+1  A: 

Dependencies on the filesystem come in two flavours here:

  • files that you're tests depend upon; if you need files to run the test, then you can generate them in your tests and put them in a /tmp directory.
  • files that you're code is dependent upon: config files, or input files.

In this second case, it's often possible to re-structure your code to remove dependency upon a file (e.g. java.io.File can be replaced with java.io.InputStream and java.io.OutputStream). This may not be possible of course.

You may also need to handle 'non-determinism' in the filesystem (I had a devil of a job debugging something on an NFS once). In this case you should probably wrap the file system in a thin interface.

At its simplest, this is just helper methods that take a File and forward the call onto that file:

InputStream getInputStream(File file) throws IOException {
    return new FileInputStream(file);
}

You can then replace this one with a mock which you can direct to throw the exception, or return a ByteArrayInputStream, or whatever.

The same can be said for URLs and URIs.

jamesh
A: 

Give the test files, both in and out, names that are structurally similar to the unit test name.

In JUnit, for instance, I'd use:

File reportFile = new File("tests/output/" + getClass().getSimpleName() + "/" + getName() + ".report.html");
Wouter Lievens