views:

235

answers:

3

I am putting together a presentation on the benefits of Unit Testing and I would like a simple example of unintended consequences: Changing code in one class that breaks functionality in another class.

Can someone suggest a simple, easy to explain an example of this?

My plan is to write unit tests around this functionality to demonstrate that we know we broke something by immediately running the test.

+8  A: 

Here's an example:

class DataProvider {
    public static IEnumerable<Something> GetData() {
        return new Something[] { ... };
    }
}

class Consumer {
    void DoSomething() {
        Something[] data = (Something[])DataProvider.GetData();
    }
}

Change GetData() to return a List<Something>, and Consumer will break.

This might seen somewhat contrived, but I've seen similar problems in real code.

SLaks
+12  A: 

A slightly simpler, and thus perhaps clearer, example is:

public string GetServerAddress()
{
    return "172.0.0.1";
}

public void DoSomethingWithServer()
{
    Console.WriteLine("Server address is: " +  GetServerAddress());
}

If GetServerAddress is changes to return an array:

public string[] GetServerAddress()
{
    return new string[] { "127.0.0.1", "localhost" };
}

The output from DoSomethingWithServer will be somewhat different, but it will all still compile, making for an even subtler bug.

The first (non-array) version will print Server address is: 127.0.0.1 and the second will print Server address is: System.String[], this is something I've also seen in production code. Needless to say it's no longer there!

Rob
How on earth would you test for that though? Changing the return value can be caught at compile time (Eg can't do `String address = GetServerAddreess();`), but catching in Strings is almost impossible
TheLQ
@TheLQ, if you're code is instead: `string serverAddress = GetServerAddress(); Console.WriteLine("Server address is: " + serverAddress);` you'll get a compilation error in this example =) And, there's no point worrying about "verbose code being less efficient" as if the JIT fails to optimise that away, I'd be very surprised *AND* concerned! :-)
Rob
+3  A: 

Say you have a method that does:

abstract class ProviderBase<T>
{
  public IEnumerable<T> Results
  {
    get
    {
      List<T> list = new List<T>();
      using(IDataReader rdr = GetReader())
        while(rdr.Read())
          list.Add(Build(rdr));
      return list;
    }
  }
  protected abstract IDataReader GetReader();
  protected T Build(IDataReader rdr);
}

With various implementations being used. One of them is used in:

public bool CheckNames(NameProvider source)
{
  IEnumerable<string> names = source.Results;
  switch(names.Count())
  {
      case 0:
        return true;//obviously none invalid.
      case 1:
        //having one name to check is a common case and for some reason
        //allows us some optimal approach compared to checking many.
        return FastCheck(names.Single());
      default:
        return NormalCheck(names)
  }
}

Now, none of this is particularly weird. We aren't assuming a particular implementaiton of IEnumerable. Indeed, this will work for arrays and very many commonly used collections (can't think of one in System.Collections.Generic that doesn't match off the top of my head). We've only used the normal methods, and the normal extension methods. It's not even unusual to have an optimised case for single-item collections. We could for instance change the list to be an array, or maybe a HashSet (to automatically remove duplicates), or a LinkedList or a few other things and it'll keep working.

Still, while we aren't depending on a particular implementation, we are depending on a particular feature, specifically that of being rewindable (Count() will either call ICollection.Count or else enumerate through the enumerable, after which the name-checking will take place.

Someone though sees Results property and thinks "hmm, that's a bit wasteful". They replace it with:

public IEnumerable<T> Results
{
  get
  {
    using(IDataReader rdr = GetReader())
      while(rdr.Read())
        yield return Build(rdr);
  }
}

This again is perfectly reasonable, and will indeed lead to a considerable performance boost in many cases. If CheckNames isn't hit in the immediate "tests" done by the coder in question (maybe it isn't hit in a lot of code paths), then the fact that CheckNames will error (and possibly return a false result in the case of more than 1 name, which may be even worse, if it opens a security risk).

Any unit test that hits on CheckNames with the more than zero results is going to catch it though.


Incidentally a comparable (if more complicated) change is a reason for a backwards-compatibility feature in NPGSQL. Not quite as simple as just replacing a List.Add() with a return yield, but a change to the way ExecuteReader worked gave a comparable change from O(n) to O(1) to get the first result. However, before then NpgsqlConnection allowed users to obtain another reader from a connection while the first was still open, and after it didn't. The docs for IDbConnection says you shouldn't do this, but that didn't mean there was no running code that did. Luckily one such piece of running code was an NUnit test, and a backwards-compatibility feature added to allow such code to continue to function with just a change to configuration.

Jon Hanna