tags:

views:

137

answers:

8

Hi all,

This is most certainly a language agnostic question and one that has bothered me for quite some time now. An example will probably help me explain the dilemma I am facing:

Let us say we have a method which is responsible for reading a file, populating a collection with some objects (which store information from the file), and then returning the collection...something like the following:

public List<SomeObject> loadConfiguration(String filename);

Let us also say that at the time of implementing this method, it would seem infeasible for the application to continue if the collection returned was empty (a size of 0). Now, the question is, should this validation (checking for an empty collection and perhaps the subsequent throwing of an exception) be done within the method? Or, should this methods sole responsibility be to perform the load of the file and ignore the task of validation, allowing validation to be done at some later stage outside of the method?

I guess the general question is: is it better to decouple the validation from the actual task being performed by a method? Will this make things, in general, easier at a later stage to change or build upon - in the case of my example above, it may be the case at a later stage where a different strategy is added to recover from the event of an empty collection being return from the 'loadConfiguration' method..... this would be difficult if the validation (and resulting exception) was being done in the method.

Perhaps I am being overly pedantic in the quest for some dogmatic answer, where instead it simply just relies on the context in which a method is being used. Anyhow, I would be very interested in seeing what others have to say regarding this.

Thanks all!

A: 

Methods should be highly cohesive ... that is single minded. So my opinion would be to separate the responsibilities as you have described. I sometimes feel tempted to say...it is just a short method so it does not matter...then I regret it 1.5 weeks later.

Vincent Ramdhanie
A: 

I think this depends on the case: If you could think of a scenario where you would use this method and it returned an empty list, and this would be okay, then I would not put the validation inside the method. But for e.g. a method which inserts data into a database which have to be validated (is the email address correct, has a name been specified, ... ) then it should be ok to put validation code inside the function and throw an exception.

eWolf
+4  A: 

I guess the general question is: is it better to decouple the validation from the actual task being performed by a method?

Yes. (At least if you really insist on answering such a general question – it’s always quite easy to find a counter-example.) If you keep both the parts of the solution separate, you can exchange, drop or reuse any of them. That’s a clear plus. Of course you must be careful not to jeopardize your object’s invariants by exposing the non-validating API, but I think you are aware of that. You’ll have to do some little extra typing, but that won’t hurt you.

zoul
Our opinions diverge I think. delayed validation (of parameters, return, etc...) is an open gate to debugging nightmares.
Matthieu M.
You are welcome to disagree :) Skipping unit tests is an open gate to debugging nightmares. If you don’t brush off the design you will keep the dangerous wires encapsulated and the outer API tested, then you have nothing to fear. Of course I am not going to defend this over my dead body, there are clearly cases where you would do best to keep the loading and validation in one piece.
zoul
I agree. A counter-example: you don't want to be bothered by an exception because the program needs to start with default values for some options anyway and the remaining are few, so you initialize them all, don't do anything when there is not a configuration file (or is empty) and forget about it.
MaD70
Clarification: my counter-example is actually an example in favor of zoul's position.
MaD70
+2  A: 

To deflect the question to a more basic one, each method should do as little as possible. So in your example, there should be a method that reads in the file, a method that extracts the necessary data from the file, another method to write that data to the collection, and another method that calls these methods. The validation can go in a separate method, or in one of the others, depending on where it makes the most sense.

   private byte[] ReadFile(string fileSpec)
   {  
       // code to read in file, and return contents
   }
   private FileData GetFileData(string fileContents)
   {
       // code to create FileData struct from file contents
   }
   private void FileDataCollection: Collection<FileData> { }

   public void DoItAll (string fileSpec, FileDataCollection filDtaCol)
   {
        filDtaCol.Add(GetFileData(ReadFile(fileSpec)));
   }

Add validation, verification to each of the methods as appropriate

Charles Bretana
A 'basic' method should do as little as possible, but it's okay for one method to aggregate 3/4 others methods.
Matthieu M.
+7  A: 

My recommendation is to stick to the single responsibility principle which says, in a nutshell, that each object should have 1 purpose. In this instance, your method has 3 purposes and then 4 if you count the validation aspect.

Here's my recommendation on how to handle this and how to provide a large amount of flexibility for future updates.

  1. Keep your LoadConfig method

  2. Have it call the a new method for reading the file.

  3. Pass the previous method's return value to another method for loading the file into the collection.

  4. Pass the object collection into some validation method.

  5. Return the collection.

That's taking 1 method initially and breaking it into 4 with one calling 3 others. This should allow you to change pieces w/o having any impact on others.

Hope this helps

JamesEggers
I understand the good intentions of such suggestion, but it risks to instigate over-engineering.
MaD70
@MaD70 to an extent I agree; however, if unit testing is going to be used, keeping things small and simple will greatly help keep the code maintainable. If it's a tiny system then I agree with you; however, based on my experiences it's better to always abstract out to single responsibilities else the risk of indirect bugs grows.
JamesEggers
+2  A: 

I will answer your question by a question: do you want various validation methods for the product of your method ?

This is the same as the 'constructor' issue: is it better to raise an exception during the construction or initialize a void object and then call an 'init' method... you are sure to raise a debate here!

In general, I would recommend performing the validation as soon as possible: this is known as the Fail Fast which advocates that finding problems as soon as possible is better than delaying the detection since diagnosis is immediate while later you would have to revert the whole flow....

If you're not convinced, think of it this way: do you really want to write 3 lines every time you load a file ? (load, parse, validate) Well, that violates the DRY principle.

So, go agile there:

  • write your method with validation: it is responsible for loading a valid configuration (1)
  • if you ever need some parametrization, add it then (like a 'check' parameter, with a default value which preserves the old behavior of course)

(1) Of course, I don't advocate a single method to do all this at once... it's an organization matter: under the covers this method should call dedicated methods to organize the code :)

Matthieu M.
+1 from me, even though we gave opposing answers. I think we both mean the same: You say to keep the validation with the loading, but split the implementation into separate methods under the cover; I say to keep them separate, but be careful when exposing the parts in the outer API.
zoul
"In general, I would recommend..." but there aren't "general programs" out there; only specific programs, written by specific people, to solve specific problems (however general one want it, a program doesn't solve every problem in existence). See my reply to zoul for an example where one doesn't want to raise an exception in case of an empty (or non existent) configuration file.
MaD70
I don't see your points... of course if there are cases when you are content with a situation, then don't treat it as an error there. However not treating any error is a lazy approach, you have to carefully define the contract of your method, and make it evolve if necessary when you need to refactor.
Matthieu M.
+2  A: 

You are designing an API and should not make any unnecessary assumptions about your client. A method should take only the information that it needs, return only the information requested, and only fail when it is unable to return a meaningful value.

So, with that in mind, if the configuration is loadable but empty, then returning an empty list seems correct to me. If your client has an application specific requirement to fail when provided an empty list, then it may do so, but future clients may not have that requirement. The loadConfiguration method itself should fail when it really fails, such as when it is unable to read or parse the file.

But you can continue to decouple your interface. For example, why must the configuration be stored in a file? Why can't I provide a URL, a row in a database, or a raw string containing the configuration data? Very few methods should take a file path as an argument since it binds them tightly to the local file system and makes them responsible for opening, reading, and closing files in addition to their core logic. Consider accepting an input stream as an alternative. Or if you want to allow for elaborate alternatives -- like data from a database -- consider accepting a ConfigurationReader interface or similar.

Ryan Bright
A: 

Another alternative, not mentioned above, is to support Dependency Injection and have the method client inject a validator. This would allow the preservation of the "strong" Resource Acquisition Is Initialization principle, that is to say Any Object which Loads Successfully is Ready For Business (Matthieu's mention of Fail Fast is much the same notion).

It also allows a resource implementation class to create its own low-level validators which rely on the structure of the resource without exposing clients to implementation details unnecessarily, which can be useful when dealing with multiple disparate resource providers such as Ryan listed.

Mike Burton