views:

249

answers:

2

We have a web application that takes user inputs or database lookups to form some operations against some physical resources. The design can be simply presented as following diagram:

user input <=> model object <=> database storage

validations are needed with request coming from user input but NOT when coming from database lookup hits (since if a record exists, those attributes must have already been validated before). I am trying to refactoring the code so that the validations happen in the object constructor instead of the old way (a separate few validation routines)

How would you decide which way is better? (The fundamental difference of method 1 (the old way) and 2 is that validations in 1 are not mandatory and decoupled from object instantiation but 2 binds them and makes them mandatory for all requests)

Here are two example code snippets for design 1 and 2:

Method 1:

# For processing single request.
# Steps: 1. Validate all incoming data. 2. instantiate the object.
ValidateAttribures(request) # raise Exceptions if failed
resource = Resource(**request)

Method 2:

# Have to extract out this since it does not have anything to do with
# the object.
# raise Exceptions if some required params missing.
# steps: 1. Check whether its a batching request. 2. instantiate the object.
#           (validations are performed inside the constructor)
CheckIfBatchRequest(request) 
resource = Resource(**request) # raise Exceptions when validations failed

In a batch request: Method 1:

# steps: 1. validate each request and return error to the client if any found.
#        2. perform the object instantiate and creation process. Exceptions are
#           captured.
#        3. when all finished, email out any errors.
for request in batch_requests:
    try:    
        ValidateAttribute(request)
    except SomeException, e:
        return ErrorPage(e)
errors = []
for request in batch_requests:
    try:
        CreatResource(Resource(**request), request)
    except CreationError, e:
        errors.append('failed to create with error: %s', e)
email(errors)

Method 2:

# steps: 1. validate batch job related data from the request.
#        2. If success, create objects for each request and do the validations.
#        3. If exception, return error found, otherwise, 
#           return a list of pairs with (object, request)
#        4. Do the creation process and email out any errors if encountered.
CheckIfBatchRequest(request)
request_objects = []
for request in batch_requests:
    try:
        resource = Resource(**request)
    except SomeException, e:
        return ErrorPage(e)
    request_objects.append((resource, request))
email(CreateResource(request_objects)) # the CreateResource will also need to be refactored.

Pros and Cons as I can see here are:

  1. Method 1 follows more close to the business logic. No redundant validations check when objects come from db lookup. The validation routines are better maintainable and read.
  2. Method 2 makes easy and clean for the caller. Validations are mandatory even if from db lookup. Validations are less maintainable and read.
+1  A: 

Doing validation in the constructor really isn't the "Django way". Since the data you need to validate is coming from the client-side, using new forms (probably with a ModelForm) is the most idiomatic method to validate because it wraps all of your concerns into one API: it provides sensible validation defaults (with the ability to easily customize), plus model forms integrates the data-entry side (the html form) with the data commit (model.save()).

However, it sounds like you have what may be a mess of a legacy project; it may be outside the scope of your time to rewrite all your form handling to use new forms, at least initially. So here are my thoughts:

First of all, it's not "non-Djangonic" to put some validation in the model itself - after all, html form submissions may not be the only source of new data. You can override the save() method or use signals to either clean the data on save or throw an exception on invalid data. Long term, Django will have model validation, but it's not there yet; in the interim, you should consider this a "safety" to ensure you don't commit invalid data to your DB. In other words, you still need to validate field by field before committing so you know what error to display to your users on invalid input.

What I would suggest is this. Create new forms classes for each item you need to validate, even if you're not using them initially. Malcolm Tredinnick outlined a technique for doing model validation using the hooks provided in the forms system. Read up on that (it's really quite simple and elegant), and hook in into your models. Once you've got the newforms classes defined and working, you'll see that it's not very difficult - and will in fact greatly simplify your code - if you rip out your existing form templates and corresponding validation, and handle your form POSTs using the forms framework. There is a bit of a learning curve, but the forms API is extremely well thought out and you'll be grateful how much cleaner it will make your code.

Daniel
A: 

Thanks Daniel for your reply. Especially for the newforms API, I will definitely spend time digging into it and see if I can adopt it for the better long-term benefits. But just for the sake of getting my work done for this iteration (meet my deadline before EOY), I'd probably still have to stick with the current legacy structure, after all, either way will get me to what I want, just that I want to make it sane and clean as possible as I can without breaking too much.

So sounds like doing validations in model isn't a too bad idea, but in another sense, my old way of doing validations in views against the request seems also close to the concept of encapsulating them inside the newforms API (data validation is decoupled from model creation). Do you think it is ok to just keep my old design? It make more sense to me to touch this with the newforms API instead of juggling them right now...

(Well I got this refactoring suggestion from my code reviewer but I am really not so sure that my old way violates any mvc patterns or too complicated to maintain. I think my way makes more senses but my reviewer thought binding validation and model creation together makes more sense...)

As a very general point, try to keep as much logic as possible outside of the constructor.Having logic in the constructor makes unit testing much more difficult, and strong unit tests are really important for any refactoring or maintenance.
Roberto Liffredo