views:

477

answers:

5

Consider the following situation: -

Suppose my app allows users to create the states / provinces in their country. Just for clarity, we are considering only ASCII characters here.

In the US, a user could create the state called "Texas". If this app is being used internally, let's say the user doesn't care if it is spelled "texas" or "Texas" or "teXas"

But importantly, the system should prevent creation of "texas" if "Texas" is already in the database.

If the model is like the following:

class State(models.Model):
    name = models.CharField(max_length=50, unique=True)

The uniqueness would be case-sensitive in postgres; that is, postgres would allow the user to create both "texas" and "Texas" as they are considered unique.

What can be done in this situation to prevent such behavior. How does one go about providing case-insenstitive uniqueness with Django and Postgres

Right now I'm doing the following to prevent creation of case- insensitive duplicates.

class CreateStateForm(forms.ModelForm):
    def clean_name(self):
        name = self.cleaned_data['name']
        try:
            State.objects.get(name__iexact=name)
        except ObjectDoesNotExist:
            return name
        raise forms.ValidationError('State already exists.')

    class Meta:
        model = State

There are a number of cases where I will have to do this check and I'm not keen on having to write similar iexact checks everywhere.

Just wondering if there is a built-in or better way? Perhaps db_type would help? Maybe some other solution exists?

A: 

You can do this by overwriting the Model's save method - see the docs. You'd basically do something like:

class State(models.Model):
    name = models.CharField(max_length=50, unique=True)

    def save(self, force_insert=False, force_update=False):
        if State.objects.get(name__iexact = self.name):
            return
        else:
            super(State, self).save(force_insert, force_update)

Also, I may be wrong about this, but the upcoming model-validation SoC branch will allow us to do this more easily.

Rishabh Manocha
This is essentially the same as what I am already doing. In fact, they way I am doing it right now is better than handling it within save.
chefsmart
On a second read, you're right. AFAIK, using the forms' validation would be the best way to go (as of now) - unless the data is not being inserted via a form :).
Rishabh Manocha
A: 

Besides already mentioned option to override save, you can simply store all text in lower case in database and capitalize them on displaying.

class State(models.Model):
    name = models.CharField(max_length=50, unique=True)

    def save(self, force_insert=False, force_update=False):
        self.name = self.name.lower()
        super(State, self).save(force_insert, force_update)
Michal Čihař
+4  A: 

You could define a custom model field, derived from models.CharField This field could check for duplicate values, ignoring the case.

Custom fields documentation is here http://docs.djangoproject.com/en/dev/howto/custom-model-fields/

Look at http://code.djangoproject.com/browser/django/trunk/django/db/models/fields/files.py for an example of how to create a custom field by subclassing an existing field.

You could use the citext module of PostgreSQL http://www.postgresql.org/docs/8.4/static/citext.html

If you use this module, the the custom field could define "db_type" as CITEXT for postgre databases.

This would lead to case insensitive comparison for unique values in the custom field.

Mayuresh
This is an interesting solution, and is seems more Django-istic than other solutions mentioned here.
chefsmart
A: 

On the Postgres side of things, a functional unique index will let you enforce unique values without case. citext is also noted, but this will work with older versions of PostgreSQL and is a useful technique in general.

Example:

# create table foo(bar text);
CREATE TABLE
# create unique index foo_bar on foo(lower(bar));
CREATE INDEX
# insert into foo values ('Texas');
INSERT 0 1
# insert into foo values ('texas');
ERROR:  duplicate key value violates unique constraint "foo_bar"
Alex Brasetvik
I have tried this and can confirm it works. But the answer by Mayuresh allows me to live within Django.
chefsmart
Well, you should always enforce your constraints in the database as well.
Alex Brasetvik
+1  A: 

Alternatively you can change the default Query Set Manager to do case insensitive look-ups on the field. In trying to solve a similar problem I came across:

http://djangosnippets.org/snippets/305/

Code pasted here for convenience:

from django.db.models import Manager
from django.db.models.query import QuerySet

class CaseInsensitiveQuerySet(QuerySet):
    def _filter_or_exclude(self, mapper, *args, **kwargs):
        # 'name' is a field in your Model whose lookups you want case-insensitive by default
        if 'name' in kwargs:
            kwargs['name__iexact'] = kwargs['name']
            del kwargs['name']
        return super(CaseInsensitiveQuerySet, self)._filter_or_exclude(mapper, *args, **kwargs)

# custom manager that overrides the initial query set
class TagManager(Manager):
    def get_query_set(self):
        return CaseInsensitiveQuerySet(self.model)

# and the model itself
class Tag(models.Model):
    name = models.CharField(maxlength=50, unique=True, db_index=True)

    objects = TagManager()

    def __str__(self):
        return self.name
Foo