views:

2413

answers:

8

I need to store some data on a Django model. These data are not equal to all instances of the model.

At first I thought on subclassing the model, but I'm trying to keep the application flexible; if I use subclasses, I'll need to create a whole class each time i need a new kind of object, and that's no good. I'll also end up with a lot of subclasses only to store a pair of extra fields.

I really feel that a dictionary would be the best approach, but there's nothing on the Djando documentation (or I can't find it).

Any clues?

+1  A: 

If you don't need to query by any of this extra data, then you can store it as a serialized dictionary. Use repr to turn the dictionary into a string, and eval to turn the string back into a dictionary. Take care with eval that there's no user data in the dictionary, or use a safe_eval implementation.

Ned Batchelder
+1  A: 

Think it over, and find the commonalities of each data set... then define your model. It may require the use of subclasses or not. Foreign keys representing commonalities aren't to be avoided, but encouraged when they make sense.

Stuffing random data into a SQL table is not smart, unless it's truly non-relational data. If that's the case, define your problem and we may be able to help.

Daniel
+1: don't just stuff Python objects into a table haphazardly.
S.Lott
+2  A: 

I'm not sure exactly sure of the nature of the problem you're trying to solve, but it sounds curiously similar to Google App Engine's BigTable Expando.

Expandos allow you to specify and store additional fields on an database-backed object instance at runtime. To quote from the docs:

import datetime

class Song(db.Expando):
  title = db.StringProperty()

crazy = Song(title='Crazy like a diamond',
             author='Lucy Sky',
             publish_date='yesterday',
             rating=5.0)

crazy.last_minute_note=db.Text('Get a train to the station.')

Google App Engine currently supports both Python and the Django framework. Might be worth looking into if this is the best way to express your models.

Traditional relational database models don't have this kind of column-addition flexibility. If your datatypes are simple enough you could break from traditional RDBMS philosophy and hack values into a single column via serialization as @Ned Batchelder proposes; however, if you have to use an RDBMS, Django model inheritance is probably the way to go. Notably, it will create a one-to-one foreign key relation for each level of derivation.

cdleary
+1  A: 

As Ned answered, you won't be able to query "some data" if you use the dictionary approach.

If you still need to store dictionaries then the best approach, by far, is the PickleField class documented in Marty Alchin's new book Pro Django. This method uses Python class properties to pickle/unpickle a python object, only on demand, that is stored in a model field.

The basics of this approach is to use django's contibute_to_class method to dynamically add a new field to your model and uses getattr/setattr to do the serializing on demand.

One of the few online examples I could find that is similar is this definition of a JSONField.

Van Gale
+4  A: 

If it's really dictionary like arbitrary data you're looking for you can probably use a two-level setup with one model that's a container and another model that's key-value pairs. You'd create an instance of the container, create each of the key-value instances, and associate the set of key-value instances with the container instance. Something like:

class Dicty(models.Model):
    name      = models.CharField(max_length=50)

class KeyVal(models.Model):
    container = models.ForeignKey(Dicty, db_index=True)
    key       = models.CharField(max_length=240, db_index=True)
    value     = models.CharField(max_length=240, db_index=True)

It's not pretty, but it'll let you access/search the innards of the dictionary using the DB whereas a pickle/serialize solution will not.

Parand
The only downside is that an additional DB query will ensue
kRON
+1  A: 

Beeing "not equal to all instances of the model" sounds to me like a good match for a "Schema-free database". CouchDB is the poster child for that approach and you might consider that.

In a project I moved several tables which never played very nice with the Django ORM over to CouchDB and I'm quite happy with that. I use couchdb-python without any of the Django-specific CouchDB modules. A description of the data model can be found here. The movement from five "models" in Django to 3 "models" in Django and one CouchDB "database" actually slightly reduced the total lines of code in my application.

mdorseif
A: 

Django-Geo includes a "DictionaryField" you might find helpful:

http://code.google.com/p/django-geo/source/browse/trunk/fields.py?r=13#49

In general, if you don't need to query across the data use a denormalized approach to avoid extra queries. User settings are a pretty good example!

jb
A: 

I agree that you need to refrain stuffing otherwise structured data into a single column. But if you must do that, Django has an XMLField build-in.

There's also JSONField at Django snipplets.

muhuk