views:

273

answers:

5

I have a fixed data model that has a lot of data fields.

class Widget(Models.model):
    widget_owner = models.ForeignKey(auth.User)
    val1 = models.CharField()
    val2 = models.CharField()
    ...
    val568 = ...

I want to cram even more data into this Widget by letting my users specify custom data fields. What's a sane way to do this? Is storing name/value pairs where the user can specify additional "Widget fields" a good idea? My pseudo thoughts are below:

data_types = ('free_text', 'date', 'integer', 'price')
class CustomWidgetField(models.Model)
  owner = ForeignKey(auth.User)
  field_title = models.CharField(auth.User)
  field_value_type = models.CharField(choices = data_types)

class CustomWidgetValue(models.Model)
  field_type = ForeignKey(CustomWidgetField)
  widget = ForeignKey(Widget)
  value = models.TextField()

So I want to let each user build a new type of data field that will apply to all of their widgets and then specify values for each custom field in each widget. I will probably have to do filtering/searching on these custom fields just as I would on a native field (which I assume will be much slower than operating on native fields.) But the scale is to have a few dozen custom fields per Widget and each User will only have a few thousand Widgets in their inventory. I can also probably batch most of the searching/filtering on the custom fields into a backend script (maybe.)

+5  A: 

It looks like you've reinvented the triple store. I think it's a common thing, as we follow the idea of database flexibility to it's natural conclusion. Triple stores tend to be fairly inefficient in relational database systems, but there are systems designed specifically for them.

http://en.wikipedia.org/wiki/Triplestore

At the scales you're talking about, your performance is likely to be acceptable, but they don't generally scale well without a specialized DB.

Paul McMillan
Agreed. You should probably consider something like Redland library or rdflib (http://code.google.com/p/rdflib/wiki/IntroStore) instead of django's db layer. The real benefit of this will be in using specialised query languages, like SPARQL.
Lee B
+6  A: 

Consider representing all custom properties with serialized dict. I used this in a recent project and it worked really well.

 class Widget(models.Model):
      owner = models.ForeignKey(auth.User)
      props = models.TextField(blank=True) # serialized custom data

      @property
      props_dict(self):
          return simplejson.loads(self.props)

 class UserProfile(models.Model)
      user = models.ForeignKey(auth.User)
      widget_fields = models.TextField(blank=True) # serialized schema declaration
Alex Lebedev
As long as you don't need to sort or filter based on the value of custom attributes, this solution will probably involve less headache (and better performance) than WidgetField and WidgetValue models. It could be cleaned up some by making props a custom field type that automatically serialized and deserializes itself on load/save.
Carl Meyer
+1  A: 

http://github.com/tuttle/django-expando may be of interest to you.

Alex Gaynor
A: 

In my opinion, the best way to achieve this sort of completely extensible model is really with EAV (Entity, Attribute, Value). Its basically a way to bring a schemaless non-relational database to SQL. You can read a bunch more about it on wikipedia, http://en.wikipedia.org/wiki/Entity-attribute-value%5Fmodel but one of the better implementation of it in django is from the EveryBlock codebase. Hope it's a help!

http://github.com/brosner/everyblock%5Fcode/blob/master/ebpub/ebpub/db/models.py

Justin Lilly
A: 

When I had an object that could be completely customized by users, I created a field on the model that would contain some JSON in the column. Then you can just serialize back and forth when you need to use it or save it.

However, it does make it harder to use the data in SQL queries.

seanmonstar