views:

508

answers:

5

I want to create a new type of field for django models that is basically a ListOfStrings. So in your model code you would have the following:

models.py:

from django.db import models

class ListOfStringsField(???):
    ???

class myDjangoModelClass():
    myName = models.CharField(max_length=64)
    myFriends = ListOfStringsField() #

other.py:

myclass = myDjangoModelClass()
myclass.myName = "bob"
myclass.myFriends = ["me", "myself", "and I"]

myclass.save()

id = myclass.id

loadedmyclass = myDjangoModelClass.objects.filter(id__exact=id)

myFriendsList = loadedclass.myFriends
# myFriendsList is a list and should equal ["me", "myself", "and I"]

How would you go about writing this field type, with the following stipulations?

  • We don't want to do create a field which just crams all the strings together and separates them with a token in one field like this. It is a good solution in some cases, but we want to keep the string data normalized so tools other than django can query the data.
  • The field should automatically create any secondary tables needed to store the string data.
  • The secondary table should ideally have only one copy of each unique string. This is optional, but would be nice to have.

Looking in the Django code it looks like I would want to do something similar to what ForeignKey is doing, but the documentation is sparse.

This leads to the following questions:

  • Can this be done?
  • Has it been done (and if so where)?
  • Is there any documentation on Django about how to extend and override their model classes, specifically their relationship classes? I have not seen a lot of documentation on that aspect of their code, but there is this.

This is comes from this question.

A: 

I think what you want is a custom model field.

JosefAssad
+2  A: 

There's some very good documentation on creating custom fields here.

However, I think you're overthinking this. It sounds like you actually just want a standard foreign key, but with the additional ability to retrieve all the elements as a single list. So the easiest thing would be to just use a ForeignKey, and define a get_myfield_as_list method on the model:

class Friends(model.Model):
    name = models.CharField(max_length=100)
    my_items = models.ForeignKey(MyModel)

class MyModel(models.Model):
    ...

    def get_my_friends_as_list(self):
        return ', '.join(self.friends_set.values_list('name', flat=True))

Now calling get_my_friends_as_list() on an instance of MyModel will return you a list of strings, as required.

Daniel Roseman
I would have thought the ForeignKey belongs in the Friends class. Am I missing something? And yes I probably am over thinking this, but if I can create the field I want I think it would be generally useful.
grieve
Yes sorry, you're right, I've updated the code.
Daniel Roseman
I think FK is the way to go here as well. +1 for adding the method to retrieve friends as a list.
googletorp
+4  A: 

I also think you're going about this the wrong way. Trying to make a Django field create an ancillary database table is almost certainly the wrong approach. It would be very difficult to do, and would likely confuse third party developers if you are trying to make your solution generally useful.

If you're trying to store a denormalized blob of data in a single column, I'd take an approach similar to the one you linked to, serializing the Python data structure and storing it in a TextField. If you want tools other than Django to be able to operate on the data then you can serialize to JSON (or some other format that has wide language support):

from django.db import models
from django.utils import simplejson

class JSONDataField(models.TextField):
    __metaclass__ = models.SubfieldBase

    def to_python(self, value):
        if value is None: 
            return None
        if not isinstance(value, basestring): 
            return value
        return simplejson.loads(value)

    def get_db_prep_save(self, value):
        if value is None: 
            return None
        return simplejson.dumps(value)

If you just want a django Manager-like descriptor that lets you operate on a list of strings associated with a model then you can manually create a join table and use a descriptor to manage the relationship. It's not exactly what you need, but this code should get you started.

mmalone
I just got a chance to look at the link you posted. I think it is closer to what I am trying to do implementation-wise than what I currently have. Thanks!
grieve
+1 for only using this approach if its for denormalization! Otherwise, normalize!
Soviut
+1  A: 

Thanks for all those that answered. Even if I didn't use your answer directly the examples and links got me going in the right direction.

I am not sure if this is production ready, but it appears to be working in all my tests so far.

class ListValueDescriptor(object):

   def __init__(self, lvd_parent, lvd_model_name, lvd_value_type, lvd_unique, **kwargs):
      """
         This descriptor object acts like a django field, but it will accept
         a list of values, instead a single value.
         For example:
            # define our model
            class Person(models.Model):
               name = models.CharField(max_length=120)
               friends = ListValueDescriptor("Person", "Friend", "CharField", True, max_length=120)

            # Later in the code we can do this
            p = Person("John")
            p.save() # we have to have an id
            p.friends = ["Jerry", "Jimmy", "Jamail"]
            ...
            p = Person.objects.get(name="John")
            friends = p.friends
            # and now friends is a list.
         lvd_parent - The name of our parent class
         lvd_model_name - The name of our new model
         lvd_value_type - The value type of the value in our new model
                        This has to be the name of one of the valid django
                        model field types such as 'CharField', 'FloatField',
                        or a valid custom field name.
         lvd_unique - Set this to true if you want the values in the list to
                     be unique in the table they are stored in. For
                     example if you are storing a list of strings and
                     the strings are always "foo", "bar", and "baz", your
                     data table would only have those three strings listed in
                     it in the database.
         kwargs - These are passed to the value field.
      """
      self.related_set_name = lvd_model_name.lower() + "_set"
      self.model_name = lvd_model_name
      self.parent = lvd_parent
      self.unique = lvd_unique

      # only set this to true if they have not already set it.
      # this helps speed up the searchs when unique is true.
      kwargs['db_index'] = kwargs.get('db_index', True)

      filter = ["lvd_parent", "lvd_model_name", "lvd_value_type", "lvd_unique"]

      evalStr = """class %s (models.Model):\n""" % (self.model_name)
      evalStr += """    value = models.%s(""" % (lvd_value_type)
      evalStr += self._params_from_kwargs(filter, **kwargs) 
      evalStr += ")\n"
      if self.unique:
         evalStr += """    parent = models.ManyToManyField('%s')\n""" % (self.parent)
      else:
         evalStr += """    parent = models.ForeignKey('%s')\n""" % (self.parent)
      evalStr += "\n"
      evalStr += """self.innerClass = %s\n""" % (self.model_name)

      print evalStr

      exec (evalStr) # build the inner class

   def __get__(self, instance, owner):
      value_set = instance.__getattribute__(self.related_set_name)
      l = []
      for x in value_set.all():
         l.append(x.value)

      return l

   def __set__(self, instance, values):
      value_set = instance.__getattribute__(self.related_set_name)
      for x in values:
         value_set.add(self._get_or_create_value(x))

   def __delete__(self, instance):
      pass # I should probably try and do something here.


   def _get_or_create_value(self, x):
      if self.unique:
         # Try and find an existing value
         try:
            return self.innerClass.objects.get(value=x)
         except django.core.exceptions.ObjectDoesNotExist:
            pass

      v = self.innerClass(value=x)
      v.save() # we have to save to create the id.
      return v

   def _params_from_kwargs(self, filter, **kwargs):
      """Given a dictionary of arguments, build a string which 
      represents it as a parameter list, and filter out any
      keywords in filter."""
      params = ""
      for key in kwargs:
         if key not in filter:
            value = kwargs[key]
            params += "%s=%s, " % (key, value.__repr__())

      return params[:-2] # chop off the last ', '

class Person(models.Model):
   name = models.CharField(max_length=120)
   friends = ListValueDescriptor("Person", "Friend", "CharField", True, max_length=120)

Ultimately I think this would still be better if it were pushed deeper into the django code and worked more like the ManyToManyField or the ForeignKey.

grieve
I have noticed with this class, that you have to save before adding, since the ID don't exist until you save. I think this could be fixed/corrected if this inherited from RelatedField and Field, but I am still trying to wrap my head around that code.
grieve
After more experimenting, this turns out to work, but is extremely brittle, especially in regards to namespaces, it has to live in models.py. I will keep working on it, and hopefully develop a cleaner version.
grieve
+4  A: 

What you have described sounds to me really similar to the tags.
So, why not using django tagging?
It works like a charm, you can install it independently from your application and its API is quite easy to use.

Roberto Liffredo
Also, if it doesn't work the way the OP wants, it's easy to look at how it's done there and change it to his needs.
Dave Vogt
Nice! I will have to look into this further.
grieve