views:

50

answers:

3

My question does not really have much to do with sqlalchemy but rather with pure python.

I'd like to control the instantiation of sqlalchemy Model instances. This is a snippet from my code:

class Tag(db.Model):

    __tablename__ = 'tags'
    query_class = TagQuery
    id = db.Column(db.Integer, primary_key=True)
    name = db.Column(db.String(), unique=True, nullable=False)

    def __init__(self, name):
        self.name = name

I want to achieve that whenever an entry is instantiated (Tag('django')) that a new instance should be created only if there is not yet another tag with the name django inside the database. Otherwise, instead of initializing a new object, a reference to the already existent row inside the database should be returned by (Tag('django')).

As of now I am ensuring the uniqueness of tags inside the Post Model:

class Post(db.Model):

        # ...
        # code code code
        # ...

        def _set_tags(self, taglist):
            """Associate tags with this entry. The taglist is expected to be already
            normalized without duplicates."""
            # Remove all previous tags
            self._tags = []
            for tag_name in taglist:
                exists = Tag.query.filter(Tag.name==tag_name).first()
                # Only add tags to the database that don't exist yet
                # TODO: Put this in the init method of Tag (if possible)
                if not exists:
                    self._tags.append(Tag(tag_name))
                else:
                    self._tags.append(exists)

It does its job but still I'd like to know how to ensure the uniqueness of tags inside the Tag class itself so that I could write the _set_tags method like this:

def _set_tags(self, taglist):
    # Remove all previous tags
    self._tags = []
    for tag_name in taglist:
        self._tags.append(Tag(tag_name))


While writing this question and testing I learned that I need to use the __new__ method. This is what I've come up with (it even passes the unit tests and I didn't forget to change the _set_tags method):

class Tag(db.Model):

    __tablename__ = 'tags'
    query_class = TagQuery
    id = db.Column(db.Integer, primary_key=True)
    name = db.Column(db.String(), unique=True, nullable=False)

    def __new__(cls, *args, **kwargs):
        """Only add tags to the database that don't exist yet. If tag already
        exists return a reference to the tag otherwise a new instance"""
        exists = Tag.query.filter(Tag.name==args[0]).first() if args else None
        if exists:
            return exists
        else:
            return super(Tag, cls).__new__(cls, *args, **kwargs)

What bothers me are two things:

First: I get a warning:

DeprecationWarning: object.__new__() takes no parameters

Second: When I write it like so I get errors (I also tried to rename the paramater name to n but it did not change anything) :

def __new__(cls, name):
    """Only add tags to the database that don't exist yet. If tag already
    exists return a reference to the tag otherwise a new instance"""
    exists = Tag.query.filter(Tag.name==name).first()
    if exists:
        return exists
    else:
        return super(Tag, cls).__new__(cls, name)

Errors (or similar):

TypeError: __new__() takes exactly 2 arguments (1 given)

I hope you can help me!

+2  A: 

Don't embed this within the class itself.

Option 1. Create a factory that has the pre-existing pool of objects.

tag_pool = {}
def makeTag( name ):
    if name not in tag_pool:
        tag_pool[name]= Tag(name)
    return tag_pool[name]

Life's much simpler.

tag= makeTag( 'django' )

This will create the item if necessary.

Option 2. Define a "get_or_create" version of the makeTag function. This will query the database. If the item is found, return the object. If no item is found, create it, insert it and return it.

S.Lott
+2  A: 

I use class method for that.

class Tag(Declarative):
    ...
    @classmethod
    def get(cls, tag_name):
        tag = cls.query.filter(cls.name == tag_name).first()
        if not tag:
            tag = cls(tag_name)
        return tag

And then

def _set_tags(self, taglist):
    self._tags = []
    for tag_name in taglist:
        self._tags.append(Tag.get(tag_name))

As for __new__, you should not confuse it with __init__. It is expected to be called w/out args, so even if your own constructor asks for some, you should not pass them to super/object unless you know that your super needs them. Typical invocation would be:

def __new__(cls, name=None): 
    tag = cls.query.filter(cls.name == tag_name).first()
    if not tag:
        tag = object.__new__(cls)
    return tag

However this will not work as expected in your case, since it calls __init__ automatically if __new__ returns instance of cls. You would need to use metaclass or add some checks in __init__.

Daniel Kluev
All of you gave helpful answers that went in the same direction (factory) but yours is the most concrete. Plus you explained the idiosyncrasies of `__new__` so I've checked your answer as the right one.
Eugen
+1  A: 

Given the OP's latest error msg:

TypeError: __new__() takes exactly 2 arguments (1 given)

it seems that somewhere the class is getting instantiated without the name parameter, i.e. just Tag(). The traceback for that exception should tell you where that "somewhere" is (but we're not shown it, so that's how far as we can go;-).

That being said, I agree with other answers that a factory function (possibly nicely dressed up as a classmethod -- making factories is one of the best uses of classmethod, after all;-) is the way to go, avoiding the complication that __new__ entails (such as forcing __init__ to find out whether the object's already initialized to avoid re-initializing it!-).

Alex Martelli