views:

424

answers:

4

I'm new to SQLAlchemy and relational databases, and I'm trying to set up a model for an annotated lexicon. I want to support an arbitrary number of key-value annotations for the words which can be added or removed at runtime. Since there will be a lot of repetition in the names of the keys, I don't want to use this solution directly, although the code is similar.

My design has word objects and property objects. The words and properties are stored in separate tables with a property_values table that links the two. Here's the code:

from sqlalchemy import Column, Integer, String, Table, create_engine
from sqlalchemy import MetaData, ForeignKey
from sqlalchemy.orm import relation, mapper, sessionmaker
from sqlalchemy.ext.declarative import declarative_base

engine = create_engine('sqlite:///test.db', echo=True)
meta = MetaData(bind=engine)

property_values = Table('property_values', meta,
    Column('word_id', Integer, ForeignKey('words.id')),
    Column('property_id', Integer, ForeignKey('properties.id')),
    Column('value', String(20))
)
words = Table('words', meta,
    Column('id', Integer, primary_key=True),
    Column('name', String(20)),
    Column('freq', Integer)
)
properties = Table('properties', meta,
    Column('id', Integer, primary_key=True),
    Column('name', String(20), nullable=False, unique=True)
)
meta.create_all()

class Word(object):
    def __init__(self, name, freq=1):
        self.name = name
        self.freq = freq

class Property(object):
    def __init__(self, name):
        self.name = name
mapper(Property, properties)  

Now I'd like to be able to do the following:

Session = sessionmaker(bind=engine)
s = Session()
word = Word('foo', 42)
word['bar'] = 'yes' # or word.bar = 'yes' ?
s.add(word)
s.commit()

Ideally this should add 1|foo|42 to the words table, add 1|bar to the properties table, and add 1|1|yes to the property_values table. However, I don't have the right mappings and relations in place to make this happen. I get the sense from reading the documentation at http://www.sqlalchemy.org/docs/05/mappers.html#association-pattern that I want to use an association proxy or something of that sort here, but the syntax is unclear to me. I experimented with this:

mapper(Word, words, properties={
    'properties': relation(Property, secondary=property_values)
    })

but this mapper only fills in the foreign key values, and I need to fill in the other value as well. Any assistance would be greatly appreciated.

+1  A: 

There is very similar question with slight interface difference. But it's easy to fix it by defining __getitem__, __setitem__ and __delitem__ methods.

Denis Otkidach
Thanks for the pointer. I had something similar, but I'd like to maintain the key names as foreign keys because there will be a lot of repetition in the keys and I don't want to have all those strings duplicated in the database. I'm looking at combining your solution with van's solution above [http://stackoverflow.com/questions/2310153/inserting-data-in-many-to-many-relationship-in-sqlalchemy/2310548#2310548].
Brent Ramerth
+1  A: 

Simply use Dictionary-Based Collections mapping mapping - out of the box solution to your question. Extract from the link:

from sqlalchemy.orm.collections import column_mapped_collection, attribute_mapped_collection, mapped_collection

mapper(Item, items_table, properties={
    # key by column
    'notes': relation(Note, collection_class=column_mapped_collection(notes_table.c.keyword)),
    # or named attribute
    'notes2': relation(Note, collection_class=attribute_mapped_collection('keyword')),
    # or any callable
    'notes3': relation(Note, collection_class=mapped_collection(lambda entity: entity.a + entity.b))
})

# ...
item = Item()
item.notes['color'] = Note('color', 'blue')
print item.notes['color']

Or try the solution for Inserting data in Many to Many relationship in SQLAlchemy. Obviously you have to replace the list logic with the dict one.
Ask question author to post hist final code with associationproxy, which he mentioned he used in the end.

van
That's the simplest way, but a lot of my entries will have the same keys, so I want to unique those keys and store them in a separate table. I'm looking at the second solution you posted.
Brent Ramerth
fair enough. but what stops you from using the dictionary-based collection mapping to a property on your object, but in addition to that only provide dict-like interface on the object, which will basically be a proxy (delegate to) this property?
van
I think that's basically what I came up with in my solution (posted below).
Brent Ramerth
A: 

I ended up combining Denis and van's posts together to form the solution:

from sqlalchemy import Column, Integer, String, Table, create_engine
from sqlalchemy import MetaData, ForeignKey
from sqlalchemy.orm import relation, mapper, sessionmaker
from sqlalchemy.orm.collections import attribute_mapped_collection
from sqlalchemy.ext.associationproxy import association_proxy
from sqlalchemy.ext.declarative import declarative_base

meta = MetaData()
Base = declarative_base(metadata=meta, name='Base')

class PropertyValue(Base):
    __tablename__ = 'property_values'
    WordID = Column(Integer, ForeignKey('words.id'), primary_key=True)
    PropID = Column(Integer, ForeignKey('properties.id'), primary_key=True)
    Value = Column(String(20))

def _property_for_name(prop_name):
    return s.query(Property).filter_by(name=prop_name).first()

def _create_propval(prop_name, prop_val):
    p = _property_for_name(prop_name)
    if not p:
        p = Property(prop_name)
        s.add(p)
        s.commit()
    return PropertyValue(PropID=p.id, Value=prop_val)

class Word(Base):
    __tablename__ = 'words'
    id = Column(Integer, primary_key=True)
    string = Column(String(20), nullable=False)
    freq = Column(Integer)
    _props = relation(PropertyValue, collection_class=attribute_mapped_collection('PropID'), cascade='all, delete-orphan')
    props = association_proxy('_props', 'Value', creator=_create_propval)

    def __init__(self, string, freq=1):
        self.string = string
        self.freq = freq

    def __getitem__(self, prop):
        p = _property_for_name(prop)
        if p:
            return self.props[p.id]
        else:
            return None

    def __setitem__(self, prop, val):
        self.props[prop] = val

    def __delitem__(self, prop):
        p = _property_for_name(prop)
        if p:
            del self.props[prop]

class Property(Base):
    __tablename__ = 'properties'
    id = Column(Integer, primary_key=True)
    name = Column(String(20), nullable=False, unique=True)

    def __init__(self, name):
        self.name = name

engine = create_engine('sqlite:///test.db', echo=False)
Session = sessionmaker(bind=engine)
s = Session()
meta.create_all(engine)

The test code is as follows:

word = Word('foo', 42)
word['bar'] = "yes"
word['baz'] = "certainly"
s.add(word)

word2 = Word('quux', 20)
word2['bar'] = "nope"
word2['groink'] = "nope"
s.add(word2)
word2['groink'] = "uh-uh"
del word2['bar']

s.commit()

word = s.query(Word).filter_by(string="foo").first()
print word.freq, word['baz']
# prints 42 certainly

The contents of the databases are:

$ sqlite3 test.db "select * from property_values"
1|2|certainly
1|1|yes
2|3|uh-uh
$ sqlite3 test.db "select * from words"
1|foo|42
2|quux|20
$ sqlite3 test.db "select * from properties"
1|bar
2|baz
3|groink
Brent Ramerth
in _create_propval do you really need to add/commit new Property object? will it not be added/commited automatically when you commit the session later? If it does, then I would not add/commit there, because you might rollback the session instead of commiting
van
Yeah, that's a sketchy party of the code. I need the Property object to obtain an id so that I can reference it when creating the PropertyValue (because PropertyValues use foreign keys to reference the property names). The Property doesn't get its id until it's committed. Any good way to obtain the id without committing?
Brent Ramerth
For the record, the solution is to replace the s.commit() line in _create_propval with s.merge(p).
Brent Ramerth
A: 

Comment for Brent, above:

You can use session.flush() instead of commit() to get an id on your model instances. flush() will execute the necessary SQL, but will not commit, so you can rollback later if needed.

Jace