views:

174

answers:

1

Hi everybody, I have question regarding the SQLAlchemy. I have database which contains Items, every Item has assigned more Records (1:n). And the Record is partially stored in the database, but it also has an assigned file (1:1) on the filesystem.

What I want to do is to delete the assigned file when the Record is removed from the database. So I wrote the following MapperExtension:

class _StoredRecordEraser(MapperExtension):
    def before_delete(self, mapper, connection, instance):
        instance.erase()

The following code creates an experimental setup (full code is here: test.py):

session = Session()

i1 = Item(id='item1')
r11 = Record(id='record11', attr='1')
i1.records.append(r11)
r12 = Record(id='record12', attr='2')
i1.records.append(r12)
session.add(i1)
session.commit()

And finally, my problem... The following code works O.k. and the old.erase() method is called:

session = Session()
i1 = session.query(Item).get('item1')
old = i1.records[0]
new = Record(id='record13', attr='3')
i1.records.remove(old)
i1.records.append(new)
session.commit()

But when I change the id of a new Record to record11, which is already in the database, but it is not the same item (attr=3), the old.erase() is not called. Does anybody know why?

Thanks

+3  A: 

A delete + insert of two records that ultimately have the same primary key within a single flush are converted into a single update right now. this is not the best behavior - it really should delete then insert, so that the various events assigned to those activities are triggered as expected (not just mapper extension methods, but database level defaults too). But the flush() process is hardwired to perform inserts/updates first, then deletes. As a workaround, you can issue a flush() after the remove/delete operation, then a second for the add/insert.

As far as flushes' current behavior, I've looked into trying to break this out but it gets very complicated - inserts which depend on deletes would have to execute after the deletes, but updates which depend on inserts would have to execute beforehand. Ultimately, the unitofwork module would be rewritten (big time) to consider all insert/update/deletes in a single stream of dependent actions that would be topologically sorted against each other. This would simplify the methods used to execute statements in the correct order, although all new systems for synchronizing data between rows based on server-level defaults would have to be devised, and its possible that complexity would be re-introduced if it turned out the "simpler" method spent too much time naively sorting insert statements that are known at the ORM level to not require any sorting against each other. The topological sort works at a more coarse grained level than that right now.

zzzeek