I am currently playing around with SQLAlchemy a bit, which is really quite neat.
For testing I created a huge table containing my pictures archive, indexed by SHA1 hashes (to remove duplicates :-)). Which was impressingly fast...
For fun I did the equivalent of a select *
over the resulting SQLite database:
session = Session()
for p in session.query(Picture):
print(p)
I expected to see hashes scrolling by, but instead it just kept scanning the disk. At the same time, memory usage was skyrocketing, reaching 1GB after a few seconds. This seems to come from the identity map feature of SQLAlchemy, which I thought was only keeping weak references.
Can somebody explain this to me? I thought that each Picture p would be collected after the hash is written out!?