tags:

views:

92

answers:

2

I am using ZODB to store some data that exists in memory for the sake of persistence. If the service with the data in memory every crashes, restarting will load the data from ZODB rather than querying 100s of thousands of rows in a MySQL db.

It seems that every time I save, say 500K of data to my database file, my .fs file grows by 500K, rather than staying at 500K. As an example:

storage     = FileStorage.FileStorage(MY_PATH)
db          = DB(storage)
connection  = db.open()
root        = connection.root()

if not root.has_key('data_db'):
    root['data_db'] = OOBTree()
mydictionary = {'some dictionary with 500K of data'}
root['data_db'] = mydictionary
root._p_changed = 1
transaction.commit()
transaction.abort()
connection.close()
db.close()
storage.close()

I want to continuously overwrite the data in root['data_db'] with the current value of mydictionary. When I print len(root['data_db']) it always prints the right number of items from mydictionary, but every time this code runs (with the same exact data) the file size increased by the data size, in this case 500K.

Am I doing something wrong here?

+1  A: 

When the data in ZODB changes, it's appended to the end of the file. Old data is left there. To reduce the filesize, you need to manually "pack" the database.

Google came up with this mailing list post.

Matthew Marshall
Is there another storage system (possibly native to python) that you might recommend since all I want to do is overwrite the stored data each time? Pickly would work for me, but the transactions seem slow when I have a huge set of data (1M + entries in the dictionary)
sberry2A
Like Mark said, I would consider sqlite.
Matthew Marshall
+1  A: 

Since you asked about another storage system in a comment, you might want to look into SQLite.

Even though SQLite behaves the same in appending to data first, it offers the vacuum command to recover unused storage space. From the Python API, you'll can either use the vacuum pragma to do it automatically, or you can just execute the vacuum command.

Mark Rushakoff
Um. SQLite does not always increase the size of the database file. Emptied pages are reused. It's just that the file won't shrink unless you run the `vacuum` command.
ΤΖΩΤΖΙΟΥ