views:

48

answers:

2
+1  Q: 

Python cached list

Hi,

I have a module which supports creation of geographic objects using a company-standard interface. After these objects are created, the update_db() method is called, and all objects are updated into a database.

It is important to have all objects inserted in one session, in order to keep counters and statistics before updating a production database.

The problem is that sometimes there are just too many objects, and the memory gets full.

Is there a way to create a cached list in Python, in order to handle lists that does not fit into memory?

My general thought was:

class CachedList(object):
    def __init__(self, max_memory_size, directory)
    def get_item(index)
    def set_item(index)
    def del_itam(index)
    def append(item)

An ordinary list would be created upon initialization. When the list's size exceeds max_memory_size, the list elements are pickled and stored at a file in directory. get_item(), set_item() and del_item() would handle the data stored in memory, or 'swap' it from disk to access it.

  1. Is this a good design? Are there any standard alternatives?
  2. How can I force garbage collection after pickle-ing parts of the list?

Thanks,

Adam

+3  A: 

Use shelve. Your keys are the indices to your list.

S.Lott
Precisely, exactly, definitely what I need. Would've given more upvotes if I could.
Adam Matan
+2  A: 

I think your first question is answered. On the second, forcing GC: use gc.collect. http://docs.python.org/library/gc.html.

altie