views:

142

answers:

2

Does anyone know if Python's shelve module uses memory-mapped IO?

Maybe that question is a bit misleading. I realize that shelve uses an underlying dbm-style module to do its dirty work. What are the chances that the underlying module uses mmap?

I'm prototyping a datastore, and while I realize premature optimization is generally frowned upon, this could really help me understand the trade-offs involved in my design.

+3  A: 

I'm not sure what you're trying to learn by asking this question, since you already seem to know the answer: it depends on the actual dbm store being used. Some of them will use mmap -- I expect everything but dumbdbm to use mmap -- but so what? The overhead in shelve is almost certainly not in the mmap-versus-fileIO choice, but in the pickling operation. You can't mmap the dbm file sensibly yourself in either case, as the dbm module may have its own fancy locking (and it may not be a single file anyway, like when it uses bsddb.)

If you're just looking for inspiration for your own datastore, well, don't look at shelve, since all it does is pickle-and-pass-along to another datastore.

Thomas Wouters
+3  A: 

Existing dbm implementations in the Python standard library all use "normal" I/O, not memory mapping. You'll need to code your own dbmish implementation with memory mapping, and integrate it with shelve (directly, or, more productively, through anydbm).

Alex Martelli
I'm pretty sure BerkeleyDB uses `mmap()`, if only so it can break in unexpected places :) I've also seen dbm implementations that use mmap.
Thomas Wouters