views:

187

answers:

3

I have a 100 mega bytes sqlite db file that I would like to load to memory before performing sql queries. Is it possible to do that in python?

Thanks

A: 
  1. Get an in-memory database running (standard stuff)
  2. Attach the disk database (file).
  3. Recreate tables / indexes and copy over contents.
  4. Detach the disk database (file)

Here's an example (taken from here) in Tcl (could be useful for getting the general idea along):

proc loadDB {dbhandle filename} {

    if {$filename != ""} {
        #attach persistent DB to target DB
        $dbhandle eval "ATTACH DATABASE '$filename' AS loadfrom"
        #copy each table to the target DB
        foreach {tablename} [$dbhandle eval "SELECT name FROM loadfrom.sqlite_master WHERE type = 'table'"] {
            $dbhandle eval "CREATE TABLE '$tablename' AS SELECT * FROM loadfrom.'$tablename'"
        }
        #create indizes in loaded table
        foreach {sql_exp} [$dbhandle eval "SELECT sql FROM loadfrom.sqlite_master WHERE type = 'index'"] {
            $dbhandle eval $sql_exp
        }
        #detach the source DB
        $dbhandle eval {DETACH loadfrom}
    }
}
ChristopheD
+5  A: 

apsw is an alternate wrapper for sqlite, which enables you to backup an on-disk database to memory before doing operations.

From the docs:

###
### Backup to memory
###

# We will copy the disk database into a memory database

memcon=apsw.Connection(":memory:")

# Copy into memory
with memcon.backup("main", connection, "main") as backup:
    backup.step() # copy whole database in one go

# There will be no disk accesses for this query
for row in memcon.cursor().execute("select * from s"):
    pass

connection above is your on-disk db.

Ryan Ginstrom
I like your solution but there is only one problem, I use a lot of row_factory feature of pysqlite; and it seems that apsw does not have this feature.
relima
This has really solved my problem. My queries are MUCH faster now.
relima
import apswmem_db_loader=apsw.Connection(file_sqlite_db)connection=apsw.Connection(":memory:")connection.backup("main", mem_db_loader, "main").step()cursor = connection.cursor()
relima
A: 

Note that you may not need to explicitly load the database into SQLite's memory at all. Simply prime your operating system disk cache by copying it to null.

Windows: copy file.db nul:
Unix/Mac:  cp file.db /dev/null

This has the advantage of the operating system taking care of memory management, especially discarding it if something more important comes along.

Roger Binns
It may be only my computer, but this technique didn't really improve my performance. (Win 7 x64, 8gb ram).
relima
It has worked for many other people on the SQLite mailing list in the past especially after a machine has just booted as it primes the file system cache. In your case it is most likely that file didn't end up in the file system cache. (Some copy tools tell the OS to bypass the cache so that they don't throw out existing "good" content in it.)
Roger Binns