views:

46

answers:

3

I use connect() and cursor() for using SQLite

self.connector = sqlite3.connect(self.dbFile)
self.cursor = self.connector.cursor()

And close() for stop using it.

self.cursor.close()

How expensive (in terms of processing time) are they? Is it so expensive that it's necessary to use it only absolutely necessary? Or, is it just OK to use it multiple times in a function?

ADDED

I tested with the following simple code. proc1() uses the code that opens and closes all the time when it runs the query, and proc2() runs only once.

from sqlite import *
import timeit
import math

def proc1():
    db = SQLiteDB("./example.db", False)
    db.getOpenRunClose("SELECT * from Benchmark")
    db.getOpenRunClose("SELECT * from Benchmark")
    db.getOpenRunClose("SELECT * from Benchmark")
    db.getOpenRunClose("SELECT * from Benchmark")
    db.getOpenRunClose("SELECT * from Benchmark")
    db.getOpenRunClose("SELECT * from Benchmark")

def proc2():
    db = SQLiteDB("./example.db")
    res = db.runSQLToGetResult("SELECT * from Benchmark")
    res = db.runSQLToGetResult("SELECT * from Benchmark")
    res = db.runSQLToGetResult("SELECT * from Benchmark")
    res = db.runSQLToGetResult("SELECT * from Benchmark")
    res = db.runSQLToGetResult("SELECT * from Benchmark")
    res = db.runSQLToGetResult("SELECT * from Benchmark")
    db.close()

if __name__ == '__main__':
    t = timeit.Timer(proc1)
    count = 5000
    print t.timeit(count) / count

    t = timeit.Timer(proc2)
    count = 5000
    print t.timeit(count) / count

The result is as follows.

0.00157478599548
0.000539195966721
+3  A: 

Connections are fairly expensive – they correspond to opening the file – but cursors aren't very so use as many as you need[1]. What does cost is transaction starts and especially commits when there's an insert or update (or if you create a table or index, of course) even if you're in auto-commit mode. That's because the database engine has to sync the data to disk before it finishes the commit (required for a durability guarantee) and that's just plain expensive on modern hardware. (Transaction starts cost because they require doing some locking of the DB file, which can have an impact.)

Compilation of statements can also cost a bit; reuse compiled statements if possible. Of course, you should be doing that anyway. Why? It's because you should never put user data in generated SQL; not only does that lead to trouble with SQL injection vulnerabilities, but it also forces the DB engine to recompile the statement every time you run it. Compiled statements are both safer and (probably) faster too.


[1] Of course, it's silly to use more cursors than you need. That's just plain wasting time and effort.

Donal Fellows
+1  A: 

Connect is a relativly expensive operation. Although sqlite connect is incredibly fast and lightwieght compared with most DBs. Its actually just an "fopen" and a few reads from the sqlite master table.

Most queries will use a cursor internally whenther you ask for it or not, its just a handle to refernece a large rowset so an explicit cursor will cost you very little. For handling large result sets cursors are actually more efficient, you can access the first result rows as soon as they are available, you only have a limited number of rows in memory at a time and you can bail out of a long query early if you decide the answer is not what you wanted.

I would also second the recomendation for "prepared" statements. Its better security, and, if you cache the queries intelligently it saves sqlite from constantly parsing the same piece of sql.

James Anderson
+1  A: 

Connect is more than just opening a file. As soon as you run any queries against the database it has to parse out all the SQL in the sqlite_master table. Therefore time to connect is heavily dependent on the complexity of the database. Simple databases can be connected to in a few milliseconds but larger ones will take more. Ours clocks in at around 45ms (over a hundred tables, several hundred triggers).

Sam