views:

121

answers:

1

This post refers to this page for merging SQLite databases.

The sequence is as follows. Let's say I want to merge a.db and b.db. In command line I do the following.

  • sqlite3 a.db
  • attach 'b.db' as toM;
  • begin; <--
  • insert into benchmark select * from toM.benchmark;
  • commit; <--
  • detach database toM;

It works well, but in the referred site, the questioner asks about speeding up, and the answer is to use the 'begin' and 'commit' command.

Then, I came up with the following python code to do the exactly same thing. I abstract the SQLite function calls with SQLiteDB, and one of it's method is runCommand(). I got the same error even though I delete the self.connector.commit().

# run command
def runCommand(self, command):
    self.cursor.execute(command)
    self.connector.commit() # same error even though I delete this line

db = SQLiteDB('a.db')
cmd = "attach \"%s\" as toMerge" % "b.db"
print cmd
db.runCommand(cmd)
cmd = "begin"
db.runCommand(cmd)
cmd = "insert into benchmark select * from toMerge.benchmark"
db.runCommand(cmd)
cmd = "commit"
db.runCommand(cmd)
cmd = "detach database toMerge"
db.runCommand(cmd)

But, I got the following error.

OperationalError: cannot commit - no transaction is active

Even though the error, the result db is well merged. And without the begin/commit, there's no error at all.

  • Why can't I run the begin/commit command?
  • Is it absolutely necessary to run begin/commit to safely merge the db files? The post says that the purpose of begin/commit is for speedup. Then, what's the difference between using and not using the begin/commit command in terms of speedup?
+4  A: 

Apparently, Cursor.execute doesn't support the 'commit' command. It does support the 'begin' command but this is redundant because sqlite3 begins them for you anway:

>>> import sqlite3
>>> conn = sqlite3.connect(':memory:')
>>> cur = conn.cursor()
>>> cur.execute('begin')
<sqlite3.Cursor object at 0x0104B020>
>>> cur.execute('CREATE TABLE test (id INTEGER)')
<sqlite3.Cursor object at 0x0104B020>
>>> cur.execute('INSERT INTO test VALUES (1)')
<sqlite3.Cursor object at 0x0104B020>
>>> cur.execute('commit')

Traceback (most recent call last):
  File "<pyshell#10>", line 1, in <module>
    cur.execute('commit')
OperationalError: cannot commit - no transaction is active
>>> 

just use the commit method on your Connection object.

As for your second question, it is not absolutely necessary to call begin/commit when merging the files: just be sure that there is absolutely no disk error, modifications to the db's or people looking at the computer the wrong way while it is happening. So begin/commit is probably a good idea. Of course, if the original db's aren't being modified (I honestly haven't looked) then there is no need for that even. If there is an error, you can just scrap the partial output and start over.

It also provides a speedup because every change doesn't have to be written to disk as it occurs. They can be stored in memory and written in bulk. But as mentioned sqlite3 handles this for you.

Also, it's worth mentioning that

cmd = "attach \"%s\" as toMerge" % "b.db"

is wrong in the sense that it's depracated. If you want to do the wrong thing correctly, it's

cmd = 'attach "{0}" as toMerge'.format("b.db") #why not just one string though?

This is forward compatible with newer versions of python which will make porting code easier.

if you want to do the right thing, it's

cmd = "attach ? as toMerge"
cursor.execute(cmd, ('b.db', ))

This avoids sql injection and is, apparently, slightly faster so it's win-win.

You could modify your runCommand method as follows:

def runCommand(self, sql, params=(), commit=True):
    self.cursor.execute(sql, params)
    if commit:
        self.connector.commit()

now you can not commit after every single command by passing commit=False when you don't need a commit. This preserves the notion of transaction.

aaronasterling
@aaronasterling : Thanks for the answer. I clarified the original question a little bit more, and I got the same error even when I delete the commit part. BTW, it's good to know that I can live without the begin/commit pair.
prosseek
@prosseek it seems that the `execute` method of cursors can't execute the 'commit' statement. This is no big deal because you can just use the `commit` method on the connection object.
aaronasterling