views:

294

answers:

3

I'm sure I'm missing something pretty obvious, but I can't for the life of me stop my pysqlite scripts crashing out with a database is locked error. I have two scripts, one to load data into the database, and one to read data out, but both will frequently, and instantly, crash depending on what the other is doing with the database at any given time. I've got the timeout on both scripts set to 30 seconds:

cx = sqlite.connect("database.sql", timeout=30.0)

And think I can see some evidence of the timeouts in that I get what appears to be a timing stamp (e.g 0.12343827e-06 0.1 - and how do I stop that being printed?) dumped occasionally in the middle of my Curses formatted output screen, but no delay that ever gets remotely near the 30 second timeout, but still one of the other keeps crashing again and again from this. I'm running RHEL 5.4 on a 64-bit 4 CPU HS21 IBM blade, and have heard some mention about issues about multi-threading and am not sure if this might be relevant. Packages in use are sqlite-3.3.6-5 and python-sqlite-1.1.7-1.2.1, and upgrading to newer versions outside of Red Hat's official provisions is not a great option for me. Possible, but not desirable due to the environment in general.

I have had autocommit=1 on previously in both scripts, but have since disabled on both, and I am now cx.commit()ing on the inserting script and not committing on the select script. Ultimately as I only ever have one script actually making any modifications, I don't really see why this locking should ever happen. I have noticed that this is significantly worse over time when the database has gotten larger. It was recently at 13 MB with 3 equal sized tables, which was about 1 day's worth of data. Creating a new file has significantly improved this, which seems understandable, but the timeout ultimately just doesn't seem to be being obeyed.

Any pointers very much appreciated.

EDIT: since asking I have been able to restructure my code slightly and use a signal to periodically write between 0 and 150 updates in one transaction every 5 seconds. This has significantly reduced the occurrences of the locking, to less than one an hour as opposed to once every minute or so. I guess I could go further by ensuring the times I write data are offset by a few seconds as I read data in the other script, but fundamentally I'm working around an issue as I percieve it, making a timeout not required, which doesn't seem right still. Ta.

A: 

SQLite uses database locking for every write (update/insert/delete/...). IMHO, this lock is held until transaction ends. This is single lock held across threads/processes, AFAIK.

So, I'd try explicitly ending both transaction and connection for writing script and to explicitly commit even in reading script and try to debug concurrency issues.

Almad
You mean to close the entire database? I'm fine with a lock being held, but neither script are waiting long enough for the lock to be released.
Chris Phillips
Yes, I mean close connection to database from the connection that is doing writes.
Almad
Well I'm writing to the database every few seconds, surely closing it will not really change anything? There's additional work in opening and closing, so would that not possibly make it worse? And also if i'm committing the changes before I close, then there's no benefit in closing it anyway, as it'd no longer be locked.
Chris Phillips
Sure - I was just iterpreting the question as You're sure there are no race conditions. But as I read it now -- Your application is relatively write-heavy and concurrent reads and writes are not good for sqlite, so for this use-case I'd change the database.
Almad
A: 

SQLite is simply not optimized for write-heavy workloads, nor does it pretend to be (but it doesn't mind writing quite a lot in one transaction). It sounds to me like you might be getting to the point where you need to switch to another database like MySQL, PostgreSQL, Oracle or DB2. Some of those options are expensive indeed, but for some workloads that's what you need. (Also note that write-heavy workloads tend to be better done with a dedicated database server solution too, despite the fact that that pushes up deployment costs and complexity. Some things just cost.)

Donal Fellows
I'd certainly much rather be using a proper database, but it's just not available to be for various reasons.
Chris Phillips
Well, then be prepared for some things to be slow. It just can't be helped.
Donal Fellows
A: 

In early versions of pysqlite, the timeout parameter to sqlite.connect is apparently interpreted as milliseconds. So your timeout=30.0 should be timeout=30000.

slowdog