tags:

views:

761

answers:

4

I have a reasonably large data set and would like to store it in a file rather than a RDBMS.

The main table in the data set is just over 1M rows, 30 columns and about 600Mb in size as a CSV.

I'm considering SQLite. Is SQLite worth investigating for data sets of this size?

+8  A: 

SQLite will handle that file just fine; make sure to import the records in a transaction so that it doesn't spend a lot of time creating indexes until everything is imported.

Paul Betts
Thanks for the extra tip Paul.
Mark Nold
+5  A: 

I investigated SQLite recently for a similar application. The SQLite documentation states that SQLite databases can be terabytes in size, and that the primary limitation of SQLite is concurrency (many users at the same time). Although we didn't go this direction (we have our own binary storage format), I was pretty confident that SQLite could handle files of this size.

Robert Harvey
+2  A: 

sqlite should work just fine for you. I have run a data set of that size even in an embedded device and sqlite performance was quite reasonable.

As stated, the main bottleneck is concurrency. Aim to design your system so that there is at most one database handle open per database file.

laalto
+2  A: 

sqlite is fast when you use transactions and don't commit too often. Using prepared statements with parameters speeds things up too. Sqlite doesn't have to reparse each sql statement when you use parameterized queries. An example: http://stackoverflow.com/questions/904796/how-do-i-get-around-the-problem-in-sqlite-and-c/926251#926251

I store 2 gigabyte in an sqlite db, it works very well. A big advantage of sqlite above a flat file is the possibility to index your data.

Theo