views:

140

answers:

5

I'm looking at solutions to store a massive quantity of information consuming the less possible disk space.

The information structure is very simple and the queries will also be very simple. I've looked at solutions like Apache Cassandra and relations databases but couldn't find a comparison where disk usage is mentioned.

Any ideas on this would be great.

+1  A: 

Newest version of Microsoft's SQL Server (2008) supports several levels of compression (row compression and page compression, in addition to backup compression). Might be worth investigating.

Some relevant resources:

BradC
Does compressed data stay in read/write mode ? MySQL also has data compression but all compressed data is readonly.
Hugo Palma
Yes, compressed data in MS SQL is fully read-write.
BradC
PostgreSQL also supports on the fly compression: http://www.postgresql.org/docs/current/static/storage-toast.html
janneb
+4  A: 

Just buy a bigger hard drive.

justin.m.chase
Sorry, doesn't really answer my question. I'm looking for way to optimize disk usage.
Hugo Palma
lol, that's actually a relevant point. How much data are we really talking about here? 10GB? 100GB? 1TB?
BradC
The goal is to deploy the database on a shared hosting site which has disk space limits. Increasing them has a serious impact on the monthly fee so it's not as easy as buying a new hard drive. It has impact on the fixed monthly cost of the solution.
Hugo Palma
+2  A: 

Take a look at Oracle Berkeley DB - very simple robust database (key/value):

"Berkeley DB enables the development of custom data management solutions, without the overhead traditionally associated with such custom projects. Berkeley DB provides a collection of well-proven building-block technologies that can be configured to address any application need from the handheld device to the datacenter, from a local storage solution to a world-wide distributed one, from kilobytes to petabytes."

oraz
+2  A: 

Redis might worth a check if you can store your data in key-value

Sundar
+1  A: 

Speaking about Apache Cassandra - it's just a disk space hog. 200 MB of logs resulted in 1.2 GB files produced by Cassandra - and the keyspace was just 4 columns with 200 length strings.

sha1dy
Thanks for the info.
jpartogi