views:

124

answers:

5

Hi,

I'm building an application that needs to query a lot of data that is written once and not changed anymore. Should I use MySQL for that or should I use something like SimpleDB or BigTable? (I need to write once, read many times)

Thank you.

Edit: I want to use Heroku, big for me is more than 5MB. "Thousands of rows" take more than 5MB. That's why I'm wondering if I should use CouchDB, SimpleDB or MongoDB in order not to pay the $15 that Heroku charges. Suggestions to overcome this? Thank you all for the comments!

A: 

I think you should use non transactional and document oriented database like MongoDB or CouchDB .

Elias Haileselassie
Can you elaborate on that please?
Tiago
+1  A: 

More important than your choice of database engine is your table structure. You should read up on OLAP database structure. Another consideration is the language you are writing in, make sure there is good support for the API of the database you want to use. CouchDB would be good as it has very low overheads due to the lack of relations/transactions.

fredley
Is CouchDB suitable for many reads and no writes? Thanks.
Tiago
Since you're not using writes during operation, it doesn't matter how efficient it is at doing no writes :-). However due to the fact it does not do transactions, or manage relations, there is less overhead compared to other, more complex database systems like MySQL. It's about as simple as it gets, and that's a good thing in this case.
fredley
Please read my edit.
Tiago
Do not pay for a database system to manage a 5MB database (small!). Use CouchDB and be done with it, or frankly use anything, you won't notice the difference. Have a read of this, the quotes in particular:http://en.wikipedia.org/wiki/Program_optimization#When_to_optimize
fredley
Thank you! CouchDB for everything is fine? Even for storing usernames/passwords? I'll definitely think about it!
Tiago
Yes it's fine. You should not be storing plaintext passwords, but cryptographic hashes:http://www.aspheute.com/english/20040105.asp
fredley
+2  A: 

What does it mean "a lot of data"? Thousands, millions, billions of rows? How many and what columns per row? Will you use many joins or simple selects?

If your tables are simple or you need to use complicated JOINs, I would pick any SQL you're familiar with.

If your structure is complicated and if document oriented database would suit your needs, I would pick MongoDB (preferred) or CouchDB.

Edit: According to your comment - thousands of rows is not that much. Use your favourite database and set as much cache as it needs (read more about necessary cache amount or start a new topic). Or use Memcached but I suggest to use database cache as it's efficient too and painless for you. Goog luck man!

dwich
Thousands of rows. 20 columns per row. Thanks.
Tiago
That's not even close to being big. Seriously. Any database engine will suit your need. Even free database engine.
Pierre-Alain Vigeant
Please read my edit.
Tiago
@Tiago: If you want to use http://heroku.com/ as a web hosting and they charge $15 per database, then just make your decision and pay the fee or pick another hosting. It's absolutely normal that services (like hosting) cost money.
dwich
They offer 5MB for FREE and 1GB CouchDB for FREE.
Tiago
@Tiago: You should pick the right technology according to your needs, the budget is not that important especially in your case - 15 bucks a month is a price of couple of burgers. Or if you want me to say - go for the 1GB FREE CouchDB, if this will make you happy, go for it!
dwich
@dwich - $15 is some money for me (I'm not in the US). Do you think that a 5MB database will handle thousands of rows? (~100k) ?
Tiago
@Tiago: I'm not in the US either. Simple answer - try it for yourself on local computer first, test it, check the database size and then you'll see if the data fit in 5MB or not. No one here can tell you if your data will fit in 5MB. For your information if you expect ~100k rows, it's 500 bytes per row.And again - pick the right technology you need, then find a hosting. Not in reverse.
dwich
A: 

For "write once, read many times", de-normalized database (that does not waste cycles to do joins etc.) is a good choice.

So, you should design your tables such that these reads have to do minimum number of I/Os and joins. You can do that with any database. It is the structure of your tables that matters.

AFAIK, SimpleDB and BigTable are distributed databases and offer very good query speeds if your users are distributed geographically (thereby bypassing the network latencies). They will not offer much advantage if the I/O latency is not the bottleneck.

Amoeba
A: 

The amount of data you have is tiny. Any DBMS will cope with a few thousand rows. I suggest you look first at one of the popular SQL DBMSs - such as MySQL, which you already mentioned. You need to make the choice based on the functional requirements rather than be concerned with the data size.

dportas