views:

307

answers:

6

I have a project which is going to be write-heavy rather than read-heavy. I was wondering if anyone had any suggestions for open source DBMS setups which are quick at writes?

It doesn't necessarily have to be a relational DBMS either; I'm open to suggestions.

+2  A: 

Database systems can be optimized according to the environment in which they run but the most important is hardware especialy I/O. Use as many disks as you can and set up RAID 10 or RAID 0+1 a s you don't want to calculate parity check everytime DBMS write something into the disk.

Novitzky
true, we assume that you are not raiding.mysql has different storage engines also, such as myisam is pretty fast, but it does table level lockinginnodb is slower but more stable and does row level locking.
+1  A: 

MongoDB is supposed to be good at this. Have a look at this post especially.

utku_karatas
+7  A: 

I'm quoting below some parts of the conclusion of NoSQL: If Only It Was That Easy (the article is more about scalability but still contains interesting things that apply to your context):

[...]

The real thing to point out is that if you are being held back from making something super awesome because you can’t choose a database, you are doing it wrong. If you know mysql, just use it. Optimize when you actually need to. Use it like a k/v store, use it like a rdbms, but for god sake, build your killer app! None of this will matter to most apps. Facebook still uses MySQL, a lot. Wikipedia uses MySQL, a lot. FriendFeed uses MySQL, a lot. NoSQL is a great tool, but it’s certainly not going to be your competitive edge, it’s not going to make your app hot, and most of all, your users won’t give a shit about any of this.

What am I going to build my next app on? Probably Postgres. Will I use NoSQL? Maybe. I might also use Hadoop and Hive. I might keep everything in flat files. Maybe I’ll start hacking on Maglev. I’ll use whatever is best for the job. If I need reporting, I won’t be using any NoSQL. If I need caching, I’ll probably use Tokyo Tyrant. If I need ACIDity, I won’t use NoSQL. If I need a ton of counters, I’ll use Redis. If I need transactions, I’ll use Postgres. If I have a ton of a single type of documents, I’ll probably use Mongo. If I need to write 1 billion objects a day, I’d probably use Voldemort. If I need full text search, I’d probably use Solr. If I need full text search of volatile data, I’d probably use Sphinx.

[...]

So if a non ACID storage system is an option, I'd look at Voldemort. If not, without more specific informations, I can't say if one DBMS is really better than another for write-intensive applications. Actually, I think it's more a matter of design/architecture/tuning and tend to agree with the author: 1) use the one you know the most 2) which one you choose won't matter to most apps.

Pascal Thivent
the question wasnt about scaling.
Agreed. But there is much more than just scaling concerns in this article and its conclusion. I'll put some emphasis on them.
Pascal Thivent
People confuse ACID with databases. You can have ACID without SQL, or vice versa (e.g. mysql). NoSQL doesn't have to forgo transactions.
Seun Osewa
+3  A: 

Well I have seen the commercial DBs get up 2GB per minute on not particulary impressive hardware. The standard Open Source dbs (MySQL, Postgress even sqlite are not far behind).

For any volume of writes which will give a modern DB trouble there are three things which will affect performance (neither of which depends on the particular DB you choose).

One is basic design, particulary partitioning (spreading your db over several physical disks) and minimising the number of indexes on the tables (for write performance zero indexes is best!).

Two is log placement or if possible log avoidence. Logging is the bottleneck in most RDBMes. Making sure your are logging to dedicated fast disks is one way, turning of logging (varies according to the RDBMS but most support this) for the table, if you can afford to lose transactions.

Three is hardware -- lots of memory and lots of fast disks to spread out your I/O load.

There are some exotic options out there if this is still not fast enough. Buy a z/OS mainframe and run the venerable IMS/DB with the DEDB (Data Entry database) feature. This is about four times faster than any other ACID DB. Buy Oracle's In Memory DB option (used to be HPs TimesTen).

Another possibility if you have some decent queing software avaiable is to capture the data and immediatly place it in a queue. You can then have one or more background processes pulling the data off the queue and doing the actual DB updates in the backgroud.

James Anderson
A: 

Defined "write heavy": billions of rows per day, or write intensive compared to read activity?

Even a "write intensive" database peaks around 15% write because of indexes, duplicate checking, UPDATE..WHERE, etc.

Unless you really are an edge case (mentioned in NoSQL answer above), any DBMS will do because the limit will be hardware not vendor.

gbn
A: 

Hi ... This is a bit of a mystery question -- If you are doing lots and lots of writes and very few look-ups (reads) and (some) updates) ...

Use a fixed record random access file (Seek() and stuff on a posix platform), a flat file. If you need indexing, simply index your keys to the flat-file to do reads and updates.

The disadvantage is you need to keep the keys in sync. with the contents on writes and updates. A pretty simple C++ or other OO class can handle that for you I feel. Why write indexes if you aren't going to use them? And, depending on you actual need, you can do away with the indexing all together -- And index at the end of the day or something-k!!

Cheers, w.

will