When NOT to use Cassandra?

views:

1292

answers:

+16 Q:

When NOT to use Cassandra?

There has been a lot of talk related to Cassandra lately.

Twitter, Digg, Facebook, etc all use it.

When does it make sense to:

use Cassandra,
not use Cassandra, and
use a RDMS instead of Cassandra.

less data! easy architecture!

Thomas 2010-04-14 13:42:05

My understanding is that you would use NoSQL when you just have a single key-value pair. Meaning, your RDMS table would just be 2 columns (key, value).

JustinT 2010-04-14 14:17:52

Well Column Families can have multiple columns in Cassandra, correct?

Luke 2010-04-14 15:19:19

Yes, and it also allows to have SuperColumns

Schildmeijer 2010-04-14 15:57:43

Plus this is just cassandra. If you're talking about NoSQL, well hell, look at MongoDB's data model, arrays and hashes nested as far as you can fit in a 4MB row. Indexable as well.

Michael 2010-04-15 08:10:57

+7 A:

The general idea of NoSQL is that you should use whichever data store is the best fit for your application. If you have a table of financial data, use SQL. If you have objects that would require complex/slow queries to map to a relational schema, use an object or key/value store.

Of course just about any real world problem you run into is somewhere in between those two extremes and neither solution will be perfect. You need to consider the capabilities of each store and the consequences of using one over the other, which will be very much specific to the problem you are trying to solve.

Tom Clarkson 2010-04-14 22:22:11

What is the advantage of sql when using fininacial data?

Paco 2010-04-26 14:25:14

The schema is unlikely to change, it fits well in a table structure, and lost/inconsistent data could cause real problems.

Tom Clarkson 2010-04-27 00:28:41

I don't understand why inconsistent data can cause real problems with banks. Scenario:You have one bank account, with $100 on above the limit on it, and two bank cards. When you try to withdraw money with the two cards at the same time at 2 different ATMs, you will get 2 times $100, and a letter with an extra fee in your mail box. The bank earns money (the extra fee for being below the limit) by using inconsistent data. It's to hard to connect all ATMs in the world with each other through one large relational database. Can you give an example where inconsistent financial data can be a problem?

Paco 2010-04-27 16:00:50

That stuff is all COBOL and batch processing, and not nearly as well designed/stable as you might think. ATMs do not connect to any sort of unified data store, so are hardly a suitable example. It's like saying SQL isn't suitable for web apps because you can't give everyone on the internet direct access to your database.Besides, I never said anything about banks - think things like orders on an ecommerce site where you don't have to deal with an organization so conservative that SQL is considered new and untrusted.

Tom Clarkson 2010-04-28 02:26:20

So the only reason is conservatism, no technical reason?

Paco 2010-04-28 08:50:47

You seem to be missing the point. Technically anything is possible, using any set of tools, but that doesn't make it a good idea. For tracking sales, the benefits of sql outweigh the disadvantages. If you think you can set up a banking system using new technology, good luck to you.

Tom Clarkson 2010-04-28 23:34:54

@Paco: The first ATM reads your balance($100), and the second ATM does the same. Both ATMs deduct $100 from $100 and write the final balance of $0 back to your account. Result: the bank loses $100.

Seun Osewa 2010-05-01 21:42:23

@Seun Osewa: That would be a stupid bank. A normal bank would ask you to pay back $100 and a ridiculous interest rate for being below the limit and earn some money instead of losing money.

Paco 2010-05-01 23:54:28

@Tom Clarkson: When you cannot name a benefit, there is no benefit.

Paco 2010-05-01 23:55:00

@Paco: The point is, without proper transaction isolation, the normal bank won't even know the account has been overdrawn. They won't even know.

Seun Osewa 2010-05-03 21:40:45

@Seun Osewa: A bank does not use atomic transactions for withdrawing money from an ATM. It would cost to much hardware to connect all ATMs in the world to the same database with atomic transactions.

Paco 2010-05-04 09:38:10

+4 A:

When evaluating distributed data systems, you have to consider the CAP theorem - you can pick two of the following: consistency, availability, and partition tolerance.

Cassandra is an available, partition-tolerant system that supports eventual consistency. For more information see my Visual Guide to NoSQL Systems.

Nathan Hurst 2010-04-20 19:01:38

+2 A:

Cassandra is the answer to a particular problem: What do you do when you have so much data that it does not fit on one server ? How do you store all your data on many servers and do not break your bank account and not make your developers insane ? Facebook gets 4 Terabyte of new compressed data EVERY DAY. And this number most likely will grow more than twice within a year.

If you do not have this much data or if you have millions to pay for Enterprise Oracle/DB2 cluster installation and specialists required to set it up and maintain it, then you are fine with SQL database.

Vagif Verdi 2010-04-24 19:30:22

+1 A:

another situation that makes the choice easier is when you want to use aggregate function like sum, min, max, etcetera and complex queries (like in the financial system mentioned above) then a relational database is probably more convenient then a nosql database since both are not possible on a nosql databse unless you use really a lot of Inverted indexes. When you do use nosql you would have to do the aggregate functions in code or store them seperatly in its own columnfamily but this makes it all quite complex and reduces the performance that you gained by using nosql.

ronaldmathies 2010-04-28 04:31:41

Talking with someone in the midst of deploying Cassandra, it doesn't handle the many-to-many well. They are doing a hack job to do their initial testing. I spoke with a Cassandra consultant about this and he said he wouldn't recommend it if you had this problem set.

Warren 2010-06-06 22:21:04

ansaurus

tags:

views:

answers:

When NOT to use Cassandra?

related questions