Cassandra hot keyspace structure change | ansaurus

tags:

cassandra

views:

192

answers:

1

+1 Q:

Cassandra hot keyspace structure change

Hello.

I'm currently running a 12-node Cassandra cluster storing 4TB of data, with a replication factor set to 3. For the needs of an application update, we need to change the configuration of our keyspace, and we'd like to avoid any downtime if possible.

I read on a mailing list that the best way to do it is to:

Kill cassandra process on one server of the cluster
Start it again, wait for the commit log to be written on the disk, and kill it again
Make the modifications in the storage.xml file
Rename or delete files in the data directories according to the changes we made
Start cassandra
Goto 1 with next server on the list

My questions would be:

Did I understand the process well?
Is there any risk of data corruption?
During the process, there will be servers with different versions of the storage.xml file in the same cluser, same keyspace. Is it a problem?
Same question as above if we not only add, rename and remove ColumnFamilies, but if we change the CompareWith parameter / transform an existing column family into a super one. Or do we need to change the name?

Thank you for your answers. It's the first time I'll do this, and I'm a little bit scared.

A:

Your list looks like the one in http://wiki.apache.org/cassandra/FAQ#modify_cf_config. So it should be accurate...

jbellis 2010-04-02 15:14:34

Yep, thanks.I shouldn't be that worried. But... that doesn't really answer the last question. Which is the most important =)

Pierre 2010-04-04 13:17:39

related questions

Are there any data modeling exercises for Cassandra like Retwis for Redis?

Do I absolutely need a minimum of 3 nodes/servers for a Cassandra cluster or will 2 suffice?

Are there any "gotchas" in deploying a Cassandra cluster to a set of Linode VPS instances?

Can I configure Cassandra dynamically without having to edit XML to create columns?

How does Voldemort compare to Cassandra?

How does Cassandra rebalance when nodes go down?

Cassandra time series data

How can I get the key of a colum in cassandra using php?

How can I create or associate a super column to a column in Perl using Net::Cassandra?

Cassandra vs Riak

Cassandra atomic reads/writes within a single ColumnFamily

Delayed execution in python for big data

Row count of a column family in Cassandra

Cassandra on Amazon EC2 with Elastic IP addresses

NoSql - which is best for my needs - i am having mental breakdown

Is Cassandra suitable to use as a primary data store?

Update an existing column value

Cassandra Vs Amazon SimpleDB

cassandra load balancing

storing massive ordered time series data in bigtable derivatives

What's The Best Practice In Designing A Cassandra Data Model?

Which is the most suitable Key-Value Store for a RDBMS background person?

What is the difference between Cassandra vs Oracle Coherence?

what is the difference between Cassandra and CouchDB?

Atomic transactions in key-value stores