cassandra

Atomic transactions in key-value stores

Please excuse any mistakes in terminology. In particular, I am using relational database terms. There are a number of persistent key-value stores, including CouchDB and Cassandra, along with plenty of other projects. A typical argument against them is that they do not generally permit atomic transactions across multiple rows or tables...

what is the difference between Cassandra and CouchDB?

I'm looking both projects and I can't really see the difference from Cassandra Site: Cassandra is a highly scalable, eventually consistent, distributed, structured key-value store...Cassandra is eventually consistent. Like BigTable, Cassandra provides a ColumnFamily-based data model richer than typical key/value systems. from Couc...

What is the difference between Cassandra vs Oracle Coherence?

Assume that Oracle Coherence is free :) Which one do you prefer? What are the architectural and feature capability differences between Oracle Coherence(Tangosol) and Cassandra? Best Regards ...

Which is the most suitable Key-Value Store for a RDBMS background person?

Is there a distinct winner among all the key-value stores? Cassandra, MongoDB, CouchDB? and do they all follow some central guidelines? or they all have their own say in defining their APIs. I'm asking this question, especially from a perspective of a RDBMS skilled person who is new to key-value stores. Which one should we follow to bes...

What's The Best Practice In Designing A Cassandra Data Model?

And what are the pitfalls to avoid? Are there any deal breaks for you? E.g., I've heard that exporting/importing the Cassandra data is very difficult, making me wonder if that's going to hinder syncing production data to development environment. BTW, it's very hard to find good tutorials on Cassandra, the only one I have http://arin.me/...

storing massive ordered time series data in bigtable derivatives

I am trying to figure out exactly what these new fangled data stores such as bigtable, hbase and cassandra really are. I work with massive amounts of stock market data, billions of rows of price/quote data that can add up to 100s of gigabytes every day (although these text files often compress by at least an order of magnitude). This d...

cassandra load balancing

so I see here that cassandra does not have automatic load balancing, which comes into view when using the ordered partitioner (a certain common range of values of a group of rows would be stored on a relatively few machines which would then serve most of the queries). http://stackoverflow.com/questions/1502735/whats-the-best-practice-in-...

Cassandra Vs Amazon SimpleDB

I'm working on an application where data size and SQL queries are going to be heavy. I am thinking between Cassandra or Amazon SimpleDB. Can you please suggest which is more suitable in this kind of scenario? Cassandra data indexing seems better than Amazon simpleDB, but the queries have fewer options compared to Amazon SimpleDB. Seems ...

Update an existing column value

What happens when a new value for an existing column is added? Will the older value be overwritten by the new value? Or the older value will also retain and can be retrieved (similar to simpleDB)? ...

Is Cassandra suitable to use as a primary data store?

I'm evaluating a storage platform for an upcoming project and keep coming back to Cassandra. For this project loosing any amount of data is unacceptable. So far we've used a relational database (Microsoft SQL Server), but the data is so varied and large that it has become an issue to store and query. Is Cassandra robust enough to use as...

NoSql - which is best for my needs - i am having mental breakdown

I am building a Reddit clone in Erlang. I am considering using some erlang web frameworks but this is not the problem. I am having a problem selecting a database. How it works; I have multiple dedicated reddits. Examples, science, funny, corporate, sport. You could consider them sub reddits. Each sub reddit has categories. A user c...

Cassandra on Amazon EC2 with Elastic IP addresses

Can I used cassandra on EC2 instances without Elastic IP addresses? I believe in that case any instance that goes down, would create an issue. If I use Elastic IP addresses for the cassandra nodes, I have to configure them such that they use the Public IP address for internal communication (gossip etc.). But that will increase the netwo...

Row count of a column family in Cassandra

Is there a way to get a row count (key count) of a single column family in Cassandra? get_count can only be used to get the column count. For instance, if I have a column family containing users and wanted to get the number of users. How could I do it? Each user is it's own row. ...

Delayed execution in python for big data

I'm trying to think about how a Python API might look for large datastores like Cassandra. R, Matlab, and NumPy tend to use the "everything is a matrix" formulation and execute each operation separately. This model has proven itself effective for data that can fit in memory. However, one of the benefits of SAS for big data is that it ...

Cassandra atomic reads/writes within a single ColumnFamily

Cassandra's front page http://incubator.apache.org/cassandra/ states that: Cassandra guarantees reads and writes to be atomic within a single ColumnFamily. What exactly does that mean? It sounds like it means that batch_insert() and batch_mutate() of two different rows, in the same CF is atomic and if the operation on one of the r...

Cassandra vs Riak

I am looking for an eventually consistent data store and it looks like it may be coming down to Riak or Cassandra. Has anyone got expereinces of a view on this? ...

How can I create or associate a super column to a column in Perl using Net::Cassandra?

How can I create or associate a super column to a column in Perl using Net::Cassandra? ...

How can I get the key of a colum in cassandra using php?

How can i get the keys of the column having sex = male. Using the php library from http://wiki.apache.org/cassandra/ClientExamples For example my keys are 0,1,2 key: 0 { column( name:age, value:24), column( name:sex, value:female) } key: 1 { column( name:age, value:24), column( name:sex, value:female) } key: 2 { column( name...

Cassandra time series data

We are looking at using Cassandra to store a stream of information coming from various sources. One issue we are facing is the best way to query between two dates. For example we will need to retrieve an object between datetime dt1 and datetime dt2. We are currently considering the created unix timestamp as the key pointing to the act...

How does Cassandra rebalance when nodes go down?

Does anyone have experience with Cassandra when nodes go down or are unavailable? I am mostly interested in whether the cluster rebalances and what happens when the nodes come online, or are replaced by new machines. ...