Please excuse any mistakes in terminology. In particular, I am using relational database terms.
There are a number of persistent key-value stores, including CouchDB and Cassandra, along with plenty of other projects.
A typical argument against them is that they do not generally permit atomic transactions across multiple rows or tables...
I'm looking both projects and I can't really see the difference
from Cassandra Site:
Cassandra is a highly scalable, eventually consistent, distributed, structured key-value store...Cassandra is eventually consistent. Like BigTable, Cassandra provides a ColumnFamily-based data model richer than typical key/value systems.
from Couc...
Assume that Oracle Coherence is free :)
Which one do you prefer?
What are the architectural and feature capability differences between Oracle Coherence(Tangosol) and Cassandra?
Best Regards
...
Is there a distinct winner among all the key-value stores? Cassandra, MongoDB, CouchDB? and do they all follow some central guidelines? or they all have their own say in defining their APIs.
I'm asking this question, especially from a perspective of a RDBMS skilled person who is new to key-value stores. Which one should we follow to bes...
And what are the pitfalls to avoid? Are there any deal breaks for you? E.g., I've heard that exporting/importing the Cassandra data is very difficult, making me wonder if that's going to hinder syncing production data to development environment.
BTW, it's very hard to find good tutorials on Cassandra, the only one I have http://arin.me/...
I am trying to figure out exactly what these new fangled data stores such as bigtable, hbase and cassandra really are.
I work with massive amounts of stock market data, billions of rows of price/quote data that can add up to 100s of gigabytes every day (although these text files often compress by at least an order of magnitude). This d...
so I see here that cassandra does not have automatic load balancing, which comes into view when using the ordered partitioner (a certain common range of values of a group of rows would be stored on a relatively few machines which would then serve most of the queries).
http://stackoverflow.com/questions/1502735/whats-the-best-practice-in-...
I'm working on an application where data size and SQL queries are going to be heavy. I am thinking between Cassandra or Amazon SimpleDB. Can you please suggest which is more suitable in this kind of scenario?
Cassandra data indexing seems better than Amazon simpleDB, but the queries have fewer options compared to Amazon SimpleDB. Seems ...
What happens when a new value for an existing column is added? Will the older value be overwritten by the new value? Or the older value will also retain and can be retrieved (similar to simpleDB)?
...
I'm evaluating a storage platform for an upcoming project and keep coming back to Cassandra. For this project loosing any amount of data is unacceptable. So far we've used a relational database (Microsoft SQL Server), but the data is so varied and large that it has become an issue to store and query.
Is Cassandra robust enough to use as...
I am building a Reddit clone in Erlang. I am considering using some erlang web frameworks but this is not the problem.
I am having a problem selecting a database.
How it works;
I have multiple dedicated reddits. Examples, science, funny, corporate, sport. You could consider them sub reddits. Each sub reddit has categories.
A user c...
Can I used cassandra on EC2 instances without Elastic IP addresses? I believe in that case any instance that goes down, would create an issue.
If I use Elastic IP addresses for the cassandra nodes, I have to configure them such that they use the Public IP address for internal communication (gossip etc.). But that will increase the netwo...
Is there a way to get a row count (key count) of a single column family in Cassandra? get_count can only be used to get the column count.
For instance, if I have a column family containing users and wanted to get the number of users. How could I do it? Each user is it's own row.
...
I'm trying to think about how a Python API might look for large datastores like Cassandra. R, Matlab, and NumPy tend to use the "everything is a matrix" formulation and execute each operation separately. This model has proven itself effective for data that can fit in memory. However, one of the benefits of SAS for big data is that it ...
Cassandra's front page http://incubator.apache.org/cassandra/ states that:
Cassandra guarantees reads and writes to be atomic within a single ColumnFamily.
What exactly does that mean?
It sounds like it means that batch_insert() and batch_mutate() of two different rows, in the same CF is atomic and if the operation on one of the r...
I am looking for an eventually consistent data store and it looks like it may be coming down to Riak or Cassandra. Has anyone got expereinces of a view on this?
...
How can I create or associate a super column to a column in Perl using Net::Cassandra?
...
How can i get the keys of the column having sex = male. Using the php library from http://wiki.apache.org/cassandra/ClientExamples
For example my keys are
0,1,2
key: 0
{
column( name:age, value:24),
column( name:sex, value:female)
}
key: 1
{
column( name:age, value:24),
column( name:sex, value:female)
}
key: 2
{
column( name...
We are looking at using Cassandra to store a stream of information coming from various sources.
One issue we are facing is the best way to query between two dates.
For example we will need to retrieve an object between datetime dt1 and datetime dt2.
We are currently considering the created unix timestamp as the key pointing to the act...
Does anyone have experience with Cassandra when nodes go down or are unavailable? I am mostly interested in whether the cluster rebalances and what happens when the nodes come online, or are replaced by new machines.
...