nosql

Choosing a distributed shared memory solution

I have a task to build a prototype for a massively scalable distributed shared memory (DSM) app. The prototype would only serve as a proof-of-concept, but I want to spend my time most effectively by picking the components which would be used in the real solution later on. The aim of this solution is to take data input from an external s...

What does "Document-oriented" vs. Key-Value mean when talking about MongoDB vs Cassandra?

What does going with a document based NoSQL option buy you over a KV store, and vice-versa? ...

What is the largest known size of a CouchDB cluster and/or database?

What is the largest known size of a CouchDB cluster and/or database in terms of bytes of storage, #s of documents, and/or #s of nodes? ...

Storing News in a Distributed DB vs RDBMS

Hi all: If I am storing News articles in a DB with different categories such as "Tech", "Finance", and "Health", would a distributed database work well in this system vs a RDBMS? Each of the news items would have the news articles attached as well as a few other items. I am wondering if querying would be faster, though. Let's say I nev...

Can we use MongoDB with ORMs we used to use with relational databases, such as linq2sql, entity framework, subsonic,...?

I want to know if its possible based on your experience to use our previous experiences using .net ORMs with nosql db such as MongoDB. And also if you know samples doing this please refer in your answer. ...

How does Apache Cassandra do aggregate operations?

I'm fairly new to Apache Cassandra and nosql in general. In SQL I can do aggregate operations like: SELECT country, sum(age) / count(*) AS averageAge FROM people GROUP BY country; This is nice because it is calculated within the DB, rather than having to move every row in the 'people' table into the client layer to do the calcul...

Denormalization of large text?

If I have large articles that need to be stored in a database, each associated with many tables would a NoSQL option help? Should I copy the 1000 char articles over multiple "buckets", duplicating them each time they are related to a bucket or should I use a normalized MySQL DB with lots of Memcache? ...

In MongoDB, how can I replicate this simple query using map/reduce in ruby?

Hi, So using the regular MongoDB library in Ruby I have the following query to find average filesize across a set of 5001 documents: avg = 0 total = collection.count() Rails.logger.info "#{total} asset creation stats in the system" collection.find().each {|row| avg += (row["filesize"] * (1/total.to_f)) if row["filesize"]} ...

Cassandra random read speed

We're still evaluating Cassandra for our data store. As a very simple test, I inserted a value for 4 columns into the Keyspace1/Standard1 column family on my local machine amounting to about 100 bytes of data. Then I read it back as fast as I could by row key. I can read it back at 160,000/second. Great. Then I put in a million similar...

Multiple inequality conditions (range queries) in NoSQL

Hi, I have an application where I'd like to use a NoSQL database, but I still want to do range queries over two different properties, for example select all entries between times T1 and T2 where the noiselevel is smaller than X. On the other hand, I would like to use a NoSQL/Key-Value store because my data is very sparse and diverse, a...

What is the way to maintain database indexes in files

I'm writting key-value storage for milions of documents - for study and fun. I added default b-tree indexing on key but of course there is no way to load all indexes to memory. For now storage have two types of files data (not ordered key-value records) and index (no efficient conception for search, adding and deleting). In b-tree obje...

Best XML Based Database

I had been assigned to develop a system on where we would get a XML from multiple sources (millions of xml) and put them in some database like and judging from the xml i would receive, there wont be any concrete structure even if they are from the same source. With this reason i think i cannot suggest RDMS and currently looking at NoSQL ...

Why exactly do we use NoSQL?

Having understood some of the advantages that NoSQL offers (scalability, availability, etc.), I am still not clear why a website would want to use a non-relational database. Can I get some help on this, preferably with an example? ...

python solutions for managing scientific data dependency graph by specification values

I have a scientific data management problem which seems general, but I can't find an existing solution or even a description of it, which I have long puzzled over. I am about to embark on a major rewrite (python) but I thought I'd cast about one last time for existing solutions, so I can scrap my own and get back to the biology, or at l...

Scalability of Using MySQL as a Key/Value Database

I am interested to know the performance impacts of using MySQL as a key-value database vs. say Redis/MongoDB/CouchDB. I have used both Redis and CouchDB in the past so I'm very familiar with their use cases, and know that it's better to store key/value pairs in say NoSQL vs. MySQL. But here's the situation: the bulk of our applicatio...

Securing document-style databases (MongoDb, CouchDb, RavenDb) for client (browser) access

Document databases that support REST-style JSON over HTTP access seem ideal for supporting AJAX-rich applications where the browser is making direct calls to the database, bypassing the traditional web server / application logic components. An example of this might be retrieving user preferences once a user has been authenticated. (BBC H...

JRedisFuture stability

I'm using the synchronous implementation of JRedis, but I'm planning to switch to the asynchronous way to communicate with the redis server. But before that I would like to ask the community whether the JRedisFuture implementation of alphazero's jredis is stable enough for production use or not? Is there anybody out there who is using ...

Select distinct rows from MongoDB

How do you select distinct records in MongoDB? This is a pretty basic db functionality I believe but I can't seem to find this anywhere else. Suppose I have a table as follows -------------------------- | Name | Age | -------------------------- |John | 12 | |Ben | 14 | |Robert | 14 |...

redis: Handling failover?

Hi everybody, Redis really seems like a great product with the built in replication and the amazing speed. After testing it out, it feels definitely like the 2010 replacement of memcached. However, since when normally using memcached, a consistent hashing is being used to evenly spread out the data across the servers in a pool. If one ...

suggest database for storing metadata regarding 200 million images (1 million books) (NoSQL? SQL?)

Friends, We will be undertaking a knowledge preservation project for scanning more than 1 million books. We need some suggestions on implementing database for storing and retrieving metadata as well as use it for tracking the scanning status of each object (book) Can you guys suggest should we go for SQL or NoSQL (The metadata could ...