Why are key value pair noSQL db's faster than traditional relational DBs

views:

431

answers:

+4 Q:

Why are key value pair noSQL db's faster than traditional relational DBs

It has been recommended to me that I investigate Key/Value pair data systems to replace a relational database I have been using.

What I am not quite understanding is how this improves efficiency of queries. From what I understand you are going to be throwing away a lot information that would help to make queries more efficient, by simply turning your structure database into one big long list of keys and values?

Have I missed the point completely?

+6 A:

The advantage of a relational database is the ability to relate and index information. Most key-value systems don't provide that.

What you need to ask yourself is, does switching make sense for my intended use case?

You have kind of missed the point. The point is, you don't have an index. You don't have a centralized list of records, or the ability to relate it together in any easy way. What makes nosql key-value stores so quick is that you store and retrieve what you need in a name-based approach. You need that blurb on someone's profile page? Just go fetch it. No need to maintain a table with everything in it.

Not everything really needs to be tabular.

There's advantages and disadvantages. Personally, I use a mix of both. SQL for most, and something along the lines of CouchDB for random things that have no need to be clogging up an SQL table.

You can liken a key-value system to making an SQL table with two columns, a unique key and a value. This is quite fast. You have no need to do any relations or correlations or collation of data. Just find the value and return it.

You'll find this is also fast in SQL databases. I've used it in place of actual key-value systems.

I do not think scientific data is well suited to a nosql implementation.

Xorlev 2010-03-01 06:52:49

+4 A:

The efficiency comes from three main areas:

The database has far fewer functions: there is no concept of a join and lessened or absent transactional integrity requirements. Less function means less work means faster, on the server side at least.
Another design principle is that the data store lives in a cloud of servers so your request may have multiple respondents. These systems also claim the multi-server system improves fault tolerance through replication.
It is fully buzzword compliant, using a bunch of ideas and descriptions that are not wholly invented yet. For example, Amazon is currently giving their services away in order to better understand how people might use them and get some experience to refine the specification.

To my eye, someone coming to you with a requirement that "our new data will be too much for our RDBMS" ought either have numbers to back that assertion up or admit they just want to try the new shiny. Is noSQL meritless? Probably not. Is it going to turn the world upside-down as Java 1.0 was hyped to? Probably not.

There's no harm in investigating new things, just don't bet the farm on them in favor of 50 year old, well-established, well-understood technology.

msw 2010-03-01 07:13:31

+2 A:

Here I'm assuming that you want to optimize one particular query, which is simply looking up a record by key. One example of this might be looking up a userinfo record by username. For some systems a query like that has to be incredibly fast and all other queries are unimportant.

The biggest factor in database performance will be the number of I/O operation required to read/write data. Most database systems use similar data structures (i.e. b-trees) which can retieve uncached data in O(log(n)) I/Os. In order to give durable updates the data will have to be written to disk: most systems do that sequentially, which is the fastest way.

So, where can a Key-Value store get efficiencies?

Non-normalized data. Putting all the data in one row means no joins.
Low CPU overhead. A key-value store avoids the CPU cost of query processing/optimization, security checks, constraint checks, etc.
It is easier to have the store be in-process (as opposed to a SQL server running as a separate service) this eliminate IPC overhead.

Most RDBMS systems are built on top of something which looks like a key-value store so you could view this as cutting out the middleman.

Laurion Burchall 2010-03-03 08:36:25

ansaurus

tags:

views:

answers:

Why are key value pair noSQL db's faster than traditional relational DBs

related questions