Non-relational databases (NoSQL) for small to medium sized applications

views:

425

answers:

+5 Q:

Non-relational databases (NoSQL) for small to medium sized applications

The benefits of a non-relational database (such as a key-value pair storage) are evident when used in large scale datasets (google, facebook, linkedin). How do you think small to medium sized applications can benefit from using non-relational databases?

+5 A:

IBM Mainframes have had "non-relational" databases since the 60s (hierarchial databases such as IMS + variants). These databases are still in use because they are extremely fast and handle huge scale well.

The point of relational databases was to provide a regular, relatively abstract method for storing and retrieving data in which the tuning can be done relatively independently of the data model (not true for IMS). They were designed rather in reaction to the inability to reorganize hiearchical databases easily. The upside is nice organization; the downside is medium, not high performance.

Google provides scalable storage and MapReduce to handle scale. It isn't relational.

There was a huge push early in the last decade to store data in XML, in essentially hiearchical form because XML is implicitly hierarchical. That was a huge mistake IMHO, because it repeated the inconvenience of heirarchical databases, but had none of the performance. I'm not very surprised this movement seems to have pretty much died.

Most of the practical push to non-relational seems to me to be towards performance and scale. I don't see how this helps "small" applications much.

People have proposed, but not done a lot of practical data management using knowledge-based schemes. Doug Lenat's CYC comes to mind here. The ability of the database to help an application draw non-obvious conclusions strikes me a very interesting for "small" applications that are trying be "smart". But there aren't a lot of these yet.

Ira Baxter 2010-03-01 16:45:29

+1 - a thoughtful contribution

APC 2010-03-01 17:12:42

thank you all, very useful

Victor P 2010-03-05 12:15:22

+2 A:

The sweet spot of using a NoSQL database at that scale is when the database model (key-value, document, etc.) is a good match to the application's needs and the advanced relational functionality is not needed.

At the small end of the spectrum, performance is a non issue because just about everything is fast. Storage engines are a non issue, if you don't need a sophisticated query engine, the lack of SQL support is a non issue.

You are left with how well it fits and how easy it is to use. Honestly though, tooling does become an issue. Relational database tooling is mature, NoSQL tooling is less feature rich and less battle hardened. Too often it is roll-your-own tooling. Definitely consider what tools you'd be giving up and how much you need them.

There is an additional slate of advantages for smaller projects when considering a NoSQL service (like Amazon SimpleDB and Microsoft Azure) as compared to a product. If you only have to pay for what you use and you don't use much, it can be cheaper than running a dedicated server, going all the way down to free for something like the SimpleDB free usage tier.

You also avoid some of the server and database maintenance costs. This can be a big win if you don't have a DBA, or when your DBAs are already over worked. Of course you'll still have admin work to do, but it is significantly reduced, and typically simpler.

Mocky 2010-03-01 17:28:14

+1 A:

When it comes to graph databases (like Neo4j - a project I'm involved in) they excel at scaling to complexity. This means, they provide "better substrates for modeling business domains" (see The State of NoSQL, also by Ben Scofield, too). As I see it, this is very important in small to medium sized apps.

This may be better explained through examples, so here's some links to example apps/domain modeling:

nawroth 2010-03-01 19:47:13

The question perhaps requires a bit more context... assuming a Python environment, consider the tutorial at the y_serial project: http://yserial.sourceforge.net/

NoSQL is not merely adopted for reasons of scalability. Serialization (of any arbitrary Python object) and persistence are very convenient at any scale -- so consider the key-value system as one approach.

code43 2010-03-20 19:41:37

Well one of the problems with a RDBMS is that you need to spend effort mapping your programming languages domain models to the relational schema of your RDBMS. This effort is usually spent configuring your ORM layer.

With NoSQL databases you are not forced to map your objects to a relational model and in most cases your objects are serialized as-is. Because of the lack of an intermediary schema, data migrations and versioning become easier.

Another benefit is scalability and performance. Since most of the time your data is received by 'keys' effectively everything uses and index. Trivial sharding is possible by doing a % (MOD) on the key against the number of your available NoSQL instances providing natural data partitioning which is crucial for sharding.

If you're interested in seeing how developing with a NoSQL differs from a RDBMS, I have a tutorial where I show how to go about designing a simple blog application using Redis.

mythz 2010-09-11 15:46:15

ansaurus

tags:

views:

answers:

Non-relational databases (NoSQL) for small to medium sized applications

related questions