views:

2119

answers:

7

I have read a lot lately about 'NoSQL' databases such as CouchDB, MongoDB etc. Most of the websites I have seen using this are mainly text based websites such as The New York Times and Source forge.

I was wondering if you could apply this to websites where payment is a huge issue. I am thinking of the following issues:

  • How well can you secure the data
  • Do these system provide an easy backup/restore machanism
  • How are transactions handled commit/rollback

I have read the following articles that cover some aspects:

In these posts the aspect of transactions if covered. However the questions of security and backups is not covered. Can someone shed some light on this subject?

And if possible, does anyone know of some e-commerce websites that have successfully implemented the document based database.

+3  A: 

I don't think security would be any different on a NoSQL database than on a relational database. In the end, security is an orthogonal question to how data is actually stored. Besides, it's not like you'd allow access to the database from anything but your business-layer servers from a networking standpoint.

As for backups, most NoSQL databases that I know of allow for hot backups, just like a regular database does.

The real question, IMO, is whether you can live with the restrictions that a NoSQL database puts on you - in particular, the general lack of ad-hoc queries. For example, if you ever wanted to know all of the people who ever bought product "X" then you'd have to build into your data access layer a counter for that from day one (or run a very expensive serial lookup of every past transaction). In a regular SQL database, you can just add an index and do a query and you're done (or even, don't add an index if it's a one-off). Or maybe you want to find out all the people who bought product "Y" before the latest version came out (so you can send them a reminder to upgrade or whatever): again, you have to plan that ahead with a NoSQL database, but it's trivial with a relational database.

I think it makes sense when you can plan your schema and your usage pattern ahead of time, and where the occasional re-scan of records to add some new field or metric is acceptable. But for an e-commerce website, I think ad-hoc queries are just too valuable a feature to lose. Of course, that's just my opinion, and there's certainly no reason why you couldn't mix-n-match parts of the application between the two databases. I'd personally choose a relational database with memcached in between for added performance, though...

Dean Harding
+1 Thank you for the nice points made. MySQL and memcached have proven there performance and reliability for quite some time now. I think for websites this still is the best solution for now. One question tho, what do you mean by ad-hoc queries and why does nosql lack that.
Saif Bechan
MongoDB offers good support for adhoc queries and adhoc creation of indexes. However I think that you should use an ACID database for anything that is related to money.
Theo
A number of NoSQL databases do provide "views" or "indexes" which allow for what I would call "semi ad-hoc" queries, but it's still no substitute for the ability to simply say `SELECT * FROM users INNER JOIN purchases on ... HAVING ...`
Dean Harding
@codeka,The indexes in MongoDB are real indexes, they are not "indexes". MongoDB differs from CouchDB. It is of course true that you can't join in MongoDB. However you can index nested data.
Theo
+5  A: 

Handling financial information is one of the areas where SQL really is the right tool for the job. Most of the NOSQL systems were designed to improve scalability by accepting a higher risk of data loss or inconsistency. They also tend to have limited abilities to run reports over all records, since on a typical large website you only need enough data in the index to find and display a single record - the rest can be completely inaccessible until you know the record you are looking for.

When dealing with money, any data inconsistency is a big problem, and if you need more scalability than a single sql server can give you, you have enough money that you can afford the higher cost of scaling sql. Also, the ad-hoc reporting available from sql is something you'd miss if you don't use sql - pretty much any information you want about sales history is trivial to get from sql, but potentially requires complex custom code from an object based store.

Tom Clarkson
+1 I think the technology is just too new to for an e-commerce website to rely on this system. The way it uses just JSON for is storage does trigger. I hope in the future this method can be used as a reliable method for money driven websites. Until then I will just stick with MySQL with memcached.
Saif Bechan
The newness of the technology isn't the issue - these tools are being used by some of the biggest sites on the net. However, the technology is designed to solve a completely different problem set and will likely never be the most appropriate choice for working with money.
Tom Clarkson
Ok I understand. Thank you for pointing that out
Saif Bechan
A: 
  1. Amazon S3 uses NoSQL implementation.
Rachel
I'm not sure how non-e-commerce sites using NoSQL is relevant to this question.
Chad Birch
When you buy a book via Amazon, it is stored in Oracle.
Theo
I have looked into this, and it is true. For important operations Oracle is used.
Saif Bechan
+13  A: 

The overhead that makes RDBMS's so slow, is guaranteeing atomicity, consistency, isolation, durability, also known as ACID. Some of these properties are pretty critical for applications that deal with money. You don't want to lose a single order when the lights go out.

NoSQL databases usually sacrifice some or all of the ACID properties in return for severely reduced overhead. For many applications, this is fine -- if a few "diggs" go missing when the lights go out, it's no big deal.

For an ecommerce site, you need to ask yourself what you really need.

  1. Do you really need a level of performance that a RDBMS can't deliver?
  2. Do you need the reliability that an DRMBS provides?

Honestly, the answer to #2 is probably "yes", which rules out most NoSQL solutions. And unless you're dealing with traffic levels comparable to amazon.com's, an RDBMs, even on modest hardware will probably satisfy your performance needs just fine, especially if you limit yourself to simple queries, and index properly. Which makes the answer to #1 "no".

You could however, consider using a RDBMS for transaction data, and a NoSQL database for non-critical data, like product pages, user reviews, etc. But then you'd have twice as much datastore software to install, and any relationships between the data in the two datastores would have to be managed in code -- there'd be no JOINing your NoSQL database against your RDBMS. This would likely result in an unnecessary level of complexity.

In the end, if an RDBMS offers features you must have for reliability, and it performs acceptably for the sorts of load you'll be experiencing, an RDBMS is probably the best bet.

Frank Farmer
+1 Indeed in the end I was questioning the solution to have a mix as you stated. But share your opinion on the unnecessary level of complexity. I will keep your points in mind.
Saif Bechan
Do you have any ideas on the amount of extra memory it will cost to have both of them running. Maybe it will be a good idea to just have the catalogue loaded in mongodb. Let out the complexity, do you see any major benefits for this?
Saif Bechan
+1  A: 

You guys should check this out:

Replication Acknowledgement via getlasterror

MongoDB is on the verge of providing durable writes. I think that is the main issue with people discuss this topic w.r.t. money. The transactional part is less important due to the nested document features.

Michael Kennedy
+8  A: 

Just posted some thoughts on MongoDB and E-commerce: http://kylebanker.com/blog/2010/04/30/mongodb-and-ecommerce/

Kyle Banker
Thank you for the nice link. This is very informative.
Saif Bechan
+1. Nice article. The main point is that one should not instinctively model the solution as a bunch of RDBMS tables since that leads to a set of problems that is only solvable by a RDBMS. If you instead think in terms of documents/objects, the ACID-related problem just disappears and it is clear that you can use a noSQL database for e-commerce as well!
Martin Wickman
A: 

Gilt.com uses Voldemort to handle basket / inventory under huge load. See this presentation from London QCon 2010 on the details - http://www.infoq.com/presentations/Project-Voldemort-at-Gilt-Groupe

I'd also reiterate the fact that "NoSQL" does not mean "No SQL", but "Not Only SQL", and that instead of looking at any technology for a complete rip/replace of any other you should be looking at the best tool for the job. NoSQL data stores don't make very good data warehouses, and probably aren't appropriate for storing user transactions, but they are very good in certain niche areas - see the Gilt Groupe example above.

Another prominent example is the BBC homepage - not transactional, but interesting nonetheless. They use CouchDB to store user preferences. Unfortunately, they appear to have crashed under the load.

Hugo Rodger-Brown