views:

57

answers:

1

I'd like to evaluate a document db, probably mongo db in an ASP.Net MVC web shop.

A little reasoning at the beginning:

There are about 2 million products.

The product model would be pretty bad for rdbms as there'd be many different kinds of products with unique attributes.

For example, there'd be books which have isbn, authors, title, pages etc as well as dvds with play time, directors, artists etc and quite a few more types.

In the end, I'd have about 9 different products with a combined column count (counting common columns like title only once) of about 70 to 100 whereas each individual product has 15 columns at most.

The three commonly used ways in RDBMS would be:

EAV model which would have pretty bad performance characteristics and would make it either impractical or perform even worse if I'd like to display the author of a book in a list of different products (think start page, recommended products etc.).

Ignore the column count and put it all in the product table: Although I deal with somewhat bigger databases (row wise), I don't have any experience with tables with more than 20 columns as far as performance is concered but I guess 100 columns would have some implications.

Create a table for each product type: I personally don't like this approach as it complicates everything else.

C# Driver / Classes:

I'd like to use the NoRM driver and so far I think i'll try to create a product dto that contains all properties (grouped within detail classes like book details, except for those properties that should be displayed on list views etc.).

In the app I'll use BookBehavior / DvdBehaviour which are wrappers around a product dto but only expose the revelent Properties.

My questions now:

  1. Are my performance concerns with the many columns approach valid?

  2. Did I overlook something and there is a much better way to do it in an RDBMS?

  3. Is MongoDb on Windows stable enough?

  4. Does my approach with different behaviour wrappers make sense?

A: 

Are my performance concerns with the many columns approach valid?

Honestly, I don't think performance is really the key issue here. If you have 2M rows with 100 columns and those columns never change then SQL Server / MySQL / etc. will actually do just fine. It might suck up a lot of unnecessary space, but the DB will probably perform adequately well.

What the DB won't do is accept any changes very readily and I think that's the central concern. Trying to add a column into a central table with 2M rows is basically going to be a nightmare. You can "farm out" the separate fields to separate sub-tables, but that doesn't necessarily solve the problem, just delays it. EAV is also kind of a nightmare as you now take your 2M rows and convert them into 20M rows.

Did I overlook something and there is a much better way to do it in an RDBMS?

As long as you're willing to make the usual RDBMS sacrifices, then you'll probably find MongoDB to be much faster, easier and flexible for basic CRUD.

Is MongoDb on Windows stable enough?

I can't speak to that. The forums definitely have people running on Windows. But 2M records is honestly small potatoes for most people that are using it.

Does my approach with different behaviour wrappers make sense?

Are you suggesting the following?

  • All of the product data lives in one collection.
  • Each product is flagged with a type, so when accessed by the code an appropriate "wrapper class" is loaded.

If so, then I think that's the way to go. That way you can implement business logic (books must have ISBNs) while still making storage really simple.

Gates VP