views:

1438

answers:

5

I've been playing with MongoDB recently (It's AMAZINGLY FAST) using the C# driver on GitHub. Everything is working just find in my little single threaded console app I'm testing with. I'm able to add 1,000,000 documents (yes, million) in under 8 seconds running single threaded. I only get this performance if I use the connection outside the scope of a for loop. In other words, I'm keeping the connection open for each insert rather than connecting for each insert. Obviously that's contrived.

I thought I'd crank it up a notch to see how it works with multiple threads. I'm doing this because I need to simulate a website with multiple concurrent requests. I'm spinning up between 15 and 50 threads, still inserting a total of 150,000 documents in all cases. If I just let the threads run, each creating a new connection for each insert operation, the performance grinds to a halt.

Obviously I need to find a way to share, lock, or pool the connection. Therein lies the question. What's the best practice in terms of connecting to MongoDB? Should the connection be kept open for the life of the app (there is substantial latency opening and closing the TCP connection for each operation)?

Does anyone have any real world or production experience with MongoDB, and specifically the underlying connection?

Here is my threading sample using a static connection that's locked for insert operations. Please offer suggestions that would maximize performance and reliability in a web context!

   private static Mongo _mongo;

    private static void RunMongoThreaded()
    {
        _mongo = new Mongo();
        _mongo.Connect();

        var threadFinishEvents = new List<EventWaitHandle>();

        for(var i = 0; i < 50; i++)
        {
            var threadFinish = new EventWaitHandle(false, EventResetMode.ManualReset);
            threadFinishEvents.Add(threadFinish);

            var thread = new Thread(delegate()
                {
                     RunMongoThread();
                     threadFinish.Set();
                });

            thread.Start();
        }

        WaitHandle.WaitAll(threadFinishEvents.ToArray());
        _mongo.Disconnect();
    }

    private static void RunMongoThread()
    {
        for (var i = 0; i < 3000; i++)
        {
            var db = _mongo.getDB("Sample");
            var collection = db.GetCollection("Users");
            var user = GetUser(i);
            var document = new Document();
            document["FirstName"] = user.FirstName;
            document["LastName"] = user.LastName;

            lock (_mongo) // Lock the connection - not ideal for threading, but safe and seemingly fast
            {
                collection.Insert(document);
            }
        }
    }

Thanks!

+2  A: 

The thing to remember about a static connection is that it's shared among all your threads. What you want is one connection per thread.

Joel Coehoorn
You may have missed the part where I stated that one connection per thread is noticeably slow. I don't think that's the best answer for a high traffic website.
Acoustic
For your sample, where you are grouping things, one per thread is the best you can do. A static, shared connection _will_ create deadlocks like you're seeing. Your alternative is to do connection pooling. That's something that the sql server provider has built-in but for mongo you'll have to build yourself, and it's not trivial to get right.
Joel Coehoorn
A: 

This discussion regarding MongoDB and connection pooling may help.

Eric J.
A: 

Connection Pool should be your answer.

The feature is being developed (please see http://jira.mongodb.org/browse/CSHARP-9 for more detail).

Right now, for web application, the best practice is to connect at the BeginRequest and release the connection at EndRequest. But to me, I think that operation is too expensive for each request without Connection Pool. So I decide to have the global Mongo object and using that as shared resource for every threads (If you get the latest C# driver from github right now, they also improve the performance for concurrency a bit).

I don't know the disadvantage for using Global Mongo object. So let's wait for another expert to comment on this.

But I think I can live with it until the feature(Connection pool) have been completed.

ensecoz
A: 

Somewhat but still of interest is CSMongo, a C# driver for MongoDB created by the developer of jLinq. Here's a sample:

//create a database instance
using (MongoDatabase database = new MongoDatabase(connectionString)) {

    //create a new document to add
    MongoDocument document = new MongoDocument(new {
        name = "Hugo",
        age = 30,
        admin = false
    });

    //create entire objects with anonymous types
    document += new {
        admin = true,
        website = "http://www.hugoware.net",
        settings = new {
            color = "orange",
            highlight = "yellow",
            background = "abstract.jpg"
        }
    };

    //remove fields entirely
    document -= "languages";
    document -= new[] { "website", "settings.highlight" };

    //or even attach other documents
    MongoDocument stuff = new MongoDocument(new {
        computers = new [] { 
            "Dell XPS", 
            "Sony VAIO", 
            "Macbook Pro" 
            }
        });
    document += stuff;

    //insert the document immediately
    database.Insert("users", document);

}
David Robbins