views:

87

answers:

2

Using NHibernate 2.1.2.4000 against SQL Server 2008. The target table has no triggers or extraneous indexes. It is simply:

create table LogEntries (
    Id INT IDENTITY NOT NULL,
   HostName NVARCHAR(32) not null,
   UserName NVARCHAR(64) not null,
   LogName NVARCHAR(512) not null,
   Timestamp DATETIME not null,
   Level INT not null,
   Thread NVARCHAR(64) not null,
   Message NVARCHAR(MAX) not null,
   primary key (Id)
)

My entity mapping is:

<class name="LogEntry" table="LogEntries">
    <id name="Id" unsaved-value="0">
        <generator class="native"/>
    </id>
    <property name="HostName" length="32" not-null="true"/>
    <property name="UserName" length="64" not-null="true"/>
    <property name="LogName" length="512" not-null="true"/>
    <property name="Timestamp" type="utcdatetime" not-null="true"/>
    <property name="Level" not-null="true"/>
    <property name="Thread" length="64" not-null="true"/>
    <property name="Message">
        <column name="Message" sql-type="NVARCHAR(MAX)" not-null="true"/>
    </property>
</class>

Now, consider the following test case:

[Fact]
public void bulk_insert_test()
{
    var batchSize = 100;
    var numberItems = 10000;

    var configuration = new NHibernate.Cfg.Configuration().Configure();
    configuration.SetProperty("connection.connection_string", @"my_conn_string");
    configuration.SetProperty("adonet.batch_size", batchSize.ToString());
    var sessionFactory = configuration.BuildSessionFactory();

    var ts = this.WriteWithNH(sessionFactory, numberItems);
    ////var ts = this.WriteWithBC(sessionFactory, numberItems, batchSize);

    Console.WriteLine("Saving {0} items with batch size {1}: {2}", numberItems, batchSize, ts);
}

public TimeSpan WriteWithNH(ISessionFactory sessionFactory, int numberItems)
{
    using (var session = sessionFactory.OpenStatelessSession())
    using (var transaction = session.BeginTransaction())
    {
        session.Insert(new LogEntry()
        {
            HostName = "host",
            UserName = "user",
            LogName = "log",
            Level = 0,
            Thread = "thread",
            Timestamp = DateTime.UtcNow,
            Message = "Warm up"
        });

        transaction.Commit();
    }

    var sw = Stopwatch.StartNew();

    using (var session = sessionFactory.OpenStatelessSession())
    using (var transaction = session.BeginTransaction())
    {
        for (var i = 0; i < numberItems; ++i)
        {
            session.Insert(new LogEntry()
            {
                HostName = "host",
                UserName = "user",
                LogName = "log",
                Level = 0,
                Thread = "thread",
                Timestamp = DateTime.UtcNow,
                Message = "Message " + i
            });
        }

        transaction.Commit();
    }

    return sw.Elapsed;
}

public TimeSpan WriteWithBC(ISessionFactory sessionFactory, int numberItems, int batchSize)
{
    using (var session = sessionFactory.OpenStatelessSession())
    using (var bulkCopy = new SqlBulkCopy((SqlConnection)session.Connection))
    {
        bulkCopy.BatchSize = batchSize;
        bulkCopy.DestinationTableName = "LogEntries";
        var table = new DataTable("LogEntries");
        table.Columns.Add("Id", typeof(int));
        table.Columns.Add("HostName", typeof(string));
        table.Columns.Add("UserName", typeof(string));
        table.Columns.Add("LogName", typeof(string));
        table.Columns.Add("Timestamp", typeof(DateTime));
        table.Columns.Add("Level", typeof(int));
        table.Columns.Add("Thread", typeof(string));
        table.Columns.Add("Message", typeof(string));

        var row = table.NewRow();
        row["HostName"] = "host";
        row["UserName"] = "user";
        row["LogName"] = "log";
        row["Timestamp"] = DateTime.UtcNow;
        row["Level"] = 0L;
        row["Thread"] = "thread";
        row["Message"] = "Warm up";
        table.Rows.Add(row);

        bulkCopy.WriteToServer(table);
    }

    var sw = Stopwatch.StartNew();

    using (var session = sessionFactory.OpenStatelessSession())
    using (var bulkCopy = new SqlBulkCopy((SqlConnection)session.Connection))
    {
        bulkCopy.BatchSize = batchSize;
        bulkCopy.DestinationTableName = "LogEntries";
        var table = new DataTable("LogEntries");
        table.Columns.Add("Id", typeof(int));
        table.Columns.Add("HostName", typeof(string));
        table.Columns.Add("UserName", typeof(string));
        table.Columns.Add("LogName", typeof(string));
        table.Columns.Add("Timestamp", typeof(DateTime));
        table.Columns.Add("Level", typeof(int));
        table.Columns.Add("Thread", typeof(string));
        table.Columns.Add("Message", typeof(string));

        for (var i = 0; i < numberItems; ++i)
        {
            var row = table.NewRow();
            row["HostName"] = "host";
            row["UserName"] = "user";
            row["LogName"] = "log";
            row["Timestamp"] = DateTime.UtcNow;
            row["Level"] = 0;
            row["Thread"] = "thread";
            row["Message"] = "Message " + i;
            table.Rows.Add(row);
        }

        bulkCopy.WriteToServer(table);
    }

    return sw.Elapsed;
}

Here is some sample output when using NHibernate to perform the inserts:

Saving 10000 items with batch size 500: 00:00:12.3064936
Saving 10000 items with batch size 100: 00:00:12.3600981
Saving 10000 items with batch size 1: 00:00:12.8102670

As a point of comparison, you'll see I also implemented a BCP-based solution. Here is some sample output:

Saving 10000 items with batch size 500: 00:00:00.3142613
Saving 10000 items with batch size 100: 00:00:00.6757417
Saving 10000 items with batch size 1: 00:00:26.2509605

Clearly, the BCP solution is miles faster than the NH one. Also evident is that batching is affecting the speed of the BCP solution, but not the NH one. When using NHibernate to do inserts, NHProf shows the following:

alt text

There are only INSERTs, no SELECTs. Interestingly, at no point does NHProf give me this warning.

I have tried specifying adonet.batch_size in both my config file and in code as per the test case above.

Now, I'm not expecting the NH solution to ever reach the speed of the BCP solution, but I'd at least like to know why batching isn't working. If it's good enough with batching enabled, then I may use the NH solution over the BCP just to keep the code base simpler.

Can anyone explain why NH is refusing to honor ADO.NET batching, and what I can do to fix it? All the scattered NH "documentation" I've read states that all you need to do is specify adonet.batch_size and (preferably) use a stateless session, but I'm doing both those things.

Thanks

+2  A: 

Using identity breaks batching.

Explained by Fabio Maulo here

The best alternative is switching to a different generator (I always recommend hilo or guid.comb)

Diego Mijelshon
Thanks Diego, but Fabio's post only talks about using `ISession`, not `IStatelessSession`. As per my question, I am using the latter. There is no need to do the subsequent `SELECT` to obtain the ID, and nor does that happen. Therefore, it should be possible to batch the `INSERT`s.
Kent Boogaart
Also, section 17.6 of the NH documentation (http://www.nhforge.org/doc/nh/en/index.html#performance-batch-updates) lists limitations of batching, and it doesn't mention ID generators at all. Par for the course with NH, I know - but so incredibly frustrating.
Kent Boogaart
Kent, I don't think `IStatelessSession` supports batching at all. Have you tried using a regular `ISession`? In some cases, it performs better, even for batch scenarios.
Diego Mijelshon
@Diego: it does, but I don't blame you for being unsure. I think the crux of the problem is the awful state of NH documentation. It seems that even with a stateless session, NH is still insisting on updating the entity ID, which precludes batching. I can't find a citation, least of all in the NH documentation, but I'm marking yours as answer.
Kent Boogaart
By the way, I meant to point out that I mistaken about saying that the `SELECT` was not happening, in case that wasn't obvious.
Kent Boogaart
+1  A: 

Neither ISession or IStatelessSession will be able to batch inserts if you are using identity as primary key.

When Insert is made Nhibernate will put the correct value in your Id property. But when you are using Identity the only place where this Id can be taken is the data base. Use hilo to make batch inserts.

Sly
+1 That is the general consensus, but I was hoping for a citation. My use of a stateless session to me should be enough to forgo updating the entity ID. Failing that, a means for me to express my disinterest in the entity ID would be useful because then this would enable batching to occur. Perhaps this is a case of wrong tool for the job, but I was hoping to at least trial it.
Kent Boogaart