ansaurus

Question

Answer 1

A:

The first thing I would do is simplify the Authors entity as I don't think you need the Initials, ForeName, and LastName objects as separate entities. I think using plain strings would be more efficient:

public class Author : Entity
{   
    public virtual string Initials { get; set; }
    public virtual string ForeName { get; set; }
    public virtual string LastName { get; set; }
}

Dan 2010-08-11 20:54:26

thanks - this is a remainder from when I tried to use lookuptables for names - where I had an int rather than the actual string as primary key. I then realised that the lookup of the int takes a lot of time - so I paid for insert efficiency with memory.

csetzkorn 2010-08-12 08:11:33

Answer 2

+1 A:

In all honesty I would say that you shouldn't even be using SA/NHibernate for something like this. It's a bulk data import from XML - an ETL tool like SSIS would be a better choice. Even a hand-cranked process on the DB server would work better - step 1, load XML to a table, step 2, do the UPSERT. Incidentally, SQL 2008 introduced the MERGE command for UPSERT operations, which might be of use.

I would also agree with Dan's comment - is it really necessary to treat initials, forename and surname as separate entities? Treating them as simple strings would boost performance. What in your domain model specifies that they are entities in their own right?

If you really must continue using SA/NHibernate, have a read of this: http://www.lostechies.com/blogs/jimmy_bogard/archive/2010/06/24/bulk-processing-with-nhibernate.aspx

The suggestion in Jimmy's blog about batching SELECTs should help quite a lot. If you plan to process a batch of 250 records at once, do all the SELECTs as a single NH command, process all the data, then do all the updates as another single batch (which I believe your use of EntityWithTypedId and the adonet.batch_size config setting will help achieve)

Finally - regarding the statement "which I parse using Linq to XML" - is that really the best way of doing it? I'm guessing that it might be, given the size of your input file, but are you aware of the approach of simply deserializing the XML file into an object graph? SO won't let me post the link to a page describing this, because I haven't earned enough reputation yet - but if you want to read up on it, Google "don't parse that xml" and the first article will explain it.

Hope this helps. Jon

Jon 2010-08-13 06:10:26

I ended up using xsd2code. make sure that, if your xml contains dtd information, use: BlaClassBlaClass = ((BlaClass)(serializer.Deserialize(System.Xml.XmlReader.Create(new XmlTextReader(filepath), new System.Xml.XmlReaderSettings() { ProhibitDtd = false }))));

csetzkorn 2010-08-16 09:49:37

ansaurus

tags:

views:

answers:

NHibernate insert/lookup performance

related questions