tags:

views:

51

answers:

2

I know its a common question asked several times on SO. But help me either way. Actually I have to upload data from my local machine to remote sql database. The remote sql database has a single table where 800,000 records are there. Now from here i have around 1,21311 records locally in my system, from which 75% records already exists on the remote database, but we dont know what records, exactly. We are checking our number using a unique code called DCNNumber. If DCN exists on server then exclude else insert.

So for that what i did is that I collected all DCN from my remote database to a XML using Dataset. The XML alone becomes a 24mb file. From my local text files i am parsing a pulling 1.2 lacs of record to a Generic List. Also the XML DCN are added to a Generic List of String.

Then these two lists are compared using if (! lstODCN.Contains(DCNFromXML)){lstNewDCN.Add(item)};

But this code is taking almost an hour to execute and filter records. So i need some optimal way to filter such a huge figure.

+2  A: 

Load all the results into a HashSet<string> - that will be much faster at checking containment.

It's possible that LINQ would also make this simpler, but I'm somewhat confused as to exactly what's going on... I suspect you can just use:

var newDCNs = xmlDCNs.Except(oldDCNs);
Jon Skeet
+1  A: 

In addition to Jon's answer: using an XML dataset to transfer the data from the server is probably a bad idea, because XML is a very verbose format. Using a flat-file format + compression would be much more efficient.

Thomas Levesque