tags:

views:

108

answers:

2

I receive a daily XML file which I use to update a database with the content. The file is always a complete file, i.e. everything is included whether it is changed or not. I am using Linq2Sql to update the database and I was debating whether to check if anything had changed in each record (most will not change) and only update those which did change, or just update each record with the current data.

I feel that I need to hit the database with an update for each record to enable me to weed out the records which are not included in the xml file. I am setting a processed date on each record, then revisiting those not processed to delete them. Then I wondered whether I should just find the corresponding record in the database ad update the object with the current information whether it has changed or not. That led me to taking a closer look at the sql generated for updates. I found that only the data which has changed is set in the update statement to the database, but I found that the WHERE clause includes all of the columns in the record, not just the primary key. This seems very wasteful in terms of data flying around the system and therefore set me wondering why this is the case and whether there is setting for the LinqToSql context to use only the primary key in the clause.

So I have two questions:

  1. Why does LinqToSql where clause include all of the current data, not just the primary key?
  2. Is there a way to configure the context to only use the primary key in the where clause?
+2  A: 

This is optimistic concurrency - it's basically making sure that it doesn't stomp on changes made by anything else. You can tweak the concurrency settings in various ways, although I'm not an expert on it.

The MSDN page for Linq to Sql optimistic concurrency is a good starting point.

If you have a column representing the "version" of the row (e.g. an autoupdated timestamp) you can use just that - or you can just set UpdateCheck=Never on all the columns if you know nothing else will have changed the data.

You haven't really described enough about "your use of the processed date" to answer the third point.

Jon Skeet
Just to clarify what Jon says about a timestamp: L2S will automagically fall back to only the timestamp if you have a timestamp/rowversion column in your table.
KristoferA - Huagati.com
Thanks Jon... Yes a quick look at the docs indicates that the UpdateCheck property on each columnAttribute defines if the column is checked Always (default), Never or When Changed.
Richbits
Thanks Kristofer, that is an interesting work around for a global reduction of data. I'll probably stick with the default for time being, but that and update changes provide some useful flexibility.
Richbits
I have now removed #3 as it is really a different topic, I'll rewrite as a seperate question.
Richbits
+1  A: 

To answer #2, in the dbml designer, set the property "Update Check" equal to "Never" on the column level for each column in the table to avoid the generation of massive where clauses.

Dave
Also without this you can't attach detached entities, the biggest gripe people had with the framework and probably borrowed from the way NH handles its sessions
Chris S