I receive a daily XML file which I use to update a database with the content. The file is always a complete file, i.e. everything is included whether it is changed or not. I am using Linq2Sql to update the database and I was debating whether to check if anything had changed in each record (most will not change) and only update those which did change, or just update each record with the current data.
I feel that I need to hit the database with an update for each record to enable me to weed out the records which are not included in the xml file. I am setting a processed date on each record, then revisiting those not processed to delete them. Then I wondered whether I should just find the corresponding record in the database ad update the object with the current information whether it has changed or not. That led me to taking a closer look at the sql generated for updates. I found that only the data which has changed is set in the update statement to the database, but I found that the WHERE clause includes all of the columns in the record, not just the primary key. This seems very wasteful in terms of data flying around the system and therefore set me wondering why this is the case and whether there is setting for the LinqToSql context to use only the primary key in the clause.
So I have two questions:
- Why does LinqToSql where clause include all of the current data, not just the primary key?
- Is there a way to configure the context to only use the primary key in the where clause?