views:

771

answers:

3

I'm wondering if it's worth the server time when updating a record to retrieve the existing record, loop through the fields checking for changes and only putting the changed fields in the update query? (I am using MySQL & PHP.)

The main reason for doing this is to reduce the size of the update query for change log purposes. Normally the query might have 15 fields, but only 2 fields are actually being changed. This query can then also be used for logging as it will only contain the changed fields and therefore is easier to parse.

My concern is the time that it takes to retrieve the existing record.

Or is there a way to retrieve from MySQL which fields it updated?

+1  A: 

The slowest point in your application is always going to be your database access, so if you can speed that up, it’s a good idea. That said it really depends how big your database, and records are, and how big they are likely to grow to, as to whether it’s worth the effort to programmatically check whether the items have updated. If your database is small and access is already quite quick, it may not be worth the time. But if speed can be improved, and it gives you an added benefit for your logging, then go for it.

Sam Cogan
+2  A: 

I think it's worth changing - but probably not worth doing a select before insert.

I only update the fields that have changed, it's part of the operation of my DbEntity class which follows an activerecord pattern. It costs little extra to do this because I hold the current record and original records -simply copying whenever a record is loaded.

Reasons are brevity - not really performance. Also you can check for concurrent modification by adding a where clause on the old value of the updated fields and throw the appropriate error.

In the write/update method:

$s1 = "";

foreach ($this->record as $key => $value)
{
    // only update fields that have been changed
    if ($value != $this->orig_record[$key])
    {
        $s1 .= $comma."`$key`='".mysql_real_escape_string($value)."'";
        $comma = ", ";
    }
}

$query = "UPDATE ".$this->table." SET $s1 where {$this->id_field}='".$this->get_keyfield()."'";
$query .= $this->extra_sql_update;
mysql_query($query);

$ar = mysql_affected_rows();
//
// the number of affected rows is actually those changed by the update operation, which will 
// either be zero, or 1. If the query affects more than one row then we have a problem.
if ($ar < 0 || $ar > 1)
{
    cbf_error("cbf_dbentity: {$this->table} :: only one row (not $ar) must be affected by an insert operation. $query",
      E_USER_ERROR);
}
else
{
    $new_id = $this->get_keyfield();

    GlobalEventBus::notify_all(new AuditLogSQL($this->table, "update", $query));

}

$this->orig_record = Array();

foreach ($this->record as $key => $value)
    $this->orig_record[$key] = $value;


//
// sanity check - ensure that what we have just written is actually there.

$this->load($new_id);

foreach ($this->orig_record as $key => $value)
    if (trim($this->record[$key]) != trim($value) 
        && (!$this->record[$key] == "0" && $value=""))
        cbf_error("cbf_dbentity: {$this->table} :: record differs during write after reload: field $key was \"$value\", after write it is now \"".
              $this->record[$key]."\"",E_USER_ERROR);

In the load method

$this->orig_record = Array();
foreach ($this->record as $key => $value)
    $this->orig_record[$key] = $value;
Richard Harrison
A: 

At the most basic level, if I'm reading your question right, you generally don't want to blindly update the entire record in case another user has already updated parts of that record that you've not actually changed. You would blindly and needlessly revert their updates.

I believe your current algorithm may lead to dirty writes, if you're going to read the current once for update, allow updates to be made in memory, then read the record again to allow you to figure out which fields have been updated. What happens if another user updated that record behind your back,leading your algorithm to believe that you were the one to update that field? But primarily, you shouldn't have to read every record twice to perform a single update.

If your data does not often result in conflicts, you may benefit from reading about optimistic locking, even if you don't choose to implement it.

We've implemented one method here whereby you add an update-timestamp or an incremental update-number column to your table. Then in your sandbox/memory you can keep track of which fields you have modified (oldvalue/newvalue), and you can freely issue your update SQL for that record for those fields, "UPDATE...WHERE UPDATENUM=the-original-number" (or WHERE UPDATETS=the-original-timestamp), making sure that your update SQL also increments the UPDATENUM or UPDATETS, as appropriate. If the records-affected by that update SQL is 0 you know that someone else has already modified that record in the background and you now have a conflict. But at least you didn't overwrite someone else's changes, and you can then reread the new data or have your user resolve the conflict.

JMD