I am using mySQL from their C API, but that shouldn't be relevant. My code must process records from a table that match some criteria, and then update the said records to flag them as processed. The lines in the table are modified/inserted/deleted by another process I don't control. I am afraid in the following, the UPDATE might flag some records erroneously since the set of records matching might have changed between step 1 and step 3.
SELECT * FROM myTable WHERE <CONDITION>; # step 1
<iterate over the selected set of lines. This may take some time.> # step 2
UPDATE myTable SET processed=1 WHERE <CONDITION> # step 3
What's the smart way to ensure that the UPDATE updates all the lines processed, and only them? A transaction doesn't seem to fit the bill as it doesn't provide isolation of that sort: a recently modified record not in the originally selected set might still be targeted by the UPDATE statement. For the same reason, SELECT ... FOR UPDATE doesn't seem to help, though it sounds promising :-)
The only way I can see is to use a temporary table to memorize the set of rows to be processed, doing something like:
CREATE TEMPORARY TABLE workOrder (jobId INT(11));
INSERT INTO workOrder SELECT myID as jobId FROM myTable WHERE <CONDITION>;
SELECT * FROM myTable WHERE myID IN (SELECT * FROM workOrder);
<iterate over the selected set of lines. This may take some time.>
UPDATE myTable SET processed=1 WHERE myID IN (SELECT * FROM workOrder);
DROP TABLE workOrder;
But this seems wasteful and not very efficient.
Is there anything smarter?
Many thanks from a SQL newbie.