I have a huge list of URLs in a MySQL InnoDB table, and worker processes that query MySQL for a set of URLs to process. The URLs should immediately be marked as being processed, so that other worker processes do not waste resources by starting to process the same ones.
Currently I first do this to get some URLs:
SELECT DISTINCT url FROM urls WHERE task_assigned is NULL ORDER BY id LIMIT 100
Then in code I naively loop through each of those urls to mark it as being processed:
UPDATE urls SET task_assigned = NOW() WHERE url = ? COLLATE utf8_bin
I'm perfectly aware how silly and inefficient this is. More importantly there is no guarantee that another worker process wouldn't try to get a list in the middle of my UPDATEs. What's the beautiful way to do this? Should I make it a transaction, how?