tags:

views:

80

answers:

2

Hi all,

First off, I want to start by saying I am not an SQL programmer (I'm a C++/Delphi guy), so some of my questions might be really obvious. So pardon my ignorance :o)

I've been charged with writing a script that will update certain tables in a database based on the contents of a CSV file. I have it working it would seem, but I am worried about atomicity for one of the steps: One of the tables contains only one field - an int which must be incremented each time, but from what I can see is not defined as an identity for some reason. I must create a new row in this table, and insert that row's value into another newly-created row in another table.

This is how I did it (as part of a larger script):

DECLARE @uniqueID INT,
@counter INT,
@maxCount INT

SELECT @maxCount = COUNT(*) FROM tempTable
SET @counter = 1

WHILE (@counter <= @maxCount)
BEGIN
SELECT @uniqueID = MAX(id) FROM uniqueIDTable        <----Line 1
INSERT INTO uniqueIDTableVALUES (@uniqueID + 1)      <----Line 2

SELECT @uniqueID = @uniqueID + 1

UPDATE TOP(1) tempTable
SET userID = @uniqueID
WHERE userID IS NULL

SET @counter = @counter + 1
END
GO

First of all, am I correct using a "WHILE" construct? I couldn't find a way to achieve this with a simple UPDATE statement.

Second of all, how can I be sure that no other operation will be carried out on the database between Lines 1 and 2 that would insert a value into the uniqueIDTable before I do? Is there a way to "synchronize" operations in SQL Server Express?

Also, keep in mind that I have no control over the database design.

Thanks a lot!

+1  A: 

You need a transaction to ensure atomicity and you need to move the select and insert into one statement or do the select with an updlock to prevent two people from running the select at the same time, getting the same value and then trying to insert the same value into the table.

Basically

DECLARE @MaxValTable TABLE (MaxID int)

BEGIN TRANSACTION
BEGIN TRY

  INSERT INTO uniqueIDTable VALUES (id) 
  OUTPUT inserted.id INTO @MaxValTable
  SELECT MAX(id) + 1 FROM uniqueIDTable

  UPDATE TOP(1) tempTable
  SET userID = (SELECT MAXid FROM @MaxValTable)
  WHERE userID IS NULL

  COMMIT TRANSACTION
END TRY
BEGIN CATCH
  ROLLBACK TRANSACTION
  RAISERROR 'Error occurred updating tempTable' -- more detail here is good
END CATCH

That said, using an identity would make things far simpler. This is a potential concurrency problem. Is there any way you can change the column to be identity?

Edit: Ensuring that only one connection at a time will be able to insert into the uniqueIDtable. Not going to scale well though.

Edit: Table variable's better than exclusive table lock. If need be, this can be used when inserting users as well.

GilaMonster
INSERT ... SELECT MAX() is not atomic. And the subsequent UPDATE can see other commited inserts as new MAX, resulting in mutiple updates applying the same id.
Remus Rusanu
Thanks for the reply.I don't think I can change the the column - it's not our application - but I'll see.Assuming I could, how would it change things in tis case?
Bourgui
Ok, I'll edit it to be more paranoid, but far less performant.
GilaMonster
If you could change the column to identity you wouldn't need the uniqueIDtable at all. You could just insert into tempTable and SQL would handle the ID column itself.
GilaMonster
Thanks Gila.Performance is not a huge issue in this case, it is for a maintenance, after-hours operation.However, I can't seem to get this working with my while loop. Must be something simple, but as I mentioned I'm out of my element here, and of course this is due yesterday :o)I'll go with Remus's answer below for now. I'll test both in more details over the weekend (hopefully)
Bourgui
Edit: Thinking some more, there's a better way.Bourgui, what exactly do you mean by 'can't get this working'? Errors?
GilaMonster
+1  A: 

You can do the whole 9 yards in one single statement:

WITH cteUsers AS (
  SELECT t.*
    , ROW_NUMBER() OVER (ORDER BY userID) as rn
    , COALESCE(m.id,0) as max_id
  FROM tempTable t WITH(UPDLOCK)
  JOIN (
    SELECT MAX(id) as id
    FROM uniqueIDTable WITH (UPDLOCK)
    ) as m ON 1=1
  WHERE userID IS NULL)
UPDATE cteUsers
  SET userID = rn + max_id
OUTPUT INSERTED.userID
INTO uniqueIDTable (id);

You get the MAX(id), lock the uniqueIDTable, compute sequential userIDs for users with NULL userID by using ROW_NUMBER(), update the tempTable and insert the new ids into uniqueIDTable. All in one operation.

For performance you need and index on uniqueIDTable(id) and index on tempTable(userID).

SQL is all about set oriented operations, WHILE loops are the code smell of SQL.

Remus Rusanu
Thank you Remus. I had a feeling WHILE loops weren't the way to go, but I could not come up with a statement like yours. It seems to work perfectly from the few tests I've done so far.
Bourgui