views:

162

answers:

6

I've inherited a (Microsoft?) SQL database that wasn't very pristine in its original state. There are still some very strange things in it that I'm trying to fix - one of them is inconsistent ID entries.

In the accounts table, each entry has a number called accountID, which is referenced in several other tables (notes, equipment, etc. ). The problem is that the numbers (for some random reason) - range from about -100000 to +2000000 when there are about only 7000 entries.

Is there any good way to re-number them while changing corresponding numbers in the other tables? At my disposal I also have Coldfusion, so any thing that works with SQL and/or that I'll accept.

A: 

You could probably disable the foreign key relationships (if you're able to take it offline temporarily) and then update the primary keys using a script. I've used this update script before to change values, and you could pretty easily wrap this code in a cursor to go through the key values in question, one by one, and update the arbitrary value to an incrementing value you're keeping track of.

Check out the script here: http://vyaskn.tripod.com/sql_server_search_and_replace.htm

If you just have a list of tables that use the primary key, you could set up a series of UPDATE statements that run inside your cursor, and then you wouldn't need to use this script (which can be a little slow).

It's worth asking, though, why these values appear out of wack. Does this database have values added and deleted constantly? Are the primary key values really arbitrary, or do they just appear to be, but they really have meaning? Though I'm all for consolidating, you'd have to ensure that there's no purpose to those values.

rwmnau
The database was designed who knows how - i wasn't there back then - and there are different odd things. At one time, a number called servicecallID was the date, instead of an incrementing number. It went skipped up to 2005xxxx then started incrementing by one again.
Iuvat
A: 

With ColdFusion this shouldn't be a herculean task, but it will be messy and you'll have to be careful. One method you could use would be to script the database and then generate a brand new, blank table schema. Set the accountID as an identity field in the new database.

Then, using ColdFusion, write a query that will pull all of the old account data and insert them into the new database one by one. For each row, let the new database assign a new ID. After each insert, pull the new ID (using either @@IDENTITY or MAX(accountID)) and store the new ID and the old ID together in a temporary table so you know which old IDs belong to which new IDs.

Next, repeat the process with each of the child tables. For each old ID, pull its child entries and re-insert them into the new database using the new IDs. If the primary keys on the child tables are fine, you can insert them as-is or let the server assign new ones if they don't matter.

Assigning new IDs in place by disabling relationships temporarily may work, but you might also run into conflicts if one of the entries is assigned an ID that is already being used by the old data which could cause conflicts.

Justin Scott
I've done a coldfusion page for one of the smaller tables, but it just takes too long for this one, and for some reason i get mismatches. do you have mock up code for a good way?
Iuvat
Never use @@identity to pull the new id, you can get incorrect results and mess up data integrity. Use scope_identity() instead.
HLGEM
+4  A: 

For surrogate keys, they are meant to be meaningless, so unless you actually had a database integrity issue (like there were no foreign key contraints properly defined) or your identity was approaching the maximum for its datatype, I would leave them alone and go after some other low hanging fruit that would have more impact.

Cade Roux
+1 Agreed, leave that kind of thing well alone unless the system is totally buggered or approaching meltdown.
Kev
I"m no expert in databases, and i've already ported the data to tables i've made, so I"m running out of things to do. Fixing those numbers wouldn't hurt with some of the developing of the user gui i'm doing.
Iuvat
But what would "fixing" those numbers give you? When you've got thousands or millions of surrogate keys, you don't get attached to them ;-)
Cade Roux
Yeah, I suppose i'll just use CF to fix the negatives that are bothering my OCPD, then try to convince my semi-technical boss that this is a good way to go.
Iuvat
iuvat, as Cade said, do not get attached to those numbers, you can look at IDs as pointers in C or object references in other programming languages. The value does not matter, the important point is that it references the right object (a proper row in the table)
kristof
+1 to kristof. If these values don't mean anything in a 'business' sense and their sole purpose is to create relationships between data I'd leave alone, permanently.
Kev
A: 

Create a new column in the accounts table for your new ID, and new column in each of your related tables to reference the new ID column.

ALTER TABLE accounts
ADD new_accountID int IDENTITY

ALTER TABLE notes
ADD new_accountID int

ALTER TABLE equipment
ADD new_accountID int

Then you can map the new_accountID column on each of your referencing tables to the accounts table.

UPDATE notes
SET new_accountID = accounts.new_accountID
FROM accounts
INNER JOIN notes ON (notes.accountID = accounts.accountID)

UPDATE equipment
SET new_accountID = accounts.new_accountID
FROM accounts
INNER JOIN equipment ON (equipment.accountID = accounts.accountID)

At this point, each table has both accountID with the old keys, and new_accountID with the new keys. From here it should be pretty straightforward.

  1. Break all of the foreign keys on accountID.
  2. On each table, UPDATE [table] SET accountID = new_accountID.
  3. Re-add the foreign keys for accountID.
  4. Drop new_accountID from all of the tables, as it's no longer needed.
Patrick McElhaney
+1  A: 

I would talk to an accountant (or at least your financial people) before messing in anyway with the numbers in the accounts tables if this is a financial app. The Table of accounts is very critical to how finances are reported. These IDs may have meaning you don't understand. No one puts in a negative id unless they had a reason. I would under no circumstances change that unless I understood why it was negative to begin with. You could truly screw up your tax reporting or some other thing by making an uneeded change.

HLGEM
Trust me, I know this DB back to front, having ported all the info over from a badly designed one it used to be in. Also, I personally wrote the only application that interfaces with it, so I'm pretty sure that making the numbers positive wont hurt anything. It was just the bad original programmer.
Iuvat
+2  A: 

In this instance, it sounds like "why" is a better question than "how". The OP notes that there is a strange problem that needs to be fixed but doesn't say why it is a problem. Is it causing problems? What positive impact would changing these numbers have? Unless you originally programmed the system and understand precisely why the number is in its current state, you are taking quite a risky making changes like this.

robsymonds
Indeed. It is common, for IDentifiers, to be non-contiguous and I do not see why it is a problem.
bortzmeyer