views:

140

answers:

3

I have a table with CreateDate datetime field default(getdate()) that does not have any identity column.

I would like to add identity(1,1) field that would reflect same order of existing records as CreateDate field (order by would give same results). How can I do that ?

I guess if I create clustered key on CreateDate field and then add identity column it will work (not sure if it's guaranteed), is there a good/better way ?

I am interested in SQL Server 2005, but I guess the answer will be the same for SQL Server 2008, SQL Server 2000.

+1  A: 

As you suspect, it will add them according to the clustered index. Otherwise, you'll have to do it in code from somewhere.

Patrick Karcher
No, it won't. Only an ORDER BY clause will guarantee an ORDER
gbn
@gbn Every time I've inserted an identity column, the order of the autonumbers matched the clustered index. I've counted on this behavior to get the order I wished. You're saying that this behavior is not guaranteed and might not happen, that I've been lucky?
Patrick Karcher
@Patrick: IDENTITY can leave gaps, it's documented on MSDN: "If an identity column exists for a table with frequent deletions, gaps can occur between identity values."
Remus Rusanu
@gbn That's certainly true, but how does it apply? In his situation there is as yet no Identity column. He wants to know how to add an identity column that (at the moment it's created) follows the order of the CreateDate column (as best it can of course, since CreateDate could have duplicated values). After he does this, the two can/will of course diverge, but this has nothing to do with his question. If he has a clustered index on CreateDate, at the moment he adds a new identity field, it will have an order that follows CreateDate. You're saying this is not the case?
Patrick Karcher
@Remus Rusanu woops, above comment was in response to your comment.
Patrick Karcher
The identity column will be out of sync with the physical position after the first delete. It may also get out of sync after two inserts. 'Indentity column' and clustered 'index order' should never be used in the same sentence, except this sentence.
Remus Rusanu
@Remus but that's the future. I don't think he wants a magical identity column that will mirror the date field into the future. It could be that he really, really, does not know what he's doing at all and thinks this is possible. Since I've done the exact same thing he's doing several times, I figured he just wanted some *somewhat reasonable* seed values to start the new fie, since he can't fix the past. Your interpretation could be correct I suppose. Also, we seem to disagree pretty seriously about Identity columns and clustered indexes. I think they're like chocolate and peanut better.
Patrick Karcher
You've read the OP as a immediate operational DBA type decision (must add IDENTITY now, how to?) when I read it as a design, developer type decision. I raised the flag about IDENTITY vs. order mismatch since it is so often (erroneously) assumed. As for the peanut butter and jelly, my objection is not about IDENTITY as clustered key, is about confusing primary keys and clustered keys., see http://stackoverflow.com/questions/1301165/should-i-design-a-table-with-a-primary-key-of-varchar-or-int/1301536#1301536 for a longer discussion of the topic.
Remus Rusanu
Demo of clustered index with no ORDER BY: http://sqlblog.com/blogs/alexander_kuznetsov/archive/2009/05/20/without-order-by-there-is-no-default-sort-order.aspx
gbn
@gbn Great info I guess, as it seems some people think a query will always use the clustered index to order returns in the absense of Order By. I'm not one of those people. And, it doesn't apply to this question. This question is not about all the queries he's going to run later, but how to have the autonumbers on a new autonumber identity column match align with the **existing** CreateDate field values. Not "all future magically", but "existing". I don't believe he's asking for anything else.
Patrick Karcher
@Patrick Karcher: I covered this in my answer (and qualified it too)
gbn
+2  A: 

IDENTITY values are orthogonal to the physical storage order in general. In particular an identity will not always match a datetime clustered key order because of the datetime resolution of 3ms that allows multiple rows with the same datetime. Also if the original time is bound to the client machine (ie. mid tier, asp layer, user machine etc) then the time drift between machines will also ensure a difference between insert order (what IDENTITY would give) and storage order.

If you need a row order integer, use ROW_NUMBER() in the projection list. If you need an IDENTITY primary key for ORM purposes, use an IDENTITY column and index it as a non-clustered index.

Never confuse physical storage requirement (clustered key) with logical modeling requirements (primary key).

Remus Rusanu
If you downvote, explain why, as a basic courtesy.
Remus Rusanu
@Remus Sorry, that was me. I don't believe you at all answered the question. I *agree* with your statements (except that my Identity columns are pretty much always the clustered index), but they are not helpful. He wants to improve his design/clarity/indexing/foreignKeys by adding an autonumber field; I've been in his situation several times. I'm sure he wishes there had been an autonumber to begin with. Rather than random numbers to start our with, he'd like them to at least *somewhat* match the order in which they were added. Better than nothing. How to do that was his question.
Patrick Karcher
Yes, but what will happen *next*, after he adds the identity column? His cherished physical order-identity value mapping will vanish in 5 minutes. My whole post is about how *never* to rely on such a mapping.
Remus Rusanu
@Remus. I just can't imagine that he expected it to. He probably just needs a good foreign key. Or, the previous clustered index was some horrid, huge, "natural key" and now he need performance for his non-clustered indexes, etc. Or it was a multi-field key, and he's trying bring some sanity.
Patrick Karcher
+3  A: 

Following on from Remus' theoretical answer... you need to generate a list first with your ideal ordering

SELECT
    ID, CreateDate
INTO
    MyNewTable
FROM
    (
    SELECT
        CreateDate,
        ROW_NUMBER() OVER (ORDER BY CreateDate ASC) AS ID
    FROM
        MyTable
    ) foo

Then, the best solution is to use SSMS to add the IDENTITY property to MyNewTable. SSMS will generate a script that includes SET IDENTITY INSERT to preserve the order

Note: IDENTITY columns are just numbers that have no implicit meaning and nothing should be inferred by their alignment with the CreateDate after this exercise...

gbn