views:

100

answers:

2

I have SQL Server 2005 Standard Service Pack 2 9.00.4053.00 (Intel X86)

Table has close to 30 million rows..

If I do

SELECT GETDATE(), * FROM
<table>

Identical Date and time value is returned including milliseconds part.. though query took more then 3 minutes to complete...

I have already read

http://sqlblog.com/blogs/andrew_kelly/archive/2008/02/27/when-getdate-is-not-a-constant.aspx

http://social.msdn.microsoft.com/Forums/en-US/transactsql/thread/66507b8b-4a74-44c1-9637-3ab5f75db6a0

One of the link I posted (marked answer) suggest that prior to SQL 2005 GETDATE was deterministic although SQL 2000 BOL states GETDATE is nondeterministic

If I do an update with millions of rows

UPDATE tableName
SET dateColumn = GETDATE()

I know you really want to do

DECLARE @DT datetime
SET @DT = GETDATE()
UPDATE table
SET datecol =@DT

I am really confused

What would be expected behavior?

  1. In case of select statement I posted earlier
  2. Behavior of update statement

Considering you are updateing a datecolun on a table with 100 million rows Would datecolumn will have identical date and time in milliseconds....?

+4  A: 

GetDate() was never deterministic. Deterministic means that it will always return the same result when passed the same parameters.

In common with rand() It is evaluated once per column but once evaluated remains the same for all rows.

It is easier to see this behaviour with rand() than getdate()

select top 4 rand(), rand()
from sys.objects

Returned

---------------------- ----------------------
0.0566172633850772     0.431111195699363
0.0566172633850772     0.431111195699363
0.0566172633850772     0.431111195699363
0.0566172633850772     0.431111195699363

If you try the following

select top 10 getdate(), getdate()
from sys.objects

and look at the ComputeScalar operator properties in the actual execution plan you will see that GetDate() is evaluated twice.

NB: It is possible that this behaviour of evaluation per column rather than per query changed after SQL 2000 (I don't know) but that isn't what BOL defines as the meaning of deterministic.

Martin Smith
Martin, thank you...
cshah
Martin, how come NEWID() return different result?
cshah
@cshah - Good question and one I don't know the answer to!
Martin Smith
@cshah, @Martin Smith: please see my answer for NEWID
gbn
I need to mark both of your reply as answer but can't?
cshah
+2  A: 

Following on from Martin Smith's answer, The determinism referred to was a change in udf behaviour. In SQL Server 2000, you could not use GETDATE in a udf. You can in SQL Server 2005. See this link too

As Martin Smith said, some functions are evaluated per column, per query. Not per row. GETDATE is one, RAND is another.

If you do need row by row evaluation of GETDATE then wrap it in udf.

Edit:

NEWID is statistically unique. It must be evaluated row by row so you don't have the same value appear in another row. Hence the CHECKSUM(NEWID()) trick to generate row by row random numbers...

gbn
Thanks but Why not NEWID behaves differenly?SELECT top 4 newid(), newid()from sys.objects
cshah
now I understand as if you want to return rows randomly then ORDER BY NEWID() so newid() is different...
cshah
What would update behaves same? ie. UPDATE tableName SET dateColumn = GETDATE() OK it will same.. dumb :-(
cshah