+2  A: 

If you want to compare actual performance, you need to create the tables and indexes (and everything else involved). While a temp table will be a much better analog than a table variable, neither is a substitute for an actual permanent table structure if you're seeking performance metrics.

All of that being said, however, you should avoid using uniqueidentifier as a primary key, or, at the very least, use newsequentialid() rather than newid(). Having a clustered index means that the rows will actually be stored in physical order. If an inserted value is out of sequence, SQL Server will have to rearrange the rows in order to insert it into its proper place.

Adam Robinson
Why wouldn't a temp table be the same? Edit: I can think of logging behaviour.
Martin Smith
@Martin: All things being equal, it would, but to get a clear picture of how the table will *actually* perform means that you need all of the associated performance-impacting database objects as well (indexes, statistics, constraints, triggers, etc.).
Adam Robinson
ONe reason why a table variable would be horrible for this is that you can't create indexes etc. Temp tables are better, but really for testing actual tables you should have actual tables set up with exactly the constraints and indexes they will have.
HLGEM
+2  A: 

First of all never ever cluster on a uniqueidentifier when using newid(), it will cause fragmentation and thus page splits, if you have to use a GUID then do it like this

create table #test (id uniqueidentifier primary key defualt newsequentialid())

newsequentialid() won't cause page splits

Still an int is still better as the PK since now all your non clustered indexes and foreign keys will be smaller and now you need less IO to get the same numbers of rows back

SQLMenace
Yessir, that's the beasty for which I'm looking for definitive perf metrics.
stack
Plus people hate to write queries using uniqueidentifier.
HLGEM
You get used to it. :-\
stack
A: 

I dunno why but I'd like to cite Remus Rusanu [1]:

First of all, you need to run the query repeatedly under each [censored] and average the result, discarding the one with the maximum time. This will eliminate the buffer warm up impact: you want all runs to be on a warm cache, not have one query warm the cache and pay the penalty in comparison.

Next, you need to make sure you measure under realistic concurrency scenario. IF you will have updates/inserts/deletes occur under real life, then you must add them to your test, since they will impact tremendously the reads under various isolation level. The last thing you want is to conclude 'serializable reads are fastest, lets use them everywhere' and then watch the system melt down in production because everything is serialized.

1) Running the query on a cold cache is not accurate. Your production queries will not run on a cold cache, you'll be optimizing an unrealistic scenario and you don't measure the query, you are really measuring the disk read throughput. You need to measure the performance on a warm cache as well, and keep track of both (cold run time, warm run times).

How relevant is the cache for a large query (millions of rows) that under normal circumstances runs only once for particular data? Still very relevant. Even if the data is so large that it never fits in memory and each run has to re-read every page of the table, there is still the caching of non-leaf pages (ie. hot pages in the table, root or near root), cache of narrower non-clustered indexes, cache of table metadata. Don't think at your table as an ISAM file

[1] Why better isolation level means better performance in SQL Server
http://stackoverflow.com/questions/2450808/why-better-isolation-level-means-better-performance-in-sql-server

vgv8