views:

349

answers:

3

Jimmy Nilsson discusses his COMB guid concept here. This concept is popular in NHibernate, among other circles, for its supposed performance value over standard GUIDs which are typically far more random.

However, in testing, this does not appear to be the case. Am I missing something?

Test case:

I have a table called temp (not a temp table, just a table named "temp") with 585,000 rows in it. I have a new table called Codes, and wish to copy all 585,000 code values from the temp table to the codes table. The test SQL I executed was:

set statistics time on;

truncate table codes;
DBCC DBREINDEX ('codes', '', 90);

insert into codes (codeid, codevalue)
select newid(), codevalue from temp

truncate table codes;
DBCC DBREINDEX ('codes', '', 90);

insert into codes (codeid, codevalue)
select CAST(CAST(NEWID() AS BINARY(10)) + CAST(GETDATE() AS BINARY(6)) AS UNIQUEIDENTIFIER), codevalue from temp

Performance with standard GUID values:

SQL Server Execution Times: CPU time = 17250 ms, elapsed time = 15735 ms.

(585000 row(s) affected)

Performance with COMB GUID values:

SQL Server Execution Times: CPU time = 17500 ms, elapsed time = 16419 ms.

(585000 row(s) affected)

What am I missing? the COMB GUID values resulted in slightly longer times, presumably because of the additional conversions. I thought the point was to reduce the insert time by semi-ordering the GUIDS using the date for the last 6 bytes, but the performance gain appears non-existent.

+3  A: 

I'd suggest that you're not seeing the order benefit because the target table has no PK. So, it's the conversion overhead you're seeing. IF it has a PK, the 585k rows must still be sorted on insert. How does SQL know it's semi-sorted?

Now, if it was 5,850 x 100 row inserts, then you may see some benefit because the new rows will go "at the end" not "in the middle" so reducing page splits and overhead.

I'd go further and say that the article is dated 2002, and is for SQL 2000, and has been overtaken by real life.

In SQL Server 2005 we have SEQUENTIAL GUIDs to allow strictly monotonic GUIDs to solve some issues. The GUID as PK has been done here too: recent example: INT vs Unique-Identifier for ID field in database with 3rd party links.

If an ORM dictates GUID as a PK rather than a natural key or standard int-based surrogate key, that's a severe limitation of the ORM. And a case of the client tail wagging the database dog.

gbn
A: 

Your code for generating new GUIDs isn't correct. For each row, it's creating a very different number (you call NEWID() for each row). You need to keep most of the GUID the same.

Rik Hemsley
A: 

I second that you'll see differences only when you have indexes (PK, FK or other kind of indexes, clustered or not clustered) on the Guid colume, because cost of standard guid versus newguid or comb guid is due to the high cost of re-ordering the index data every time an insert is performed.

See my question in which I corroborate this with some real life data from both SQL Server and Oracle: StackOverFlow Question

Regards Massimo

massimogentilini