views:

31

answers:

0

I was pondering GUIDs as primary keys recently, and was reminded of the most egregious misuse of them I've ever encountered:

This database contained a lot of Entity-Detail parent-child relationships, like Receipt, which had LineItems. Most of the Detail tables (LineItem in this case) used GUID primary keys. But instead of being stored using MSSQL's uniqueidentifier type, they were stored as 38-character strings, in the form '{00000000-0000-0000-0000-000000000000}'. Oh, and they were almost always in nvarchar (Unicode) columns, clocking in at 76 bytes a piece (instead of 16 bytes for a uniqueidentifier).

And how often were these fields joined on? In almost every single query in the system. Hundreds of client databases, millions of records fitting this profile. Bad.

The system did not, to the best of my memory, precede SQL Server 7.0, when the uniqueidentifier was introduced. It was just a sheer failure of knowledge / research that led to this problem.

I have two questions:

  • How common, in your experience, is this anti-pattern?

  • It seems obvious that a join on a 76-byte Unicode string would be dramatically slower than a join on a 16-byte binary number, with indexes or without. But can anyone provide an idea of just what a performance hit this might entail? Assume you index the join columns in either scenario.