views:

2403

answers:

6

I am starting new project with SqlServer and Linq to Sql. What data type would be better for surrogate keys in tables: identity or uniqueidentifier ?

As I understood, identity is automatically generated on insert, while uniqueidentifier should be generated in code (GUID).

Are there any significant performance differences? e.g. identity value must be read after insert, so there is extra database trip after insert.

Edit:
I found very detailed answer in another question: Tables with no primary keys. Read selected answer.

Edit 2:
Regarding answers about surrogate keys: I like surrogate keys more than natural keys. That decision is done, so please do not suggest to reconsider database design. Also, please, do not discuss pros and cons of natural and surrogate keys.

A: 

I use identity and either int or bigint depending on expected table size. I expect that LINQ does a single query that combines both insert and key read.

tvanfosson
A: 

uniqueidentifier can be inserted automatically also, by setting the default value in the database to NEWID() .

James Curran
+1  A: 

Either one is fine. It depends on your design. UNIQUEIDENTIFIER will work when you need to create the IDs on the client side, which can sometimes be nice when you're building up a list of objects that you'll need to add/remove/modify on the client before sending to the database. It's also better if you have to merge data created on different systems.

INTs are smaller, so they're probably going to be faster, and they're a heck of a lot easier to use when you're debugging something with your data because they're easier to discuss.

So in short, there's no "best" answer here. It's a design decision.

Dave Markle
+1  A: 

I don't think it's a good idea to choose your table's primary key based on the fact that you're using LINQ-to-SQL. In my opinion, it would be much better to use the primary key for its intended purpose - to ensure data integrity by uniquely identifying a particular row. Integer or GUID PKs do not protect you against having duplicate rows in your table (aside from the auto-generated columns, of course). Then again, there's been a big debate raging about the pro's and con's of auto-generated keys :)

+2  A: 

I agree with a comment here that you should design your db structure independently from your data access method. Whether it's LINQ2SQL or ADO.NET or NHibernate, you will get the same set of benefits/problems, whether your PK is autoincrement identity or GUID.

I actually can think only of one purpose to use GUID in favor of INT as PK - in a highly distributed application with several databases where data needs to be synchronized with a master server etc. Even then you could use IDENTITY with different increments, e.g. IDENTITY(1,2) for one DB server, IDENTITY (2, 2) with the next one etc [works only with a fixed number of DB servers of course.]

The main "problem" with GUID is that it takes more space (16 bytes compared to 4/8), not only as PK, but also as part of any index in that table, since every index stores the PK value implicitly. Comparing 16 bytes is gonna be slower too (how much? you should measure it, it depends)

Also since GUID are random, you can get a lot of page-splits when inserting rows (though I believe there is a new function in SQL Server 2005 or 2008 to generate consecutive GUIDs) into a table. This alone can be quite costly in an environment with a lot of inserts.

liggett78
Yes, the new function in SQL 2005 is NEWSEQUENTIALID(), which alleviates just this problem. But 16 bytes is a lot when multiplied by millions (?) of rows...
Dave Markle
Check article I linked to when I edited question.
zendar
A: 

OK, I had just written a blog about this. I am reposting below for others benefit.

The basic gist appears to be comments made on the ado.net blog that state the Entity Framework is the only thing getting major developer time for Visual Studio 2010 and Dot Net 4.

We have known this

My response is - DUH. We have all known this. Microsoft said publicly back at the PDC 2007 that LINQ to SQL was a short term release for SQL Server because there was no other LINQ story to SQL Server. It only works with SQL Server. You cannot write a LINQ to SQL provider - there is no model for it. It was a one off technology, not extensible.

Entity Framework = Provider

The Entity Framework is the ONLY way from Microsoft to build a LINQ Provider. The Entity Framework has turned out to be quite contreversial, but I think that is partly due to the fact that LINQ to SQL has a better programmer experience today. Entity Framework will catch and surpass LINQ to SQL because it is the ORM/Mapping tool of the future from Microsoft.

I think that Microsoft has a huge investment in Entity Framework with third party vendors like us, IBM, Oracle, etc. If they want third party developers and databases to support LINQ they had to have a model to write to. Entity Framework is that answer.

I saw this coming last year when they relesed LINQ to SQL with VS 2008 and not the EF. They should never have released a closed implementation of something with so much overlap. I know they wanted a relational database story, but I think they told the wrong one. Now there are a lot of seriouly confused developers who come asking us for our LINQ to SQL provider. There isn't one because you can't write one.

Jason Short