views:

983

answers:

8

Hi

We use Guid's for primary key, which you know is clustered by default.

When inserting a new row into a table it is inserted at a random page in the table (because Guid's are random). This has a measurable performance impact because the DB will split data pages all the time (fragmentation). But the main reason I what a sequential Guid is because I want new rows to be inserted as the last row in the table... which will help when debugging.

I could make a clustered index on CreateDate, but our DB is auto generated and in development, we need to do something extra to facilitate this. Also CreateDate is not a good candidate for a clustered index.

Back in the day I used Jimmy Nielsons COMB's, but I was wondering if there is something in the .NET framework for this. In SQL 2005 Microsoft introduced newsequentialid() as an alternative to newid(), so I was hoping that they made a .NET equivalent, because we generate the ID in the code.

PS: Please don't start discussing if this is right or wrong, because GUID's should be unique etc.

A: 

The key problem is knowing what the last value was in a .NET application. SQL Server keeps track of this for you. You will need to hold the last value yourself and use the Guid constructor with a byte array containing the next value. Of course, on a distributed application this probably isn't going to help and you may have to use the randomised Guids. (Not that I see anything wrong with this.)

http://msdn.microsoft.com/en-us/library/90ck37x3.aspx

BlackWasp
I didn't realize what the newsequentialId() is really doing.I don't need them to be 100% sequential. I just need the guid created at one point in time to have a lower sorting than one created later.Jimmy's COMB is using current time to do this and therefore I don't have to keep track of the ids.
Thomas Jespersen
+5  A: 

Perhaps a simple way to determine the order in which rows have been added would be to add an IDENTITY column to the table, avoiding the need to keep your GUIDS in order and hence avoiding the performance hit of maintaining a clustered index on the GUID column.

I can't help but wonder how keeping these rows in order helps you when debugging. Could you expand that a bit?

Ed Guiness
Wouldn't that add an extra column to my table? A column which is of no other use, and not a good choice for a clustered index!Regarding debugging: When I select * form my table I get the rows in random order. If I get the latest row in the bottom, I don't have do do any sorting every time I do so.
Thomas Jespersen
Why the fixation on having a Clustered Index? Please explain how this helps you with debugging. SELECT * FROM TABLE ORDER BY my_ident_column is not so hard, is it?
Ed Guiness
The standard way of handling this is to have an IDENTITY column. Yes, it'll add an extra column to your table, but that shouldn't be a big deal. It takes up 4 to 8 bytes a row depending on your setup. It's actually the perfect choice for a clustered index!
nathaniel
+1  A: 

Unfortunatley, no there isn't a .NET equivalent to newsequentialid(). You could continue using a Comb. I actually have a C# implementation of a Comb somewhere...I'll see if I can dig it up.

Scott Dorman
+1  A: 

By insisting on having your rows "in order" you are subverting the important principle that tables are conceptually an unordered set. Order can be imposed of course, via "ORDER BY" arguments and indexes, but by trying to force your database to perpetually maintain rows in a state that matches your (incorrect) mental model you are going to continue to struggle.

Once more I appeal for you to please explain how keeping rows in a physical order helps you with debugging. It might be that there is an easier way to achieve your debugging goals.

Ed Guiness
an unordered set? yes perhaps, but they are also paged and an order can mean that you need less disk io because your rows are in the same page which they perhaps aren't when using a normal guid. (this is the short explanation)
chrissie1
@chrissie1, sure, and that's why a clustered index can be helpful for performance. But @Thomas hasn't yet explained how this physical ordering helps him with debugging, and this appears to be at the heart of his question.
Ed Guiness
The heart of the problem is performance. I might have overstated the point of debugging. It is just that I like to be able to open a table, scrool to the bottom and se if a row is inserted or not.
Thomas Jespersen
+6  A: 

It should be possible to create a sequential GUID in c# or vb.net using an API call to UuidCreateSequential. The API deceleration (C#) below has been taken from Pinvoke.net where you can also find a full example of how to call the function.

[DllImport("rpcrt4.dll", SetLastError=true)]
static extern int UuidCreateSequential(out Guid guid);

The MSDN article related to the UuidCreateSequential function can be found here which includes the prerequisites for use.

John
I marked this as the correct answer. This is not what I was after though. I wanted a COMB, and I orginally thougt that SequentialId's was the same as comb. That is... the Question my description is not aligned. But it seams that this is the correct answer to the question.
Thomas Jespersen
A: 

I've been lead to believe that random Guids can be beneficial to performance in some use cases. Apparently inserting to random pages can avoid contention that would otherwise occur in the end page when multiple people are trying to insert at the same time.

John's PInvoke suggestions is probably the closest to SQL's version but the UUidCreateSequential docs state that you shouldn't use it to identify an object that it's strictly local to the machine generating the Guid.

I'd be measuring the actual use case performance hit with realistic data in realistic quantities before I investigated sequential Guid generation any further.

Jason Stangroome
I think "avoiding contention" in the last page pales in comparison to the complete and utter disaster this makes of any attempts to index a table using the uuid as the clustered index.
Mel
+1  A: 

Here is the C# code to generate a COMB GUID.

byte[] guidArray = System.Guid.NewGuid().ToByteArray();

            DateTime baseDate = new DateTime(1900, 1, 1);
            DateTime now = DateTime.Now;

            // Get the days and milliseconds which will be used to build the byte string 
            TimeSpan days = new TimeSpan(now.Ticks - baseDate.Ticks);
            TimeSpan msecs = new TimeSpan(now.Ticks - (new DateTime(now.Year, now.Month, now.Day).Ticks));

            // Convert to a byte array 
            // Note that SQL Server is accurate to 1/300th of a millisecond so we divide by 3.333333 
            byte[] daysArray = BitConverter.GetBytes(days.Days);
            byte[] msecsArray = BitConverter.GetBytes((long)(msecs.TotalMilliseconds / 3.333333));

            // Reverse the bytes to match SQL Servers ordering 
            Array.Reverse(daysArray);
            Array.Reverse(msecsArray);

            // Copy the bytes into the guid 
            Array.Copy(daysArray, daysArray.Length - 2, guidArray, guidArray.Length - 6, 2);
            Array.Copy(msecsArray, msecsArray.Length - 4, guidArray, guidArray.Length - 4, 4);

            return new System.Guid(guidArray);
Donny V.
I marked this as the "accepted answer" even though I haven't tested it. But it looks right. If anyone can tell me that this is wrong please, write a comment and I will undo this.PS: It seams to me that many don't se the benefit of GUID's as primary keys. Here is a two:1. When createing objects like Order and OrderLine in the same Unit Of Work, one can set the OrderLine.OrderId before ever talking to the database. Thus mark OrderId as immutable.2. If you have a project where you need to merge or migrate databases guid are great because you don't have to do integer manipulation.
Thomas Jespersen
I use this in my current project ComicsInventory.com and it works like a charm. Also I believe another benefit of using GUIDs is that you can use them in your client side code and not have to worry about someone figuring out the next id.
Donny V.
This does not create a *globally* unique ID because it does not involve the network card. John's answer below is correct.
joshcomley
A: 

About the selected answer. The docs says... The generated Guid will not give you uniqueId between computers if they don't have ehternet access.

If you must know the guid when inserting, couldn't you let Sql-server return a block of sequential guids that you assign to your data before you insert them?

declare @ids table(id uniqueidentifier default NEWSEQUENTIALID(), dummy char(1))

declare @c int set @c = 0; while (@c < 100) begin insert into @ids (dummy) values ('a'); set @c += 1; end

select id from @ids

//Daniel

Daniel