Sequential Guid and fragmentation

views:

answers:

+2 Q:

Sequential Guid and fragmentation

I'm trying to understand how sequential guid performs better than a regular guid.

Is it because with regular guid, the index use the last byte of the guid to sort? Since it's random it will cause alot of fragmentation and page splits since it will often move data to another page to insert new data?

Sequential guid sine it is sequential it will cause alot less page splits and fragmentation?

Is my understanding correct?

If anyone can shed more lights on the subject, I'll appreciated very much.

Thank you

EDIT:

Sequential guid = NEWSEQUENTIALID(),

Regular guid = NEWID()

+2 A:

I defer to Kimberly L. Tripp's wisdom on this topic:

But, a GUID that is not sequential - like one that has it's values generated in the client (using .NET) OR generated by the newid() function (in SQL Server) can be a horribly bad choice - primarily because of the fragmentation that it creates in the base table but also because of its size. It's unnecessarily wide (it's 4 times wider than an int-based identity - which can give you 2 billion (really, 4 billion) unique rows). And, if you need more than 2 billion you can always go with a bigint (8-byte int) and get 263-1 rows.

Joe Stefanelli 2010-08-10 14:37:44

With Sequential Guid, since it's sequential, technically all new index pages will be in theory sequential thus improving the performance. Am I understanding it correctly? For regular guid it the records will be all over the place since it's random thus doing a regular select statement with a range can be slow?

pdiddy 2010-08-10 14:45:40

@pdiddy - well since the "random" GUIDs aren't sequential why would you select for a range of them?

JNK 2010-08-10 14:51:01

You are correct about the sequential vs. random nature, however the real pain is on INSERT rather than on SELECT.

Joe Stefanelli 2010-08-10 14:53:23

I see, the insert is a real pain because of page splits and the fragmentation. But a select with joins will also be pain right since it has to do alot of seek? I mean joining with table where there PK are also regular guid

pdiddy 2010-08-10 14:57:20

+2 A:

You've pretty much said it all in your question.

With a sequential GUID / primary key new rows will be added together at the end of the table, which makes things nice an easy for SQL server. In comparison a random primary key means that new records could be inserted anywhere in the table - the chance of the last page for the table being in the cache is fairly likely (if that's where all of the reads are going), however the chance of a random page in the middle of the table being in the cache is fairly low, meaning additional IO is required.

On top of that, when inserting rows into the middle of the table there is the chance that there isn't enough room to insert the extra row. If this is the case then SQL server needs to perform additional expensive IO operations in order to create room for the record - the only way to avoid this is to have gaps scattered amongst the data to allow for extra records to be inserted (known as a Fill factor), which in itself causes performance issues because the data is spread over more pages and so more IO is required to access the entire table.

Kragen 2010-08-10 14:43:39

Thank you! I just wanted to confirm that I understood it correctly.

pdiddy 2010-08-10 14:46:33

ansaurus

tags:

views:

answers:

Sequential Guid and fragmentation

related questions