views:

57

answers:

2

If I am going to be querying a table by Guids (irregardless of fragmentation problems with Guids), would it be faster to have the Guid as the clustered index rather than the non-clustered index or no index at all?

This question is coming from a read-only standpoint. I'm just curious if there will be a speed improvement between the searching rows for a specific Guid, and will searching complete faster with/without an index or with/without a clustered index?

Alternatively, I'm fairly certain in the answer to my next question, but now apply int identifiers to the previous question. Will it be faster to search if the table is clustered by that int? (This is rather than clustered by some other item in the table?)




I know there are many other questions posted on this topic, but I haven't found the specific answer that I'm looking for in any of these:
http://stackoverflow.com/questions/1757222/should-a-sequential-guid-primary-key-column-be-a-clustered-index
http://stackoverflow.com/questions/583001/improving-performance-of-cluster-index-guid-primary-key
http://stackoverflow.com/questions/713430/clustered-primary-key-on-unique-identifier-id-column-in-sql-server
http://stackoverflow.com/questions/967956/uniqueidentifier-with-index
http://stackoverflow.com/questions/277625/should-i-get-rid-of-clustered-indexes-on-guid-columns

Thanks for any help!

+2  A: 

The table will certainly query faster with Integer clustered indexes than GUID indexes. The reason being the size of the data type.

If you have already decided to go with GUIDs as key then probably generate these GUIDs using newSequentialId() instead of NewId() as this would reduce the effects of fragmentation in Guid indexes as the Ids ae always increasing and you have less chances of having a page split.

Adding to my point, it is a natural choice to go with this as a clustered index unless you have a potential candidate for a clustered index i.e. if you are using this guid not for key purposes. If its a relatively small table that is when you have a choice to not have an index else its always good to have indexes.

Baaju
+1  A: 

Assuming MS SQL Server. This may or may not apply to other RDBMSs:

If you have a clustered index then it will be fastest, although if you're searching for a single row then the difference between that and a non-clustered index will be negligible. When you use a non-clustered index the server needs to first find the right value in the index and then go fetch the full record from the table storage. The table storage is the clustered index, so searching by a clustered index eliminates that step (called a Bookmark Lookup), but that step is almost imperceptible for a single row.

Clustered indexes tend to provide a bigger advantage for reading when they are on a column that is selected by range (for example, transaction date and you want to find all transactions for the past month). In that case the server can find the start and just read off the data in one quick, sequential sweep.

Having a non-clustered index on an INT (all other things being equal) will be slightly faster than using a GUID because the index itself will be smaller (because INTs are much smaller than GUIDs) which means that the server has to traverse fewer pages to find the value that it's looking to get. In the case of a clustered index I don't think that you'll see much of a difference if your row sizes are already large compared to the difference between a GUID and an INT, but I haven't done any testing on that.

Tom H.