I am struggling understanding what a clustered index in SQL Server 2005 is. I read the MSDN article Clustered Index Structures (among other things) but I am still unsure if I understand it correctly.
The (main) question is: what happens if I insert a row (with a "low" key) into a table with a clustered index?
The above mentioned MSDN article states:
The pages in the data chain and the rows in them are ordered on the value of the clustered index key.
And Using Clustered Indexes for example states:
For example, if a record is added to the table that is close to the beginning of the sequentially ordered list, any records in the table after that record will need to shift to allow the record to be inserted.
Does this mean that if I insert a row with a very "low" key into a table that already contains a gazillion rows literally all rows are physically shifted on disk? I cannot believe that. This would take ages, no?
Or is it rather (as I suspect) that there are two scenarios depending on how "full" the first data page is.
- A) If the page has enough free space to accommodate the record it is placed into the existing data page and data might be (physically) reordered within that page.
- B) If the page does not have enough free space for the record a new data page would be created (anywhere on the disk!) and "linked" to the front of the leaf level of the B-Tree?
This would then mean the "physical order" of the data is restricted to the "page level" (i.e. within a data page) but not to the pages residing on consecutive blocks on the physical hard drive. The data pages are then just linked together in the correct order.
Or formulated in an alternative way: if SQL Server needs to read the first N rows of a table that has a clustered index it can read data pages sequentially (following the links) but these pages are not (necessarily) block wise in sequence on disk (so the disk head has to move "randomly").
How close am I? :)