views:

556

answers:

4
create nonclusterd  INDEX index_name 
    ON <object> ( column [ ASC | DESC ] [ ,...n ] ) 
    [ INCLUDE ( column_name [ ,...n ] ) ]
    [ WHERE <filter_predicate> ]

In the above syntax we will specify non-key columns in the INCLUDE,what is the advantage in specifying non-key columns? Please help me.

Thanks in advance

+1  A: 

This will store the data for the column with the index node for fast access on queries that use the index.

Reference:

INCLUDE ( column [ ,... n ] )

Specifies the non-key columns to be added to the leaf level of the nonclustered index. The nonclustered index can be unique or non-unique.

tvanfosson
+2  A: 

When an index is used, it can either lead to the location of the complete record, or all the fields you need can be stored in the index itself (so you don't need to go to complete record.)

In the first case, you have two reads best case, one to read the index, and one to read the complete record.

In the second case, you get everything you need from reading the index.

You can INCLUDE extra columns to be stored with the index if, together with the field(s) comprising the index, you have complete queries satisfied without needing the extra read for the row.

This is not an issue for the CLUSTERED index because reading the index is the same as reading the entire row.

There is an especially big advantage if you are reading several index keys in a sequence (e.g. SELECT ... FROM ... WHERE keys BETWEEN n1 AND n2) because the index reads will likely be stored adjacent to each other and can be read with perhaps only one or two physical sectors; and not needing to find the rest of the records provides more leverage.

le dorfier
Nice explanation of why it doesn't apply to the clustered index
Andomar
A: 

It removes the need to do a key lookup if the query uses the included column. Example.

select ssn, firstname from myusers where ssn='111-11-1111'

An appropriate index would look like this

create index idx_user_ssn nonclustered on myusers(ssn) include(firstname)
Al W
Thank you very much
Anoop
+2  A: 

Let's assume you have an employee table like so:

CREATE TABLE Employee(EmployeeID INT IDENTITY(1,1) PRIMARY KEY,
                      LastName VARCHAR(50),
                      FirstName VARCHAR(50),
                      HireDate DATETIME,
                      Salary DECIMAL)

You would have the primary clustered key on EmployeeID, and possibly a non-clustered key on (LastName,FirstName) in order to be able to find employees by name.

CREATE INDEX NameIndex ON Employee(LastName ASC, FirstName ASC)

Now if you need to find "Joe Murphy" and retrieve his hire date and salary, what happens is an index seek in your name-based non-clustered key (which is good), but then in order to fetch the hire date and salary, SQL Server needs to do a so-called bookmark lookup into the actual table data to get the record for Joe Murphy. This will most likely incur one or several physical disk accesses (which is bad in terms of performance).

HOWEVER: if your name-based non-clustered index also specifies "INCLUDE (HireDate, Salary)":

CREATE INDEX NameIndex ON Employee(LastName ASC, FirstName ASC)
       INCLUDE (HireDate, Salary)

then SQL Server is done once it's looked up Joe Murphy in the non-clustered name index --> all the fields to satisfy your query are in the non-clustered index, so there's no more need to do a disk-intensive bookmark lookup and your queries will be potentially much faster.

The downside of INCLUDE columns are increased disk-space need by non-clustered indexes, since they will have the included columns in their leaf-level nodes. It's a trade-off between speed and size (as usual).

Marc

marc_s
Great explanation ...Thank you very much
Anoop