Login

In SQL Server, row-oriented storage both clustered and nonclustered indexes are organized as B trees.

![enter image description here][1]

([Image Source][2])

The key difference between clustered indexes and non clustered indexes is that the leaf level of the clustered index **is** the table. This has two implications.

1. The rows on the clustered index leaf pages always contain *something* for each of the (non-sparse) columns in the table (either the value or a pointer to the actual value).
2. The clustered index is the primary copy of a table.

Non clustered indexes can also do point 1 by using the `INCLUDE` clause (Since SQL Server 2005) to explicitly include all non-key columns but they are secondary representations and there is always another copy of the data around (the table itself).

CREATE TABLE T
(
A INT,
B INT,
C INT,
D INT
)

CREATE UNIQUE CLUSTERED INDEX ci ON T(A, B)
CREATE UNIQUE NONCLUSTERED INDEX nci ON T(A, B) INCLUDE (C, D)

The two indexes above will be nearly identical. With the upper-level index pages containing values for the key columns `A, B` and the leaf level pages containing `A, B, C, D`

> There can be only one clustered index per table, because the data rows
> themselves can be sorted in only one order.

The above quote from SQL Server books online causes much confusion

In my opinion, it would be much better phrased as.

> There can be only one clustered index per table because the leaf level rows of the clustered index **are** the table rows.

The book's online quote is not incorrect but you should be clear that the "sorting" of both non clustered and clustered indices is logical, not physical. If you read the pages at leaf level by following the linked list and read the rows on the page in slot array order then you will read the index rows in sorted order but physically the pages may not be sorted. The commonly held belief that with a clustered index the rows are always stored physically on the disk in the same order as the index **key** is false.

This would be an absurd implementation. For example, if a row is inserted into the middle of a 4GB table SQL Server does **not** have to copy 2GB of data up in the file to make room for the newly inserted row.

Instead, a page split occurs. Each page at the leaf level of both clustered and non clustered indexes has the address (`File: Page`) of the next and previous page in logical key order. These pages need not be either contiguous or in key order.

e.g. the linked page chain might be `1:2000 <-> 1:157 <-> 1:7053`

When a page split happens a new page is allocated from anywhere in the filegroup (from either a mixed extent, for small tables or a non-empty uniform extent belonging to that object or a newly allocated uniform extent). This might not even be in the same file if the filegroup contains more than one.

The degree to which the logical order and contiguity differ from the idealized physical version is the degree of logical fragmentation.

In a newly created database with a single file, I ran the following.

CREATE TABLE T
(
X TINYINT NOT NULL,
Y CHAR(3000) NULL
);

CREATE CLUSTERED INDEX ix
ON T(X);

GO

--Insert 100 rows with values 1 - 100 in random order
DECLARE @C1 AS CURSOR,
@X AS INT

SET @C1 = CURSOR FAST_FORWARD
FOR SELECT number
FROM master..spt_values
WHERE type = 'P'
AND number BETWEEN 1 AND 100
ORDER BY CRYPT_GEN_RANDOM(4)

OPEN @C1;

FETCH NEXT FROM @C1 INTO @X;

WHILE @@FETCH_STATUS = 0
BEGIN
INSERT INTO T (X)
VALUES (@X);

FETCH NEXT FROM @C1 INTO @X;
END

Then checked the page layout with

SELECT page_id,
X,
geometry::Point(page_id, X, 0).STBuffer(1)
FROM T
CROSS APPLY sys.fn_PhysLocCracker( %% physloc %% )
ORDER BY page_id

The results were all over the place. The first row in key order (with value 1 - highlighted with an arrow below) was on nearly the last physical page.

![enter image description here][4]

Fragmentation can be reduced or removed by rebuilding or reorganizing an index to increase the correlation between logical order and physical order.

After running

ALTER INDEX ix ON T REBUILD;

I got the following

![enter image description here][5]

If the table has no clustered index it is called a heap.

Non clustered indexes can be built on either a heap or a clustered index. They always contain a row locator back to the base table. In the case of a heap, this is a physical row identifier (rid) and consists of three components (File:Page: Slot). In the case of a Clustered index, the row locator is logical (the clustered index key).

For the latter case if the non clustered index already naturally includes the CI key column(s) either as NCI key columns or `INCLUDE`-d columns then nothing is added. Otherwise, the missing CI key column(s) silently gets added to the NCI.

SQL Server always ensures that the key columns are unique for both types of indexes. The mechanism in which this is enforced for indexes not declared as unique differs between the two index types, however.

Clustered indexes get a `uniquifier` added for any rows with key values that duplicate an existing row. This is just an ascending integer.

For non clustered indexes not declared as unique SQL Server silently adds the row locator into the non clustered index key. This applies to all rows, not just those that are actually duplicates.

The clustered vs non clustered nomenclature is also used for column store indexes. The paper [Enhancements to SQL Server Column Stores][6] states

> Although column store data is not really "clustered" on any key, we
> decided to retain the traditional SQL Server convention of referring
> to the primary index as a clustered index.

[1]:

[2]:

[To see links please register here]

[3]:

[4]:

[5]:

[6]:

[To see links please register here]

Clustered Index
---------------

A Clustered Index is basically a tree-organized table. Instead of storing the records in an unsorted Heap table space, the clustered index is actually B+Tree index having the Leaf Nodes, which are ordered by the clusters key column value, store the actual table records, as illustrated by the following diagram.

[![Clustered Index][1]](

[To see links please register here]

)

The Clustered Index is the default table structure in SQL Server and MySQL. While MySQL adds a hidden clusters index even if a table doesn't have a Primary Key, SQL Server always builds a Clustered Index if a table has a Primary Key column. Otherwise, the SQL Server is stored as a Heap Table.

The Clustered Index can speed up queries that filter records by the clustered index key, like the usual CRUD statements. Since the records are located in the Leaf Nodes, there's no additional lookup for extra column values when locating records by their Primary Key values.

For example, when executing the following SQL query on SQL Server:

SELECT PostId, Title
FROM Post
WHERE PostId = ?

You can see that the Execution Plan uses a Clustered Index Seek operation to locate the Leaf Node containing the `Post` record, and there are only two logical reads required to scan the Clustered Index nodes:

|StmtText |
|-------------------------------------------------------------------------------------|
|SELECT PostId, Title FROM Post WHERE PostId = @P0 |
| |--Clustered Index Seek(OBJECT:([high_performance_sql].[dbo].[Post].[PK_Post_Id]), |
| SEEK:([high_performance_sql].[dbo].[Post].[PostID]=[@P0]) ORDERED FORWARD) |

Table 'Post'. Scan count 0, logical reads 2, physical reads 0

Non-Clustered Index
-------------------

Since the Clustered Index is usually built using the Primary Key column values, if you want to speed up queries that use some other column, then you'll have to add a Secondary Non-Clustered Index.

The Secondary Index is going to store the Primary Key value in its Leaf Nodes, as illustrated by the following diagram:

[![Non-Clustered Index][2]](

[To see links please register here]

)

So, if we create a Secondary Index on the `Title` column of the `Post` table:

CREATE INDEX IDX_Post_Title on Post (Title)

And we execute the following SQL query:

SELECT PostId, Title
FROM Post
WHERE Title = ?

We can see that an Index Seek operation is used to locate the Leaf Node in the `IDX_Post_Title` index that can provide the SQL query projection we are interested in:

|StmtText |
|------------------------------------------------------------------------------|
|SELECT PostId, Title FROM Post WHERE Title = @P0 |
| |--Index Seek(OBJECT:([high_performance_sql].[dbo].[Post].[IDX_Post_Title]),|
| SEEK:([high_performance_sql].[dbo].[Post].[Title]=[@P0]) ORDERED FORWARD)|

Table 'Post'. Scan count 1, logical reads 2, physical reads 0

Since the associated `PostId` Primary Key column value is stored in the `IDX_Post_Title` Leaf Node, this query doesn't need an extra lookup to locate the `Post` row in the Clustered Index.

[1]:

[2]:

**Clustered Index**

A clustered index determines the physical order of DATA in a table. For this reason, a table has only one clustered index(Primary key/composite key).

> "**Dictionary**" No need of any other Index, its already Index according to words

**Nonclustered Index**

A non-clustered index is analogous to an index in a Book. The data is stored in one place. The index is stored in another place and the index has pointers to the storage location. this help in the fast search of data. For this reason, a table has more than 1 Nonclustered index.

>"**Biology Book**" at starting there is a separate index to point Chapter location and At the "**END**" there is another Index pointing the common WORDS location

perigordian994

zackeryijsnjs

overstep894