Pages

SQL Index Maintenance - Reorganize and Rebuild Index

SQL Server database indexes are created mainly to optimize performance. Different types of indexes such as clustered index, non-clustered index are put in place to make query process run faster. However over time, DML statements (insert, update, delete) cause information scattered across database hence resulting table or index fragmentation. Fragmentation happen when the data logical order does not match with physical ordering inside the data file. These fragmentation need to be addressed according as it could significantly affect SQL Server performance.

There are two main ways to address fragmentation, REORGANIZE or REBUILD the index. In general, reorganize an index uses lower resources than rebuild by reordering the leaf-level pages to match the logical order. Rebuild an index is to drop the existing index and recreate the index (To be more accurate, it creates a new copy of the index before dropping the old one). During index creation, the index row is order in contiguous pages.

A lot of the time, this task is designed and scheduled in SQL Server maintenance plan to routinely remove fragmentation.

Here are some of the differences between reorganize and rebuild,

REORGANIZE is a single thread process. It is fully logged but does not prevent transaction log from clearing. REORGANIZE is an ONLINE operations which only incur minimal lock, not the holding the lock for entire operation. REORGANIZE does not update statistics. It requires less space.

REBUILD can utilize multiple CPUs for its operation (Running in parallel, limited to MAXDOP setting). It can use minimal logging in bulk-logged and simple recovery model to reduce log growth, but fully logged in full recovery model. REBUILD may incur long term lock (Unless it is in Online mode. However short term lock still incur at the start and the end of the operation). Since REBUILD recreate index which require a full scan, it automatically update the statistics (similar with UPDATE STATISTICS .. WITH FULL SCAN mode). It requires larger space since a new copy of index has to be created before the old one is discarded.

Both operations have its pros and cons in terms of disk space requirements, log file growth, parallelism. In order to defragment efficiently, it is better to use REORGANIZE operation for lower logical fragmentation.  REORGANIZE uses lower resource to reorder the existing fragmentation; Use REBUILD for larger fragmentation as it recreates the index which consume higher resources.

The general guideline provided from Microsoft is to use REORGANIZE if fragmentation is > 10% and <= 30%; and use REBUILD operation for > 30% fragmentation. In both cases, the index should has at least 1000 pages to consider for defragmentation. Keep in mind that this is just the general guidelines. You need to monitor your environment and change these threshold accordingly. These threshold were created by Paul Randal (former Microsoft employee and currently CEO of SQLskills.com), and he mentioned this in his blog post.

To detect fragmentation, you can use dynamic management view sys.dm_db_index_physical_stats introduced in SQL Server 2005 or DBCC SHOWCONTIG introduced in SQL Server 2000. I blogged here with details how to use the DMV to identify fragmentation along with my script.

Once the fragmentation is known, to REORGANIZE signle index on a table,
ALTER INDEX YourIndex
ON YourTable
REORGANIZE;

REORGANIZE also provide a function to compact index contains large object (LOB, data type image, text, ntext, nvarchar(max), varbinary(max) and xml) with LOB_COMPATION option to improve disk space usage. This option is ignored if no LOB column is present.

ALTER INDEX YourIndex
ON YourTable
REORGANIZE
WITH LOB_COMPACTON = ON;

To REBUILD single index on a table,

ALTER INDEX YourIndex
ON YourTable
REBUILD;

There are other options of REBUILD that could enhance/affect your performance. You can set FILLFACTOR during index rebuilding. Fill factor specifies how much free space is make for every index page. 0 and 100 mean 100% filled. It affects the amount of storage size and incorrect setting could contribute to fragmentation.

SQL Server Enterprise also allows use of multiple processor to perform REBUILD operation by using MAXDOP option. This MAXDOP option set number of dedicated cpu for index operation, and it overrides max degree of parallelism set at server configuration.

SQL Server 2008 (Enterprise edition) introduce ONLINE option for REBUILD. It reduce the duration of locks by holding lock very shortly at the beginning and the end of the REBUILD operation. ONLINE operation is fully logged regardless of recovery model. OFFLINE option applies lock throughout the index operation and prevent user to access the table. However, OFFLINE is faster.

Another option is SORT_IN_TEMPDB. The intermediate results are stored and sorted in tempdb to build the index. This reduce the index rebuilding time as the redad and write data process during rebuild operation become more serially. Please notes this option is only benefitial when the tempdb is on different set of physical disks. Sorting in tempdb increase disk space used in tempdb. The default for this option is set to OFF.

ALTER INDEX YourIndex
ON YourTable
REBUILD
WITH 
  FILLFACTOR = 80
  , ONLINE

No comments:

Post a Comment