Monday, April 29, 2013

Got CHECKSUM on Backup? Check Again

If your SQL Server is using backup task created via SQL maintenance plans or some customized backup scripts, and you are expecting it to performs backup with CHECKSUM operation, you may want to review the backup task.

Backup database task from maintenance plans does not have the CHECKSUM option. Someone may argue that if the COMPRESSION is enabled, CHECKSUM is the default behavior for a compress backup.

Here is the excerpt from Book Online (BOL)

CHECKSUM

Specifies that the backup operation will verify each page for checksum and torn page, if enabled and available, and generate a checksum for the entire backup. This is the default behavior for a compressed backup.


Is CHECKSUM option really enabled by default if COMPRESSION option is enabled? Before we conduct the test, let us cover some background of CHECKSUM.

The CHECKSUM option on the backup operation enables a checksum to be generated for the entire backup, and verify each database page for checksum and torn page. This option is highly recommended to allow early detection of page corruption during the backup as well as verifying backup for any corruption when performing database restore (or RESTORE VERIFYONLY).

Please keep in mind that the CHECKSUM option only verify the database page if the particular database page contains checksum or the torn bits. In order for a page to contain this information,

First - The database needs to have its PAGE_VERIFY setting configured with either CHECKSUM or TORN_PAGE_DETECTION enabled.
Second - SQL Server Database Engine calculates checksum or torn bits on a page and stores it on page header only when the page is written to disk.

With these understanding, let's do some test.

First, create the database with its PGE_VERIFY setting configured as CHECKSUM. Create a table and insert an record. Issues CHECKPOINT command to ensure the page is written to the disk.

CREATE DATABASE TEST;
GO

ALTER DATABASE TEST
SET PAGE_VERIFY CHECKSUM;
GO

USE TEST;
GO

CREATE TABLE Table1
(
  ID tinyint IDENTITY(1,1),
  Name varchar(10)
);
GO

INSERT Table1
SELECT 'TEST123';
GO

SELECT ID, Name
FROM Table1;
GO

-- To ensure the data page is written to the disk
CHECKPOINT;
GO

-- Drop the data page in buffer pool so that data will be pulled from the disk on next inquiry.
DBCC DROPCLEANBUFFERS
GO

Second, examine the data page with DBCC IND and DBCC PAGE for the table and the record inserted, and make sure checksum has been generated for the page.

DBCC IND ('TEST', 'Table1');
GO



-- To enable output for DBCC command.
DBCC TRACEON (3604);
GO
DBCC PAGE ('TEST', 1, 230, 3);
GO




m_flagBits = 0x8200 shows that the page has a checksum stored on it.

Next step, turn database off. Use Hex Editor xvi32 to edit 'TEST123' record to 'TEST124' and save the new data to emulate corruption.

USE master;
GO

ALTER DATABASE TEST
SET OFFLINE
WITH ROLLBACK IMMEDIATE;
GO



Now it is time to check if the database engine detect corruption error when the page is read. Turn database on. Select from table encounter corruption error.

ALTER DATABASE TEST
SET ONLINE;
GO
USE TEST;
GO

SELECT ID, Name
FROM Table1;
GO

Error below encountered.

Msg 824, Level 24, State 2, Line 2 SQL Server detected a logical consistency-based I/O error: incorrect checksum (expected: 0xf4caaa80; actual: 0x74caaa83). It occurred during a read of page (1:230) in database ID 5 at offset 0x000000001cc000 in file 'C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\TEST.mdf'. Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.

Take a backup with compression without explicitly specify CHECKSUM option to see if any error is encountered.

BACKUP DATABASE TEST
TO DISK = 'TEST_backup_compress_nochecksum.bak'
WITH COMPRESSION;
GO

BACKUP DATABASE successfully processed 291 pages in 0.581 seconds (3.901 MB/sec).

NO! Backup operation complete without any error! Inspect the backup header to verify if checksum is generated for the backup.

RESTORE HEADERONLY
FROM DISK = 'TEST_backup_compress_nochecksum.bak';



NO!

Now, backup the database and specify CHECKSUM explicitly with, and also without compression.

BACKUP DATABASE TEST
TO DISK = 'TEST_backup_compress_checksum.bak'
WITH COMPRESSION, CHECKSUM;
GO

And then also,

BACKUP DATABASE TEST
TO DISK = 'TEST_backup_nocompress_checksum.bak'
WITH CHECKSUM;
GO

Msg 3043, Level 16, State 1, Line 1 BACKUP 'TEST' detected an error on page (1:230) in file 'C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\TEST.mdf'. Msg 3013, Level 16, State 1, Line 1 BACKUP DATABASE is terminating abnormally. 

Corruption error encountered during backup operation! Perhaps this is the only time I am excited about corruption error.

Observation
Without explicitly specifying CHECKSUM option, backup with or without compression does not generate checksum for the backup, and does not verify page checksum. 

So, bad news for backup task in SQL maintenance plans. If you are expecting checksum to be generated for backup as well as page checksum verification during backup, you may need to use customized backup script that explicitly specify the CHECKSUM option in the backup command.

Google+