Infrastructure at your Service

David Barbarin

SQL Server: DBCC CHECKDB does not detect corruption

During my audits at customer places, it still happens very often to find SQL Server databases with page verification option configured to “none”. I always alert my customers on this configuration point because it can have an impact on the overall integrity of their databases. One of my customer told me that the integrity task of its maintenance will detect the corruption anyway and alert him by email – but is it really the case?

Of course, my response to the customer is that it depends of the corruption type and that a dbcc checkdb is not an absolute guarantee in this case. I like to show this example to my customers to convince them to change the page verify option to another thing than “none”.

Let me show this to you:

use [master];
go
 
– create corrupt db
if DB_ID(‘corrupt’) is not null
       drop database corrupt;
      
create database corrupt;
go
 
– force page verify option to none
alter database corrupt set page_verify none;
go
 
use corrupt;
go
 
– create two tables t1 and t2
create table dbo.t1( id int identity primary key, col1 varchar(50));
create table dbo.t2( id int identity primary key, col1 int, col2 int, col3 varchar(50));
 
– insert some sample data
declare @random int;
declare @i int = 1;
 
while @i
begin
       set @random = 10000/(rand()* 1000);
       if @random > 50 set @random = 50;
 
       insert t1(col1) values (REPLICATE(‘t’, 50));
       insert t2(col1, col2, col3) values (@i, @i + 1, REPLICATE(‘t’, @random));
 
       set @i = @i + 1;
end
 
– add a foreign key on column id to table t2 that references the table t1 on column id
alter table dbo.t2
add constraint FK_t2_id foreign key (id) references dbo.t1(id);
go
 
– create a nonclustered covered index on table t1
create nonclustered index idx_t2_col1
on dbo.t2( col1, col2) include ( col3 );
go

At this point we have two tables named t1 and t2. Table t2 has a foreign key constraint on the id column that references the table t1 on the column with the same name.

Now let’s corrupt a data page in the clustered index on the table t1. First, we will find the first data page in the clustered index of the table t1:

– get the first data page on the t1 table clustered index
dbcc ind(‘corrupt’, ‘dbo.t1′, 1);
go

 

blog_14_-_dbcc_checkb_corrupt_-_1

Then we will find the first row. The first row is stored in slot 0 which is located at offset 0x060.

– Display dump page id = 15
dbcc traceon(3604);
go
dbcc page (‘corrupt’, 1, 15, 3);
go

 

blog_14_-_dbcc_checkb_corrupt_-_2

Now it’s time to corrupt the id column (id = 1) located to the offset 0x4 in the row. That means we have to place to the offset 0x60 + 0x4 to corrupt this column.

We will use the DBCC WRITEPAGE undocumented command to corrupt our page (again, a big thanks to Paul Randal for showing us how to use this command for testing purposes).

– corrupt the concerned page
alter database corrupt set single_user;
go
 
dbcc writepage(‘corrupt’, 1, 15, 100, 1, 0x00, 1)
 
alter database corrupt set multi_user;
go

 

Now if we take a look at the page id=15, we notice that the id column value is now changed from 1 to 0.

blog_14_-_dbcc_checkb_corrupt_-_3

Ok, let’s run a DBCC CHECKDB command:

– perform an integrity check with dbcc checkdb
dbcc checkdb(‘corrupt’) with no_infomsgs, all_errormsgs;
go

 

blog_14_-_dbcc_checkb_corrupt_-_4

As you can see, the dbcc checkdb command does not detect any corruption! Now, let’s run the following statements:

– first query
select t2.col2, t2.col3
from dbo.t2
       join dbo.t1
             on t1.id = t2.id
where t2.col1 = 1

 

blog_14_-_dbcc_checkb_corrupt_-_5

Do you notice that reading the corrupted page does not trigger an error in this case?

– second query
select t2.col2, t2.col3, t1.col1
from dbo.t2
       join dbo.t1
             on t1.id = t2.id
where t2.col1 = 1

 

blog_14_-_dbcc_checkb_corrupt_-_6

As you can notice, adding the t1.col1 column to the query will give it a different result between the both queries. Strange behavior isn’t it? In fact, the two queries above don’t use the same execution plan as the following below:

Query 1:

blog_14_-_dbcc_checkb_corrupt_-_7

Query 2:

blog_14_-_dbcc_checkb_corrupt_-_8

 

In the query 1, due to the foreign key constraint, the query execution engine doesn’t need to join t2 to t1 to retrieve data because we need only data already covered by the idx_t2_col1 index on table t2. However the story is not the same with the query 2. Indeed, we want to retrieve an additional value provided by the col1 column from the table t1. In this case SQL Server has to join t1 and t2 because the covered index idx_t2_col1 cannot provide all the data we need. But remember we had corrupt the id column of the primary key of the table t1 by changing the value from 1 to 0. This is why the query 2 doesn’t display any results.

The main question here is: why dbcc checkdb doesn’t detect the corruption? Well, in this case corruption has occurring directly on the data value and dbcc checkdb doesn’t have a verification mechanism to detect a corruption issue. Having a checksum value stored in the page would help dbcc checkdb operation in this case because it could compare a computed checksum while reading the page with the stored checksum stored on it.

Below the output provided by dbcc checkdb command if checksum page verify option was enabled for the database …

blog_14_-_dbcc_checkb_corrupt_-_9

… or when we ran the query used earlier:

blog_14_-_dbcc_checkb_corrupt_-_10

My conclusion:

Do not hesitate to change your page verify option value when it is configured to “none”.

 

3 Comments

  • Greg says:

    very good article !
    btw Dave, has it any matter if we perform checkpoint before dbcc checkdb as it concerns only hardended changes while dirty pages ( also these corrupted ) are not checked ?

     
  • Hi Greg, thanks for your comment. No, it doesn’t matter because dbcc checkdb will issue an internal checkpoint in order to get an consistent view before processing. I assume here that we use the dbcc checkdb default behavior by using an hidden database snapshot. As you know, checkpoint process will flush dirty pages on disk and next they will be read while the dbcc checkdb process is running

     
  • SQLpro says:

    Belle démo, je crois que je vais te la piquer !
    Mais as tu essayé un DBCC CHECKCONSTRAINTS ou un DBCC CHECK WITH DATAPURITY ?
    A +

     

Leave a Reply


two − 1 =

David Barbarin
David Barbarin

Senior Consultant & Microsoft Technology Leader