dbi Blog

During my audits at customer places, it still happens very often to find SQL Server databases with page verification option configured to “none”. I always alert my customers on this configuration point because it can have an impact on the overall integrity of their databases. One of my customer told me that the integrity task of its maintenance will detect the corruption anyway and alert him by email – but is it really the case?

Of course, my response to the customer is that it depends of the corruption type and that a dbcc checkdb is not an absolute guarantee in this case. I like to show this example to my customers to convince them to change the page verify option to another thing than “none”.

Let me show this to you:

use [master];

— create corrupt db

if DB_ID(‘corrupt’) is not null

drop database corrupt;

create database corrupt;

— force page verify option to none

alter database corrupt set page_verify none;

use corrupt;

— create two tables t1 and t2

create table dbo.t1( id int identity primary key, col1 varchar(50));

create table dbo.t2( id int identity primary key, col1 int, col2 int, col3 varchar(50));

— insert some sample data

declare @random int;

declare @i int = 1;

while @i

begin

set @random = 10000/(rand()* 1000);

if @random > 50 set @random = 50;

insert t1(col1) values (REPLICATE(‘t’, 50));

insert t2(col1, col2, col3) values (@i, @i + 1, REPLICATE(‘t’, @random));

set @i = @i + 1;

end

— add a foreign key on column id to table t2 that references the table t1 on column id

alter table dbo.t2

add constraint FK_t2_id foreign key (id) references dbo.t1(id);

— create a nonclustered covered index on table t1

create nonclustered index idx_t2_col1

on dbo.t2( col1, col2) include ( col3 );

At this point we have two tables named t1 and t2. Table t2 has a foreign key constraint on the id column that references the table t1 on the column with the same name.

Now let’s corrupt a data page in the clustered index on the table t1. First, we will find the first data page in the clustered index of the table t1:

— get the first data page on the t1 table clustered index

dbcc ind(‘corrupt’, ‘dbo.t1’, 1);

blog_14_-_dbcc_checkb_corrupt_-_1

Then we will find the first row. The first row is stored in slot 0 which is located at offset 0x060.

— Display dump page id = 15

dbcc traceon(3604);

dbcc page (‘corrupt’, 1, 15, 3);

blog_14_-_dbcc_checkb_corrupt_-_2

Now it’s time to corrupt the id column (id = 1) located to the offset 0x4 in the row. That means we have to place to the offset 0x60 + 0x4 to corrupt this column.

We will use the DBCC WRITEPAGE undocumented command to corrupt our page (again, a big thanks to Paul Randal for showing us how to use this command for testing purposes).

— corrupt the concerned page

alter database corrupt set single_user;

dbcc writepage(‘corrupt’, 1, 15, 100, 1, 0x00, 1)

alter database corrupt set multi_user;

Now if we take a look at the page id=15, we notice that the id column value is now changed from 1 to 0.

blog_14_-_dbcc_checkb_corrupt_-_3

Ok, let’s run a DBCC CHECKDB command:

— perform an integrity check with dbcc checkdb

dbcc checkdb(‘corrupt’) with no_infomsgs, all_errormsgs;

blog_14_-_dbcc_checkb_corrupt_-_4

As you can see, the dbcc checkdb command does not detect any corruption! Now, let’s run the following statements:

— first query

select t2.col2, t2.col3

from dbo.t2

join dbo.t1

on t1.id = t2.id

where t2.col1 = 1

blog_14_-_dbcc_checkb_corrupt_-_5

Do you notice that reading the corrupted page does not trigger an error in this case?

— second query

select t2.col2, t2.col3, t1.col1

from dbo.t2

join dbo.t1

on t1.id = t2.id

where t2.col1 = 1

As you can notice, adding the t1.col1 column to the query will give it a different result between the both queries. Strange behavior isn’t it? In fact, the two queries above don’t use the same execution plan as the following below:

Query 1:

blog_14_-_dbcc_checkb_corrupt_-_7

Query 2:

blog_14_-_dbcc_checkb_corrupt_-_8

In the query 1, due to the foreign key constraint, the query execution engine doesn’t need to join t2 to t1 to retrieve data because we need only data already covered by the idx_t2_col1 index on table t2. However the story is not the same with the query 2. Indeed, we want to retrieve an additional value provided by the col1 column from the table t1. In this case SQL Server has to join t1 and t2 because the covered index idx_t2_col1 cannot provide all the data we need. But remember we had corrupt the id column of the primary key of the table t1 by changing the value from 1 to 0. This is why the query 2 doesn’t display any results.

The main question here is: why dbcc checkdb doesn’t detect the corruption? Well, in this case corruption has occurring directly on the data value and dbcc checkdb doesn’t have a verification mechanism to detect a corruption issue. Having a checksum value stored in the page would help dbcc checkdb operation in this case because it could compare a computed checksum while reading the page with the stored checksum stored on it.

Below the output provided by dbcc checkdb command if checksum page verify option was enabled for the database …

… or when we ran the query used earlier:

blog_14_-_dbcc_checkb_corrupt_-_10

My conclusion:

Do not hesitate to change your page verify option value when it is configured to “none”.

By David Barbarin

Post Views: 992

SQL Server: DBCC CHECKDB does not detect corruption

My conclusion:

Microsoft Team

Leave a Reply:

Related blog articles

My conclusion:

Microsoft Team

Leave a Reply:

Related blog articles

A first look at openSUSE Leap Micro 6.0

Another file system for Linux: bcachefs (3) – Mirroring/Replicas

Another file system for Linux: bcachefs (2) – multi device file systems

Another file system for Linux: bcachefs (1) – basics