Infrastructure at your Service

David Barbarin

Partitioning – When data movement is not performed as expected

This blog is about an interesting partitioning story and curious data movements during merge operation. I was at my one of my customer uses intensively partitioning for various reasons including archiving and manageability. A couple of days ago, we decided to test the new fresh developed script, which will carry out the automatic archiving stuff against the concerned database in quality environment.

Let’s describe a little bit the context.

We used a range-based data distribution model. Boundary are number-based and are incremented monolithically with identity values.

We defined a partition function with range right values strategy as follows:

CREATE PARTITION FUNCTION pfOrderID (INT) 
AS RANGE RIGHT FOR VALUES(@boundary_archive, @boundary_current)
go

 

The first implementation of the partition scheme consisted in storing all partitioned data inside the same filegroup.

CREATE PARTITION SCHEME psOrderID 
AS PARTITION pfOrderID 
ALL TO ([PRIMARY])
go

Well, the initial configuration was as follows. We applied page compression to the archive partition because it contained cold data which is not supposed to be updated very frequently.

blog 106 - 0 - partitioned table starting point

blog 106 - 1 - partitioned table starting point config

As you may expect, the next step will consist in merging ARCHIVE and CURRENT partitions by using the MERGE command as follows:

ALTER PARTITION FUNCTION pfOrderID() 
MERGE RANGE (@b_archive);

 

Basically, we achieve a merge operation according to Microsoft documentation:

The filegroup that originally held boundary_value is removed from the partition scheme unless it is used by a remaining partition, or is marked with the NEXT USED property.

 

blog 106 - 2 - partitioned table after merge

At this point we may expect SQL Server has moved data from the middle partition to the left partition and according to this other Microsoft pointer

When two partitions are merged, the resultant partition inherits the data compression attribute of the destination partition.

But the results come as a little surprise because I expected to see a page compression value from my archive partition.

blog 106 - 3 - partitioned table after merge config

At this point, my assumption was either compression value doesn’t inherit correctly as mentioned in the Microsoft documentation or data movement is not performed as expected. The latter may be checked very quickly by looking at the corresponding records inside the transaction log file.

SELECT 
	[AllocUnitName],
	Operation,
	COUNT(*) as nb_ops
FROM ::fn_dblog(NULL,NULL)
WHERE [Transaction ID] = (
     SELECT TOP 1 [Transaction ID]
     FROM ::fn_dblog(NULL,NULL)
     WHERE [Xact ID] = 164670)
GROUP BY [AllocUnitName], Operation
ORDER BY nb_ops DESC
GO

I put only the relevant sample here

blog 106 - 5 - log file after merge config

Referring to what I saw above, I noticed data movement was not performed as expected but rather as shown below:

blog 106 - 4 - partitioned table after merge config 2

So, this strange behavior seems to explain why compression state switched from PAGE to NONE in my case. Another strange thing is when we changed the partition scheme to include an archive filegroup, we returned to normality.

CREATE PARTITION SCHEME psOrderID 
AS PARTITION pfOrderID 
TO ([ARCHIVE], [PRIMARY], [PRIMARY])
go

 

I doubled checked the Microsoft documentation to see if it exists one section to figure out the behavior I experienced in this case but without success. After further investigations I found out an interesting blog post from Sunil Agarwal (Microsoft) about partition merging and some performance improvements shipped with SQL Server 2008 R2. In short, I was in the specific context described in the blog post (same filegroup) and merging optimization came into action transparently because number of rows in the archive partition was lower than the one in the current partition at the moment of the MERGE operation.

Let me introduce another relevant information – the number of lines for each partition – to the following picture:

blog 106 - 6 - partitioned table optimization stuff

Just to be clear, this is an interesting and smart mechanism provided by Microsoft but it may be a little bit confusing when you’re not aware of this optimization. In my case, we finally decided with my customer to dedicate an archiving partition to store cold data that will be rarely accessed in this context and keep up the possibility to store cold data on cheaper storage when archiving partition will grow to a critical size. But in other cases, if storing data is the same filegroup is a still relevant scenario, keep in mind this optimization to not experience unexpected behaviors.

Happy partitioning!

 

 

Leave a Reply


9 − = seven

David Barbarin
David Barbarin

Senior Consultant & Microsoft Technology Leader