Infrastructure at your Service

David Barbarin

Case of monitoring snapshot replication from the subscriber side

I don’t work often with SQL Server replication. The main reason is that the number of customers that use replication is pretty low and each time that I have to deal with it, it’s a good opportunity to improve my skills on this area. A couple of months ago I had to face an interesting issue with the snapshot replication (yes, I was lucky … this is not the more complex replication mechanism you have to deal with). My customer had a lot of databases (approximatively 250) to replicate from two datacenters apart from one continent.

The global architecture includes two servers that act as publishers with a total of 250 articles (one table into one database) and a central subscriber that gathers this 250 articles (250 databases with one table into each database). All articles are concerned by push subscriptions and all replication jobs are running on the publishers. This replication process is part of a more wide ETL process which delivers some financial metric reports to the business users.

Before computing the financial metrics we need to ensure that all databases replication is done correctly. Unfortunately, according to the customer security context we are stuck because we may not have access from the different publishers. So the main question that remains is how to ensure that we don’t start the calculation of the financial metrics, if we detect that there exist running database replication processes at the same time from the subscriber side?

After reflexion, I had a solution but it is not perfect. I will explain why later in this blog post. My solution includes two things:

1- Recording the last time the concerned table is created. As reminder snapshot replication will recreate concerned articles on the subscriber.
2- Tracking bulk insert activity issued by the snapshot agent before performing the calculation of the financial metrics

Here my script:

SELECT  
   DB_NAME() as database_name,
   tb.name as table_name,  
   tb.create_date as last_creation_date,
   txt.text as current_statement,
   txt.program_name as interface,
   txt.request_mode,
   txt.request_owner_type,
   txt.request_status  
FROM sys.tables as tb
OUTER APPLY(
   SELECT  
   db_name(r.database_id) as database_name,
   t.text,
   s.program_name,
   l.request_mode,
   l.request_status,
   l.request_owner_type
   FROM sys.dm_exec_requests AS r
   JOIN sys.dm_exec_sessions as s
   on r.session_id = s.session_id
   JOIN sys.dm_tran_locks as l  
   on l.request_session_id = s.session_id
   CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) AS t
   WHERE t.text LIKE ‘insert bulk % + tb.name + ”%TABLOCK%’
   AND r.session_id @@SPID
   AND l.resource_type = ‘OBJECT’
   AND l.resource_database_id = r.database_id
       AND l.resource_associated_entity_id = tb.object_id
) AS txt
WHERE tb.name = ‘mytable’

 
I finally created a stored procedure that fetches each concerned database and  execute the above script in the current context of the database. In the context of my customer I implemented an additional step which sends an email on half-day basis. Notice also that I added a custom business rule that detects replication issue if it does not occur before the last 6 hours. You can modify and adjust the script at your convenience.

SET NOCOUNT ON;
 
DECLARE @database_name SYSNAME;
DECLARE @sql NVARCHAR(MAX);
 
– working table : alert_replication_monitoring
TRUNCATE TABLE msdb.dbo.alert_replication_monitoring;
 
– for each concered database we will verify if the t_replixxx table is updated
– from snapshot replication
DECLARE C_DB CURSOR FAST_FORWARD FOR
SELECT
       name
FROM sys.databases
WHERE name LIKE ‘repli_%’;
 
OPEN C_DB;
 
FETCH NEXT FROM C_DB INTO @database_name;
 
WHILE @@FETCH_STATUS = 0
BEGIN
       SET @sql = N’USE ‘ + QUOTENAME(@database_name) +
      
                           IF EXISTS (SELECT 1
                                               FROM sys.tables WHERE name LIKE N”t_repli%”)
                           BEGIN
                                  INSERT INTO msdb.dbo.alert_replication_monitoring
                                  SELECT
                                        DB_NAME() as database_name,
                                        tb.name as table_name,
                                        tb.create_date as last_creation_date,
                                        txt.text as current_statement,
                                        txt.program_name as interface,
                                        txt.request_mode,
                                        txt.request_owner_type,
                                        txt.request_status 
                              FROM sys.tables as tb
                              OUTER APPLY (
                                                            SELECT
                                                                   db_name(r.database_id) as database_name,
                                                                   t.text,
                                                                   s.program_name,
                                                                   l.request_mode,
                                                                   l.request_status,
                                                                   l.request_owner_type
                                                            FROM sys.dm_exec_requests AS r
                                                                   JOIN sys.dm_exec_sessions as s
                                                                          on r.session_id = s.session_id
                                                                   JOIN sys.dm_tran_locks as l      
                                                                          on l.request_session_id = s.session_id
                                                            CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) AS t
                                                            WHERE t.text LIKE ”insert bulk %” + tb.name + ”%TABLOCK%”
                                                                   AND r.session_id @@SPID
                                                                          AND l.resource_type = ”OBJECT”
                                                                                 AND l.resource_database_id = r.database_id
                                                                                       AND l.resource_associated_entity_id = tb.object_id
                                                      ) AS txt
                                  WHERE tb.name LIKE ”t_repli%”
                           END’;
      
       EXEC sp_executesql@sql;
      
       FETCH NEXT FROM C_DB INTO @database_name;
END
 
CLOSE C_DB;
DEALLOCATE C_DB;
 
– Check status of each database replication
SELECT *
FROM
(
       SELECT
             database_name AS [database],
             table_name AS [table],
             last_creation_date AS [last synchro],
             DATEDIFF(MI, last_creation_date, CURRENT_TIMESTAMP) AS [time since the last update (minutes)],
             current_statement,
             CASE
                    WHEN DATEDIFF(MI, last_creation_date, CURRENT_TIMESTAMP)
                                  AND current_statement IS NOT NULL THEN ‘1 – Replication in progress.’
                    WHEN DATEDIFF(MI, last_creation_date, CURRENT_TIMESTAMP)
                                  AND current_statement IS NULL THEN ‘2 – Replication done.’
                    WHEN DATEDIFF(MI, last_creation_date, CURRENT_TIMESTAMP) > 360
                                  AND current_statement IS NOT NULL THEN ‘3 – Replication in progress and takes longer than expected (6 hours).’
                    ELSE ‘4 – No replication has occured during the six last hours.’
             END replication_status
       FROM msdb.dbo.alert_replication_monitoring
) AS repli
ORDER BY [database], replication_status DESC;

Here a picture of the script result while the snapshot agent is running…

blog_31_1_replication_status_monitor

… and when there is no activity but we track the last time the table was synchronized.

blog_31_2_replication_status_monitor_21

As I said earlier, this method has a main following caveat:
We may only claim that a replication process is not running at a given time but we may not know if the replication process is done correctly or with errors.
Remember that this is more a workaround than a perfect solution.
Hope it helps! Please feel free to share your comments about this script!

 

Leave a Reply


− 1 = three

David Barbarin
David Barbarin

Senior Consultant & Microsoft Technology Leader