Have you already a plan to migrate your old SQL Server 2012 availability group infrastructure to windows 2016 and SQL Server 2016? In a previous post, I talked about distributed availability groups and cross-cluster migration scenarios but this time the game is not the same because we want to achieve an in-place upgrade of the existing AG infrastructure. This question will probably be a concern the next year and if we take a closer look at improvements shipped with new versions of Windows and SQL Server, we will not be disappointed for sure.
Indeed, cluster rolling upgrade is a new feature from Windows Server 2016 which allows us to migrate smoothly (and almost transparently) the WSFC side of the existing database infrastructure. On the other side, upgrading high available replicas from SQL Server 2012 to SQL Server 2016 is also possible without reinstalling completely the availability group infrastructure. Thus, we may benefit from a temporary and mixed infrastructure at the both sides to reduce the outage timeframe of our applications. I may think about some customers where it could be helpful regarding their business and their corresponding SLAs.
So let’s just demonstrate this kind of scenario which includes a classic customer’s availability group infrastructure. Most part of availability groups implemented in my area consists of two replicas meaning a WSFC with 2 cluster nodes at the low-level of the global architecture as shown above:
So the question is how to achieve the migration of the above infrastructure from 2012 version to 2016 version (both Windows and SQL Server) with low downtime? Well, one solution would consist in preparing and adding temporary two extra nodes which would run on Windows Server 2016 and SQL Server 2012 in a first step. Unfortunately we cannot mix directly two different versions of SQL Server in an existing availability group yet. We have to execute an extra step to upgrade one by one each replica we want to run on SQL Server 2016. Having two extra nodes will allow to prepare smoothly our migration without impacting the existing high available infrastructure.
So let’s begin with adding 2 nodes with Windows Server 2016 version. Basically, we may use either GUI or PowerShell cmdlets command for that.
The initial scenario (2012 version) is as follows:
Let’s add the two extra cluster nodes which run on Windows Server 2016. According to Microsoft technet procedure, the key point is to perform this action from a Windows Server 2016 node exclusively.
The PowerShell cmdlet used is the same than the previous version and I executed it for the two additional nodes (WIN20168SQL16 and WIN20169SQL16) which run both on Windows Server 2016. Just remember to exclude the two extra nodes from quorum vote to avoid impacting the existing configuration.
However we may notice new cluster functional level property as shown below. The value is equal to 8 meaning that the cluster has switched to a temporary / mixed mode because at the moment we have cluster nodes both on 2012 and 2016 versions.
The transitioned infrastructure includes now 4 nodes. The first 2 nodes run on Windows Server 2012 whereas the last 2 nodes run on Windows Server 2016. An availability group runs on the top of the first 2 nodes and two additional replicas are ready to be enrolled to the existing infrastructure.
So now let’s move on the SQL Server side and let’s add the 2 additional replicas.
The initial AG scenario is as follows:
After adding the two replicas in asynchronous mode, we get the following picture:
At this step, we have now to upgrade the new added replicas to SQL Server 2016. One important thing to keep in mind here is that we have to prevent absolutely failover to an upgraded replica before ensuring all the new secondary replicas are already upgraded. Indeed according to the Microsoft documentation an upgraded primary replica can no longer ship logs to any secondary replica whose SQL Server 2016 instance has not yet been upgraded to the same version.
So in my case, I upgraded first the WIN20169SQL16\SQL12 replica and then the WIN20168SQL16\SQL12 as shown below:
The new transitioned infrastructure is shown in the picture below:
The next part of the migration step includes a short downtime. The previous steps did not imply outage so far.
Basically the next part of the procedure will include the following steps:
- Choose the next replica in 2016 version that will be involved as primary and change its replication mode to synchronous in order to prevent losing data
- Failover the availability group to this replica (at this step old replicas in 2012 version will not be synchronized because the new primary replica may no longer ship logs to them as said previously)
- Change the replication mode of the second replica in 2016 to synchronous to meet the initial configuration
- Remove old replicas in 2012 from the availability group
The T-SQL script was as follows in my case:
:CONNECT WIN20121SQL16\SQL12 -- Change temporary replication to synchronous -- for next SQL Server 2016 primary replica USE [master] GO ALTER AVAILABILITY GROUP [dummygrp] MODIFY REPLICA ON N'WIN20168SQL16\SQL12' WITH (AVAILABILITY_MODE = SYNCHRONOUS_COMMIT) GO :CONNECT WIN20168SQL16\SQL12 -- Initiate failover to next SQL Server 2016 primary replica USE [master] GO ALTER AVAILABILITY GROUP [dummygrp] FAILOVER; GO :CONNECT WIN20168SQL16\SQL12 -- Change temporary replication to asynchronous -- old replicas (2012) USE [master] GO ALTER AVAILABILITY GROUP [dummygrp] MODIFY REPLICA ON N'WIN20169SQL16\SQL12' WITH ( AVAILABILITY_MODE = SYNCHRONOUS_COMMIT ); ALTER AVAILABILITY GROUP [dummygrp] MODIFY REPLICA ON N'WIN20169SQL16\SQL12' WITH ( FAILOVER_MODE = AUTOMATIC ); :CONNECT WIN20168SQL16\SQL12 -- Remove old replicas (2012) USE [master] GO ALTER AVAILABILITY GROUP [dummygrp] REMOVE REPLICA ON N'WIN20121SQL16\SQL12'; GO ALTER AVAILABILITY GROUP [dummygrp] REMOVE REPLICA ON N'WIN20122SQL16\SQL12'; GO
Let’s go back to the cluster side and the next step will consist in removing old cluster nodes from the WSFC by using usual commands as Remove-ClusterNode
… And we may finally update the cluster functional level to 9 (2016 version). Just be aware that upgrading the cluster functional level to 2016 will make the process un-reversible. So reverting back to the initial configuration will simply not be possible or at least it will require extra steps and longer downtime as well.
And let’s get the final view of our upgraded availability group dashboard:
The listener stays the same and it is almost transparent from applications.
In this blog post we had a glimpse of new capabilities of both Windows 2016 and SQL Server 2016 in terms of rolling upgrade. Of course the reality would be probably a little more complicated when introducing other parameters as customer context, number of availability groups, performance impact of adding temporary replicas, external dependencies and so on. But these feature seems to be promising and may be very helpful for future migration scenarios. I’m looking forward to experiment such feature at customer shops!