To be honest, I am not a specialist of Windows Azure. In order to learn a little bit more about this subject, I decided to follow a TechEd Europe 2013 session about High Availability and Desaster Recovery on Windows Azure Virtual Machines.

Overview of Windows Azure

Windows Azure is Microsoft’s application platform for the public cloud. You have the possibility to use this platform in many different ways:

  • build a web application that runs and stores its data on Windows Azure data centers
  • just storing data and running your application on-premises (outside the public cloud)
  • create VMs for development or test
  • etc.

Windows Azure offers multiple solutions.

Windows Azure Principles

It is economic and usage-based:

  • pay for what you use
  • pay by the minute
  • MSDN usage free in VMs

It is automated and elastic:

  • you can use PowerShell automation
  • Easy to scale-out
  • Easy to scale-up

It is managed, hybrid, and supports AlwaysOn:

  • simple load-balancing possible
  • managed availability
  • easy hybrid (Windows Azure and on-premises)

Infrastructure services on Windows Azure

What does Windows Azure offer in terms of infrastructure services?

  • Experience of IT professionals
    • multiple way to get started with Management Portal, Scripting…
  • Image of application available to install it quickly
    • SQL Server(2008 R2 WEB/Standard/Entreprise, 2012 Express/WEB/Standard/Entreprise), SharePoint 2010/2013…
  • Storage Manageability and Mobility
    • possibility to have his own storage or/and to use Windows Azure storage
  • High Availability Features
    • Power Unit Rack Switch: Load balancing between Rack containing VMs(Availability SLA: 99.95%)
  • Advanced Hybrid Networking
    • Virtual Network Site-To-Site VPN netween on-premises datacenter and Windows Azure
    • Virtual Network Point-To-Site: on-premises individual computers behind firewall and  Remote workers  connect via Windows Azure Gateway
  • IaaS, PaaS, and Agility
    • Pay by the minute: VMs stops = Payment stops, no rounding-up, no minimun
    • MSDN Usage Improvements
      • MSDN products can be used on VMs
      • Single monetory credit instead of multiple
      • Focusing on Test/dev usage

Three “infrastructure as a service” scenarios

There are three main “infrastructure as a service” scenarios for SQL Server high availability and disaster recovery:

  • HA within Azure
    • Availability of SQL Server in Azure VM
    • Protection from issues impacting SQL Server or VM
    • Using another SQL Server VM in same Azure DC
  • DR between On-Premise and Azure
    • Ensure availability of on-premise SQL Server (physical or virtual)
    • Protection from issues impacting on-premise DC
    • Using a SQL Server VM in Azure
  • DR across Azure DCs
    • Availability of SQL Server in Azure VM
    • Protection from issues impacting the Azure DC
    • Using another SQL Server VM in different Azure DC

SQL Server High Availability with Azure

What are the reasons for achieving high availability with Azure?

  • Azure’s failure detection for VM (not SQL Server)
    • SQL Server service could be down or hung
    • Servicing of guest OS can cause downtime
    • Servicing of SQL Server can cause downtime
  • Azure’s service healing involves restarting VM in different host
    • around 12 minutes downtime each time
  • Azure’s upgrade involves servicing host OS and restarting VM in the host
    • around 15 minutes downtime each time

Have a look at this exemple:

b2ap3_thumbnail_Azure1.jpg

There are some limitations for the current version of Windows Azure – e. g. mirroring is only possible with one secondary.

b2ap3_thumbnail_Azure2.jpg

Availability Group

You can also use SQL Server technology such as Availability Group in Windows Azure. It offers the following advantages:

  • Provides many other capabilities
    • Flexible Failover Policy
    • Automatic Page Repair
    • Backups on Secondaries
    • Improved Manageability
    • FileStream & FileTable support
  • But requires
    • Windows Cluster
      • Though no shared storage
    • Same Windows Domain
      • Needs an Active Directory Domain Controller

For the moment Availability Group Listeners are not yet supported by Windows Azure. It will be supported in next couple of months.
In the meantime it is possible to use Failover Partner as in database mirroring, but only with two replicas.

How to configure SQL Server Availability Group?

You will have to setup an Active Directory Domain Controller and add VMs to this domain and create a Windows Cluster.
Take care: Azure’s DHCP assigns a dup IP to the cluster network name (CNN) which can cause cluster creation to fail as Availability Groups do not use CNN.
As a workaround, you can use this script.
The rest of the process is the same as on-premises.

Mirroring
If you need Windows Authentication you will have to setup an Active Directory Domain Controller and add VMs to this domain. The rest of the process is the same as on-premises.

SQL Server Disaster Recovery between On-Premises and Azure

Why should we need that?

  • An event can cause on-premises SQL Server to become unavailable temporarily (gateway failure) or permanently (flooding).
  • A Disaster recovery site is expensive
    • site rent + maintenance
    • hardware
    • operations (maintenance)

How to do it?

  • Deploy one or more secondary replicas for on-premises SQL Server
    • Replicas continuously synchronize
  • Best regions: Western US, Eastern US, East Asia, Southeast Asia, Northern Europe, Western Europe
    • Political considerations
    • Latency
  • Low TCO (Total Cost Of)
    • VM and storage

It should look like this:

b2ap3_thumbnail_Azure3.jpgb2ap3_thumbnail_Azure4.jpg

There are some limitations in terms of supported technologies:

b2ap3_thumbnail_Azure5.jpg

SQL Server Disaster Recovery across Azure Datacenters

Why could it be interesting to use this configuration?

  • If you use multiple disks:
    • Azure’s Geo-Replication doesn’t guarantee write order across disks
    • This can break SQL Server’s recovery requirement (log always more up-to-date than data)
  • If Azure’s DR doesn’t satisfy your requirements:
    • NO SLA
    • Based on Azure tests:
      • VM recovery:  less than twenty four hours
      • Data loss: less than thirty minutes

 

b2ap3_thumbnail_Azure6.jpg

These are the supported technologies:

b2ap3_thumbnail_Azure7.jpg

For the moment, Availability Groups are not supported in this configuration because they require the same Windows Domain. Thye will however be supported later this year.

For the moment it is possible to use Database Mirroring or Availability Group with an on-premises Disaster Recovery replica.

I have tries to describe different ways to achieve high availability and disaster recovery with Windows Azure based on a session of TechEd 2013. Hope it helps!