Infrastructure at your Service

David Barbarin

SQL Server availability groups, SQL Browser and Shared Memory considerations

Few weeks ago, my colleagues and me discussed availability groups and network considerations for one of our customers including disabling SQL Browser service and shared memory protocol. The point was disabling both features may lead to unexpected behaviors when creating availability groups.

blog 145 - 0 - AG network banner

Let’s start with the SQL Browser service. It is not uncommon to disable this service at customer shops and to use directly SQL Server listen ports instead. But if you go trough the availability group wizard you will find there a plenty of blockers when actions than require connecting to the secondary replica as adding a database, performing a failover and so on.

Disabling the SQL Browser service doesn’t mean you can not reach out your SQL Server instance by using the named instance format SERVER\INSTANCE. There are some scenarios that’s work perfectly including either connecting from the local server through the shared memory or by using SQL Server aliases. Let’s say my infrastructure includes 2 AG replicas vmtest2012r04\SQL2014 and vmtest2012r05\SQL2014. SQL browser is disabled and shared memory is enabled on each. There are no aliases as well. If you try to connect from the vmtest2012r04\SQL2014 by using named instance format it will work on the local replica (through shared memory) but it won’t work if you try to connect to the remote replica vmtest2012r05\SQL2014. In the latest case, you have will to use SERVER,PORT format as shown below:

C:\Users\dab>sqlcmd -S vmtest2012r204\SQL2014 -Q"SELECT 'OK' AS connection"
connection
---------
OK

(1 rows affected)

C:\Users\dab>sqlcmd -S vmtest2012r205\SQL2014 -Q"SELECT 'OK' AS connection"
Sqlcmd: Error: Microsoft ODBC Driver 13 for SQL Server : SQL Server Network Inte
rfaces: Error Locating Server/Instance Specified [xFFFFFFFF]. .
Sqlcmd: Error: Microsoft ODBC Driver 13 for SQL Server : Login timeout expired.
Sqlcmd: Error: Microsoft ODBC Driver 13 for SQL Server : A network-related or in
stance-specific error has occurred while establishing a connection to SQL Server
. Server is not found or not accessible. Check if instance name is correct and i
f SQL Server is configured to allow remote connections. For more information see
 SQL Server Books Online..

C:\Users\dab>sqlcmd -S vmtest2012r205,1453 -Q"SELECT 'OK' AS connection"
connection
---------
OK

 

But I guess this is not a big surprise for you. This kind of configuration works well with availability group but at the cost of some compromises. Indeed, creating an availability group remains pretty easy and you just have to keep using SERVER,PORT format when the wizard asks for connection information.

blog 145 - 1 - AG wizard - replica

But the game is different for adding a database to the AG or at least any operation that requires to connect to the replicas. In this case the Wizard forces to connect to the second replica by using SERVER\INSTANCE format leading to get stuck at this step.

blog 145 - 2 - AG wizard - add DB

The only way is to go through T-SQL script (or PowerShell command) to change the format to SERVER,PORT. Probably something that may be fixed by Microsoft in the future.

Update 24.10.2018: A correction from Damien Berchotteau: You can override the connection string by using server=server_name,port and continue to go through SSMS.

blog 145 - 3 - AG wizard - connection string

Let’s add now to the equation disabling the shared memory protocol on each replica. I met some customers who disable it to meet their internal best practices because their applications are not intended to connect locally on the same server than their database engine. At the first glance, this is not a bad idea but we may get in trouble with operations performed on availability group architectures. This is a least what we experienced every time we were in this specific context. For instance, if I try to create an availability group, I will face the following timeout error message:

blog 145 - 3 - AG wizard - shared memory disabled

This is a pretty weird issue and to get more details, we have to take a look at the cluster log. Here the interesting sample of messages we may find out:

...2018/10/22-20:42:51.436 ERR   [RES] SQL Server Availability Group <AG2014>: [hadrag] ODBC Error: [08001] [Microsoft][SQL Server Native Client 11.0]SQL Server Network Interfaces: Error Locating Server/Instance Specified [xFFFFFFFF].  (268435455)
...2018/10/22-20:42:51.436 ERR   [RES] SQL Server Availability Group <AG2014>: [hadrag] ODBC Error: [HYT00] [Microsoft][SQL Server Native Client 11.0]Login timeout expired (0)
...2018/10/22-20:42:51.436 ERR   [RES] SQL Server Availability Group <AG2014>: [hadrag] ODBC Error: [08001] [Microsoft][SQL Server Native Client 11.0]A network-related or instance-specific error has occurred while establishing a connection to SQL Server. Server is not found or not accessible. Check if instance name is correct and if SQL Server is configured to allow remote connections. For more information see SQL Server Books Online. (268435455)
...2018/10/22-20:42:51.436 INFO  [RES] SQL Server Availability Group <AG2014>: [hadrag] Could not connect to SQL Server (rc -1)
...2018/10/22-20:42:51.436 INFO  [RES] SQL Server Availability Group <AG2014>: [hadrag] SQLDisconnect returns following information
...2018/10/22-20:42:51.451 ERR   [RES] SQL Server Availability Group <AG2014>: [hadrag] ODBC Error: [08003] [Microsoft][ODBC Driver Manager] Connection not open (0)
...2018/10/22-20:42:51.451 ERR   [RES] SQL Server Availability Group <AG2014>: [hadrag] Failed to connect to SQL Server
...2018/10/22-20:42:51.451 ERR   [RHS] Online for resource AG2014 failed.

 

It seems that the RHS.exe, through the resource associated to my AG, is not able to connect to the SQL Server replica during the initialization phase. According to the above cluster log, the ODBC connection seems to be limited to connect by using INSTANCE\NAME format and as I far as I know there is no interface to change it with the AG cluster resource DLL (thanks Microsoft guys for confirming this point). Therefore, disabling both SQL Browser and shared memory leads to the AG cannot be brought online because a communication channel cannot be established between the primary and the cluster service. My friend MVP Christophe Laporte tried also some funny tests by trying to create custom DSN connections without luck.   So, the simplest way to fix it if you want to keep disable the SQL Browser service is to enable the shared memory on each replica. Another workaround may consist in using SQL aliases but it leads to a static configuration that requires to document well your architecture.

In a nutshell, disabling SQL Browser limits the AG operations that can be done through the GUI. Adding the shared memory to the equation may have a bigger impact to the underlying WSFC infrastructure that you have to be aware of. According to my tests, this behavior seems to be same with versions from SQL2012 to SQL2017 (on Windows) regardless the WSFC version.

Hope this helps!

 

 

 

 

 

Leave a Reply

David Barbarin
David Barbarin

Senior Consultant & Microsoft Technology Leader