Infrastructure at your Service

David Barbarin

Load Balancing with Docker Swarm mode and SQL Server containers

Docker swarm mode provides an easy way to publish ports for services. Indeed, in this mode all nodes participate in an ingress routing mesh and accept connections to any service port published. Thus, all incoming requests are routed to available nodes hosting a service.

That’s a pretty cool feature but it has some drawbacks. Let’s say that a node goes wrong … In this case the application must retry connecting to the next available node and it will likely imply changing the application connection string or using a DNS redirection. A load balancer outside of the Swarm provides a better way to connect to the containers without having to worry about the cluster nodes from an application perspective. Thanks to the ingress routing all of the published services are available through any of the swarm nodes and the load balancer can be set to use the swarm private IP addresses without a concern of which node is hosting what service.

After taking a look at the existing open source projects / solutions in the market, I quickly reduced the scope of my search to only 2 ones: nginx and HAProxy. Each product has pros and cons but I choose HAProxy that fit well with my tests. In fact, I used more specifically the dockercloud-haproxy because it includes an HAProxy docker image that runs with Docker Swarm mode including docker-compose. I’m not sure this project is dead or alive referring to the dockerhub page that states the following: Unfortunately, this project is currently put into maintenance mode. Please do not send any PR for new features. We will still fix bugs if there is any. It exists some other alternatives as proxy.dockerflow but once again the dockercloud-haproxy project remained relevant for what I wanted to test.

Here my docker-compose file:

version: '3.1'
services:
  db: 
    build: .
    image: 127.0.0.1:5000/dbi_linux_sql2017:CU4
    ports: 
      - "1433:1433"
    volumes:
      - /u00/db2:/u00
      - /u01/db2:/u01
      - /u02/db2:/u02
      - /u03/db2:/u03
      - /u98/db2:/u98
    environment:
      - MSSQL_SA_PASSWORD_FILE=/run/secrets/mssql_sa_password
      - ACCEPT_EULA=Y
      - MSSQL_PID=Developer
      - MSSQL_USER=dbi
      - MSSQL_USER_PASSWORD_FILE=/run/secrets/mssql_user_password
      - TZ=Europe/Berlin
      - SERVICE_PORTS=1433
    networks:
      - back-tier
    deploy:
      replicas: 1
      placement:
        constraints: [node.role != manager]
      resources:
        limits:
          cpus: '1'
          memory: 3584M
        reservations:
          cpus: '1'
          memory: 3072M
    secrets:
      - source: mssql_sa_password
        target: mssql_sa_password
      - source: mssql_user_password
        target: mssql_user_password

  proxy:
      image: dockercloud/haproxy
      depends_on:
        - db
      environment:
        - BALANCE=leastconn
        - ADDITIONAL_SERVICES=project_dir:db
        - MODE=tcp
      volumes:
        - /var/run/docker.sock:/var/run/docker.sock
      ports:
        - 80:80
        - 1936:1936
      networks:
        - back-tier
      deploy:
        placement:
          constraints: [node.role == manager]

networks:
  back-tier:
    external:
      name: backend-server
  front-tier:
    external:
      name: frontend-server

secrets: # top level secrets block
  mssql_sa_password:
    external: true
  mssql_user_password:
    external: true

 

It includes 2 services:

  • db (my SQL Server container)
  • proxy (my HAProxy server)

I used some Swarm Mode specific environment variables:

  • SERVICE_PORTS=1433 – to expose the port of my db service (mssql default port)
  • BALANCE=leastconn – because it is recommended with long sessions as LDAP, SQL, TSE etc…
  • ADDITIONAL_SERVICES=project_dir:db – to identify the db service from the proxy service
  • MODE=tcp – mode of load balancing for HAProxy (TCP in my case)

I deployed the stack as following:

$ docker stack deploy -c docker-compose-test.yml mssql
Ignoring unsupported options: build

Creating service mssql_db
Creating service mssql_proxy

 

My service state was as follows:

$ docker service ls --filter name=mssql
ID                  NAME                MODE                REPLICAS            IMAGE                                  PORTS
26zdisl9r64y        mssql_db            replicated          1/1                 127.0.0.1:5000/dbi_linux_sql2017:CU4   *:1433->1433/tcp
nhje3081gwr8        mssql_proxy         replicated          1/1                 dockercloud/haproxy:latest             *:80->80/tcp,*:1936->1936/tcp

 

Concerning the mssql_proxy service we may notice the exposed ports 80 and 1936. The former will be used to connect to the HAProxy for the SQL Server connection redirection and the latter concers the default port to get HAProxy stats information (from a web browser).

blog 132 - 2 - docker swarm node status1

Here an example of my HAProxy output:

blog 132 - 1 - docker swarm HA proxy stats

The db service task (mssql_db.1.o9s2xxxxx) – that corresponds in fact to my service task (or container) is well-identified by the HA proxy.

So, I tried to connect to my SQL Server DB container through the HAProxy:

C:\Users\clustadmin>sqlcmd -S docker1,80 -Usa -PPassw0rd1 -Q"SELECT @@SERVERNAME"

---------------------------------------------------------------------------------
cf7f9d6036f3

 

It worked like a charm!

The next step consisted in simulating a failure of my docker node DOCKER3 …

blog 132 - 3 - docker swarm node status2

… and try again a connection to the SQL Server DB container that has restarted on DOCKER2 node. As expected, the connection kept working and was automatically redirected by the HAProxy to the next available node. Obviously in this case the redirection was not transparent for the application. We have to take to into account the short period of outage that corresponds to the mssql_db container restart duration. By the way, if you take a look at the HAProxy default_service section you may notice that the task id has changed (mssql_db.1.zo8yqxxxxxx) meaning the system has created a new task (or container). However, my mssql_db container host name remained the same in my case.

blog 132 - 4 - docker swarm HA proxy stats2

C:\Users\clustadmin>sqlcmd -S docker1,80 -Usa -PPassw0rd1 -Q"SELECT @@SERVERNAME"

---------------------------------------------------------------------------------
cf7f9d6036f3

 

Another interesting point I had to dig further concerns the HAProxy health check by itself. In my case I used TCP-based health check to ensure the service is alive but it is probably not good enough to check if my SQL Server instance (in other words my application) is working correctly. I spent some times to read articles and documentations and I discovered we may use custom health checks with HAProxy with databases as MySQL, PostgreSQL and Redis. Unfortunately, at the time I’m writing this blog post nothing concerning SQL Server probably because it is pretty new in this area (maybe I get wrong … so please feel free to comment) . I found out some other alternatives as xinetd that may be used to trigger a custom routine (by using sqlcmd command line tool for example) but in my context it makes the game a little bit more complex because I have to include it to my SQL Server docker image. After thinking a little bit about this issue, I realized we already implemented such routine directly inside the SQL Server image itself (HEALTHCHECK section) that uses precisely the sqlcmd command line tool to check the SQL Server connection is working correctly.

/opt/mssql-tools/bin/sqlcmd -S localhost,$MSSQL_TCP_PORT -U sa -P ${SA_PASSWORD} -Q "select 1"

 

By the way, the task status can be viewed through the docker ps command as follows:

$docker ps -f name=mssql --format 'table {{.ID}}\t{{.Names}}\t{{.CreatedAt}}\t{{.Status}}'
CONTAINER ID        NAMES	CREATED AT                       STATUS
9bfe04a97617        mssql_db.1.zc8yqp9llmjdcn3df49izyj72	2018-05-01 23:01:39 +0200 CEST   Up 9 hours (healthy)

 

So, the combination of the docker health check routine and the HAProxy TCP health check seems to be a good solution to address different kind of failure including docker swarm node failures and application failures as well.

This is just the first step about using an HAProxy with SQL Server (redirection capabilities). You may also extend this first scenario with a low cost solution including multiples SQL Server Express containers to address scale-out needs with Reporting queries for instance. I remember Christophe Laporte(b) gave a session about some years ago. I will try to blog about soon!

See you!

 

 

 

 

5 Comments

  • Masoud says:

    hi dear

    thank you for this great post
    now in 2019 I searched for some solutions to scale up ms sql server running on docker containers with Docker Swarm orchestration
    found no tutorial or good content except your blog posts
    but now I can not orchestrate ms sql server via Swarm mode
    it needs High Availability Groups and Failover Cluster enabled in sql servers and also I think it needs to share Storage Volumes between sql containers ¿

    or maybe Im wrong

    I really thank you if you can help me some

    wanna orchestrate and set ha for sql server on Swarm for High Availability for 100-200 thousands concurrent sql connections
    every time 5000 concurrent users running my app sql instance is crashing

    plz help me thank you

    • David Barbarin says:

      Hi Masoud,

      Unfortunately Microsoft doesn’t provide support for Swarm but only for K8s orchestrator.
      Therefore there is no supported way to implement SQL Server with AG / FCI capabilities on Swarm. You may just rely on Swarm HA in this case to protect your container from node failure.

      Regards

  • masoud says:

    thank you for your reply

    so anyway we can not use swarm for sql server HA
    so whats best solution to switch our infra into it :
    between :
    Rancher ( stand alone )
    Kubernetes ( with or without rancher )

    to have Sql server orchestration

    • David Barbarin says:

      Hi Masoud,

      I would say this is the same discussion than Docker Swarm. As far as I know Rancher is not supported by Microsoft for SQL Server and Rancher doesn’t support SQL Server so far if I’m not wrong. Are you looking for using MSSQL containers in production?

  • David Barbarin says:

    But don’t get me wrong .. if relying on Docker Swarm high availability capabilities meets your requirement, Docker Swarm may be a good candidate. It is like using SQL Server VMs that rely on VSphere HA capabilities. Microsoft doesn’t support Swarm but if you face an issue from the container itself (not related to Swarm mode) you will likely get support from Microsoft.

    However if you need HA capabilities of SQL Server, K8s will be the way to go. But be aware AlwaysOn AGs will be supported since SQL Server 2019 (not in GA actually)

    Regards

    Regards

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

David Barbarin
David Barbarin

Principal Consultant & Microsoft Technology Leader