Infrastructure at your Service

In previous blogs, I talked about some basis and presented some possible architectures for Alfresco, I talked about the Clustering setup for the Alfresco Repository, the Alfresco Share and for ActiveMQ. In this one, I will talk about the Front-end layer, but in a very particular setup because it will also act as a Load Balancer. For an Alfresco solution, you can choose the front-end that you prefer and it can just act as a front-end to protect your Alfresco back-end components, to add SSL or whatever. There is no real preferences but you will obviously need to know how to configure it. I posted a blog some years ago for Apache HTTPD as a simple front-end (here) or you can check the Alfresco documentation which now include a section for that as well but there is no official documentation for a Load Balancer setup.

In an Alfresco architecture that includes HA/Clustering you will, at some point, need a Load Balancer. From time to time, you will come across companies that do not already have a Load Balancer available and you might therefore have to provide something to fill this gap. Since you will most probably (should?) already have a front-end to protect Alfresco, why not using it as well as a Load Balancer? In this blog, I choose Apache HTTPD because that’s the front-end I’m usually using and I know it’s working fine as a LB as well.

The architectures that I described in the first blog of this series, there always were a front-end installed on each node with Alfresco Share and there were a LB above that. Here, these two boxes are actually together. There are multiple ways to set that up but I didn’t want to talk about that in my first blog because it’s not really related to Alfresco, it’s above that so it would just have multiplied the possible architectures that I wanted to present and my blog would just have been way too long. There were also no communications between the different front-end nodes because technically speaking, we aren’t going to setup Apache HTTPD as a Cluster, we only need to provide a High Availability solution.

Alright so let’s say that you don’t have a Load Balancer available and you want to use Apache HTTPD as a front-end+LB for a two-node Cluster. There are several solutions so here are two possible ways to do that from an inbound communication point of view that will still provide redundancy:

  • Setup a Round Robin DNS that points to both Apache HTTPD node1 and node2. The DNS will redirect connections to either of the two Apache HTTPD (Active/Active)
  • Setup a Failover DNS with a pretty low TimeToLive (TTL) which will point to a single Apache HTTPD node and redirect all traffic there. If this one isn’t available, it will failover to the second one (Active/Passive)

 

In both cases above, the Apache HTTPD configuration can be exactly the same, it will work. From an outbound communication point of view, Apache HTTPD will talk directly with all the Share nodes behind it. To avoid disconnection and loss of sessions in case an Apache HTTPD is going down, the solution will need to support session stickiness across all Apache HTTPD. With that, all communications coming a single browser will always be redirected to the same backend server which ensures that the sessions are still intact, even if you are losing an Apache HTTPD. I mentioned previously that there won’t be any communications between the different front-ends so this session stickiness must be based on something present inside the session (header or cookie) or inside the URL.

With Apache HTTPD, you can use the Proxy modules to provide both a front-end configuration as well as a Load Balancer but, in this blog, I will use the JK module. The JK module is provided by Apache for communications between Apache HTTPD and Apache Tomcat. It has been designed and optimized for this purpose and it also provides/supports a Load Balancer configuration.

 

I. Apache HTTPD setup for a single back-end node

For this example, I will use the package provided by Ubuntu for a simple installation. You can obviously build it from source to customize it, add your best practices, aso… This has nothing to do with the Clustering setup, it’s a simple front-end configuration for any installation. So let’s install a basic Apache HTTPD:

[[email protected]_n1 ~]$ sudo apt-get install apache2 libapache2-mod-jk
[[email protected]_n1 ~]$ sudo systemctl enable apache2.service
[a[email protected]_n1 ~]$ sudo systemctl daemon-reload
[[email protected]_n1 ~]$ sudo a2enmod rewrite
[[email protected]_n1 ~]$ sudo a2enmod ssl

 

Then to configure it for a single back-end Alfresco node (I’m just showing a minimal configuration again, there is much more to do add security & restrictions around Alfresco and mod_jk):

[[email protected]_n1 ~]$ cat /etc/apache2/sites-available/alfresco-ssl.conf
...
<VirtualHost *:80>
    RewriteRule ^/?(.*) https://%{HTTP_HOST}/$1 [R,L]
</VirtualHost>

<VirtualHost *:443>
    ServerName            dns.domain
    ServerAlias           dns.domain dns
    ServerAdmin           [email protected]
    SSLEngine             on
    SSLProtocol           -all +TLSv1.2
    SSLCipherSuite        EECDH+AESGCM:EDH+AESGCM:AES256+EECDH:AES256+EDH:AES2
    SSLHonorCipherOrder   on
    SSLVerifyClient       none
    SSLCertificateFile    /etc/pki/tls/certs/dns.domain.crt
    SSLCertificateKeyFile /etc/pki/tls/private/dns.domain.key

    RewriteRule ^/$ https://%{HTTP_HOST}/share [R,L]

    JkMount /* alfworker
</VirtualHost>
...
[[email protected]_n1 ~]$
[[email protected]_n1 ~]$ cat /etc/libapache2-mod-jk/workers.properties
worker.list=alfworker
worker.alfworker.type=ajp13
worker.alfworker.port=8009
worker.alfworker.host=share_n1.domain
worker.alfworker.lbfactor=1
[[email protected]_n1 ~]$
[[email protected]_n1 ~]$ sudo a2ensite alfresco-ssl
[[email protected]_n1 ~]$ sudo a2dissite 000-default
[[email protected]_n1 ~]$ sudo rm /etc/apache2/sites-enabled/000-default.conf
[[email protected]_n1 ~]$
[[email protected]_n1 ~]$ sudo service apache2 restart

 

That should do it for a single back-end Alfresco node. Again, this was just an example, I wouldn’t recommend using the configuration as is (inside the alfresco-ssl.conf file), there is much more to do for security reasons.

 

II. Adaptation for a Load Balancer configuration

If you want to configure your Apache HTTPD as a Load Balancer, then on top of the standard setup shown above, you just have to modify two things:

  • Modify the JK module configuration to use a Load Balancer
  • Modify the Apache Tomcat configuration to add an identifier for Apache HTTPD to be able to redirect the communication to the correct back-end node (session stickiness). This ID put in the Apache Tomcat configuration will extend the Session’s ID like that: <session_id>.<tomcat_id>

 

So on all the nodes hosting the Apache HTTPD, you should put the exact same configuration:

[[email protected]_n1 ~]$ cat /etc/libapache2-mod-jk/workers.properties
worker.list=alfworker

worker.alfworker.type=lb
worker.alfworker.balance_workers=node1,node2
worker.alfworker.sticky_session=true
worker.alfworker.method=B

worker.node1.type=ajp13
worker.node1.port=8009
worker.node1.host=share_n1.domain
worker.node1.lbfactor=1

worker.node2.type=ajp13
worker.node2.port=8009
worker.node2.host=share_n2.domain
worker.node2.lbfactor=1
[[email protected]_n1 ~]$
[[email protected]_n1 ~]$ sudo service apache2 reload

 

With the above configuration, we keep the same JK Worker (alfworker) but instead of using a ajp13 type, we use a lb type (line 4) which is an encapsulation. The alfworker will use 2 sub-workers named node1 and node2 (line 5), that’s just a generic name. The alfworker will also enable stickiness and use the method B (Busyness), which means that for new sessions, Apache HTTPD to choose to use the worker with the less requests being served, divided by the lbfactor value.

Each sub-worker (node1 and node2) define their type which is ajp13 this time, the port and host it should target (where the Share nodes are located) and the lbfactor. As mentioned above, increasing the lbfactor means that more requests are going to be sent to this worker:

  • For the node2 to serve 100% more requests than the node1 (x2), then set worker.node1.lbfactor=1 and worker.node2.lbfactor=2
  • For the node2 to serve 50% more requests than the node1 (x1.5), then set worker.node1.lbfactor=2 and worker.node2.lbfactor=3

 

The second thing to do is to modify the Apache Tomcat configuration to add a specific ID. On the Share node1:

[[email protected]_n1 ~]$ grep "<Engine" $CATALINA_HOME/conf/server.xml
    <Engine name="Catalina" defaultHost="localhost" jvmRoute="share_n1">
[[email protected]_n1 ~]$

 

On the Share node2:

[[email protected]_n2 ~]$ grep "<Engine" $CATALINA_HOME/conf/server.xml
    <Engine name="Catalina" defaultHost="localhost" jvmRoute="share_n2">
[[email protected]_n2 ~]$

 

The value to be put in the jvmRoute parameter is just a string so it can be anything but it must be unique across all Share nodes so that the Apache HTTPD JK module can find the correct back-end node that it should transfer the requests to.

It’s that simple to configure Apache HTTPD as a Load Balancer in front of Alfresco… To check which back-end server you are currently using, you can use the browser’s utilities and in particular the network recording which will display, in the headers/cookies section, the Session ID which will therefore display the value that you put in the jvmRoute.

 

 

Other posts of this series on Alfresco HA/Clustering:

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Morgan Patou
Morgan Patou

Senior Consultant & Technology Leader ECM