Infrastructure at your Service

In previous blogs, I talked about some basis and presented some possible architectures for Alfresco, I talked about the Clustering setup for the Alfresco Repository, the Alfresco Share and for ActiveMQ. I also setup an Apache HTTPD as a Load Balancer. In this one, I will talk about the last layer that I wanted to present, which is Solr and more particularly Solr6 (Alfresco Search Services) Sharding. I planned on writing a blog related to Solr Sharding Concepts & Methods to explain what it brings concretely but unfortunately, it’s not ready yet. I will try to post it in the next few weeks, if I find the time.

 

I. Solr configuration modes

So, Solr supports/provides three configuration modes:

  • Master-Slave
  • SolrCloud
  • Standalone


Master-Slave
: It’s a first specific configuration mode which is pretty old. In this one, the Master node is the only to index the content and all the Slave nodes will replicate the Master’s index. This is a first step to provide a Clustering solution with Solr, and Alfresco supports it, but this solution has some important drawbacks. For example, and contrary to an ActiveMQ Master-Slave solution, Solr cannot change the Master. Therefore, if you lose your Master, there is no indexing happening anymore and you need to manually change the configuration file on each of the remaining nodes to specify a new Master and target all the remaining Slaves nodes to use the new Master. This isn’t what I will be talking about in this blog.

SolrCloud: It’s another specific configuration mode which is a little bit more recent, introduced in Solr4 I believe. SolrCloud is a true Clustering solution using a ZooKeeper Server. It adds an additional layer on top of a Standalone Solr which is slowing it down a little bit, especially on infrastructures with a huge demand on indexing. But at some points, when you start having dozens of Solr nodes, you need a central place to organize and configure them and that’s what SolrCloud is very good at. This solution provides Fault Tolerance as well as High Availability. I’m not sure if SolrCloud could be used by Alfresco because sure SolrCloud also has Shards and its behaviour is pretty similar to a Standalone Solr but it’s not entirely working in the same way. Maybe it’s possible, however I have never seen it so far. Might be the subject of some testing later… In any cases, using a SolrCloud for Alfresco might not be that useful because it’s really easier to setup a Master-Master Solr mixed with Solr Sharding for pretty much the same benefits. So, I won’t talk about SolrCloud here either.

You guessed it, in this blog, I will only talk about Standalone Solr nodes and only using Shards. Alfresco supports Solr Shards only since the version 5.1. Before that, it wasn’t possible to use this feature, even if Solr4 provided it already. When using the two default cores (the famous “alfresco” & “archive” cores), with all Alfresco versions (all supporting Solr… So since Alfresco 4), it is possible to have a High Available Solr installation by setting up two Solr Standalone nodes and putting a Load Balancer in front of it but in this case, there is no communication between the Solr nodes so, it’s only a HA solution, nothing more.

 

In the architectures that I presented in the first blog of this series, if you remember the schema N°5 (you probably don’t but no worry, I didn’t either), I put a link between the two Solr nodes and I mentioned the following related to this architecture:
“N°5: […]. Between the two Solr nodes, I put a Clustering link, that’s in case you are using Solr Sharding. If you are using the default cores (alfresco and archive), then there is no communication between distinct Solr nodes. If you are using Solr Sharding and if you want a HA architecture, then you will have the same Shards on both Solr nodes and in this case, there will be communications between the Solr nodes, it’s not really a Clustering so to speak, that’s how Solr Sharding is working but I still used the same representation.”

 

II. Solr Shards creation

As mentioned earlier in this blog, there are real Cluster solutions with Solr but in the case of Alfresco, because of the features that Alfresco adds like the Shard Registration, there is no real need to set up complex things like that. Having just a simple Master-Master installation of Solr6 with Sharding is already a very good and strong solution to provide Fault Tolerance, High Availability, Automatic Failover, Performance improvements, aso… So how can that be setup?

First, you will need to install at least two Solr Standalone nodes. You can use exactly the same setup for all nodes and it’s also exactly the same setup to use the default cores or Solr Sharding so just do what you are always doing. For the Tracking, you will need to use the Load Balancer URL so it can target all Repository nodes, if there are several.

If you created the default cores, you can remove them easily:

[[email protected]_n1 ~]$ curl -v "http://localhost:8983/solr/admin/cores?action=removeCore&storeRef=workspace://SpacesStore&coreName=alfresco"
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8983 (#0)
> GET /solr/admin/cores?action=removeCore&storeRef=workspace://SpacesStore&coreName=alfresco HTTP/1.1
> Host: localhost:8983
> User-Agent: curl/7.58.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Type: application/xml; charset=UTF-8
< Content-Length: 150
<
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">0</int><int name="QTime">524</int></lst>
</response>
* Connection #0 to host localhost left intact
[[email protected]_n1 ~]$
[[email protected]_n1 ~]$ curl -v "http://localhost:8983/solr/admin/cores?action=removeCore&storeRef=archive://SpacesStore&coreName=archive"
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8983 (#0)
> GET /solr/admin/cores?action=removeCore&storeRef=archive://SpacesStore&coreName=archive HTTP/1.1
> Host: localhost:8983
> User-Agent: curl/7.58.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Type: application/xml; charset=UTF-8
< Content-Length: 150
<
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">0</int><int name="QTime">485</int></lst>
</response>
* Connection #0 to host localhost left intact
[[email protected]_n1 ~]$

 

A status of “0” means that it’s successful.

Once that’s done, you can then simply create the Shards. In this example, I will:

  • use the DB_ID_RANGE method
  • use two Solr nodes
  • for workspace://SpacesStore: create 2 Shards out of a maximum of 10 with a range of 20M
  • for archive://SpacesStore: create 1 Shard out of a maximum of 5 with a range of 50M

Since I will use only two Solr nodes and since I want a High Availability on each of the Shards, I will need to have them all on both nodes. With a simple loop, it’s pretty easy to create all the Shards:

[[email protected]_n1 ~]$ solr_host=localhost
[[email protected]_n1 ~]$ solr_node_id=1
[[email protected]_n1 ~]$ begin_range=0
[[email protected]_n1 ~]$ range=19999999
[[email protected]_n1 ~]$ total_shards=10
[[email protected]_n1 ~]$
[[email protected]_n1 ~]$ for shard_id in `seq 0 1`; do
>   end_range=$((${begin_range} + ${range}))
>   curl -v "http://${solr_host}:8983/solr/admin/cores?action=newCore&storeRef=workspace://SpacesStore&numShards=${total_shards}&numNodes=${total_shards}&nodeInstance=${solr_node_id}&template=rerank&coreName=alfresco&shardIds=${shard_id}&property.shard.method=DB_ID_RANGE&property.shard.range=${begin_range}-${end_range}&property.shard.instance=${shard_id}"
>   echo ""
>   echo "  -->  Range N°${shard_id} created with: ${begin_range}-${end_range}"
>   echo ""
>   sleep 2
>   begin_range=$((${end_range} + 1))
> done

*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8983 (#0)
> GET /solr/admin/cores?action=newCore&storeRef=workspace://SpacesStore&numShards=10&numNodes=10&nodeInstance=1&template=rerank&coreName=alfresco&shardIds=0&property.shard.method=DB_ID_RANGE&property.shard.range=0-19999999&property.shard.instance=0 HTTP/1.1
> Host: localhost:8983
> User-Agent: curl/7.58.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Type: application/xml; charset=UTF-8
< Content-Length: 182
<
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">0</int><int name="QTime">254</int></lst><str name="core">alfresco-0</str>
</response>
* Connection #0 to host localhost left intact

  -->  Range N°0 created with: 0-19999999


*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8983 (#0)
> GET /solr/admin/cores?action=newCore&storeRef=workspace://SpacesStore&numShards=10&numNodes=10&nodeInstance=1&template=rerank&coreName=alfresco&shardIds=1&property.shard.method=DB_ID_RANGE&property.shard.range=20000000-39999999&property.shard.instance=1 HTTP/1.1
> Host: localhost:8983
> User-Agent: curl/7.58.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Type: application/xml; charset=UTF-8
< Content-Length: 182
<
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">0</int><int name="QTime">228</int></lst><str name="core">alfresco-1</str>
</response>
* Connection #0 to host localhost left intact

  -->  Range N°1 created with: 20000000-39999999

[[email protected]_n1 ~]$
[[email protected]_n1 ~]$ begin_range=0
[[email protected]_n1 ~]$ range=49999999
[[email protected]_n1 ~]$ total_shards=4
[[email protected]_n1 ~]$ for shard_id in `seq 0 0`; do
>   end_range=$((${begin_range} + ${range}))
>   curl -v "http://${solr_host}:8983/solr/admin/cores?action=newCore&storeRef=archive://SpacesStore&numShards=${total_shards}&numNodes=${total_shards}&nodeInstance=${solr_node_id}&template=rerank&coreName=archive&shardIds=${shard_id}&property.shard.method=DB_ID_RANGE&property.shard.range=${begin_range}-${end_range}&property.shard.instance=${shard_id}"
>   echo ""
>   echo "  -->  Range N°${shard_id} created with: ${begin_range}-${end_range}"
>   echo ""
>   sleep 2
>   begin_range=$((${end_range} + 1))
> done

*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8983 (#0)
> GET /solr/admin/cores?action=newCore&storeRef=archive://SpacesStore&numShards=4&numNodes=4&nodeInstance=1&template=rerank&coreName=archive&shardIds=0&property.shard.method=DB_ID_RANGE&property.shard.range=0-49999999&property.shard.instance=0 HTTP/1.1
> Host: localhost:8983
> User-Agent: curl/7.58.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Type: application/xml; charset=UTF-8
< Content-Length: 181
<
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">0</int><int name="QTime">231</int></lst><str name="core">archive-0</str>
</response>
* Connection #0 to host localhost left intact

-->  Range N°0 created with: 0-49999999

[[email protected]_n1 ~]$

 

On the Solr node2, to create the same Shards (another Instance of each Shard) and therefore provide the expected setup, just re-execute the same commands but replacing solr_node_id=1 with solr_node_id=2. That’s all there is to do on Solr side, just creating the Shards is sufficient. On the Alfresco side, configure the Shards registration to use the Dynamic mode:

[[email protected]_n1 ~]$ cat $CATALINA_HOME/shared/classes/alfresco-global.properties
...
# Solr Sharding
solr.useDynamicShardRegistration=true
search.solrShardRegistry.purgeOnInit=true
search.solrShardRegistry.shardInstanceTimeoutInSeconds=60
search.solrShardRegistry.maxAllowedReplicaTxCountDifference=500
...
[[email protected]_n1 ~]$

 

After a quick restart, all the Shard’s Instances will register themselves to Alfresco and you should see that each Shard has its two Shard’s Instances. Thanks to the constant Tracking, Alfresco knows which Shard’s Instances are healthy (up-to-date) and which ones aren’t (either lagging behind or completely silent). When performing searches, Alfresco will make a request to any of the healthy Shard’s Instances. Solr will be aware of the healthy Shard’s Instances as well and it will start the distribution of the search request to all the Shards for the parallel query. This is the communication between the Solr nodes that I mentioned earlier: it’s not really Clustering but rather query distribution between all the healthy Shard’s Instances.

 

 

Other posts of this series on Alfresco HA/Clustering:

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Morgan Patou
Morgan Patou

Senior Consultant & Technology Leader ECM