Beginning of the year, while performing a migration from a Documentum 7.3 environment on VM to Documentum 16.4 on Kubernetes, a customer had an issue where their custom facets weren’t showing up on D2 after a full reindex. At the end of the migration, since xPlore has been upgraded as well (from xPlore 1.5 to 16.4, from VM to K8s), then a full reindex has been executed so that all the documents are indexed. In this case, it was several millions of documents that were indexed and it took a few days. Unfortunately, at the end of the full reindex, the customer saw that the facets weren’t working…
Why is that exactly? Well, while configuring custom facets, you will need to add subpath configuration for the facet computing and that is a schema change inside the index. Each and every schema change requires at the very least an online rebuild of the index so that the change of the schema is propagated into each and every node of the index. Unless you are doing this online rebuild, the xPlore index schema will NOT be refreshed and the indexing of documents will therefore use the old schema. In case you are wondering what is the “online rebuild” I’m talking about, it’s the action behind the button “Rebuild Index” that you can find in the Dsearch Admin UI under “Home >> Data Management >> <DOMAIN_NAME> (usually Repo name) >> <COLLECTION_NAME> (e.g.: default or Node1_CPS1 or Node4_CPS2 …)“:
This action will not index any new content, it will however create a new index based on the refreshed schema and then copy all the nodes from the current index to the new one. At the end, it will replace the current index with the new one and this can be done online without downtime. This button was initially present for both Data collections (where your documents are) as well as ApplicationInfo collections (ACLs, Groups). However in recent versions of xPlore (at least since 16.4), the feature has been removed for the ApplicationInfo collections.
So, what is the minimum required to configure custom facets? The answer is that it depends… :). Here are some examples:
- If the xPlore has never been started, the index doesn’t exist yet and therefore configuring the facets inside the indexserverconfig.xml file would take effect immediately at the first startup. In this case, an online rebuild wouldn’t even be needed. However, it might not always be easy to modify the indexserverconfig.xml file before xPlore even starts; it depends on how you are deploying the components…
- If the xPlore has been started at least once but indexing hasn’t started yet (0 content inside the Data collections), then you can just login to the Dsearch Admin UI and perform the online rebuild on the empty collections. This will be almost instantaneous so you will most probably not even see it happen though.
- If this is a new environment, then make sure the IndexAgent is started in normal mode after that so that it will process incoming indexing requests and that’s it
- If this is an existing environment, then you will need to execute a full reindex operation using your preferred choice (IndexAgent full reindex action, through some select queries, through the ids.txt)
- If the xPlore has been started at least once and the indexing has been completed, then you will need to perform the online rebuild as well. However, this time, it will take probably quite some time because as I mentioned earlier, it needs to copy all the indexed nodes to a new index. This process is normally faster than a full reindex because it’s only xPlore internal communications, because it only duplicates the existing index (and applied schema change) and because there is no exchange with the Content Server. Once the online rebuild has been performed, then the facets should be available.
Even if an online rebuild is faster than a full reindex, based on the size of the index, it might still take from hours to days to complete. It is therefore quite important to plan this properly in advance in case of migration or upgrade so that you can start with an online rebuild on an empty index (therefore instantaneously done) and then perform the needed full reindex after, instead of the opposite. This might save you several days of pain with your users and considerably reduce the load on the Dsearch/CPS.
This behavior wasn’t really well documented before. I had some exchange with OpenText on this topic and they created the KB15765485 based on these exchanges and also based on what is described in this blog. I’m not sure if that is really better now but at least there is a little bit more information.