Everybody familiar with Documentum knows that just starting the JBoss/WildFly hosting an IndexAgent isn’t really enough to have the indexing working: the IndexAgent must be started from the UI (or via DA or via the job or via iapi or automatically via the Repository startup or …). Starting the IA in “normal mode” is usually something that takes a few seconds. I faced a few times an IA that apparently didn’t want to start: whenever the request was submitted, it would just try but never succeed. In this blog, I will try to explain why it happens and what can be done to restore it.

When an IndexAgent start, it will do a few things like setup the filters/exclusions, it will check all the parameters configured and finally it will communicate with the Repository to do cleanup. The step that is most probably causing this “issue” is the last one. What happen is that whenever the IndexAgent is running, it will consumes documents for indexing. During this process, it will mark some of the items in the dmi_queue_item table as taken into account. However, if the xPlore Server is stopped during the processing of these items, it might not be fully completed and therefore, there are still tasks in progress that were cancelled. To avoid non-indexed documents, the very first task of the IndexAgent, even before it is marked as started in normal mode, is therefore to reinitialize the status of these items by putting them back into the queue to process. The IndexAgent will never be marked as running if this doesn’t complete and this is what happen whenever you are facing this issue about an IndexAgent just stuck in the start process.

To see the details of the start process of an IndexAgent, you can just look into its log file whenever you submit the request. This is an example of a “working” startup:

2020-11-13 14:29:29,765 INFO FtIndexAgent [http--0.0.0.0-9202-3]DM_INDEX_AGENT_START
2020-11-13 14:29:29,808 INFO Context [http--0.0.0.0-9202-3]Filter cabinets_to_exclude value: Temp, System, Resources,
2020-11-13 14:29:29,808 INFO Context [http--0.0.0.0-9202-3]Filter types_to_exclude value: dmi_expr_code, dmc_jar, dm_method, dm_activity, dmc_module, dmc_aspect_type, dm_registered, dm_validation_descriptor, dm_location, dmc_java_library, dm_public_key_certificate, dm_client_registration, dm_procedure, dmc_dar, dm_process, dmc_tcf_activity_template, dm_ftwatermark, dmc_wfsd_type_info, dm_menu_system, dm_plugin, dm_script, dmc_preset_package, dm_acs_config, dm_business_pro, dm_client_rights, dm_cont_transfer_config, dm_cryptographic_key, dm_docbase_config, dm_esign_template, dm_format_preferences, dm_ftengine_config, dm_ftfilter_config, dm_ftindex_agent_config, dm_jms_config, dm_job, dm_mount_point, dm_outputdevice, dm_server_config, dm_xml_application, dm_xml_config, dm_ftquery_subscription, dm_smart_list,
2020-11-13 14:29:29,808 INFO Context [http--0.0.0.0-9202-3]Filter folders_to_exclude value: /Temp/Jobs, /System/Sysadmin/Reports, /System/Sysadmin/Jobs,
2020-11-13 14:29:29,811 INFO AgentInfo [http--0.0.0.0-9202-3]Start
Documentum Index Agent 1.5.0170.0173
Java Version                    1.7.0_72
DFC Version                     7.2.0170.0165
DMCL Version                    7.2.0170.0165
Docbase (Repo01)                7.2.0160.0297  Linux64.Oracle

Start Configuration Information
 Instance
  indexagent_instance_name(AgentInstanceName)=xplore_server01_9200_IndexAgent
  docbase_name(DocbaseName)=Repo01
  docbase_user(DocbaseUser)=
  docbase_domain(DocbaseDomain)=
  runaway_item_timeout(RunawayItemTimeout)=600000
  runaway_thread_timeout(RunawayThreadTimeout)=600000
  parameter_list(InstanceOptionalParams)
 Status
  frequency(StatusFrequency)=5000
  history_size(StatusHistorySize)=20
 Connectors
  class_name(ClassName)=com.documentum.server.impl.fulltext.indexagent.connector.DocbaseNormalModeConnector
  parameter_list(Options)
   parameter=save_queue_items, value=false
   parameter=queue_user, value=dm_fulltext_index_user
   parameter=wait_time, value=60000
   parameter=batch_size, value=1000
  class_name(ClassName)=com.documentum.server.impl.fulltext.indexagent.connector.FileConnector
  parameter_list(Options)
   parameter=wait_time, value=2000
   parameter=batch_size, value=100
   parameter=file_name, value=ids.txt
 Exporter
  queue_size(PrepQSize)=250
  queue_low_percent(PrepQLowPercentage)=90
  wait_time(PrepWaitTime)=100
  thread_count(PrepWorkers)=2
  shutdown_timeout(PrepShutdownTimeout)=60000
  runaway_timeout(RunawayItemTimeout)=600000
  all_filestores_local(areAll_filestores_local)=false
  local_content_area(LocalContentArea)=/data/primary/Indexagent_Repo01/export
  local_filestore_map(LocalFileStoreMap)
  local_content_remote_mount(LocalContentRemoteMount)=null
  content_clean_interval(ContentCleanInterval)=2000000
  keep_dftxml(KeepDftxml)=false
  parameter_list(PrepOptionalParameters)=
   parameter=contentSizeLimit, value=367001600
 Indexer
  queue_size(IndexQSize)=500
  queue_low_percent(IndexQLowPercentage)=90
  queue_size(CallbackQSize)=200
  queue_low_percent(CallbackQLowPercentage)=90
  wait_time(IndexWaitTime)=100
  thread_count(IndexWorkers)=1
  shutdown_timeout(IndexShutdownTimeout)=60000
  runaway_timeout(IndexRunawayTimeout)60000
  partition_config
   default_partition collection_name(DefaultCollection)=null
  partitions(PartitionMap)
 Indexer Plugin Config
  class_name(IndexerClassName)=com.documentum.server.impl.fulltext.indexagent.plugins.enterprisesearch.DSearchFTPlugin
  parameter_list(IndexerParams)
   parameter=dsearch_qrserver_host, value=lb_xplore_server.domain.com
   parameter=query_plugin_mapping_file, value=/app/dctm/server/fulltext/dsearch/dm_AttributeMapping.xml
   parameter=max_tries, value=2
   parameter=max_pending_requests, value=10000
   parameter=load_balancer_enabled, value=true
   parameter=dsearch_qrserver_protocol, value=HTTPS
   parameter=dsearch_qrygen_mode, value=both
   parameter=security_mode, value=BROWSE
   parameter=max_requests_in_batch, value=10
   parameter=dsearch_qrserver_port, value=9302
   parameter=dsearch_config_port, value=9302
   parameter=dsearch_config_host, value=xplore_server01.domain.com
   parameter=max_batch_wait_msec, value=1000
   parameter=dsearch_qrserver_target, value=/dsearch/IndexServerServlet
   parameter=dsearch_domain, value=Repo01
   parameter=group_attributes_exclude_list, value=i_all_users_names
End Configuration Information

2020-11-13 14:29:29,828 INFO ObjectFilter [http--0.0.0.0-9202-3][DM_INDEX_AGENT_CUSTOM_FILTER_INFO] running DQL query: select primary_class from dmc_module where any a_interfaces = 'com.documentum.fc.indexagent.IDfCustomIndexFilter'
2020-11-13 14:29:29,833 INFO ObjectFilter [http--0.0.0.0-9202-3][DM_INDEX_AGENT_CUSTOM_FILTER_INFO] instantiated filter: com.documentum.services.message.impl.type.MailMessageChildFilter
2020-11-13 14:29:29,834 INFO ObjectFilter [http--0.0.0.0-9202-3][DM_INDEX_AGENT_CUSTOM_FILTER_INFO] instantiated filter: com.documentum.services.message.impl.type.MailMessageChildFilter
2020-11-13 14:29:29,834 INFO ObjectFilter [http--0.0.0.0-9202-3][DM_INDEX_AGENT_CUSTOM_FILTER_INFO] instantiated filter: com.documentum.server.impl.fulltext.indexagent.filter.defaultCabinetFilterAction
2020-11-13 14:29:29,834 INFO ObjectFilter [http--0.0.0.0-9202-3][DM_INDEX_AGENT_CUSTOM_FILTER_INFO] instantiated filter: com.documentum.server.impl.fulltext.indexagent.filter.defaultFolderFilterAction
2020-11-13 14:29:29,834 INFO ObjectFilter [http--0.0.0.0-9202-3][DM_INDEX_AGENT_CUSTOM_FILTER_INFO] instantiated filter: com.documentum.server.impl.fulltext.indexagent.filter.defaultTypeFilterAction
2020-11-13 14:29:29,869 INFO defaultFilters [http--0.0.0.0-9202-3]Populated cabinet cache for filter CabinetsToExclude with count 3
2020-11-13 14:29:30,462 INFO defaultFilters [http--0.0.0.0-9202-3]Populated folder id cache for filter FoldersToExclude with count 140
2020-11-13 14:29:30,488 INFO DocbaseNormalModeConnector [http--0.0.0.0-9202-3][DM_INDEX_AGENT_QUERY_BEGIN] update dmi_queue_item objects set task_state = ' ', set sign_off_user = ' ', set dequeued_by = ' ', set message = ' ' where name = 'dm_fulltext_index_user' and task_state = 'acquired' and sign_off_user = 'xplore_server01_9200_IndexAgent'
2020-11-13 14:29:30,488 INFO DocbaseNormalModeConnector [http--0.0.0.0-9202-3][DM_INDEX_AGENT_QUERY_UPDATE_COUNT] 0
2020-11-13 14:29:30,489 INFO ESSIndexer [http--0.0.0.0-9202-3][DM_INDEX_AGENT_PLUGIN] DSS Server host: xplore_server01.domain.com
2020-11-13 14:29:30,489 INFO ESSIndexer [http--0.0.0.0-9202-3][DM_INDEX_AGENT_PLUGIN] DSS Server protocol: HTTPS
2020-11-13 14:29:30,489 INFO ESSIndexer [http--0.0.0.0-9202-3][DM_INDEX_AGENT_PLUGIN] DSS Server port: 9302
2020-11-13 14:29:30,489 INFO ESSIndexer [http--0.0.0.0-9202-3][DM_INDEX_AGENT_PLUGIN] DSS Server domain: Repo01
2020-11-13 14:29:30,502 INFO ESSIndexer [http--0.0.0.0-9202-3][DM_INDEX_AGENT_PLUGIN] Index Server Status: normal

 

When this issue occurs, the lines 92 and above will not appear. As you can see, the DQL query executed is actually recorded in the log as well as the number of items updated. The “issue” is that if there are too many items that would match the WHERE clause (acquired items), this query could take hours to complete (if at all) and therefore, it would appear as if the start isn’t working. Because of how DQL works, this kind of query on thousands of objects or more will be very DB intensive and that would introduce a big performance hit.

How is it possible to end-up with hundreds of thousand or even millions of acquired items you may think? Well each time it happened to me, it was in relation to some huge batches or jobs running that would update millions of items or during big migrations/imports of objects. As you know, the events that have been registered in the dmi_registry table will trigger the creation of a new entry in the dmi_queue_item table. Therefore, whenever you are importing a lot of documents for example, it is highly recommended to carefully manage the index table because it can cause huge performance issues since it is used a lot inside Documentum for various purposes. This is especially true whenever Lifecycles are in the picture because then processes (like ApplyD2Config) will generate a lot of dm_save events per documents and therefore duplicates in the table. I won’t go into these details in this blog but in short, you can chose to remove the events from the dmi_registry during the import and put them back afterwards, manually indexing the imported documents at the end or do manual cleanups of the dmi_queue_item table during the process. Unfortunately, if you aren’t aware that a huge migration takes places for example, then the situation can quickly become complicated with millions and millions of items. Last time I saw something similar happening, it was an import started “in secret” before the weekend and filling the dmi_queue_item table. The IndexAgent was initially started and therefore it processed them but it wasn’t fast enough. On the Monday morning, we had the pleasant surprise to see around 6 million of acquired items and 9 more million of awaiting….

I think (to be confirmed) the behavior changed in more recent versions but this environment was using xPlore 1.5 and here, the IndexAgent might pull batches of documents for processing, even if there are still already a lot in process. The xPlore Servers (a Federation) weren’t sleeping at all since they actually processed millions of items already but there were just too many to handle and unfortunately, the IA kind of entered a dead end where updating the dmi_queue_item table would just be too long for the processing to be effective again. I didn’t try to restart the IndexAgent because I knew it would never complete but I thought this might make an interesting blog post. There is probably a KB on the OpenText site describing that since it is rather well known.

As you might expect, triggering a DQL query supposed to update 6 million rows on a table that contains at the very least three times that isn’t gonna happen. So what can be done then to restore the system performance and to allow the IndexAgent to restart properly? DQL isn’t very good for processing of huge batches and therefore, your best bet would be to go to the Database directly to avoid the Documentum overhead. Instead of executing one single SQL command to update the 6 million of items, you should also split it in smaller batches by adding a WHERE clause on the date for example. That would help tremendously and that’s not something that the IndexAgent can do by itself because it has no idea of when things started to go south… So then, which kind of command should be executed? In this case, I wouldn’t recommend to do what the IndexAgent is doing. If you are simply resetting the status from acquired to awaiting, sure the IndexAgent will be able to start but it will still have 6+9 million items awaiting for processing and therefore, you still have bad performance and you have a pretty high probability that the number of acquired will rise again… Therefore, the only reasonable choice is to export all distinct items from the dmi_queue_item table and then clean/remove all FT items. With some luck, you might have 5 or 10 duplicates for each document so instead of indexing 15 million, it would just be 1 or 2 million (distinct).

An example of SQL command to cleanup all the items on a 1 hour timeframe would be for Oracle (I would suggest to make sure the IA isn’t running when messing with the table):

DELETE dmi_queue_item_s
WHERE name='dm_fulltext_index_user'
  AND delete_flag=0
  AND date_sent>=to_date('2020-06-28 22:00:00','YYYY-MM-DD HH24:MI:SS')
  AND date_sent<to_date('2020-06-28 23:00:00','YYYY-MM-DD HH24:MI:SS');
commit;

 

This cleanup can be done online without issue, just make sure you take an export of all distinct item_id to re-index afterwards, otherwise you will have to execute the FT Integrity utility to find the missing documents in the index. With parallel execution on several DB sessions, the cleanup can actually be done rather quickly and then it’s just background processing for the index via the ids.txt for example.