You have finally completed that migration and need to restart all the Documentum processes. So, you shut down the docbroker and move on to the repositories but then you receive an error message about them not being reachable any more. Or conversely, you want to start all the Documentum processes and you start first the repositories and later the docbrokers. Next, you want to connect to one repository and you receive the same error message. Of course, you finally remember, since the docbroker is a requirement for the repositories, it must be started first and shut down last but it is too late now. How to get out if this annoying situation ? You could just (re)start the docbroker and wait for the next repostories’ checkpoint, at most 5 minutes by default. If this is not acceptable, at first sight, there is no other solution than to “kill -9” the repositories’ processes, start the docbroker and only next the repositories. Let’s see if we can find a better way. Spoiler alert: to stop this insufferable suspens, I must say up front that there is no other way, sorry, but there are a few ways to alleviate this inconvenience.

A quick clarification

Let’s first clarify a point of terminology here: there is a difference between docbases/repositories and content servers. A docbase encompasses the actual content and their persistent data and technical information whereas the content server is the set of running processes that give access to and manage one docbase. It is very similar to Oracle’s databases and instances, where one database can be served by several instances, providing parallelism and high availability. A docbase can be served by more than one content server, generally spread over different machines, with its own set of dm_start_docbase and dm_shutdown_docbase scripts and server.ini. A docbase knows how many content servers use it because they each have their own dm_server_config object. If there is just one content server, both docbase and content server can be used interchangeably but when there are several content servers for the same docbase, when one says “stopping the docbase” it really means “stopping one particular content server”, and this is the meaning that will be used in the rest of the article. If the docbase has more than one content servers, just extend the presented manipulations to each of them.

Connecting to the repositories without a docbroker

If one could connect to a repository without a running docbroker, the situation that triggered this article, things would be much easier. In the ancient, simpler times, the dmcl.ini parameters below could help working around an unavailable docbroker:

[DOCBROKER_DEBUG]
docbase_id = <id of docbase as specified in its server.ini file>
host =  <host's name the docbase server is running on>
port = <docbase's port as specified in /etc/services>
service = <docbase's service name as specified in /etc/services>

and they used to work.
After the switch to the dfc.properties file, those parameters were renamed as follows:

dfc.docbroker.debug.docbase_id=<id of docbase as specified in its server.ini file>
dfc.docbroker.debug.host=<host's name the docbase server is running on>
dfc.docbroker.debug.port=<docbase's port as specified in /etc/services>
dfc.docbroker.debug.service=<docbase's service name as specified in /etc/services>

Unfortunately, they don’t work any more. Actually, although they are still documented in the dfcfull.properties, they have not been implemented and will never be according to OTX. Moreover, they will be removed in the future. Too bad, that would have been such a cheap way to extricate oneself from an uncomfortable situation.

Preventing the situation

The best solution is obviously to prevent it to happen. This can be easily realized by using a central script for stopping and starting the Documentum stack. And, while we are at it, inquiring its status.
Documentum already provides such a script, e.g. see here Linux scripts for automatic startup and shutdown of Documentum Content Server. Here is another more sophisticated implementation:

#!/bin/bash
#
# See Usage() function below for explanations; 
# cec - dbi-services - April 2019
#

general_status=0

Usage() {
   cat <<EoU
Usage:
    start-stop.sh [(help) | start | stop | status] [(all) | docbases | docbrokers | docbase={,} | docbroker={,} | method_server]
 E.g.:
    display this help screen:
       start-stop.sh
    start all:
       start-stop.sh start [all]
    stop all:
       start-stop.sh stop [all]
    status all:
       start-stop.sh status [all]
    start docbroker01:
       start-stop.sh start docbroker=docbroker01
    start docbases global_registry and dmtest01:
       start-stop.sh docbase=global_registry,dmtest01
    start all the docbases:
       start-stop.sh docbases
    start all the docbrokers:
       start-stop.sh docbrokers
EoU
}

start_docbroker() {
   docbroker=$1
   echo "starting up docbroker $docbroker ..."
   ./dm_launch_${docbroker}
}

start_all_docbrokers() {
   echo "starting the docbrokers ..."
   DOCBROKERS=`ls -1 dm_launch_* 2>/dev/null | cut -f3 -d_`
   nb_items=0
   for docbroker in $DOCBROKERS; do
      start_docbroker $docbroker
      (( nb_items++ ))
   done
   echo "$nb_items docbrokers started"

}

start_docbase() {
   docbase=$1
   echo "starting $docbase"
   ./dm_start_${docbase}
}

start_all_docbases() {
   echo "starting the repositories ..."
   DOCBASES=`ls -1 config 2>/dev/null `
   nb_items=0
   for docbase in $DOCBASES; do
      start_docbase $docbase
      (( nb_items++ ))
   done
   echo "$nb_items repositories started"
}

start_method_server() {
   echo "starting the method server ..."
   cd ${DOCUMENTUM}/${JBOSS}/server
   nohup ${DOCUMENTUM}/${JBOSS}/server/startMethodServer.sh 2>&1 > /tmp/nohup.log &
   echo "method server started"
}

start_all() {
   echo "starting all the documentum processes ..."
   start_all_docbrokers
   start_all_docbases
   start_method_server
}

status_docbroker() {
   docbroker_name=$1
   docbroker_host=$(grep "^host=" /app/dctm/dba/dm_launch_${docbroker_name} | cut -d= -f2)
   docbroker_port=$(grep "dmdocbroker -port " /app/dctm/dba/dm_launch_${docbroker_name} | cut -d  -f3)
   dmqdocbroker -t $docbroker_host -p $docbroker_port -c ping 2> /dev/null 1> /dev/null
   local_status=$?
   if [ $local_status -eq 0 ]; then
      echo "$(date +"%Y/%m/%d %H:%M:%S"): successfully pinged docbroker $docbroker_name listening on port $docbroker_port on host $docbroker_host"
   else
      echo "$(date +"%Y/%m/%d %H:%M:%S"): docbroker $docbroker_name listening on port $docbroker_port on host $docbroker_host is unhealthy"
      general_status=1
   fi
   echo "status for docbroker $docbroker_name:$docbroker_port: $local_status, i.e. $(if [[ $local_status -eq 0 ]]; then echo OK; else echo NOK;fi)"
}

status_all_docbrokers() {
   DOCBROKERS=`ls -1 dm_launch_* 2>/dev/null | cut -f3 -d_`
   DOCBROKERS_PORTS=`grep -h "./dmdocbroker" dm_launch_* | cut -f3 -d `
   for f in `ls -1 dm_launch_* 2>/dev/null `; do
      docbroker_name=`echo $f | cut -f3 -d_`
      docbroker_port=`grep "./dmdocbroker" $f | cut -f3 -d `
      status_docbroker $docbroker_name $docbroker_port
   done
   echo "general status for all docbrokers: $general_status, i.e. $(if [[ $general_status -eq 0 ]]; then echo OK; else echo NOK;fi)"
}

status_docbase() {
   docbase=$1
   timeout --preserve-status 30s idql $docbase -Udmadmin -Pxx 2> /dev/null 1> /dev/null <<eoq
     quit
eoq
   local_status=$?
   if [[ $local_status -eq 0 ]]; then
      echo "$(date +"%Y/%m/%d %H:%M:%S"): successful connection to repository $docbase"
   else
      echo "$(date +"%Y/%m/%d %H:%M:%S"): repository $docbase is unhealthy"
      general_status=1
   fi
   echo "status for docbase $docbase: $local_status, i.e. $(if [[ $local_status -eq 0 ]]; then echo OK; else echo NOK;fi)"
}

status_all_docbases() {
   DOCBASES=`ls -1 config 2>/dev/null `
   for docbase in $DOCBASES; do
      status_docbase $docbase
   done
   echo "general status for all docbases: $general_status, i.e. $(if [[ $general_status -eq 0 ]]; then echo OK; else echo NOK;fi)"
}

status_method_server() {
   # check the method server;
   curl --silent --fail -k http://${HOSTNAME}:9080/DmMethods/servlet/DoMethod 2>&1 > /dev/null
   local_status=$?
   if [ $local_status -eq 0 ]; then
      echo "$(date +"%Y/%m/%d %H:%M:%S"): method server successfully contacted"
   else
      echo "$(date +"%Y/%m/%d %H:%M:%S"): method server is unhealthy"
      general_status=1
   fi
   echo "status for method_server: $local_status, i.e. $(if [[ $local_status -eq 0 ]]; then echo OK; else echo NOK;fi)"
}

status_all() {
   status_all_docbrokers
   status_all_docbases
   status_method_server
   echo "General status: $general_status, i.e. $(if [[ $general_status -eq 0 ]]; then echo OK; else echo NOK;fi)"
}

stop_docbase() {
   echo "stopping $docbase"
   docbase=$1
   ./dm_shutdown_${docbase}
   echo "docbase $docbase stopped"
}

stop_all_docbases() {
   echo "stopping the repositories ..."
   DOCBASES=`ls -1 config 2>/dev/null `
   nb_items=0
   for docbase in $DOCBASES; do
      stop_docbase $docbase
      (( nb_items++ ))
   done
   echo "$nb_items repositories stopped"
}

stop_docbroker() {
   echo "stopping docbroker $docbroker ..."
   docbroker=$1
   ./dm_stop_${docbroker}
   echo "docbroker $docbroker stopped"
}

stop_all_docbrokers() {
   echo "stopping the docbrokers ..."
   DOCBROKERS=`ls -1 dm_stop_* 2>/dev/null | cut -f3 -d_`
   nb_items=0
   for docbroker in $DOCBROKERS; do
      stop_docbroker $docbroker
      (( nb_items++ ))
   done
   echo "$nb_items docbrokers stopped"
}

stop_method_server() {
   echo "stopping the method server ..."
   ${DOCUMENTUM}/${JBOSS}/server/stopMethodServer.sh
   echo "method server stopped"
}

stop_all() {
   echo "stopping all the documentum processes ..."
   stop_all_docbases
   stop_method_server
   stop_all_docbrokers
   echo "all documentum processes stopped"
   ps -ajxf | egrep '(PPID|doc|java)' | grep -v grep | sort -n -k2,2
}

# -----------
# main;
# -----------
   [[ -f ${DM_HOME}/bin/dm_set_server_env.sh ]] && . ${DM_HOME}/bin/dm_set_server_env.sh
   cd ${DOCUMENTUM}/dba
   if [[ $# -eq 0 ]]; then
      Usage
      exit 0
   else
      while [[ $# -ge 1 ]]; do
         case $1 in
	    help)
	       Usage
	       exit 0
	    ;;
            start|stop|status)
	       cmd=$1
	       shift
	       if [[ -z $1 || $1 = "all" ]]; then
	          ${cmd}_all
	       elif [[ $1 = "docbases" ]]; then
	          ${cmd}_all_docbases
	       elif [[ $1 = "docbrokers" ]]; then
	          ${cmd}_all_docbrokers
	       elif [[ ${1%%=*} = "docbase" ]]; then
	          docbases=`echo ${1##*=} | gawk '{gsub(/,/, " "); print}'`
                  for docbase in $docbases; do
	             ${cmd}_docbase $docbase
	          done
	       elif [[ ${1%%=*} = "docbroker" ]]; then
	          docbrokers=`echo ${1##*=} | gawk '{gsub(/,/, " "); print}'`
                  for docbroker in $docbrokers; do
	             ${cmd}_docbroker $docbroker
	          done
	       elif [[ $1 = "method_server" ]]; then
                  ${cmd}_method_server
               fi
               exit $general_status
            ;;
            *)
               echo "syntax error"
	       Usage
	       exit 1
	    ;;
         esac
         shift
      done
   fi

See lines 11 to 29 for its usage.
Note on line 110 the timeout command when attempting to connect to a docbase to check its status; see the article Adding a timeout in monitoring probes for an explanation.
We couldn’t help but adding the option to address each component individually, or a few of them, in addition to all of them at once. So, the script lets us stop, start and inquire the status of one particular docbroker or docbase or method server, or a list of docbrokers or a list of docbases, or everything at once.
After a maintenance task, to stop all the Documentum processes, the command below could be used:

$ start-stop.sh stop all

Similarly, to start everything:

$ start-stop.sh start all

Thus, the proper order is guaranteed to be used and human error is prevented. By standardizing on such script and using it as shown, the aforementioned problem won’t occur anymore.

That is fine but if we didn’t use the script and find ourselves in the situation where no docbroker is running and we must shut down the repositories, is there a way to do it easily and cleanly ? Well, easily, certainly, but cleanly, no. Please, continue reading on Part II.