Infrastructure at your Service

Cesare Cervini

Tracking Logs Inside a Documentum Container (part I)

Containers running under docker can have their stdout observed from the outside through the “docker logs”command; here is an excerpt of its usage:

docker logs --help
Usage:	docker logs [OPTIONS] CONTAINER

Fetch the logs of a container

      --details        Show extra details provided to logs
  -f, --follow         Follow log output
      --since string   Show logs since timestamp (e.g. 2013-01-02T13:23:37) or relative (e.g. 42m
                       for 42 minutes)
      --tail string    Number of lines to show from the end of the logs (default "all")
  -t, --timestamps     Show timestamps
      --until string   Show logs before a timestamp (e.g. 2013-01-02T13:23:37) or relative (e.g.
                       42m for 42 minutes)


Example of output:

docker logs --follow --timestamps container05bis
2019-07-10T03:50:38.624862914Z ==> /app/dctm/dba/log/docbroker.container05bis.1489.log <==
2019-07-10T03:50:38.624888183Z OpenText Documentum Connection Broker (version 16.4.0000.0248 Linux64)
2019-07-10T03:50:38.624893936Z Copyright (c) 2018. OpenText Corporation
2019-07-10T03:50:38.624898034Z 2019-07-10T05:50:38.519721 [DM_DOCBROKER_I_START]info: "Docbroker has started. Process id: 35"
2019-07-10T03:50:38.624902047Z 2019-07-10T05:50:38.521502 [DM_DOCBROKER_I_REGISTERED_PORT]info: "The Docbroker registered using port (1489)."
2019-07-10T03:50:38.624906087Z 2019-07-10T05:50:38.521544 [DM_DOCBROKER_I_LISTENING]info: "The Docbroker is listening on network address: (INET_ADDR: family: 2, port: 1489, host: container05bis (, 0721a8c0))"
2019-07-10T03:50:38.624915369Z ==> /app/dctm/dba/log/dmtest05bis.log <==
2019-07-10T03:50:38.625050474Z ==> /app/dctm/dba/log/dmtest05bis/agentexec/agentexec.log <==
2019-07-10T03:50:38.625055299Z Wed Jul 10 03:33:48 2019 [INFORMATION] [LAUNCHER 4251] Detected during program initialization: Version: 16.4.0000.0248 Linux64
2019-07-10T03:50:38.625059065Z Wed Jul 10 03:34:18 2019 [INFORMATION] [LAUNCHER 4442] Detected during program initialization: Version: 16.4.0000.0248 Linux64
2019-07-10T03:50:38.625071866Z Wed Jul 10 03:34:48 2019 [INFORMATION] [LAUNCHER 4504] Detected during program initialization: Version: 16.4.0000.0248 Linux64
2019-07-10T03:50:38.625075268Z Wed Jul 10 03:36:18 2019 [INFORMATION] [LAUNCHER 4891] Detected during program initialization: Version: 16.4.0000.0248 Linux64
2019-07-10T03:50:38.625078688Z Wed Jul 10 03:36:49 2019 [INFORMATION] [LAUNCHER 4971] Detected during program initialization: Version: 16.4.0000.0248 Linux64
2019-07-10T03:50:38.625082182Z Wed Jul 10 03:48:18 2019 [INFORMATION] [LAUNCHER 6916] Detected during program initialization: Version: 16.4.0000.0248 Linux64
2019-07-10T03:50:38.625101275Z ==> /app/dctm/wildfly9.0.1/server/DctmServer_MethodServer/logs/ServerApps.log <==
2019-07-10T03:50:38.625105098Z at io.undertow.servlet.core.DeploymentManagerImpl.start(
2019-07-10T03:50:38.625108575Z at org.wildfly.extension.undertow.deployment.UndertowDeploymentService.startContext(
2019-07-10T03:50:38.625112342Z at org.wildfly.extension.undertow.deployment.UndertowDeploymentService$
2019-07-10T03:50:38.625116110Z at java.util.concurrent.Executors$
2019-07-10T03:50:38.625120084Z at
2019-07-10T03:50:38.625123672Z at java.util.concurrent.ThreadPoolExecutor.runWorker(
2019-07-10T03:50:38.625127341Z at java.util.concurrent.ThreadPoolExecutor$
2019-07-10T03:50:38.625131122Z at
2019-07-10T03:50:38.625134828Z at
2019-07-10T03:50:38.625139133Z 05:34:58,050 INFO [ServerService Thread Pool -- 96] com.documentum.cs.otds.DfSessionHandler - DFC Client Successfully initialized
2019-07-10T03:50:38.625146939Z ==> /app/dctm/wildfly9.0.1/server/DctmServer_MethodServer/logs/ServerApps_trace.log <==
2019-07-10T03:50:38.625154528Z ==> /app/dctm/wildfly9.0.1/server/DctmServer_MethodServer/logs/dmotdsrest.log <==
2019-07-10T03:50:38.625159563Z 03:33:16,505 [ServerService Thread Pool -- 66] DFC Client Successfully initialized
2019-07-10T03:50:38.625163445Z 05:34:58,050 [ServerService Thread Pool -- 96] DFC Client Successfully initialized
2019-07-10T03:50:38.625955045Z Setting up watches. Beware: since -r was given, this may take a while!
2019-07-10T03:50:38.627044196Z Watches established.
2019-07-10T03:50:38.934668467Z ==> /app/dctm/dba/log/dmtest05bis.log <==
2019-07-10T03:50:38.934673076Z Wed Jul 10 05:50:38 2019[DM_STARTUP_W_DOCBASE_OWNER_NOT_FOUND] *** warning *** : The database user, dmtest05bisc as specified by your server.ini is not a valid user as determined using the system password check api. This will likely severly impair the operation of your docbase.
2019-07-10T03:50:39.001821414Z OpenText Documentum Content Server (version 16.4.0000.0248 Linux64.Oracle)
2019-07-10T03:50:39.001825146Z Copyright (c) 2018. OpenText Corporation
2019-07-10T03:50:39.001828983Z All rights reserved.
2019-07-10T03:50:39.005318448Z 2019-07-10T05:50:39.005068 193[193] 0000000000000000 [DM_SERVER_I_START_SERVER]info: "Docbase dmtest05bis attempting to open"
2019-07-10T03:50:39.005342472Z 2019-07-10T05:50:39.005172 193[193] 0000000000000000 [DM_SERVER_I_START_KEY_STORAGE_MODE]info: "Docbase dmtest05bis is using database for cryptographic key storage"

If the information is plethoric, the above command can be narrowed to a time window by adding −−since and −−until restrictions, e.g.:

docker logs --timestamps --follow --since 5m <container>

Thus, everything sent to the container’s stdout can be aggregated and viewed very simply from one place, yielding a cheap console log. In particular, Documentum containers could expose their well-known log files to the outside world, e.g. the docbroker log, the content server(s) log(s) and the method server logs. To this effect, it would be enough to just start a “tail -F ” on those files from within the entrypoint as illustrated below:

tail -F ${DOCUMENTUM}/dba/log/docbroker.log ${DOCUMENTUM}/dba/log/dmtest.log ${DOCUMENTUM}/wildfly9.0.1/server/DctmServer_MethodServer/logs/ServerApps.log ...

The -F option guarantees that the logs continue being followed even after a possible rotation.
Admittedly, this output can be a bit hard to read because the logs are interlaced, i.e. lines or block of lines from different logs are displayed sorted by the time they were produced and not by their origin. Actually, this is a benefit because it makes it easier to find correlations between distinct, apparently unrelated events.
Viewing a particular log is still possible from without the container, e.g.:

docker exec <container_name> /bin/bash -c "tail -f \${DOCUMENTUM}/dba/log/dmtest05bis.log"

provided the tail command exists in the container, which is not obvious as there is a definitive will to make images as stripped down as possible.
As those files are statically known (i.e. at build time), such command could be defined as early as in the buildfile and invoked in its entrypoint script.
Unfortunately, the content server logs are not very verbose and the most useful messages are directed to session or ad hoc logs. The session logs are dynamically created for each new session with the session id as their name, which makes it unpredictable. Since those names are only known at run-time, the above buildfile’s “tail -F” command cannot include them and consequently they are not displayed by the “docker logs” command. The same applies to on-demand trace files with variable names, e.g. with a timestamp suffix.
So, is there a way to follow those dynamic session logs (or any dynamically named files at that) anyway ? An obvious way is to use a file or directory watcher to be notified of any created or modified file. If a watcher process running inside the container could wait for such conditions and signal any occurence thereof, a listener process, also running inside the container, could receive the notification and dynamically fork a tail -F command to follow that file. Externally, “docker logs” would continue displaying whatever is sent to the container’s stdout, including the newly discovered files.
Under Linux, we can use inotifywait as the file watcher. Let’s see how to set it up.

Installing inotifywait

Under a Debian Linux derivative such as Ubuntu, inotifywait can be easily installed from the inotify-tools package through the usual command:

sudo apt-get install inotify-tools

Under a Red Hat Linux derivative such as Centos, a two-step method is to first grab the rpm package from its on-line repository and then install it; the latest release as of this writing is the 3.14-9:

  curl -o inotify-tools-3.14-9.el7.x86_64.rpm
  sudo rpm -Uvh inotify-tools-3.14-9.el7.x86_64.rpm 

In order to verify that the installation was successful, just try to launch “inotifywait”:

$ inotifywait
No files specified to watch!

Good, the command exists; let’s now test it.

Testing inotifywait

The command inotifywait has the following invocation syntax:

inotifywait [-hcmrq] [-e  ] [-t  ] [--format  ] [--timefmt  ]  [ ... ]

The man page explains very well each of the parameters so, please, refer there for details.
As the whole point is to detect new or changed files whose name we don’t know at image build time, but whose location is known (e.g. a path such as ${DOCUMENTUM}/dba/log), we will be watching directories. In such a case, one parameter will be a list of directories to recursively watch for any new or modified files.
That command was designed to output to stdout any event whose occurence it was configured to wait for.
The default output format is the following:

watched_filename EVENT_NAMES event_filename

one line per file.
EVENT_NAMES is the event that occurred, in our case mainly one of CREATE or MODIFY as requested through the -e command-line parameter.
As we watch directories, watched_filename is the name of the watched directory where the event occured and event_filename, the created/modified file.
Here is an example of use as a test with custom formatting for a better presentation:

$ inotifywait --quiet --monitor --event create,modify,attrib --recursive --timefmt "%Y/%m/%d-%H:%M:%S" --format "%T %e %w%f" /tmp &

The command will recursively (option −−recursive) watch the /tmp directory for any created or modified (option −−event …) file or a change of file attributes such as the timestamp or permissions. It will run in the background (option −−monitor) and issue events to stdout prefixed with a properly formatted time stamp (options −−timefmt and −−format).
With inotifywait running in the background, create now a dummy file in /tmp:

$ touch /tmp/xx
# the following lines get displayed:
2019/12/31-17:43:45 CREATE /tmp/xx
2019/12/31-17:43:45 ATTRIB /tmp/xx

Actually, the /tmp directory is a busy directory and a lot of file activity occurs in there very quickly:

2019/12/31-17:43:52 CREATE /tmp/hsperfdata_dmadmin/471
2019/12/31-17:43:52 MODIFY /tmp/hsperfdata_dmadmin/471
2019/12/31-17:43:52 MODIFY /tmp/hsperfdata_dmadmin/471
2019/12/31-17:44:06 MODIFY /tmp/nohup.log
2019/12/31-17:44:06 MODIFY /tmp/nohup.log
2019/12/31-17:44:06 MODIFY /tmp/nohup.log
2019/12/31-17:45:06 MODIFY /tmp/nohup.log
2019/12/31-17:45:06 MODIFY /tmp/nohup.log
2019/12/31-17:45:06 MODIFY /tmp/nohup.log
2019/12/31-17:45:22 CREATE /tmp/hsperfdata_dmadmin/510
2019/12/31-17:45:22 MODIFY /tmp/hsperfdata_dmadmin/510
2019/12/31-17:45:22 MODIFY /tmp/hsperfdata_dmadmin/510
2019/12/31-17:45:50 CREATE /tmp/runc-process785366568
2019/12/31-17:45:50 MODIFY /tmp/runc-process785366568

Let’s see if the directory is really watched recursively:

$ mkdir -p /tmp/dir1/dir2
> 2019/12/31-17:49:51 CREATE,ISDIR /tmp/dir1
> 2019/12/31-17:49:51 CREATE,ISDIR /tmp/dir1/dir2

We notice that the directory creation is trapped too.

$ touch /tmp/dir1/dir2/xx
> 2019/12/31-17:49:59 CREATE /tmp/dir1/dir2/xx
> 2019/12/31-17:49:59 ATTRIB /tmp/dir1/dir2/xx

It works as advertised. Moreover, two events are raised here, CREATE for the file creation as it didn’t previously exist, and ATTRIB for the change of timestamp. Let’s verify this:

$ touch /tmp/dir1/dir2/xx
> 2019/12/31-17:50:01 ATTRIB /tmp/dir1/dir2/xx
$ touch /tmp/dir1/dir2/xx
> 2019/12/31-17:50:03 ATTRIB /tmp/dir1/dir2/xx

Let’s see if a change of attribute is also noticed:

$ chmod -r /tmp/dir1/dir2/xx
2019/12/31-18:03:50 ATTRIB /tmp/dir1/dir2/xx
$ chmod +r /tmp/dir1/dir2/xx
2019/12/31-18:03:55 ATTRIB /tmp/dir1/dir2/xx

It is, fine.

Using inotify in a [containerized] Documentum installation

inotifywait will be used with the syntax shown above against the Documentum log directory ${DOCUMENTUM}/dba/log. Its output will be piped into a gawk script that will spawn “tail -F” commands when needed and keep a list of such processes so they can be killed after a timeout of inactivity, i.e. when they are following a file that did not get updated within a given time interval (let’s jokingly name this value “tail time to live”, or TTTL). This is to prevent a potentially unlimited number of such processes to hog the system’s resources, not that they consume much CPU cycles but it is pointless to leave hundreds of such idle processes sitting in memory. So the script will start a tail command on a file upon its CREATE event, or a MODIFY event if not already followed, and clean up existing idle tail commands. As said before, the code below could be included into a container’s entrypoint to make it run constantly in the background.


export watcher_workdir=/tmp/watcher
export tail_on_off=${watcher_workdir}/tail_on_off
export heartbeat_file=${watcher_workdir}/heartbeat_file
export pause_duration=3

# comma-separated list of files watched but excluded from tailing;
export excluded_files="$tail_on_off,$heartbeat_file"

# make private copies of inotifywait and tail so they can be easily identified and killed from the list of running processes;
export private_inotify=~/dctm-inotifywait
export private_tail=~/dctm-tail

follow_logs() {
   # tail time to live, maximum duration in minutes an unmodified tailed file will stay tailed before the tail is killed, i.e. TTTL ;-);
   $private_inotify --quiet --monitor --event create,modify,attrib --recursive --timefmt "%Y/%m/%d-%H:%M:%S" --format "%T|%e|%w%f" ${DOCUMENTUM}/dba/log $watcher_workdir | gawk -v tail_timeout=$((tail_timeout * 60)) -v excluded_files="$excluded_files" -v tail_on_off=$tail_on_off -v heartbeat_file=$heartbeat_file -v env_private_tail=private_tail -v FS="|" -v Apo="'" 'BEGIN {
      # get the dedicated tail command from the environment;

      # get current time;
      cmd_now = "date +\"%Y/%m/%d-%H:%M:%S\""
      # get the time of the next check, i.e. now + the timeout delay;
      cmd_future = "date +\"%Y/%m/%d-%H:%M:%S\" --date=\"" tail_timeout  " seconds\""
      cmd_future | getline next_check_date; close(cmd_future)
      # find the running private tail commands;
      # use the same FS defined in the outer gawk scope to guarantee thet $0 can be split smoothly;
      cmd_tail = "pgrep -fa " private_tail " | gawk " Apo "{print $1 \"" FS "\" $NF}" Apo
      # get the most recent time a running idle tail command is allowed to live;
      cmd_oldest = "date +\"%Y/%m/%d-%H:%M:%S\" --date=\"" tail_timeout  " seconds ago\""
      # command to get the modifiy date of a tailed file;
      cmd_file_stamp = "date +\"%Y/%m/%d-%H:%M:%S\" --reference="
      # files can be excluded from tailing if specified in parameter exclude_files;
      # convert the space-separated list of excluded files to an associative array, which is easier to search as it is indexed by file names;
      nb_files = split(excluded_files, tmp_excluded_files, ",")
      for (;nb_files > 0; nb_files--)
         tab_excluded_files[tmp_excluded_files[nb_files]] = 0

      bMust_tail = 1
      # skip directories (they have a trailing /);
      if (match($3, /\/$/)) next
      # check if logs must be tailed;
      if (tail_on_off == $3 && ("CREATE" == $2 || "MODIFY" == $2)) {
         if ((getline bMust_tail < tail_on_off) >  0) {
            system("rm " tail_on_off " 2>/dev/null")
      } else 
         # tailing on ?
         if (bMust_tail && !($3 in current_tails))
            # CREATE event ?
            if ("CREATE" == $2 && !($3 in tab_excluded_files)) {
               # newly created file not in exclusion list: tail it !
               system(private_tail " --follow=name --retry " $3 " 2>/dev/null &")
               # ... and keep track of it;
               current_tails[$3] = 0
            # MODIFY event ?
            else if ("MODIFY" == $2 && !($3 in tab_excluded_files)) {
               # modified file not in exclusion list nor already tailed: tail it !
               system(private_tail " --follow=name --retry " $3 " 2>/dev/null &")
               # ... and keep track of it;
               current_tails[$3] = 0
      # clean up logic starts here;
      # only upon a heartbeat event;
      if ("ATTRIB" == $2 && $3 == heartbeat_file) {
         # time to check ?
         cmd_now | getline now; close(cmd_now)
         if (now >= next_check_date) {
            # maximum time to live for idle tails;
            cmd_oldest | getline oldest_date; close(cmd_oldest)
            # loop though all the running tail commands;
            while ((cmd_tail | getline) > 0) {
               # cannot explain this spurious entry, ignoring it explicitly eventhough it will anyway; 
               if ("$NF}" Apo == $2) continue
               tail_pid = $1
               tailed_file = $2
               # is it one of the watched files ?
               if (tailed_file in current_tails) {
                  # get the current tailed file last modification date;
                  (cmd_file_stamp tailed_file " 2>/dev/null") | getline last_modif_date; close(cmd_file_stamp tailed_file " 2>/dev/null")
                  # tailed file not updated during time to live period ?
                  if (last_modif_date <= oldest_date) {
                     # yes, kill the tailing process;
                     system("kill -9 " tail_pid " 2> /dev/null")
                     # ... and update the list of tailed files;
                     delete current_tails[tailed_file]
                  else current_tails[tailed_file] = 1
               # else it should not happen because private tail commands are only issues from here and get tracked;

            # resynchronize internal list with actual tailed files;
            for (f in current_tails)
               if (0 == current_tails[f])
                  # tracked file not tailed any more (this should not happen because private tail commands are killed from here only), untrack it;
                  delete current_tails[f]
                  # tracked file was checked and is still alive;
                  current_tails[f] = 0
            # compute the next check time;
            cmd_future | getline next_check_date; close(cmd_future)
# -----------
# main;
# -----------

# create an inotifywait alias so it is easily identified to be stopped later;
ln -s /usr/bin/inotifywait $private_inotify 2> /dev/null
ln -s /usr/bin/tail $private_tail           2> /dev/null

# create the watcher's workdir;
mkdir -p $watcher_workdir 2> /dev/null

# enable following new or modified logs;
echo 1 > $tail_on_off
# start the watcher;
follow_logs &

while [ true ]; do
   # send heartbeat to inotify;
   touch $heartbeat_file
   # do something else here;
   # ...
   sleep $pause_duration

Admittedly, the fact that gawk relies a lot on external commands and pipes (because of its lack of libraries, but this can be arranged by extending it, see other articles in this blog such this one) obfuscates somewhat the statements’ purpose. Also, all the logic is contained in the not-so-obvious automatic loop which is executed each time an event is notified. Anyway, as usual the code is only provided as an illustration.
On line 3, a “technical” sub-directory is defined and created later on line 134. It will contain the heartbeat file (file heartbeat_file, see next paragraph for more details) and a file to switch tailing on and off (file tail_on_off). This directory is watched (see the call to inotifywait on line 19).
One line 12 and 13, private versions of the commands inotifywait and tail are defined and created later on line 130 and 131 as symlinks; this is to facilitate their identification in the list of running processes to kill them if needed.
This script is encapsulated in function follow_logs() started on line 15. It is launched in the background on line 140.
On line 46 the boolean bMust_tail is initialized; it gets updated on line 53 upon a CREATE event on the file $tail_on_off; after it has been read into bMust_tail, the file is remove. By writing 0 or a positive number in it, aggregation of new logs is respectively disabled or enabled:

# disable following new logs;
# from within a container or in a traditional installation:
$ echo 0 > $tail_on_off
# or, for a container, supposing $tail_on_off is defined in the container's current user's environment:
$ docker exec container_name /bin/bash -c "echo 0 > \$tail_on_off"
# enable following new logs;
# from within a container or in a traditional installation:
$ echo 1 > $tail_on_off
# or, for a container, supposing $tail_on_off is defined in the container's current user's environment:
$ docker exec container_name /bin/bash -c "echo 1 > \$tail_on_off"

Currently running tail commands are not impacted.
Note how the $ in $tail_on_off is escaped so it is not consumed by the docker’s host shell and is passed as-is to the container.
On line 63 and 70, private tails commands are started when new files have appeared in the watched sub-directory, or modified files which are not already tailed.
One line 79, a clean-up of idle (private) tail commands is performed and the internal associative array that keeps track of them is refreshed to make sure it reflects the currently running tails.
One line 142, the well-known container entrypoint’s never-ending loop is entered.
The above script can also be run stand-alone in a classic, non containerized installation.

Keeping the gawk script busy

A problem to solve is that the gawk script runs as a coroutine to inotify, i.e. it is synchronized with its output. If there is none, because no event were detected, the script blocks waiting for input, so no cleaning of idle tail commands gets done. This not such a big deal because no new tails commands are started either so the status quo is maintained. However, an easy work-around is possible: In order to force the script to enter the next iteration and give it a chance to perform the cleaning, we introduce a heartbeat with one watched file, e.g.:

while [ true ]; do
   # send heartbeat to inotify;
   touch $heartbeat_file
   # do something else here;
   # ...
   sleep $pause_duration

This explains the presence of the file $heartbeat_file in the list of inotifywait’s target (see line 19 in the code above).
Now, because its timestamp is updated, at least one event is always raised periodically, an ATTRIB on the heartbeat file $heartbeat_file, as shown before. Although ATTRIB events are listened to, no action is done upon them generally, except when occuring on $heartbeat_file in which case their sole purpose is to trigger the script’s execution, more precisely, the cleanup of inactive tail commands.

Let’s test it now.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Cesare Cervini
Cesare Cervini