Infrastructure at your Service

Cesare Cervini

A password() function for dmgawk

A few days ago, as I was preparing a dmawk script for a presentation, I stumbled against another unexpected error.
The script was attempting to connect to a docbase by providing a docbase name, a user name and a password. But before that, it tested whether a password was provided as a command-line parameter (I know, this is not very secure but it was for demonstration purpose only); if not, it prompted for one using dmawk’s built-in password() function. The full command was:

echo "select count(*) from dm_user" | dmawk -v docbase=dmtest -v username=dmadmin -f select.awk

with select.awk narrowed down to:

cat select.awk
BEGIN {
   passwd = password("please, enter password: ")
   print "password was:", passwd
}

The problem was that when piping something into the script, it didn’t prompt anymore for a password. Without piping, it prompted as expected:

echo "select count(*) from dm_user" | dmawk73 -f ./getpasswd.dmawk
password was:
exiting ...

==> not prompted for password;

dmawk73 -f ./getpasswd.dmawk
please, enter password:
password was: Supercalifragilisticexpialidocious!
exiting ...

==> prompted for password;
Here, the alias dmawk73 points to the content server v7.3’s dmawk, my current version of Documentum contentServer.
Note that the variant below did not work either:

cat query_file 
select
   count(*)
from
dm_user

dmawk73 -f ./getpasswd.dmawk < query_file
password was:
exiting ...

==> not prompted for password;
This proves that what screws up the dmawk’s password() function is the presence of characters in stdin, whether they come from a pipe or from a redirection.
Did they change (a politically correct way to say “break”) something in this version relatively to a previous one ? To be sure, I tried the same tiny script with dmawk from an ancient 5.3 installation I keep around for those puzzling occasions, and guess what ? No special weirdness here, it worked as expected:

dmadmin@dmclient:~/getpasswd$ echo "select count(*) from dm_user" | dmawk53 -f ./getpasswd.dmawk
please, enter password:
password was: Supercalifragilisticexpialidocious
exiting ...

where the alias dmawk53 points to the content server v5.3’s dmawk.
A strace on dmawk53 shows that the device /dev/tty is read for input:

open("/dev/tty", O_RDWR|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 4
...
write(4, "please, enter password: ", 24) = 24
read(4, "kdk\n", 4096) = 4
...
close(4) = 0
...
write(1, "password was: kdk\n", 18) = 18
write(1, "exiting ...\n", 12) = 12

For sure, the original behavior was changed somehow around reading from tty and the built-in password() function gets disrupted when something is first input into stdin.
So how to work around this new pesky issue ? Let’s see a few solutions. To be clear, I assume from the beginning that security is not a major concern here. Proposed solutions 4, 5 and 6 however are on the same security level as dmawk’s password() since they restore this function.

1. Give up piping into dmawk

This means that it will not be possible to concatenate the awk script to the previous command. If this is acceptable, why not ? dmawk’s input will have to come from a file, e.g.:

cat query_file 
select
   count(*)
from
dm_user

cat getpasswd_no_pipe.dmawk 
BEGIN {
   while ((getline < query_file) > 0)
      query = query "\n" $0
   close(query_file)
   print "query string is:", query
   pp = password("please, enter password: ")
   print "password was:", pp

   exit

}
END {
  print "exiting ..."
}

Execution:
dmadmin@dmclient:~/getpasswd$ dmawk73 -f getpasswd_no_pipe.dmawk -v query_file=query_file
query is: 
select
   count(*)
from
dm_user

please, enter password: 
password was: Et tu, Brute?
exiting ...

If security matters and command concatenation is not needed, the above may be an acceptable work-around.

2. Using an environment variable

If security is not significant, the password could be passed in a environment variable, e.g.:

cat getpasswd_env.dmawk
BEGIN {
   cmd = "echo $password"
   cmd | getline pp
   close(cmd)
   print "password was:", pp
}
END {
  print "exiting ..."
}

Execution:

export password=Supercalifragilisticexpialidocious!
echo "select count(*) from dm_user" | dmawk73 -f ./getpasswd_env.dmawk
password was: Supercalifragilisticexpialidocious!
exiting ...

Here, it is mandatory to use the export statement because dmawk launches a sub-process to read the parent’s environment variable.
Unlike dmawk, gawk can map the process’ environment into the built-in associative array ENVIRON, which makes accessing $password more elegant and also faster as no sub-process gets spawned:

cat ./getpasswd_env.awk
BEGIN {
   print "password was:", ENVIRON["password"]
   exit
}
END {
   print "exiting..."
}

Execution:

echo "select count(*) from dm_user" | gawk -f ./getpasswd_env.awk
password was: Supercalifragilisticexpialidocious!
exiting...

A little digression here while on the subject of environment variables: it’s a little known fact that the tools iapi and idql support 3 handy but rarely used environment variables: DM_DOCBASE_NAME, DM_USER_NAME and DM_PASSWORD; if those are set, either as a whole or individually, the above utilities can be launched with the corresponding option -ENV_CONNECT_DOCBASE, -ENV_CONNECT_USER_NAME and -ENV_CONNECT_PASSWORD and the corresponding parameter can be omitted. E.g.:

export DM_DOCBASE_NAME=dmtest
export DM_USER_NAME=kermit
export DM_PASSWORD=conehead
idql -ENV_CONNECT_DOCBASE_NAME -ENV_CONNECT_USER_NAME -ENV_CONNECT_PASSWORD </dev/null
   select count(*) from dm_user
   go
   quit
EoQ
Connected to Documentum Server running Release 7.3.0000.0214  Linux64.Oracle
1> 2> count(*)              
----------------------
                    61
(1 row affected)
1> Bye

However, there is no prompt for missing parameters or unset variables and, quite surprisingly, the command fails silently in such cases.
Nonetheless, the point here is that we could standardize on these variable names and use them with awk, e.g. (dm)awk would pull out those parameters from the environment as follows:

echo "select count(*) from dm_user" | dmawk73 'BEGIN {
   cmd = "echo $DM_DOCBASE_NAME $DM_USER_NAME $DM_PASSWORD"
   cmd | getline docbase_name dm_user_name passwd
   print docbase_name, dm_user_name, passwd ? passwd : "N/A"
   close(cmd)
}'
dmtest kermit conehead 

whereas gawk could chose to access those environment variables through the built-in ENVIRON associative array:

echo "select count(*) from dm_user" | gawk 'BEGIN { print ENVIRON["DM_DOCBASE_NAME"], ENVIRON["DM_USER_NAME"], ENVIRON["DM_PASSWORD"] ? ENVIRON["DM_PASSWORD"] : "N/A"}'
dmtest kermit conehead

which can be more readable in some cases since its indexes are explicitly named vs. positional.
See section 5 below to know what dmawk and gawk have in common regarding Documentum.

3. Reading the password from a file

Here too, let’s admit that security is not important so a cleartext password could be read from a text file as follows:

cat getpasswd_from_file.awk
# Usage:
#    dmawk -v password_file=... -f getpasswd_from_file.dmawk 
BEGIN {
   if (!password_file) {
      print "missing password_file parameter"
      exit
   }
   getline pp < password_file
   close(password_file)
   print "password was:", pp
}

Execution:

cat password_file
Supercalifragilisticexpialidocious!

echo "select count(*) from dm_user" | dmawk -f getpasswd_from_file.awk  -v password_file=password_file
password was: Supercalifragilisticexpialidocious!

No surprise here.

4. Access bash’s read -s command

The bash shell has the built-in command read which take the -s option in order to prevent echoing on the screen the entered characters. Unfortunately, while bash is most of the time a login shell, it is not always the subshell invoked when spawning a command, which awk does when executing things like “cmd | getline”. Actually, it is /bin/sh that is invoked as a subshell under Linux, which is a sym link to /bin/dash (at least the Ubuntu 16.04 and 18.04 I’m using here; under Centos, /usr/bin/sh is symlinked to /usr/bin/bash), a much smaller shell than bash and supposedly faster. So, how to force bash as a subshell ?
I could not find any system setting to configure the choice of the subshell. Obviously, changing the /bin/sh symlink and making it point to /bin/bash works indeed but it is a system-wide change and it is not recommended because of possible compatibility issues.
The solution is to explicitly tell the subshell to make bash execute the read. But it is not enough, we also need to explicitly tell read to get its input from /dev/tty otherwise it gets messed up with any piped or redirected input. Here is a solution:

cat getpasswd_tty.dmawk
BEGIN {
   pp = getpassword("please, enter password: ")
   print "\npassword was:", pp
   exit
}
END {
  print "exiting ..."
}
function getpassword(prompt     , cmd, passwd) {
   cmd = "/bin/bash -c 'read -s -p \"" prompt "\" passwd < /dev/tty; echo $passwd'"
   cmd | getline passwd
   close(cmd)
   return passwd
}

Execution:
echo "select count(*) from dm_user" | dmawk -f  getpasswd_tty.dmawk 
please, enter password: password: 
password was: AreYo7Kidd8ngM3?
exiting ...

Line 11 invokes bash from whatever subshell is launched by dmawk, and asks it to execute the read built-in without echo, with the given prompt, and with its input coming directly from the device /dev/tty.
On line 10, note the function getpassword’s formal parameters cmd and passwd; since the function is called without any effective value for those, they are considered as local variables; this is a common idiom in awk where all variables are global and come to existence as soon as they are referenced.
Under Centos, where /usr/bin/bash is also invoked as a subshell, line 11 can be slightly simplified:

   cmd = "'read -s -p \"" prompt "\" passwd < /dev/tty; echo $passwd'""

This work-around is the easiest and closest to the original built-in password() function.

5. Implement password() in dmgawk

Those who have read my blog here know that we have now a much more powerful implementation of awk in our toolbox, GNU gawk, which we can extend to suit our needs. The above blog describes how to extend gawk with a connectivity to Documentum docbases; I jokingly named the resulting awk dmgawk. As glibc includes the getpass() function just for this purpose, why not use the same approach and add to dmgawk a sensible password() function around C’s getpass() that works as before ? Let’s put our money where our mouth is and implement this function in dmgawk. In truth, it should be noted that getpass() is marked as being obsolete so this alternative should be considered as a temporary work-around.
I won’t copy here all the steps from the above blog though; here are only the distinctive ones.
The interface’s source:

cat ~/dmgawk/gawk-4.2.1/extension/password.c
/*
 * password.c - Builtin function that provide an interface to the getpass() function;
 * see dmapp.h for description of functions;
 *
 * C. Cervini
 * dbi-services.com
 * 7/2018
 */
#ifdef HAVE_CONFIG_H
#include 
#endif

#include "gawkapi.h"

#include "gettext.h"
#define _(msgid)  gettext(msgid)
#define N_(msgid) msgid

static const gawk_api_t *api;   /* for convenience macros to work */
static awk_ext_id_t ext_id;
static const char *ext_version = "password extension: version 1.0";
static awk_bool_t (*init_func)(void) = NULL;

int plugin_is_GPL_compatible;

/*  do_password */
static awk_value_t *
do_password(int nargs, awk_value_t *result, struct awk_ext_func *unused) {
   awk_value_t prompt;
   char *passwd;

   assert(result != NULL);

   if (get_argument(0, AWK_STRING, &prompt)) {
      passwd = getpass(prompt.str_value.str);
   }
   else passwd = getpass("");

   make_const_string(passwd == NULL ? "" : passwd, strlen(passwd), result);
   return result;
}

/*
these are the exported functions along with their min and max arities;
let's make the prompt parameter optional, as in dmawk;
*/
static awk_ext_func_t func_table[] = {
        { "password", do_password, 1, 0, awk_false, NULL },
};

/* define the dl_load function using the boilerplate macro */

dl_load_func(func_table, password, "")

Compilation steps:

cd ~/dmgawk/gawk-4.2.1/extension
vi Makefile.am
append the new library to the pkgextension_LTLIBRARIES list:
pkgextension_LTLIBRARIES =      \
        filefuncs.la    \
        fnmatch.la      \
        fork.la         \
        inplace.la      \
        intdiv.la       \
        ordchr.la       \
        readdir.la      \
        readfile.la     \
        revoutput.la    \
        revtwoway.la    \
        rwarray.la      \
        time.la         \
        dctm.la         \
        password.la

later:
dctm_la_SOURCES       = dctm.c
dctm_la_LDFLAGS       = $(MY_MODULE_FLAGS)
dctm_la_LIBADD        = $(MY_LIBS)

password_la_SOURCES  = password.c
password_la_LDFLAGS  = $(MY_MODULE_FLAGS)
password_la_LIBADD   = $(MY_LIBS)

run the make command:
make

go one level up and run the make command again:
make

At this point, the new gawk is ready for use. Let’s test it:

cat getpasswd.awk
@load "password"

BEGIN {
   passwd = password("please, enter password: ")
   print "password was:", passwd
}

END {
   print "exiting..."
}

Execution:

AWKLIBPATH=~/dmgawk/gawk-4.2.1/extension/.libs echo "select count(*) from dm_user" | ~/dmgawk/gawk-4.2.1/gawk -f ./getpasswd.awk 
please, enter password: 
password was: precipitevolissimevolmente
exiting...

If all is good, install the new extension system-wide as follows:

cd ~/dmgawk/gawk-4.2.1
sudo make install

make an alias to the new gawk:
alias dmgawk=/usr/local/bin/gawk
The usage is simplified now:
echo "select count(*) from dm_user" | dmgawk -f ./getpasswd.awk
please, enter password: 
password was: humptydumpty
exiting...

dmgawk looks more and more like a valuable substitute for dmawk. What gets broken in dmawk can be fixed by dmgawk.

6. And in python ?

Those who use python for their Documentum administration tasks, extended with the Documentum connectivity as proposed in my blog here, are even luckier because python has a library for just about everything but the kitchen sink, and an interface to C’s getpass(), appropriately named getpass(), already exists, see here. Therefore, there is no need to write one using e.g. ctypes. Here is how to call the python’s getpass():

cat getpasswd.py 
#!/usr/bin/python

import getpass

passwd = getpass.getpass(prompt = "Please, enter password: ")
print("The password is: " + passwd)

Execution:
echo "select count(*) from dm_user" | ./getpasswd.py
Please, enter password: 
The password is: Did the quick brown fox jump over the lazy dog ?

No muss, no fuss here.

Conclusion

It’s quite interesting to see how basic things that we take for granted get broken from one Documentum release to another. On the bright side though, those little frustrations gives us the opportunity to look for work-arounds, and write blogs about them ;-). I am eager to find the next dysfunction and pretty confident that Documentum will not be disappoint me in this respect.

 

Leave a Reply

Cesare Cervini
Cesare Cervini