Infrastructure at your Service

The PostgreSQL psql utility provides some really nice features. One of these features is SQL interpolation which allows us to do interesting things, e.g. reading files and analyze the results directly in the database. This post will show how to use this by reading and analyzing sar files of a linux server.

Read More
Usually linux hosts store sar statistics in the /var/log/sa directory:

ls /var/log/sa
 sa02 sa03 sa04 sa05 sa06 sa07 sa08 sa09 sa10 sa11 sa12 sa13 sa14 sa15 sa16 sa18 sa19 sa21 sa25 sa28 sa29 sa30

This files can later be analyzed by providing the file name as an argument to sar:

sar -f /var/log/sa/sa07  | head -8
Linux 3.8.13-68.1.2.el7uek.x86_64 (oel7.lcsys.ch) 	05/07/2015 	_x86_64_	(1 CPU)

09:49:30 AM       LINUX RESTART

09:50:01 AM     CPU     %user     %nice   %system   %iowait    %steal     %idle
10:00:01 AM     all      0.03      0.00      0.11      0.13      0.00     99.73
10:10:01 AM     all      0.48      0.00      0.38      1.93      0.00     97.21
10:20:01 AM     all      0.05      0.00      0.15      0.08      0.00     99.73

 

For the scope of this post let’s say there is a requirement to store the iowait statistics for a long time so that the values may be analyzed at any point in time. The perfect place to store data that needs to be analyzed/queried is the database 🙂
Now let’s come back to SQL interpolation. psql allows us to store the result of an OS command into a variable, e.g:

postgres=# \set blubb `ls -la`

The variable then can be queried with standard sql:

postgres=# select :'blubb';
                                          ?column?                                           
---------------------------------------------------------------------------------------------
 total 2521056                                                                              +
 drwx------.  5 postgres postgres       4096 Jun 15 10:54 .                                 +
 drwxr-xr-x.  4 root     root             37 May 28 11:10 ..                                +
 -rw-rw-r--.  1 postgres postgres        218 May  7 16:37 1mio_copy_index_after_hstore.sql  +
 -rw-rw-r--.  1 postgres postgres        217 May  7 16:36 1mio_copy_index_before_hstore.sql +
 -rw-rw-r--.  1 postgres postgres        149 May  7 16:07 1mio_copy_no_indexes_hstore.sql   +
 -rw-rw-r--.  1 postgres postgres        144 May  8 16:20 1mio_copy_no_indexes_jsonb.sql    +

… …

If the result of any command can be stored into a variable in psql we are able to do something like this:

postgres=# \set iowait `sar -f /var/log/sa/sa07`
postgres=# select :'iowait';
                                            ?column?                                             
-------------------------------------------------------------------------------------------------
 Linux 3.8.13-68.1.2.el7uek.x86_64 (oel7.lcsys.ch)       05/07/2015      _x86_64_        (1 CPU)+
                                                                                                +
 09:49:30 AM       LINUX RESTART                                                                +
                                                                                                +
 09:50:01 AM     CPU     %user     %nice   %system   %iowait    %steal     %idle                +
 10:00:01 AM     all      0.03      0.00      0.11      0.13      0.00     99.73                +
 10:10:01 AM     all      0.48      0.00      0.38      1.93      0.00     97.21                +

As we are only interested in the iowait statistics let’s get rid of everything we don’t need:

postgres=# \set iowait `sar -f /var/log/sa/sa07 | egrep -v "^$|Average|Linux|LINUX|CPU" | awk -F " " '{print $7}'`
postgres=# select :'iowait';
 ?column? 
----------
 0.13    +
 1.93    +
 0.08    +
 0.15    +
 0.06    +

Looks much better. But if we insert this into a table the result would be one varchar column which is not very good for data analysis. Let’s do some more transformation and delete all new lines and replace all spaces with commas:

postgres=# \set iowait `sar -f /var/log/sa/sa07 | egrep -v "^$|Average|Linux|LINUX|CPU" | awk -F " " '{print $7}' | sed ':a;N;$!ba;s/\n/,/g'`
postgres=# select :'iowait';
                                                                                                        ?c
Column?                                                                                                    
    
----------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
----
 0.13,1.93,0.08,0.15,0.06,0.79,6.15,0.01,0.00,7.20,5.89,0.46,0.05,0.01,0.00,0.00,0.00,0.00,0.00,0.02,0.00,
0.08,0.11,0.04,0.02,0.00,0.00,0.00,0.00,1.05,0.05,0.12,0.00,0.00,0.07,0.67,0.04,0.87,1.14,0.10,0.12,0.03,0
.00
(1 row)

We now have all the values on one line separated by comma. A perfect starting point for an array:

postgres=# select string_to_array(:'iowait',',');
                                                                                                     string_to_array                                                       
                                               
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------
 {0.13,1.93,0.08,0.15,0.06,0.79,6.15,0.01,0.00,7.20,5.89,0.46,0.05,0.01,0.00,0.00,0.00,0.00,0.00,0.02,0.00,0.08,0.11,0.04,0.02,0.00,0.00,0.00,0.00,1.05,0.05,0.12,0.00,0.00
,0.07,0.67,0.04,0.87,1.14,0.10,0.12,0.03,0.00}
(1 row)

Now we have an array of strings. Still not the best thing for analyzing numbers, so let’s do another cast:

postgres=# select string_to_array(:'iowait',',')::numeric[];
                                                                                                     string_to_array                                                       
                                               
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------
 {0.13,1.93,0.08,0.15,0.06,0.79,6.15,0.01,0.00,7.20,5.89,0.46,0.05,0.01,0.00,0.00,0.00,0.00,0.00,0.02,0.00,0.08,0.11,0.04,0.02,0.00,0.00,0.00,0.00,1.05,0.05,0.12,0.00,0.00
,0.07,0.67,0.04,0.87,1.14,0.10,0.12,0.03,0.00}
(1 row)

This is something we can work with. No we need a table to store this:

postgres=# create table iowaits ( day date, iowaits numeric[] );
CREATE TABLE
postgres=# insert into iowaits (day,iowaits) 
                  values ( current_date, string_to_array(:'iowait',',')::numeric[] );
INSERT 0 1

Perfect. Let’s do some analyzing:

postgres=# with bb as ( select unnest(iowaits) vals, day   from iowaits where day = current_date ) select max(vals) from bb;
 max  
------
 7.20
(1 row)

Sure, for a real analysis the data model needs some more investigation. For showing how easy it is to work with files or results of OS commands in PostgreSQL this should be enough 🙂

One Comment

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Daniel Westermann
Daniel Westermann

Principal Consultant & Technology Leader Open Infrastructure