ORA-12801 and ORA-12154 with OracleRestart / GRID / RAC on Oracle Database 12c

Symptom/Analysis:

Using Oracle 12c in a RAC environment, you may encounter the following errors:

ORA-12801: error signaled in parallel query server P000, instance rac2.localdomain:DBRAC2 (2)
ORA-12154: TNS:could not resolve the connect identifier specified

In this article, we will present you an issue that is inspired from a real case:

olivier@olivier-ThinkPad-T550 ~ $ sqlplus sys/***@DBRAC1 as sysdba
SQL*Plus: Release 12.1.0.2.0 Production on Mon Jan 9 15:03:42 2017
Copyright (c) 1982, 2014, Oracle.  All rights reserved.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options
SQL> 
select count(*) from DBA_OBJECTS@DBLINK_ONE_MYHOME_TNS, SCOTT.OBJECTS;
exit
ERROR at line 1:
ORA-12801: error signaled in parallel query server P000, instance rac2.localdomain:DBRAC2 (2)
ORA-12154: TNS:could not resolve the connect identifier specified

The following queries return answers with no error:

SQL> select count(*) from DBA_OBJECTS@DBLINK_ONE_MYHOME_TNS;
COUNT(*)
----------
20342
SQL> select count(*) from SCOTT.OBJECTS;
COUNT(*)
----------
90951

Strange !
select * from A is working.
select * from B is working.
select * from A,B is not working.

Let’s check if TNS_ADMIN Oracle environment variable is set in the session:
We will use the dbms_system.get_env function

dbms_system.get_env()
SQL> 
set serveroutput on
DECLARE
RetVal VARCHAR2(4000);
BEGIN
dbms_system.get_env('TNS_ADMIN', RetVal);
dbms_output.put_line('TNS_ADMIN: '||RetVal);
END;
/
SQL> 2 3 4 5 6 7
TNS_ADMIN: /u01/app/MyNetworkAdmin/
PL/SQL procedure successfully completed.
TNS_ADMIN is set correctly.

The dblink is working and is correctly created:

dba_db_links
SQL> SELECT owner, db_link, username, host FROM   dba_db_links ORDER BY owner, db_link;
OWNER	DB_LINK 		  USERNAME   HOST
------- ------------------------- ---------- --------------------
PUBLIC	DBLINK_ONE_MYHOME_TNS	  SYSTEM     DBONE-MYHOME-TNS
DBMS_METADATA.GET_DDL
SQL> SELECT DBMS_METADATA.GET_DDL('DB_LINK','DBLINK_ONE_MYHOME_TNS','PUBLIC') ddl from dual;
DDL
-------------------------------------------------------------------------
CREATE PUBLIC DATABASE LINK "DBLINK_ONE_MYHOME_TNS" 
   CONNECT TO "SYSTEM" IDENTIFIED BY VALUES ':1' USING 'DBONE-MYHOME-TNS'
/u01/app/MyNetworkAdmin/tnsnames.ora
DBONE-MYHOME-TNS =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = X.X.X.X)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SID = ONE)
    )
  )

So what ?
Let’s print the SQL plan:

SQL> explain plan for
2 select count(*) from DBA_OBJECTS@DBLINK_ONE_MYHOME_TNS,SCOTT.OBJECTS;
Explained
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------
Plan hash value: 1869185832
-------------------------------------------------------------------------------------------------------------
| Id  | Operation		 | Name        | Rows  | Cost (%CPU)| Time     | TQ/Ins |IN-OUT| PQ Distrib |
-------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT	 |	       |     1 | 19591	 (2)| 00:00:01 |	|      |	    |
|   1 |  SORT AGGREGATE 	 |	       |     1 |	    |	       |	|      |	    |
|   2 |   PX COORDINATOR	 |	       |       |	    |	       |	|      |	    |
|   3 |    PX SEND QC (RANDOM)	 | :TQ10001    |     1 |	    |	       |  Q1,01 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE	 |	       |     1 |	    |	       |  Q1,01 | PCWP |	    |
|   5 |      MERGE JOIN CARTESIAN|	       |  1810M| 19591	 (2)| 00:00:01 |  Q1,01 | PCWP |	    |
|   6 |       BUFFER SORT	 |	       |       |	    |	       |  Q1,01 | PCWC |	    |
|   7 |        PX RECEIVE	 |	       | 19911 |    65	 (0)| 00:00:01 |  Q1,01 | PCWP |	    |
|   8 | 	PX SEND BROADCAST| :TQ10000    | 19911 |    65	 (0)| 00:00:01 |	| S->P | BROADCAST  |
|   9 | 	 REMOTE 	 | DBA_OBJECTS | 19911 |    65	 (0)| 00:00:01 | DBLIN~ | R->S |	       |
|  10 |       BUFFER SORT	 |	       | 90951 | 19526	 (2)| 00:00:01 |  Q1,01 | PCWP |	    |
|  11 |        PX BLOCK ITERATOR |	       | 90951 |    15	 (0)| 00:00:01 |  Q1,01 | PCWC |	    |
|  12 | 	TABLE ACCESS FULL| OBJECTS     | 90951 |    15	 (0)| 00:00:01 |  Q1,01 | PCWP |	    |
-------------------------------------------------------------------------------------------------------------
Remote SQL Information (identified by operation id):
----------------------------------------------------
9 - SELECT 0 FROM "DBA_OBJECTS" "DBA_OBJECTS" (accessing 'DBLINK_ONE_MYHOME_TNS' )
Note
-----
- Degree of Parallelism is 4 because of table property
29 rows selected.
 

Parallelism is enabled for the query.
Let’s disable it !

SQL> ALTER SESSION disable parallel query;

Let’s run our query again:

SQL> select count(*) from DBA_OBJECTS@DBLINK_ONE_MYHOME_TNS,SCOTT.OBJECTS;
  COUNT(*)
----------
1850125242

It is working now, but without using parallelism features.

Problem:

In fact, the problem comes from the environment variable TNS_ADMIN that is not (or not correctly) set for the PX servers process:

When parallelism is enabled, the PX servers are doing all the work. (as shown in our parallel plans)
On the server, PX server processes can be easily identified on the OS level. On Linux they are the processes ora_p***:
Source: http://www.oracle.com/technetwork/database/bi-datawarehousing/twp-parallel-execution-fundamentals-133639.pdf

[oracle@rac1 admin]$ ps -ef | grep ora_p0
oracle 25803 1 0 11:21 ? 00:00:00 ora_p000_DBRAC1
oracle 25805 1 0 11:21 ? 00:00:00 ora_p001_DBRAC1
oracle 25807 1 0 11:21 ? 00:00:00 ora_p002_DBRAC1
oracle 25809 1 0 11:21 ? 00:00:00 ora_p003_DBRAC1
oracle 28021 1 0 14:25 ? 00:00:00 ora_p004_DBRAC1
oracle 28023 1 0 14:25 ? 00:00:00 ora_p005_DBRAC1
oracle 28025 1 0 14:25 ? 00:00:00 ora_p006_DBRAC1
oracle 28027 1 0 14:25 ? 00:00:00 ora_p007_DBRAC1
oracle 28029 1 0 14:25 ? 00:00:00 ora_p008_DBRAC1
oracle 28031 1 0 14:25 ? 00:00:00 ora_p009_DBRAC1
oracle 28033 1 0 14:25 ? 00:00:00 ora_p00a_DBRAC1
oracle 28035 1 0 14:25 ? 00:00:00 ora_p00b_DBRAC1
oracle 28037 1 0 14:25 ? 00:00:00 ora_p00c_DBRAC1
oracle 28039 1 0 14:25 ? 00:00:00 ora_p00d_DBRAC1
oracle 28041 1 0 14:25 ? 00:00:00 ora_p00e_DBRAC1
oracle 28047 1 0 14:25 ? 00:00:00 ora_p00f_DBRAC1

The file /proc/25803/environ contains the Oracle environment variables set for the P000 Process:
Command “sudo strings /proc/25803/environ | grep TNS_” give no result.
Source: https://blogs.oracle.com/myoraclediary/entry/how_to_check_environment_variables

If we check the environ file of the listener processes, we can see that the TNS_ADMIN is correctly set.

[oracle@rac1 ~]$ ps -ef | grep lsn
oracle 2731 12705 0 14:49 pts/0 00:00:00 grep --color=auto lsn
oracle 4176 1 0 08:57 ? 00:00:00 /u01/app/grid/bin/tnslsnr MGMTLSNR -no_crs_notify -inherit
oracle 4309 1 0 08:57 ? 00:00:00 /u01/app/grid/bin/tnslsnr LISTENER_SCAN2 -no_crs_notify -inherit
oracle 4320 1 0 08:57 ? 00:00:00 /u01/app/grid/bin/tnslsnr LISTENER_SCAN3 -no_crs_notify -inherit
oracle 9059 1 0 12:01 ? 00:00:00 /u01/app/grid/bin/tnslsnr LISTENER -no_crs_notify -inherit
[oracle@rac1 ~]$ sudo strings /proc/9059/environ | grep TNS_
TNS_ADMIN=/u01/app/MyNetworkAdmin

This small environ.sh script for the lazy one can list the Oracle environment variables set for all the PX Server process and pmon:

environ.sh
#!/bin/bash
mypspmon='/tmp/pspmon'
myprocess1='pmon_DB'
myprocess2='ora_p00'
ps -ef | grep $myprocess1 | awk '{print $2}' > $mypspmon
ps -ef | grep $myprocess2 | awk '{print $2}' >> $mypspmon
while read ligne;
do
myenvironfile="/proc/${ligne#* }/environ"
if [ -e "$myenvironfile" ]
then
strings $myenvironfile
fi
done < $mypspmon

“sudo ./environ.sh | grep TNS” give no result
The TNS_ADMIN Oracle environment variable is not set for the PX server processes that are spawn to handle the parallel queries.

Solution:

  1. Be sure to set the TNS_ADMIN with srvctl !!
  2. Source: https://docs.oracle.com/cd/B19306_01/rac.102/b14197/srvctladmin.htm#i1010191

    [oracle@rac1 ~]$ srvctl setenv listener -l LISTENER -t TNS_ADMIN='/u01/app/MyNetworkAdmin/'
    [oracle@rac1 ~]$ srvctl setenv database -d DBRAC -t TNS_ADMIN='/u01/app/MyNetworkAdmin/'
    

    Let’s check if the variable are correctly set

    [oracle@rac1 ~]$ srvctl getenv listener -l LISTENER -t TNS_ADMIN
    
    LISTENER:
    TNS_ADMIN=/u01/app/MyNetworkAdmin/
    
    [oracle@rac1 ~]$ srvctl getenv database -d DBRAC -t TNS_ADMIN
    
    DBRAC:
    TNS_ADMIN=/u01/app/MyNetworkAdmin/
    

    TNS_ADMIN seems to be correctly set but we still receive ORA-12801 ORA-12154 errors.
    Moreover “sudo ./environ.sh | grep TNS” still gives no result !
    You need to restart database to set up thoroughly the TNS_ADMIN environment variable database !!

    [oracle@rac1 admin]$ srvctl stop listener -l LISTENER
    [oracle@rac1 admin]$ srvctl start listener -l LISTENER
    [oracle@rac1 admin]$ srvctl stop database -d DBRAC
    [oracle@rac1 admin]$ srvctl start database -d DBRAC
    
    [oracle@rac1 admin]$ srvctl status listener -l LISTENER
    
    Listener LISTENER is enabled
    Listener LISTENER is running on node(s): rac1,rac2
    
    [oracle@rac1 admin]$ srvctl status database -d DBRAC
    
    Instance DBRAC1 is running on node rac1
    Instance DBRAC2 is running on node rac2
    

    Now our “sudo ./environ.sh | grep TNS” commands list the TNS_ADMIN environment variable used by our pmon and px server processes.

    TNS_ADMIN=/u01/app/MyNetworkAdmin/
    TNS_ADMIN=/u01/app/MyNetworkAdmin/
    TNS_ADMIN=/u01/app/MyNetworkAdmin/
    TNS_ADMIN=/u01/app/MyNetworkAdmin/
    [...]
    TNS_ADMIN=/u01/app/MyNetworkAdmin/
    

    And our query is working using parallelism features.

  3. Other solutions exist:
  4. One of them would consist in modifying USR_ORA_ENV cluster resources attribute values with crsctl

    [oracle@rac1 admin]$ crsctl modify resource ora.dbrac.db -attr "USR_ORA_ENV=TNS_ADMIN=/u01/app/MyNetworkAdmin/"
    
    CRS-4995: The command 'Modify resource' is invalid in crsctl. Use srvctl for this command.
    

    In 12c, just add the flag -unsupported ad the end to avoid CRS-4995 errors.
    [oracle@rac1 admin]$ crsctl modify resource ora.dbrac.db -attr “USR_ORA_ENV=TNS_ADMIN=/u01/app/MyNetworkAdmin/” -unsupported

Comments:

  1. This issue is inspired from a real case. You can easily reproduce the issue
  2. SCOTT.OBJECTS table was created like that.

    SCOTT.OBJECTS
    
    create table SCOTT.OBJECTS as select * from DBA_OBJECTS;
    alter table scott parallel 32;
    

    Add an entry in your tnsnames.ora and create a dblink
    You can remove TNS_ADMIN environment database with “srvctl unsetenv database -d database -t TNS_ADMIN” command. And restart.

  3. On the server, on the OS level, if ORACLE_HOME is set and TNS_ADMIN is not set then $ORACLE_HOME/network/admin is used to locate tnsname.ora
  4. So it can be interesting to use symlink (ln -s) in $ORACLE_HOME/network/admin to point and use the same tnsnames.ora file when using sqlplus from the server.

    [oracle@rac1 admin]$ pwd
    
    /u01/app/oracle/product/12.1.0.2/db_1/network/admin
    
    [oracle@rac1 admin]$ ll
    
    total 0
    lrwxrwxrwx 1 oracle oinstall 36 Jan 10 09:53 listener.ora -> /u01/app/MyNetworkAdmin/listener.ora
    lrwxrwxrwx 1 oracle oinstall 34 Jan 10 09:53 sqlnet.ora -> /u01/app/MyNetworkAdmin/sqlnet.ora
    lrwxrwxrwx 1 oracle oinstall 36 Jan 10 09:53 tnsnames.ora -> /u01/app/MyNetworkAdmin/tnsnames.ora
    

But remember, although the TNS_ADMIN environment is set on the OS level when starting the instance, you need to set the TNS_ADMIN with srvctl or crsctl before !!
If not, you may encounter ORA-12154 errors.
And not only for PDML !!