Tuesday 21 April 2020

RAC [INS-06003] Failed to setup passwordless SSH connectivity with the following node(s)



  • Scenario Preview : We recently find ssh error on during installation. As per my experience only found this error only on open suse 12 server.

  • OpenSSH_6.7: ERROR [INS-06003] Failed to setup passwordless SSH connectivity During Grid Infrastructure Install (Doc ID 2111092.1)






















  • srv1:/u01/grid/sshsetup # vi  /etc/ssh/sshd_config (both nodes)


KexAlgorithms curve25519-sha256@libssh.org,ecdh-sha2-nistp256,ecdh-sha2-nistp384,ecdh-sha2-nistp521,diffie-hellman-group-exchange-sha256,diffie-hellman-group14-sha1,diffie-hellman-group-exchange-sha1,diffie-hellman-group1-sha1   

(Doc ID 2111092.1)


  • srv1:/u01/grid/sshsetup # service sshd restart (both nodes)
  • srv1:/u01/grid/sshsetup # service sshd status

sshd.service - OpenSSH Daemon
   Loaded: loaded (/usr/lib/systemd/system/sshd.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2020-04-21 11:28:18 IST; 20s ago
  Process: 6035 ExecStartPre=/usr/sbin/sshd -t $SSHD_OPTS (code=exited, status=0/SUCCESS)
  Process: 6032 ExecStartPre=/usr/sbin/sshd-gen-keys-start (code=exited, status=0/SUCCESS)
 Main PID: 6039 (sshd)
    Tasks: 1
   CGroup: /system.slice/sshd.service
           └─6039 /usr/sbin/sshd -D

Apr 21 11:28:18 srv1 systemd[1]: Stopped OpenSSH Daemon.
Apr 21 11:28:18 srv1 systemd[1]: Starting OpenSSH Daemon...
Apr 21 11:28:18 srv1 sshd-gen-keys-start[6032]: Checking for missing server keys in /etc/ssh
Apr 21 11:28:18 srv1 sshd[6039]: Server listening on 0.0.0.0 port 22.
Apr 21 11:28:18 srv1 sshd[6039]: Server listening on :: port 22.
Apr 21 11:28:18 srv1 systemd[1]: Started OpenSSH Daemon.
srv1:/u01/grid/sshsetup #

Wednesday 15 April 2020

RAC _Patch _Failed_scenario : CRS-6706: Oracle Clusterware Release patch level ('966527961') does not match Software patch level ('2919480821'). Oracle Clusterware cannot be started.


Scenario Preview : Recently with got error on TEST 2 node grid 12c  regarding unable to create ACFS volume.


  • Got Error - ORA-15032: not all alterations performed || ORA-15477: cannot communicate with the volume driver


  • when we apply patch got error ACFS-9459: ADVM/ACFS is not supported on this OS version: '4.12.14-120-default SP5' after searching trace and founding this is due to bugs as per Doc ID 2205623.1
  • ACFS-9459: ADVM/ACFS is not supported on this OS version: '3.10.0-514.el7.x86_64' (Doc ID 2205623.1)


  • So we find out  this error  due to bug 25078431. So we apply patch below forcefully. 


  • Patch applied successfully but cluster ware didn't start so we have taken grid, database and Orainventory backup at the tar level , so unzip grid and database binary and using  Rlink (Doc ID 1536057.1) 
  •  started the services and cluster service started. Meanwhile on node 1 remains Active as we applied on 2 node so service are running on node1.


Please find example scenario below:

suse2:/u01 # export ORACLE_HOME=/u01/app/12.1.0/grid
suse2:/u01 # export PATH=$ORACLE_HOME/OPatch:$PATH:$ORACLE_HOME/bin
suse2:/u01 # export PATH=$ORACLE_HOME/perl/bin:$PATH

suse2:/u01/25078431 # opatchauto apply -analyze

OPatchauto session is initiated at Fri Apr 10 16:36:02 2020

System initialization log file is /u01/app/12.1.0/grid/cfgtoollogs/opatchautodb/systemconfig2020-04-10_04-36-09PM.log.


Session log file is /u01/app/12.1.0/grid/cfgtoollogs/opatchauto/opatchauto2020-04-10_04-42-21PM.log
The id for this session is 8Q6I

Executing OPatch prereq operations to verify patch applicability on home /u01/app/12.1.0/grid

Executing OPatch prereq operations to verify patch applicability on home /u01/app/oracle/product/12.1.0/dbhome_1
Patch applicability verified successfully on home /u01/app/oracle/product/12.1.0/dbhome_1

Patch applicability verified successfully on home /u01/app/12.1.0/grid


Verifying SQL patch applicability on home /u01/app/oracle/product/12.1.0/dbhome_1
No step execution required.........

OPatchAuto successful.

--------------------------------Summary--------------------------------

Analysis for applying patches has completed successfully:

Host:suse2
RAC Home:/u01/app/oracle/product/12.1.0/dbhome_1
Version:12.1.0.2.0


==Following patches were SKIPPED:

Patch: /u01/25078431/25078431
Reason: This patch is not applicable to this specified target type - "rac_database"


Host:suse2
CRS Home:/u01/app/12.1.0/grid
Version:12.1.0.2.0


==Following patches were SUCCESSFULLY analyzed to be applied:

Patch: /u01/25078431/25078431
Log: /u01/app/12.1.0/grid/cfgtoollogs/opatchauto/core/opatch/opatch2020-04-10_16-42-36PM_1.log



OPatchauto session completed at Fri Apr 10 16:42:42 2020
Time taken to complete the session 6 minutes, 41 seconds
suse2:/u01/25078431 #


suse2:/u01/25078431/25078431 # su grid
grid@suse2:/u01/25078431/25078431> export ORACLE_HOME=/u01/app/12.1.0/grid
grid@suse2:/u01/25078431/25078431> export PATH=$ORACLE_HOME/OPatch:$PATH:$ORACLE_HOME/bin
grid@suse2:/u01/25078431/25078431> export PATH=$ORACLE_HOME/perl/bin:$PATH
grid@suse2:/u01/25078431/25078431>
grid@suse2:/u01/25078431/25078431> /u01/app/12.1.0/grid/OPatch/opatch apply -oh /u01/app/12.1.0/grid -local /u01/25078431/25078431
Oracle Interim Patch Installer version 12.2.0.1.17
Copyright (c) 2020, Oracle Corporation.  All rights reserved.


Oracle Home       : /u01/app/12.1.0/grid
Central Inventory : /u01/app/oraInventory
   from           : /u01/app/12.1.0/grid/oraInst.loc
OPatch version    : 12.2.0.1.17
OUI version       : 12.1.0.2.0
Log file location : /u01/app/12.1.0/grid/cfgtoollogs/opatch/opatch2020-04-10_16-48-57PM_1.log

Verifying environment and performing prerequisite checks...
OPatch continues with these patches:   25078431

Do you want to proceed? [y|n]
y
User Responded with: Y
All checks passed.

Please shutdown Oracle instances running out of this ORACLE_HOME on the local system.
(Oracle Home = '/u01/app/12.1.0/grid')


Is the local system ready for patching? [y|n]
y
User Responded with: Y
Backing up files...
Applying interim patch '25078431' to OH '/u01/app/12.1.0/grid'

Patching component oracle.usm, 12.1.0.2.0...
Patch 25078431 successfully applied.
Sub-set patch [24007012] has become inactive due to the application of a super-set patch [25078431].
Please refer to Doc ID 2161861.1 for any possible further required actions.
Log file location: /u01/app/12.1.0/grid/cfgtoollogs/opatch/opatch2020-04-10_16-48-57PM_1.log

OPatch succeeded.
grid@suse2:/u01/25078431/25078431>



suse2:/u01/25078431/25078431 # . oraenv
ORACLE_SID = [+ASM2] ?
The Oracle base remains unchanged with value /u01/app/grid
suse2:/u01/25078431/25078431 # crsctl start crs
CRS-6706: Oracle Clusterware Release patch level ('966527961') does not match Software patch level ('2919480821'). Oracle Clusterware cannot be started.
CRS-4000: Command Start failed, or completed with errors.

suse2:/u01/25078431/25078431 # ps -ef |grep pmon
root      9224  9126  0 16:56 pts/0    00:00:00 grep --color=auto pmon
suse2:/u01/25078431/25078431 # cd /u01/app/grid/
suse2:/u01/app/grid # ll


suse2:/u01/app/12.1.0/grid/crs/install # ./rootcrs.sh -patch
Using configuration parameter file: /u01/app/12.1.0/grid/crs/install/crsconfig_params
2020/04/10 17:21:56 CLSRSC-4015: Performing install or upgrade action for Oracle Trace File Analyzer (TFA) Collector.

2020/04/10 17:24:16 CLSRSC-4003: Successfully patched Oracle Trace File Analyzer (TFA) Collector.

suse2:/u01/app/12.1.0/grid/crs/install # ./roothas.pl -postpatch
Using configuration parameter file: ./crsconfig_params
CRS-4640: Oracle High Availability Services is already active
CRS-4000: Command Start failed, or completed with errors.

suse2:/u01/app/12.1.0/grid/crs/install # ps -ef |grep d.bin
root       632     1  0 21:16 ?        00:00:00 /usr/lib/wicked/bin/wickedd-auto4 --systemd --foreground
root       634     1  0 21:16 ?        00:00:00 /usr/lib/wicked/bin/wickedd-dhcp6 --systemd --foreground
root       636     1  0 21:16 ?        00:00:00 /usr/lib/wicked/bin/wickedd-dhcp4 --systemd --foreground
root      1754     1  1 21:16 ?        00:00:09 /u01/app/12.1.0/grid/bin/ohasd.bin reboot
root      1905     1  0 21:17 ?        00:00:03 /u01/app/12.1.0/grid/bin/orarootagent.bin
grid      1972     1  0 21:17 ?        00:00:01 /u01/app/12.1.0/grid/bin/oraagent.bin
grid      1984     1  0 21:17 ?        00:00:01 /u01/app/12.1.0/grid/bin/mdnsd.bin
grid      1987     1  0 21:17 ?        00:00:02 /u01/app/12.1.0/grid/bin/evmd.bin
grid      2001     1  0 21:17 ?        00:00:01 /u01/app/12.1.0/grid/bin/gpnpd.bin
grid      2025     1  0 21:17 ?        00:00:02 /u01/app/12.1.0/grid/bin/gipcd.bin
grid      2042  1987  0 21:17 ?        00:00:00 /u01/app/12.1.0/grid/bin/evmlogger.bin -o /u01/app/12.1.0/grid/log/[HOSTNAME]/evmd/evmlogger.info -l /u01/app/12.1.0/grid/log/[HOSTNAME]/evmd/evmlogger.log
root      2162     1  0 21:17 ?        00:00:01 /u01/app/12.1.0/grid/bin/cssdmonitor
root      2180     1  0 21:17 ?        00:00:01 /u01/app/12.1.0/grid/bin/cssdagent
grid      2193     1  0 21:17 ?        00:00:03 /u01/app/12.1.0/grid/bin/ocssd.bin
root      2272     1  0 21:17 ?        00:00:03 /u01/app/12.1.0/grid/bin/octssd.bin reboot
root      6734  5388  0 21:28 pts/0    00:00:00 grep --color=auto d.bin

suse2:/u01/app/12.1.0/grid/crs/install # crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4529: Cluster Synchronization Services is online
CRS-4534: Cannot communicate with Event Manager


suse2:/u01/app/12.1.0/grid/crs/install # crsctl stop crs

NOTE: We found on crs alert log ctssd| ocssd time is different between 2 nodes , So we correct data and time and started the services but no solution found, done troubleshooting also but unable to start clusterware service on node 2. rebooted also. 

suse2:/u01/app/12.1.0/grid/crs/install # ./roothas.pl -postpatch
Using configuration parameter file: ./crsconfig_params
CLSU-00105: operating system interface has reported an internal failure
CLSU-00103: error location: canexec2
CLSU-00104: additional error information: no exe permission, file [/u01/app/12.1.0/grid/bin/ohasd]



CRS ALERT.log


2020-04-10 18:06:26.735 [OCSSD(2193)]CRS-1652: Starting clean up of CRSD resources.
2020-04-10 18:06:26.746 [OCSSD(2193)]CRS-1653: The clean up of the CRSD resources failed.
Fri Apr 10 18:06:26 2020
Errors in file /u01/app/grid/diag/crs/suse2/crs/trace/ocssd.trc  (incident=9):
CRS-8503 [] [] [] [] [] [] [] [] [] [] [] []
Incident details in: /u01/app/grid/diag/crs/suse2/crs/incident/incdir_9/ocssd_i9.trc

2020-04-10 18:06:27.014 [OCSSD(2193)]CRS-8503: Oracle Clusterware OCSSD process with operating system process ID 2193 experienced fatal signal or exception code 6
Sweep [inc][9]: completed
2020-04-10 18:06:27.387 [ORAROOTAGENT(1905)]CRS-5013: Agent "ORAROOTAGENT" failed to start process "/u01/app/12.1.0/grid/bin/octssd" for action "start": details at "(:CLSN00008:)" in "/u01/app/grid/diag/crs/suse2/crs/trace/ohasd_orarootagent_root.trc"
2020-04-10 18:06:27.400 [OHASD(1754)]CRS-2878: Failed to restart resource 'ora.ctssd'
2020-04-14 22:26:52.525 [CLSECHO(3762)]CRS-10001: 14-Apr-20 22:26 ACFS-9459: ADVM/ACFS is not supported on this OS version: '4.12.14-120-default SP5'
2020-04-14 22:26:52.617 [CLSECHO(3765)]CRS-10001: 14-Apr-20 22:26 ACFS-9201: Not Supported
2020-04-14 22:26:53.120 [CLSCFG(3767)]CRS-1810: Node-specific configuration for node suse2 in Oracle Local Registry was patched to patch level 966527961.
2020-04-14 22:34:14.217 [CLSECHO(8009)]CRS-10001: 14-Apr-20 22:34 ACFS-9459: ADVM/ACFS is not supported on this OS version: '4.12.14-120-default SP5'
2020-04-14 22:34:14.280 [CLSECHO(8011)]CRS-10001: 14-Apr-20 22:34 ACFS-9201: Not Supported
2020-04-14 22:34:14.354 [CLSCFG(8013)]CRS-1810: Node-specific configuration for node suse2 in Oracle Local Registry was patched to patch level 966527961.


2020-04-10 18:24:16.122 [CLSECHO(9279)]CRS-10001: 10-Apr-20 18:24 ACFS-9459: ADVM/ACFS is not supported on this OS version: '4.12.14-120-default SP5'
2020-04-10 18:24:16.133 [CLSECHO(9281)]CRS-10001: 10-Apr-20 18:24 ACFS-9201: Not Supported
2020-04-10 18:24:16.236 [CLSCFG(9283)]CRS-1810: Node-specific configuration for node suse2 in Oracle Local Registry was patched to patch level 966527961.

2020-04-10 19:35:48.037 [OHASD(25258)]CRS-8500: Oracle Clusterware OHASD process is starting with operating system process ID 25258
2020-04-10 19:35:48.037 [OHASD(25258)]CRS-0715: Oracle High Availability Service has timed out waiting for init.ohasd to be started.
2020-04-10 19:39:10.788 [OHASD(26174)]CRS-8500: Oracle Clusterware OHASD process is starting with operating system process ID 26174
2020-04-10 19:39:10.788 [OHASD(26174)]CRS-0715: Oracle High Availability Service has timed out waiting for init.ohasd to be started.
2020-04-10 19:41:18.982 [OHASD(26687)]CRS-8500: Oracle Clusterware OHASD process is starting with operating system process ID 26687
2020-04-10 19:41:18.982 [OHASD(26687)]CRS-0715: Oracle High Availability Service has timed out waiting for init.ohasd to be started.


( Try to rollback patch but no luck )

use2:/u01/25078431/25078431 # . oraenv
ORACLE_SID = [+ASM2] ?
The Oracle base remains unchanged with value /u01/app/grid
suse2:/u01/25078431/25078431 # export ORACLE_HOME=/u01/app/12.1.0/grid
suse2:/u01/25078431/25078431 # export PATH=$ORACLE_HOME/OPatch:$PATH:$ORACLE_HOME/bin
suse2:/u01/25078431/25078431 # export PATH=$ORACLE_HOME/perl/bin:$PATH
suse2:/u01/25078431/25078431 # opatchauto rollback /u01/25078431/25078431 -oh /u01/app/12.1.0/grid

OPatchauto session is initiated at Mon Apr 20 13:36:29 2020

System initialization log file is /u01/app/12.1.0/grid/cfgtoollogs/opatchautodb/systemconfig2020-04-20_01-36-35PM.log.

Session log file is /u01/app/12.1.0/grid/cfgtoollogs/opatchauto/opatchauto2020-04-20_01-39-31PM.log
The id for this session is VZM8
OPATCHAUTO-72132: Grid is not running on the local host.
OPATCHAUTO-72132: Cannot start a new apply or rollback session when the local grid is not running.
OPATCHAUTO-72132: Please start grid service on the local host to start patching.
OPatchAuto failed.

OPatchauto session completed at Mon Apr 20 13:39:41 2020
Time taken to complete the session 3 minutes, 12 seconds

 opatchauto failed with error code 42
suse2:/u01/25078431/25078431 # cd /u01/app/12.1.0/grid/




########################    Solution   ##############


Unzip  node 2 grid and database binary :


  • tar zvxf grid.tar.gz


  •  tar zvxf 12.1.0.tar.gz


  • Perform  Rlink (Doc ID 1536057.1) 
  • reboot

################# STATUS AFTER Reboot ################

suse2:~ # date -s "13 APR 2020 20:23:10"
Mon Apr 13 20:23:10 IST 2020
suse2:~ # ps -ef |grep pmon
grid      2615     1  0 19:59 ?        00:00:00 asm_pmon_+ASM2
oracle    3252     1  0 20:00 ?        00:00:00 ora_pmon_rac2
root     11213  9425  0 20:23 pts/0    00:00:00 grep --color=auto pmon
suse2:~ # crsctl status resource -t
If 'crsctl' is not a typo you can use command-not-found to lookup the package that contains it, like this:
    cnf crsctl
suse2:~ # . oraenv
ORACLE_SID = [root] ? +ASM2
The Oracle base has been set to /u01/app/grid
suse2:~ # crsctl status resource -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ACFS.dg
               ONLINE  ONLINE       suse1                    STABLE
               ONLINE  ONLINE       suse2                    STABLE
ora.CRS.dg
               ONLINE  ONLINE       suse1                    STABLE
               ONLINE  ONLINE       suse2                    STABLE
ora.DATA.dg
               ONLINE  ONLINE       suse1                    STABLE
               ONLINE  ONLINE       suse2                    STABLE
ora.FRA.dg
               ONLINE  ONLINE       suse1                    STABLE
               ONLINE  ONLINE       suse2                    STABLE
ora.LISTENER.lsnr
               ONLINE  ONLINE       suse1                    STABLE
               ONLINE  ONLINE       suse2                    STABLE
ora.asm
               ONLINE  ONLINE       suse1                    Started,STABLE
               ONLINE  ONLINE       suse2                    Started,STABLE
ora.net1.network
               ONLINE  ONLINE       suse1                    STABLE
               ONLINE  ONLINE       suse2                    STABLE
ora.ons
               ONLINE  ONLINE       suse1                    STABLE
               ONLINE  ONLINE       suse2                    STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       suse2                    STABLE
ora.LISTENER_SCAN2.lsnr
      1        ONLINE  ONLINE       suse1                    STABLE
ora.LISTENER_SCAN3.lsnr
      1        ONLINE  ONLINE       suse1                    STABLE
ora.MGMTLSNR
      1        ONLINE  ONLINE       suse1                    169.254.74.130 192.1
                                                             68.10.5,STABLE
ora.cvu
      1        ONLINE  ONLINE       suse1                    STABLE
ora.mgmtdb
      1        ONLINE  ONLINE       suse1                    Open,STABLE
ora.oc4j
      1        ONLINE  ONLINE       suse1                    STABLE
ora.rac.db
      1        ONLINE  ONLINE       suse1                    Open,STABLE
      2        ONLINE  ONLINE       suse2                    Open,STABLE
ora.scan1.vip
      1        ONLINE  ONLINE       suse2                    STABLE
ora.scan2.vip
      1        ONLINE  ONLINE       suse1                    STABLE
ora.scan3.vip
      1        ONLINE  ONLINE       suse1                    STABLE
ora.suse1.vip
      1        ONLINE  ONLINE       suse1                    STABLE
ora.suse2.vip
      1        ONLINE  ONLINE       suse2                    STABLE
--------------------------------------------------------------------------------
suse2:~ # crsctl stop crs
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'suse2'
CRS-2673: Attempting to stop 'ora.crsd' on 'suse2'
CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'suse2'
CRS-2673: Attempting to stop 'ora.CRS.dg' on 'suse2'
CRS-2673: Attempting to stop 'ora.rac.db' on 'suse2'
CRS-2673: Attempting to stop 'ora.LISTENER_SCAN1.lsnr' on 'suse2'
CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'suse2'
CRS-2677: Stop of 'ora.CRS.dg' on 'suse2' succeeded
CRS-2677: Stop of 'ora.LISTENER_SCAN1.lsnr' on 'suse2' succeeded
CRS-2673: Attempting to stop 'ora.scan1.vip' on 'suse2'
CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'suse2' succeeded
CRS-2673: Attempting to stop 'ora.suse2.vip' on 'suse2'
CRS-2677: Stop of 'ora.scan1.vip' on 'suse2' succeeded
CRS-2672: Attempting to start 'ora.scan1.vip' on 'suse1'
CRS-2677: Stop of 'ora.rac.db' on 'suse2' succeeded
CRS-2673: Attempting to stop 'ora.FRA.dg' on 'suse2'
CRS-2673: Attempting to stop 'ora.DATA.dg' on 'suse2'
CRS-2677: Stop of 'ora.suse2.vip' on 'suse2' succeeded
CRS-2672: Attempting to start 'ora.suse2.vip' on 'suse1'
CRS-2677: Stop of 'ora.DATA.dg' on 'suse2' succeeded
CRS-2677: Stop of 'ora.FRA.dg' on 'suse2' succeeded
CRS-2673: Attempting to stop 'ora.ACFS.dg' on 'suse2'
CRS-2677: Stop of 'ora.ACFS.dg' on 'suse2' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'suse2'
CRS-2677: Stop of 'ora.asm' on 'suse2' succeeded
CRS-2676: Start of 'ora.scan1.vip' on 'suse1' succeeded
CRS-2672: Attempting to start 'ora.LISTENER_SCAN1.lsnr' on 'suse1'
CRS-2676: Start of 'ora.suse2.vip' on 'suse1' succeeded
CRS-2676: Start of 'ora.LISTENER_SCAN1.lsnr' on 'suse1' succeeded
CRS-2673: Attempting to stop 'ora.ons' on 'suse2'
CRS-5014: Agent "ORAAGENT" timed out starting process "/u01/app/12.1.0/grid/opmn/bin/onsctli" for action "stop": details at "(:CLSN00009:)" in "/u01/app/grid/diag/crs/suse2/crs/trace/crsd_oraagent_grid.trc"
CRS-2675: Stop of 'ora.ons' on 'suse2' failed
CRS-2679: Attempting to clean 'ora.ons' on 'suse2'
CRS-2681: Clean of 'ora.ons' on 'suse2' succeeded
CRS-2673: Attempting to stop 'ora.net1.network' on 'suse2'
CRS-2677: Stop of 'ora.net1.network' on 'suse2' succeeded
CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'suse2' has completed
CRS-2677: Stop of 'ora.crsd' on 'suse2' succeeded
CRS-2673: Attempting to stop 'ora.ctssd' on 'suse2'
CRS-2673: Attempting to stop 'ora.evmd' on 'suse2'
CRS-2673: Attempting to stop 'ora.storage' on 'suse2'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'suse2'
CRS-2673: Attempting to stop 'ora.gpnpd' on 'suse2'
CRS-2677: Stop of 'ora.storage' on 'suse2' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'suse2'
CRS-2677: Stop of 'ora.ctssd' on 'suse2' succeeded
CRS-2677: Stop of 'ora.mdnsd' on 'suse2' succeeded
CRS-2677: Stop of 'ora.gpnpd' on 'suse2' succeeded
CRS-2677: Stop of 'ora.evmd' on 'suse2' succeeded
CRS-2677: Stop of 'ora.asm' on 'suse2' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'suse2'
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'suse2' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'suse2'
CRS-2677: Stop of 'ora.cssd' on 'suse2' succeeded
CRS-2673: Attempting to stop 'ora.crf' on 'suse2'
CRS-2677: Stop of 'ora.crf' on 'suse2' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'suse2'
CRS-2677: Stop of 'ora.gipcd' on 'suse2' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'suse2' has completed
CRS-4133: Oracle High Availability Services has been stopped.
suse2:~ #
suse2:~ #

suse2:~ # crsctl start crs
CRS-4123: Oracle High Availability Services has been started.

######################################################################


suse2:/u01/app/12.1.0 # crsctl status resource -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ACFS.dg
               ONLINE  ONLINE       suse1                    STABLE
               ONLINE  ONLINE       suse2                    STABLE
ora.CRS.dg
               ONLINE  ONLINE       suse1                    STABLE
               ONLINE  ONLINE       suse2                    STABLE
ora.DATA.dg
               ONLINE  ONLINE       suse1                    STABLE
               ONLINE  ONLINE       suse2                    STABLE
ora.FRA.dg
               ONLINE  ONLINE       suse1                    STABLE
               ONLINE  ONLINE       suse2                    STABLE
ora.LISTENER.lsnr
               ONLINE  ONLINE       suse1                    STABLE
               ONLINE  ONLINE       suse2                    STABLE
ora.asm
               ONLINE  ONLINE       suse1                    Started,STABLE
               ONLINE  ONLINE       suse2                    Started,STABLE
ora.net1.network
               ONLINE  ONLINE       suse1                    STABLE
               ONLINE  ONLINE       suse2                    STABLE
ora.ons
               ONLINE  ONLINE       suse1                    STABLE
               ONLINE  ONLINE       suse2                    STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       suse2                    STABLE
ora.LISTENER_SCAN2.lsnr
      1        ONLINE  ONLINE       suse1                    STABLE
ora.LISTENER_SCAN3.lsnr
      1        ONLINE  ONLINE       suse1                    STABLE
ora.MGMTLSNR
      1        ONLINE  ONLINE       suse1                    169.254.74.130 192.1
                                                             68.10.5,STABLE
ora.cvu
      1        ONLINE  ONLINE       suse1                    STABLE
ora.mgmtdb
      1        ONLINE  ONLINE       suse1                    Open,STABLE
ora.oc4j
      1        ONLINE  ONLINE       suse1                    STABLE
ora.rac.db
      1        ONLINE  ONLINE       suse1                    Open,STABLE
      2        ONLINE  ONLINE       suse2                    Open,STABLE
ora.scan1.vip
      1        ONLINE  ONLINE       suse2                    STABLE
ora.scan2.vip
      1        ONLINE  ONLINE       suse1                    STABLE
ora.scan3.vip
      1        ONLINE  ONLINE       suse1                    STABLE
ora.suse1.vip
      1        ONLINE  ONLINE       suse1                    STABLE
ora.suse2.vip
      1        ONLINE  ONLINE       suse2                    STABLE
--------------------------------------------------------------------------------
suse2:/u01/app/12.1.0 #


##################################################################

suse2:~ # su oracle
oracle@suse2:/root> cd
oracle@suse2:~> . oraenv
ORACLE_SID = [oracle] ? rac
The Oracle base has been set to /u01/app/oracle

oracle@suse2:~> sqlplus sys/xxx@rac as sysdba

SQL*Plus: Release 12.1.0.2.0 Production on Mon Apr 13 20:25:36 2020

Copyright (c) 1982, 2014, Oracle.  All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

SQL> !date
Mon Apr 13 20:25:54 IST 2020

SQL> select name,open_mode,database_Role from v$database;

NAME      OPEN_MODE            DATABASE_ROLE
--------- -------------------- ----------------
RAC       READ WRITE           PRIMARY



SQL> select INSTANCE_NUMBER,INSTANCE_NAME,HOST_NAME,STATUS from v$instance;

INSTANCE_NUMBER INSTANCE_NAME
--------------- ----------------
HOST_NAME                                                        STATUS
---------------------------------------------------------------- ------------
              2 rac2
suse2                                                            OPEN


SQL>





RAC - CRS-2403: The Cluster Time Synchronization Service on host 02 is in observer mode.

Scenario : We have 2 node 12c 12.1.0.2.0 grid clusterware setup ,  Time synchronization service is being managed by clusterware ctssd service.

Some how our linux guys make time sync entry between two nodes with NTP Server without downtime. Due to this CTSS is gone to  Observer state. Switching over to clock synchronization checks using NTP without any downtime. My post motive was  just to share error details.


CRS alert log below:
/u01/app/grid/diag/crs/02/crs/trace/alert.log


2020-04-13 12:58:49.279 [OCTSSD(4195)]CRS-2403: The Cluster Time Synchronization Service on host 02 is in observer mode.
2020-04-13 13:00:09.296 [OCTSSD(4195)]CRS-2412: The Cluster Time Synchronization Service detects that the local time is significantly different from the mean cluster time. Details in /u01/app/grid/diag/crs/02/crs/trace/octssd.trc.
2020-04-13 13:30:10.200 [OCTSSD(4195)]CRS-2412: The Cluster Time Synchronization Service detects that the local time is significantly different from the mean cluster time. Details in /u01/app/grid/diag/crs/02/crs/trace/octssd.trc.

2020-04-13 12:58:47.221 [OCTSSD(1477)]CRS-2403: The Cluster Time Synchronization Service on host 01 is in observer mode.
PRVF-5507 : NTP daemon or service is not running on any node but NTP configuration file exists on the following node(s):
02,01
PRVF-5415 : Check to see if NTP daemon or service is running failed

grid@01:~> . oraenv
ORACLE_SID = [+ASM2] ?
The Oracle base remains unchanged with value /u01/app/grid
grid@01:~> crsctl check ctss
CRS-4700: The Cluster Time Synchronization Service is in Observer mode.

grid@01:~> ssh sgdcplm02
Last login: Tue Apr 14 09:14:25 2020 from 10.4.6.138
grid@02:~> crsctl check ctss
CRS-4700: The Cluster Time Synchronization Service is in Observer mode.


##############################################################################

grid@01:/u01/app/12.1.0/grid/bin> cluvfy comp clocksync -n all -verbose

Verifying Clock Synchronization across the cluster nodes

Checking if Clusterware is installed on all nodes...
Oracle Clusterware is installed on all nodes.

Checking if CTSS Resource is running on all nodes...
Check: CTSS Resource running on all nodes
  Node Name                             Status
  ------------------------------------  ------------------------
  01                             passed
  02                             passed
CTSS resource check passed

Querying CTSS for time offset on all nodes...
Query of CTSS for time offset passed

Check CTSS state started...
Check: CTSS state
  Node Name                             State
  ------------------------------------  ------------------------
  02                             Observer
  01                             Observer
CTSS is in Observer state. Switching over to clock synchronization checks using NTP


Starting Clock synchronization checks using Network Time Protocol(NTP)...

Checking existence of NTP configuration file "/etc/ntp.conf" across nodes
  Node Name                             File exists?
  ------------------------------------  ------------------------
  02                             yes
  01                             yes
The NTP configuration file "/etc/ntp.conf" is available on all nodes
NTP configuration file "/etc/ntp.conf" existence check passed

Checking daemon liveness...

Check: Liveness for "ntpd"
  Node Name                             Running?
  ------------------------------------  ------------------------
  02                             no
  01                             no
PRVF-7590 : "ntpd" is not running on node "02"
PRVF-7590 : "ntpd" is not running on node "01"
PRVG-1024 : The NTP Daemon or Service was not running on any of the cluster nodes.
PRVF-5415 : Check to see if NTP daemon or service is running failed
Result: Clock synchronization check using Network Time Protocol(NTP) failed


PRVF-9652 : Cluster Time Synchronization Services check failed

Verification of Clock Synchronization across the cluster nodes was unsuccessful on all the specified nodes.
grid@01:/u01/app/12.1.0/grid/bin> cat /etc/ntp.conf

############################################################################

Alert log after changing to ntp  in octssd.trc

grid@01:/u01/app/grid/diag/crs/01/crs/trace> tail -1000f octssd.trc

2020-04-14 10:27:54.403648 :    CTSS:3791980288: sclsctss_gvss3: NTP active, forcing observer mode
2020-04-14 10:27:54.403654 :    CTSS:3791980288: ctss_check_vendor_sw: Vendor time sync software is detected. status [2].
2020-04-14 10:28:06.943670 :    CTSS:3819742976: ctss_checkcb: clsdm requested check alive. checkcb_data{mode[0xee], offset[0 ms]}, length=[8].
2020-04-14 10:28:09.554924 :GIPCHTHR:3817641728:  gipchaWorkerWork: workerThread heart beat, time interval since last heartBeat 31000loopCount 39
2020-04-14 10:28:19.557075 :GIPCHTHR:3815540480:  gipchaDaemonWork: DaemonThread heart beat, time interval since last heartBeat 31010loopCount 39
2020-04-14 10:28:24.406977 :    CTSS:3791980288: sclsctss_ivsr1: default config file found


old_octssd.trc

2020-04-12 00:39:02.478503 :    CTSS:3791980288: sclsctss_ivsr2: default pid file not found
2020-04-12 00:39:02.478517 :    CTSS:3791980288: sclsctss_ivsr2: default pid file not found
2020-04-12 00:39:02.478521 :    CTSS:3791980288: ctss_check_vendor_sw: Vendor time sync software is not detected. status [1].
2020-04-12 00:39:07.378919 :    CTSS:3819742976: ctss_checkcb: clsdm requested check alive. checkcb_data{mode[0xcc], offset[0 ms]}, length=[8].
2020-04-12 00:39:19.984439 :GIPCHTHR:3817641728:  gipchaWorkerWork: workerThread heart beat, time interval since last heartBeat 31010loopCount 39
2020-04-12 00:39:31.991080 :GIPCHTHR:3815540480:  gipchaDaemonWork: DaemonThread heart beat, time interval since last heartBeat 30010loopCount 37
2020-04-12 00:39:32.481366 :    CTSS:3791980288: sclsctss_ivsr2: default pid file not found
2020-04-12 00:39:32.481379 :    CTSS:3791980288: sclsctss_ivsr2: default pid file not found
2020-04-12 00:39:32.481382 :    CTSS:3791980288: ctss_check_vendor_sw: Vendor time sync software is not detected. status [1].
2020-04-12 00:39:37.383441 :    CTSS:3819742976: ctss_checkcb: clsdm requested check alive. checkcb_data{mode[0xcc], offset[0 ms]}, length=[8].
2020-04-12 00:39:50.988957 :GIPCHTHR:3817641728:  gipchaWorkerWork: workerThread heart beat, time interval since last heartBeat 31000loopCount 39
2020-04-12 00:40:01.992865 :GIPCHTHR:3815540480:  gipchaDaemonWork: DaemonThread heart beat, time interval since last heartBeat 30010loopCount 37
2020-04-12 00:40:02.485423 :    CTSS:3791980288: sclsctss_ivsr2: default pid file not found
2020-04-12 00:40:02.485450 :    CTSS:3791980288: sclsctss_ivsr2: default pid file not found


grid@02:~>crsctl status resource -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.BACKUP.dg
               ONLINE  ONLINE       01                STABLE
               ONLINE  ONLINE       02                STABLE
ora.CRS.dg
               ONLINE  ONLINE       01                STABLE
               ONLINE  ONLINE       02                STABLE
ora.DATA.dg
               ONLINE  ONLINE       01                STABLE
               ONLINE  ONLINE       02                STABLE
ora.FRA1.dg
               ONLINE  ONLINE       01                STABLE
               ONLINE  ONLINE       02                STABLE
ora.FRA2.dg
               ONLINE  ONLINE       01                STABLE
               ONLINE  ONLINE       02                STABLE
ora.LISTENER.lsnr
               ONLINE  ONLINE       01                STABLE
               ONLINE  ONLINE       02                STABLE
ora.asm
               ONLINE  ONLINE       01                Started,STABLE
               ONLINE  ONLINE       02                Started,STABLE
ora.net1.network
               ONLINE  ONLINE       01                STABLE
               ONLINE  ONLINE       02                STABLE
ora.ons
               ONLINE  ONLINE       01                STABLE
               ONLINE  ONLINE       02                STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       02                STABLE
ora.LISTENER_SCAN2.lsnr
      1        ONLINE  ONLINE       01                STABLE
ora.LISTENER_SCAN3.lsnr
      1        ONLINE  ONLINE       01                STABLE
ora.MGMTLSNR
      1        ONLINE  ONLINE       01                169.254.238.75 192.1
                                                             68.0.63,STABLE
ora.cvu
      1        ONLINE  ONLINE       01                STABLE
ora.mgmtdb
      1        ONLINE  ONLINE       01                Open,STABLE
ora.oc4j
      1        ONLINE  ONLINE       01                STABLE
ora.scan1.vip
      1        ONLINE  ONLINE       02                STABLE
ora.scan2.vip
      1        ONLINE  ONLINE       01                STABLE
ora.scan3.vip
      1        ONLINE  ONLINE       01                STABLE
ora.01.vip
      1        ONLINE  ONLINE       01                STABLE
ora.02.vip
      1        ONLINE  ONLINE       02                STABLE
ora.rac.acrac.svc
      1        ONLINE  ONLINE       01                STABLE
ora.rac.db
      1        ONLINE  ONLINE       02                Open,STABLE
      2        ONLINE  ONLINE       01                Open,STABLE
ora.rac.pretaf.svc
      1        ONLINE  ONLINE       02                STABLE
ora.rac.pretaf_preconnect.svc
      1        ONLINE  ONLINE       01                STABLE
ora.rac.staf.svc
      1        ONLINE  ONLINE       01                STABLE
--------------------------------------------------------------------------------