Ergem PEKER 's Oracle Blog: Oracle RAC

Showing posts with label Oracle RAC. Show all posts

Wednesday, October 19, 2011

Oracle Restart hands on

Starting from Oracle Database 11g, a new product (or functionality) called Oracle Restart comes with the part of the Grid Infrastructure installation. It seems Oracle decided to use crsctl, crs_stat, and srvctl like RAC commands for also managing the processes of the single instance databases. This standardization seems handy to me as i have already get used to manage RAC databases day by day.

After upgraded one of the development databases in our data center from 10.2.0.4 to 11.2.0.2 as well as the ASM instance, i decided to spend some of my time to play with this new functionality.

As on the RAC installations status of the services can be investigated with the crs_stat -t command. I think it is understandable that there is not vip, ons, gsd services here as this is not a RAC database.

[oracle@rhel6]:/oracle > crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora...._ASM.dg ora....up.type ONLINE    OFFLINE               
ora....ER.lsnr ora....er.type OFFLINE   OFFLINE               
ora.asm        ora.asm.type   OFFLINE   OFFLINE               
ora.cssd       ora.cssd.type  ONLINE    ONLINE    rhel6   
ora.diskmon    ora....on.type ONLINE    ONLINE    rhel6 

[oracle@rhel6]:/oracle > crs_stat
NAME=ora.DG_DB_ASM.dg
TYPE=ora.diskgroup.type
TARGET=ONLINE
STATE=OFFLINE

NAME=ora.LISTENER.lsnr
TYPE=ora.listener.type
TARGET=OFFLINE
STATE=OFFLINE

NAME=ora.asm
TYPE=ora.asm.type
TARGET=OFFLINE
STATE=OFFLINE

NAME=ora.cssd
TYPE=ora.cssd.type
TARGET=ONLINE
STATE=ONLINE on rhel6

NAME=ora.diskmon
TYPE=ora.diskmon.type
TARGET=ONLINE
STATE=ONLINE on rhel6

The processes of the CRS (it is "HAS" for single instance) is again controlled by crsctl as it is in the RAC installations. You can use check, start, stop options to manage the processes as usual. A small note; CRS processes in the RAC installation is not installed for the single instance installations. For the single instance installations, there is the HAS processes stands for "High Availability Services" and covers the cssd and diskmon processes.

[oracle@rhel6]:/oracle > crsctl check has
CRS-4638: Oracle High Availability Services is online
[oracle@rhel6]:/oracle > crsctl check css
CRS-4529: Cluster Synchronization Services is online
[oracle@rhel6]:/oracle > crsctl check resource ora.cssd

[oracle@rhel6]:/oracle > crsctl stop has
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'rhel6'
CRS-2673: Attempting to stop 'ora.cssd' on 'rhel6'
CRS-2677: Stop of 'ora.cssd' on 'rhel6' succeeded
CRS-2673: Attempting to stop 'ora.diskmon' on 'rhel6'
CRS-2677: Stop of 'ora.diskmon' on 'rhel6' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'rhel6' has completed
CRS-4133: Oracle High Availability Services has been stopped.

[oracle@rhel6]:/oracle > crs_stat -t
CRS-0184: Cannot communicate with the CRS daemon.

After my upgrade process Oracle Restart could not be able to manage the upgraded database. By using srvctl i added the database resource to the repository so that i can manage the database services by using srvctl command line tool. One of the nicest option is, by using the "-a" option and supplying dependent diskgroups of the database makes Oracle Restart to start the ASM and mount the related diskgroups before starting up the database.

[oracle@rhel6]:/oracle > crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora...._ASM.dg ora....up.type ONLINE    ONLINE    rhel6   
ora....ER.lsnr ora....er.type ONLINE    ONLINE    rhel6   
ora.asm        ora.asm.type   ONLINE    ONLINE    rhel6   
ora.cssd       ora.cssd.type  ONLINE    ONLINE    rhel6   
ora.diskmon    ora....on.type ONLINE    ONLINE    rhel6   

[oracle@rhel6]:/oracle > srvctl add database -h          

Adds a database configuration to be managed by Oracle Restart.

Usage: srvctl add database -d db_unique_name -o oracle_home
  [-m domain_name] 
  [-p spfile] 
  [-r {PRIMARY | PHYSICAL_STANDBY | LOGICAL_STANDBY | SNAPSHOT_STANDBY}] 
  [-s start_options] 
  [-t stop_options] 
  [-n db_name] 
  [-y {AUTOMATIC | MANUAL}] 
  [-a "diskgroup_list"]
-d db_unique_name      Unique name for the database
-o oracle_home         ORACLE_HOME path
-m domain              Domain for database. Must be set if database has DB_DOMAIN set.
-p spfile              Server parameter file path
-r role                Role of the database (primary, physical_standby, logical_standby, snapshot_standby)
-s start_options       Startup options for the database. Examples of startup options are open, mount, or nomount.
-t stop_options        Stop options for the database. Examples of shutdown options are normal, transactional, immediate, or abort.
-n db_name             Database name (DB_NAME), if different from the unique name given by the -d option
-y dbpolicy            Management policy for the database (AUTOMATIC or MANUAL)
-a "diskgroup_list"    Comma separated list of disk groups
-h                     Print usage

[oracle@rhel6]:/oracle > srvctl add database -d ORCLT -o /oracle/orahome1
[oracle@rhel6]:/oracle >
[oracle@rhel6]:/oracle >

[oracle@rhel6]:/oracle > crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora...._ASM.dg ora....up.type ONLINE    ONLINE    rhel6   
ora....ER.lsnr ora....er.type ONLINE    ONLINE    rhel6   
ora.asm        ora.asm.type   ONLINE    ONLINE    rhel6   
ora.ORCLT.db   ora....se.type OFFLINE   OFFLINE               
ora.cssd       ora.cssd.type  ONLINE    ONLINE    rhel6   
ora.diskmon    ora....on.type ONLINE    ONLINE    rhel6   

[oracle@rhel6]:/oracle > srvctl start database -d ORCLT
[oracle@rhel6]:/oracle > ps -ef | grep smon
oracle  9109530        1   0 15:46:08      -  0:00 ora_smon_ORCLT
oracle 11075806        1   0 15:43:50      -  0:00 asm_smon_+ASM
[oracle@rhel6]:/oracle >

[oracle@rhel6]:/oracle >                                                                                                     
[oracle@rhel6]:/oracle >

[oracle@rhel6]:/oracle > srvctl status database -d ORCLT
Database is running.

[oracle@rhel6]:/oracle >            
                    
[oracle@rhel6]:/oracle > crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora...._ASM.dg ora....up.type ONLINE    ONLINE    rhel6   
ora....ER.lsnr ora....er.type ONLINE    ONLINE    rhel6   
ora.asm        ora.asm.type   ONLINE    ONLINE    rhel6   
ora.ORCLT.db   ora....se.type ONLINE    ONLINE    rhel6   
ora.cssd       ora.cssd.type  ONLINE    ONLINE    rhel6   
ora.diskmon    ora....on.type ONLINE    ONLINE    rhel6   

[oracle@rhel6]:/oracle >

[oracle@rhel6]:/oracle > srvctl stop database -d ORCLT

[oracle@rhel6]:/oracle > crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora...._ASM.dg ora....up.type ONLINE    ONLINE    rhel6   
ora....ER.lsnr ora....er.type ONLINE    ONLINE    rhel6   
ora.asm        ora.asm.type   OFFLINE   ONLINE    rhel6   
ora.ORCLT.db   ora....se.type OFFLINE   OFFLINE               
ora.cssd       ora.cssd.type  ONLINE    ONLINE    rhel6   
ora.diskmon    ora....on.type ONLINE    ONLINE    rhel6

The second handy feature is "enable" and "disable" of the srvctl which configures the related objects restart options on host restart or restart of the process on failure.

[oracle@rhel6]:/oracle > srvctl enable -h 

The SRVCTL enable command enables the named object so that it can run under 
  Oracle Restart for automatic startup, failover, or restart.

Usage: srvctl enable database -d db_unique_name
Usage: srvctl enable service -d db_unique_name -s "service_name_list"
Usage: srvctl enable asm
Usage: srvctl enable listener [-l lsnr_name]
Usage: srvctl enable diskgroup -g dg_name
Usage: srvctl enable ons [-v]
Usage: srvctl enable eons [-v]

Shutting down everything nicely with Oracle Restart.

[oracle@rhel6]:/oracle > srvctl stop database -d ORCLT
[oracle@rhel6]:/oracle > srvctl stop diskgroup -g DG_DB_ASM
[oracle@rhel6]:/oracle > srvctl stop asm
[oracle@rhel6]:/oracle > srvctl stop listener

[oracle@rhel6]:/oracle > crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora...._ASM.dg ora....up.type OFFLINE   OFFLINE               
ora....ER.lsnr ora....er.type OFFLINE   OFFLINE               
ora.asm        ora.asm.type   OFFLINE   OFFLINE               
ora.ORCLT.db   ora....se.type OFFLINE   OFFLINE               
ora.cssd       ora.cssd.type  ONLINE    ONLINE    rhel6   
ora.diskmon    ora....on.type ONLINE    ONLINE    rhel6

resources:
http://download.oracle.com/docs/cd/E14072_01/server.112/e10595/restart001.htm
$ srvctl -h

Thursday, December 30, 2010

Error in checking condition of instance on node

After rebooting both rac nodes srvctl started to complain about the condition of the second node of my cluster.

[oracle@EPRHEL6 admin]$ srvctl status database -d orcl
Instance ORCL1 is running on node eprhel5
PRKO-2015 : Error in checking condition of instance on node: eprhel6

[oracle@EPRHEL6 admin]$ sqlplus system/password@ORCL2

SQL*Plus: Release 10.2.0.1.0 - Production on Mon Dec 27 00:03:11 2010

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

ERROR:
ORA-12514: TNS:listener does not currently know of service requested in connect
descriptor


Enter user-name:

srvctl also complains when i was trying to start the instance on the second node. So i decided to start the instance manually by using sqlplus.

[oracle@EPRHEL6 admin]$ sqlplus "/ as sysdba"


SQL*Plus: Release 10.2.0.1.0 - Production on Mon Dec 27 00:03:24 2010

Copyright (c) 1982, 2005, Oracle.  All rights reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP and Data Mining options

SQL> startup;
Oracle instance started.

Total System Global Area 599785472 bytes
Fixed Size     2022600 bytes
Variable Size   188744504 bytes
Database Buffers  402653184 bytes
Redo Buffers     6365184 bytes
Database mounted.
Database opened.
SQL> alter system register;

System altered.

SQL> exit
Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP and Data Mining options

[oracle@EPRHEL6 admin]$ sqlplus system/password@ORCL2

SQL*Plus: Release 10.2.0.1.0 - Production on Mon Dec 27 00:04:18 2010

Copyright (c) 1982, 2005, Oracle.  All rights reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP and Data Mining options

SQL> exit
Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP and Data Mining options

It seems there is not any problem with the instance itself. Sqlplus barely connects to the instance ORCL2. There should be a problem about the way of communication between srvctl and the instance.

[oracle@EPRHEL6 admin]$ crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora....L1.inst application    ONLINE    ONLINE    eprhel5     
ora....L2.inst application    ONLINE    UNKNOWN   eprhel6     
ora.ORCL.db    application    ONLINE    ONLINE    eprhel5     
ora....SM1.asm application    ONLINE    ONLINE    eprhel5     
ora....L5.lsnr application    ONLINE    ONLINE    eprhel5     
ora....el5.gsd application    ONLINE    ONLINE    eprhel5     
ora....el5.ons application    ONLINE    ONLINE    eprhel5     
ora....el5.vip application    ONLINE    ONLINE    eprhel5     
ora....SM2.asm application    ONLINE    ONLINE    eprhel6     
ora....L5.lsnr application    OFFLINE   OFFLINE               
ora....L6.lsnr application    ONLINE    ONLINE    eprhel6     
ora....el6.gsd application    ONLINE    ONLINE    eprhel6     
ora....el6.ons application    ONLINE    ONLINE    eprhel6     
ora....el6.vip application    ONLINE    ONLINE    eprhel6     

[oracle@EPRHEL6 admin]$ srvctl start listener -n EPRHEL6

[oracle@EPRHEL6 admin]$ crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora....L1.inst application    ONLINE    ONLINE    eprhel5     
ora....L2.inst application    ONLINE    UNKNOWN   eprhel6     
ora.ORCL.db    application    ONLINE    ONLINE    eprhel5     
ora....SM1.asm application    ONLINE    ONLINE    eprhel5     
ora....L5.lsnr application    ONLINE    ONLINE    eprhel5     
ora....el5.gsd application    ONLINE    ONLINE    eprhel5     
ora....el5.ons application    ONLINE    ONLINE    eprhel5     
ora....el5.vip application    ONLINE    ONLINE    eprhel5     
ora....SM2.asm application    ONLINE    ONLINE    eprhel6     
ora....L5.lsnr application    OFFLINE   OFFLINE               
ora....L6.lsnr application    ONLINE    ONLINE    eprhel6     
ora....el6.gsd application    ONLINE    ONLINE    eprhel6     
ora....el6.ons application    ONLINE    ONLINE    eprhel6     
ora....el6.vip application    ONLINE    ONLINE    eprhel6  

[oracle@EPRHEL6 admin]$ sqlplus system/password@ORCL1

SQL*Plus: Release 10.2.0.1.0 - Production on Mon Dec 27 00:04:35 2010

Copyright (c) 1982, 2005, Oracle.  All rights reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP and Data Mining options

SQL> show parameter listener;

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
local_listener                       string
remote_listener                      string      LISTENERS_ORCL
SQL> exit
Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP and Data Mining options

I think there is a problem with listener configuration or the listener resource itself. But everything seems fine except that OFFLINE resource. After searching google a little bit, i found a solution indicates listener configuration. I decided to recreate the listeners with netca. I will first delete listener named LISTENER from both ASM and DB homes using netca and then recreate them only using DB home. Maybe this resolves the problem.

My action plan is first stop all asm and db instances. Manually remove that OFFLINE listener which is very confusing. Remove all the listener configuration from the cluster with netca and recreate using db home. Here we go.

[oracle@EPRHEL6 db]$ lsnrctl status

LSNRCTL for Linux: Version 10.2.0.1.0 - Production on 27-DEC-2010 00:22:16

Copyright (c) 1991, 2005, Oracle.  All rights reserved.

Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
STATUS of the LISTENER
------------------------
Alias                     LISTENER_EPRHEL6
Version                   TNSLSNR for Linux: Version 10.2.0.1.0 - Production
Start Date                27-DEC-2010 00:02:31
Uptime                    0 days 0 hr. 19 min. 44 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /oracle/product/asm/network/admin/listener.ora
Listener Log File         /oracle/product/asm/network/log/listener_eprhel6.log
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.28.4.226)(PORT=1521)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.28.4.246)(PORT=1521)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC)))
Services Summary...
Service "+ASM" has 1 instance(s).
Instance "+ASM2", status BLOCKED, has 1 handler(s) for this service...
Service "+ASM_XPT" has 1 instance(s).
Instance "+ASM2", status BLOCKED, has 1 handler(s) for this service...
Service "ORCL" has 2 instance(s).
Instance "ORCL1", status READY, has 1 handler(s) for this service...
Instance "ORCL2", status READY, has 2 handler(s) for this service...
Service "ORCLXDB" has 2 instance(s).
Instance "ORCL1", status READY, has 1 handler(s) for this service...
Instance "ORCL2", status READY, has 1 handler(s) for this service...
Service "ORCL_XPT" has 2 instance(s).
Instance "ORCL1", status READY, has 1 handler(s) for this service...
Instance "ORCL2", status READY, has 2 handler(s) for this service...
The command completed successfully

[oracle@EPRHEL6 db]$ srvctl stop database -d orcl
[oracle@EPRHEL6 db]$ srvctl stop asm -n EPRHEL5
[oracle@EPRHEL6 db]$ srvctl stop asm -n EPRHEL6
[oracle@EPRHEL6 db]$ crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora....L1.inst application    OFFLINE   OFFLINE               
ora....L2.inst application    OFFLINE   OFFLINE               
ora.ORCL.db    application    OFFLINE   OFFLINE               
ora....SM1.asm application    OFFLINE   OFFLINE               
ora....el5.gsd application    ONLINE    ONLINE    eprhel5     
ora....el5.ons application    ONLINE    ONLINE    eprhel5     
ora....el5.vip application    ONLINE    ONLINE    eprhel5     
ora....SM2.asm application    OFFLINE   OFFLINE               
ora....L5.lsnr application    OFFLINE   OFFLINE               
ora....el6.gsd application    ONLINE    ONLINE    eprhel6     
ora....el6.ons application    ONLINE    ONLINE    eprhel6     
ora....el6.vip application    ONLINE    ONLINE    eprhel6     

[oracle@EPRHEL6 db]$ crs_getperm ora.eprhel6.LISTENER_EPRHEL5.lsnr
Name: ora.eprhel6.LISTENER_EPRHEL5.lsnr
owner:oracle:rwx,pgrp:dba:rwx,other::r--,
[oracle@EPRHEL6 db]$ crs_unregister ora.eprhel6.LISTENER_EPRHEL5.lsnr
[oracle@EPRHEL6 db]$ crs_profile -delete ora.eprhel6.LISTENER_EPRHEL5.lsnr
CRS-0170: The resource 'ora.eprhel6.LISTENER_EPRHEL5.lsnr' doesn't exist.

[oracle@EPRHEL6 db]$ crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora....L1.inst application    ONLINE    ONLINE    eprhel5     
ora....L2.inst application    ONLINE    ONLINE    eprhel6     
ora.ORCL.db    application    ONLINE    ONLINE    eprhel5     
ora....SM1.asm application    ONLINE    ONLINE    eprhel5     
ora....L5.lsnr application    ONLINE    ONLINE    eprhel5     
ora....el5.gsd application    ONLINE    ONLINE    eprhel5     
ora....el5.ons application    ONLINE    ONLINE    eprhel5     
ora....el5.vip application    ONLINE    ONLINE    eprhel5     
ora....SM2.asm application    ONLINE    ONLINE    eprhel6     
ora....L6.lsnr application    ONLINE    ONLINE    eprhel6     
ora....el6.gsd application    ONLINE    ONLINE    eprhel6     
ora....el6.ons application    ONLINE    ONLINE    eprhel6     
ora....el6.vip application    ONLINE    ONLINE    eprhel6     
[oracle@EPRHEL6 db]$ srvctl status database -d orcl
Instance ORCL1 is running on node eprhel5
Instance ORCL2 is running on node eprhel6
[oracle@EPRHEL6 db]$

It seems problem is solved.

Tuesday, December 28, 2010

Relocating CRS Resource

I have installed a one node RAC 10gR2 on RHEL5.5 for test purposes (my 10gR2 rac on RHEL5.5 vmware installation notes). After adding the second node to the cluster successfully, i realized that the new nodes vip resource is running on the first node. I have seen this problem before on a solaris system but i hadnt got any time to write about that.

[root@EPRHEL6]# crsctl check crs
CSS appears healthy
CRS appears healthy
EVM appears healthy

[root@EPRHEL6]# crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora....L1.inst application    ONLINE    ONLINE    eprhel5     
ora.ORCL.db    application    ONLINE    ONLINE    eprhel5     
ora....SM1.asm application    ONLINE    ONLINE    eprhel5     
ora....L5.lsnr application    ONLINE    ONLINE    eprhel5     
ora....el5.gsd application    ONLINE    ONLINE    eprhel5     
ora....el5.ons application    ONLINE    ONLINE    eprhel5     
ora....el5.vip application    ONLINE    ONLINE    eprhel5     
ora....el6.gsd application    ONLINE    ONLINE    eprhel6     
ora....el6.ons application    ONLINE    ONLINE    eprhel6     
ora....el6.vip application    ONLINE    ONLINE    eprhel5     

[root@EPRHEL6]# ping eprhel6-vip
PING eprhel6-vip (172.28.4.226) 56(84) bytes of data.
64 bytes from eprhel6-vip (172.28.4.226): icmp_seq=1 ttl=64 time=2.28 ms
64 bytes from eprhel6-vip (172.28.4.226): icmp_seq=2 ttl=64 time=1.03 ms
64 bytes from eprhel6-vip (172.28.4.226): icmp_seq=3 ttl=64 time=0.131 ms

[root@EPRHEL6]# ifconfig -a
eth0      Link encap:Ethernet  HWaddr 00:0C:29:DE:D8:FD  
inet addr:172.28.4.246  Bcast:172.28.4.255  Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:fede:d8fd/64 Scope:Link
UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
RX packets:36798 errors:0 dropped:0 overruns:0 frame:0
TX packets:13478 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000 
RX bytes:21057458 (20.0 MiB)  TX bytes:10660215 (10.1 MiB)

eth1      Link encap:Ethernet  HWaddr 00:0C:29:DE:D8:07  
BROADCAST MULTICAST  MTU:1500  Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000 
RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)

[root@EPRHEL5]# ifconfig eth0
eth0      Link encap:Ethernet  HWaddr 00:0C:29:B7:92:45  
inet addr:172.28.4.245  Bcast:172.28.4.255  Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:feb7:9245/64 Scope:Link
UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
RX packets:9074707 errors:0 dropped:0 overruns:0 frame:0
TX packets:1212938 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000 
RX bytes:1173926429 (1.0 GiB)  TX bytes:1041963477 (993.6 MiB)

[root@EPRHEL5]# ifconfig eth0:1
eth0:1    Link encap:Ethernet  HWaddr 00:0C:29:B7:92:45  
inet addr:172.28.4.225  Bcast:172.28.4.255  Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

[root@EPRHEL5]# ifconfig eth0:2
eth0:2    Link encap:Ethernet  HWaddr 00:0C:29:B7:92:45  
inet addr:172.28.4.226  Bcast:172.28.4.255  Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

I suppose this is because of the network settings of the newly added node and somehow crs could not assign the vip ip address to the nic card. crs_relocate may work on this.

[root@EPRHEL5]# crs_relocate ora.eprhel6.vip
Attempting to stop `ora.eprhel6.vip` on member `eprhel5`
Stop of `ora.eprhel6.vip` on member `eprhel5` succeeded.
Attempting to start `ora.eprhel6.vip` on member `eprhel6`
Start of `ora.eprhel6.vip` on member `eprhel6` succeeded.
[root@EPRHEL5]# crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora....L1.inst application    ONLINE    ONLINE    eprhel5     
ora.ORCL.db    application    ONLINE    ONLINE    eprhel5     
ora....SM1.asm application    ONLINE    ONLINE    eprhel5     
ora....L5.lsnr application    ONLINE    ONLINE    eprhel5     
ora....el5.gsd application    ONLINE    ONLINE    eprhel5     
ora....el5.ons application    ONLINE    ONLINE    eprhel5     
ora....el5.vip application    ONLINE    ONLINE    eprhel5     
ora....el6.gsd application    ONLINE    ONLINE    eprhel6     
ora....el6.ons application    ONLINE    ONLINE    eprhel6     
ora....el6.vip application    ONLINE    ONLINE    eprhel6  

Now ifconfig on my new node should show the vip ip address information.

[root@EPRHEL6 network-scripts]# ifconfig -a
eth0      Link encap:Ethernet  HWaddr 00:0C:29:DE:D8:FD  
inet addr:172.28.4.246  Bcast:172.28.4.255  Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:fede:d8fd/64 Scope:Link
UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
RX packets:36798 errors:0 dropped:0 overruns:0 frame:0
TX packets:13478 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000 
RX bytes:21057458 (20.0 MiB)  TX bytes:10660215 (10.1 MiB)

eth0:1    Link encap:Ethernet  HWaddr 00:0C:29:DE:D8:FD  
inet addr:172.28.4.226  Bcast:172.28.4.255  Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

Monday, February 23, 2009

Problem with adding a new node to RAC after ocr and voting disk change

After changing the ocr and voting disks with different raw devices, that used with installation, later on i faced a problem while adding a new node to my existing 2node cluster installation.

After the first step of adding a new node to a cluster (cloning my CRS_HOME (cluster home of oracle RAC) and ran all scripts), the last step in the documentation was running root.sh in the new node's CRS_HOME. But the unexpected error was;


root@radbank04 # ./root.sh
"/dev/rdsk/c6t600601607D731F006596A6CD2CE9DD11d0s0" does not exist. Create it before proceeding.
Make sure that this file is shared across cluster nodes.
1

I am pretty sure with the raw device definitions are OK in the new node. I decided to examine the "root.sh" to see what it is running in the background. "root.sh" was consisting of two other scripts "rootinstall" and "rootconfig".


root@radbank04 # more root.sh
#!/bin/sh
/oracle/product/crs10g/install/rootinstall
/oracle/product/crs10g/install/rootconfig

I found the following part of the script in the very beginning of the rootconfig script.


SILENT=false
ORA_CRS_HOME=/oracle/product/crs10g
CRS_ORACLE_OWNER=oracle
CRS_DBA_GROUP=oinstall
CRS_VNDR_CLUSTER=false
CRS_OCR_LOCATIONS=/dev/rdsk/c6t600601607D731F00BAE716BB2CE9DD11d0s0,/dev/rdsk/c6t600601607D731F006496A6CD2CE9DD11d0s0
CRS_CLUSTER_NAME=crs
CRS_HOST_NAME_LIST=radbank02,1,radbank03,2
CRS_NODE_NAME_LIST=radbank02,1,radbank03,2
CRS_PRIVATE_NAME_LIST=radbank02-priv1,1,radbank03-priv1,2
CRS_LANGUAGE_ID='AMERICAN_AMERICA.WE8ISO8859P1'
CRS_VOTING_DISKS=/dev/rdsk/c6t600601607D731F006596A6CD2CE9DD11d0s0
CRS_NODELIST=radbank02,radbank03
CRS_NODEVIPS='radbank02/radbank02-vip/255.255.255.0/bge0,radbank03/radbank03-vip/255.255.255.0/bge0'

Somehow (my second somehow in the 10gR2 RAC environment) crs records was showing the old ocr and voting disks. The new disks i have replaced in my existing RAC environment before was totally different. I decided to modify the rootconfig file and replace the old defined ocr and voting disks with the new ones as follows. An rerun the rootconfig file. I dont know if this action is recommended by oracle but i think this is the only change that i have in my hand.


SILENT=false
ORA_CRS_HOME=/oracle/product/crs10g
CRS_ORACLE_OWNER=oracle
CRS_DBA_GROUP=oinstall
CRS_VNDR_CLUSTER=false
#CRS_OCR_LOCATIONS=/dev/rdsk/c6t600601607D731F00BAE716BB2CE9DD11d0s0,/dev/rdsk/c6t600601607D731F006496A6CD2CE9DD11d0s0
CRS_OCR_LOCATIONS=/dev/rdsk/c6t600601607D731F00AE6F711488FEDD11d0s0,/dev/rdsk/c6t600601607D731F00AF6F711488FEDD11d0s0
CRS_CLUSTER_NAME=crs
CRS_HOST_NAME_LIST=radbank02,1,radbank03,2
CRS_NODE_NAME_LIST=radbank02,1,radbank03,2
CRS_PRIVATE_NAME_LIST=radbank02-priv1,1,radbank03-priv1,2
CRS_LANGUAGE_ID='AMERICAN_AMERICA.WE8ISO8859P1'
#CRS_VOTING_DISKS=/dev/rdsk/c6t600601607D731F006596A6CD2CE9DD11d0s0
CRS_VOTING_DISKS=/dev/rdsk/c6t600601607D731F00A8833C0377FEDD11d0s0,/dev/rdsk/c6t600601607D731F00A6833C0377FEDD11d0s0,/dev/rdsk/c6t60
0601607D731F00A7833C0377FEDD11d0s0
CRS_NODELIST=radbank02,radbank03
CRS_NODEVIPS='radbank02/radbank02-vip/255.255.255.0/bge0,radbank03/radbank03-vip/255.255.255.0/bge0'

root@radbank04 # /oracle/product/crs10g/install/rootconfig

Checking to see if Oracle CRS stack is already configured
OCR LOCATIONS =  /dev/rdsk/c6t600601607D731F00AE6F711488FEDD11d0s0,/dev/rdsk/c6t600601607D731F00AF6F711488FEDD11d0s0
OCR backup directory '/oracle/product/crs10g/cdata/crs' does not exist. Creating now
Setting the permissions on OCR backup directory
Setting up NS directories
Oracle Cluster Registry configuration upgraded successfully
WARNING: directory '/oracle/product' is not owned by root
WARNING: directory '/oracle' is not owned by root
clscfg: EXISTING configuration version 3 detected.
clscfg: version 3 is 10G Release 2.
Successfully accumulated necessary OCR keys.
Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
node :   
node 1: radbank02 radbank02-priv1 radbank02
node 2: radbank03 radbank03-priv1 radbank03
clscfg: Arguments check out successfully.

NO KEYS WERE WRITTEN. Supply -force parameter to override.
-force is destructive and will destroy any previous cluster
configuration.
Oracle Cluster Registry for cluster has already been initialized
Startup will be queued to init within 30 seconds.
Adding daemons to inittab
Expecting the CRS daemons to be up within 600 seconds.
CSS is active on these nodes.
        radbank02
        radbank03
        radbank04
CSS is active on all nodes.

root@radbank04 # ./crs_stat -t

Name           Type           Target    State     Host        
------------------------------------------------------------
ora.monsc.db   application    ONLINE    ONLINE    radbank03   
ora....c1.inst application    ONLINE    ONLINE    radbank02   
ora....c2.inst application    ONLINE    ONLINE    radbank03   
ora.montc.db   application    ONLINE    ONLINE    radbank02   
ora....ntc1.cs application    ONLINE    ONLINE    radbank02   
ora....c1.inst application    ONLINE    ONLINE    radbank02   
ora....tc1.srv application    ONLINE    ONLINE    radbank02   
ora....ntc2.cs application    ONLINE    ONLINE    radbank03   
ora....c2.inst application    ONLINE    ONLINE    radbank03   
ora....tc2.srv application    ONLINE    ONLINE    radbank03   
ora....ntc3.cs application    ONLINE    ONLINE    radbank03   
ora....tc1.srv application    ONLINE    ONLINE    radbank02   
ora....tc2.srv application    ONLINE    ONLINE    radbank03   
ora....SM1.asm application    ONLINE    ONLINE    radbank02   
ora....02.lsnr application    ONLINE    ONLINE    radbank02   
ora....02.lsnr application    ONLINE    ONLINE    radbank02   
ora....k02.gsd application    ONLINE    ONLINE    radbank02   
ora....k02.ons application    ONLINE    ONLINE    radbank02   
ora....k02.vip application    ONLINE    ONLINE    radbank02   
ora....SM2.asm application    ONLINE    ONLINE    radbank03   
ora....03.lsnr application    ONLINE    ONLINE    radbank03   
ora....03.lsnr application    ONLINE    ONLINE    radbank03   
ora....k03.gsd application    ONLINE    ONLINE    radbank03   
ora....k03.ons application    ONLINE    ONLINE    radbank03   
ora....k03.vip application    ONLINE    ONLINE    radbank03   
ora....k04.gsd application    ONLINE    ONLINE    radbank04   
ora....k04.ons application    ONLINE    ONLINE    radbank04   
ora....k04.vip application    ONLINE    ONLINE    radbank04

My new node is part of my existing cluster now. Now time to add the ORACLE_HOME (database installation) and the instances with the services.

Resources:

Oracle® Database Oracle Clusterware and Oracle Real Application Clusters Administration and Deployment Guide 10g Release 2 (10.2)

Add/Remove ASM disks and asm_power_limit

Before going live on production our storage team decided to reconfigure their storage and raidsets for better performance. So i gave the raw devices back from asm and of course used the opportunity to test the asm_power_limit parameter.

From the following listing you can find how to list asm disks, adding and removing disks from a disk group and watch the rebalance operation from v$asm_operation view.

By changing the init parameter "asm_power_limit", asm rebalance operations are significantly getting faster. The difference can obviously be seen by setting the parameter to 10 (instead of default value 1) and then giving the same operations that need rebalance. Of course this was not an online system so we may want asm to exhaust all the IO for rebalancing. But in case of production systems there can be unwanted IO performance degradation of the application by this kind of aggressive setting.


SQL> select group_number,state,name,total_mb from v$asm_disk;

GROUP_NUMBER STATE    NAME                   TOTAL_MB
------------ -------- -------------------- ----------
0 NORMAL                               924
0 NORMAL                               924
0 NORMAL                               924
0 NORMAL                               924
1 NORMAL   DATA_0009                614300
1 NORMAL   DATA_0008                614290
1 NORMAL   DATA_0002                614290
1 NORMAL   DATA_0001                614290
1 NORMAL   DATA_0000                614290
1 NORMAL   DATA_0007                614290
1 NORMAL   DATA_0006                614290
1 NORMAL   DATA_0005                614290
1 NORMAL   DATA_0004                614290
1 NORMAL   DATA_0003                614290

14 rows selected.

SQL> alter diskgroup DATA drop disk DATA_0000;

Diskgroup altered.

SQL> alter diskgroup DATA drop disk DATA_0001;

Diskgroup altered.

SQL> select group_number,state,name,total_mb,label,path from v$asm_disk;

GROUP_NUMBER STATE    NAME              TOTAL_MB LABEL PATH
------------ -------- --------------- ---------- ----- --------------------------------------------------
0 NORMAL                          924       /dev/rdsk/c6t600601607D731F00A8833C0377FEDD11d0s0
0 NORMAL                          924       /dev/rdsk/c6t600601607D731F00A7833C0377FEDD11d0s0
0 NORMAL                          924       /dev/rdsk/c6t600601607D731F00A6833C0377FEDD11d0s0
0 NORMAL                          924       /dev/rdsk/c6t600601607D731F006596A6CD2CE9DD11d0s0
1 NORMAL   DATA_0009           614300       /dev/rdsk/c6t600601607D731F00D994060263E8DD11d0s0
1 NORMAL   DATA_0008           614290       /dev/rdsk/c6t600601607D731F00D894060263E8DD11d0s0
1 NORMAL   DATA_0002           614290       /dev/rdsk/c6t600601607D731F006A37A4E0A0E7DD11d0s0
1 DROPPING DATA_0001           614290       /dev/rdsk/c6t600601607D731F006937A4E0A0E7DD11d0s0
1 DROPPING DATA_0000           614290       /dev/rdsk/c6t600601607D731F006837A4E0A0E7DD11d0s0
1 NORMAL   DATA_0007           614290       /dev/rdsk/c6t600601607D731F007608DB3DA0E7DD11d0s0
1 NORMAL   DATA_0006           614290       /dev/rdsk/c6t600601607D731F007508DB3DA0E7DD11d0s0
1 NORMAL   DATA_0005           614290       /dev/rdsk/c6t600601607D731F007408DB3DA0E7DD11d0s0
1 NORMAL   DATA_0004           614290       /dev/rdsk/c6t600601607D731F007308DB3DA0E7DD11d0s0
1 NORMAL   DATA_0003           614290       /dev/rdsk/c6t600601607D731F007208DB3DA0E7DD11d0s0


SQL> select * from v$asm_operation;

OPERA STAT      POWER     ACTUAL      SOFAR   EST_WORK   EST_RATE EST_MINUTES
----- ---- ---------- ---------- ---------- ---------- ---------- -----------
REBAL RUN          10         10       7198      12996        260          22
REBAL RUN          10         10       7679      12979        242          21
REBAL RUN          10         10       9286      12930        201          18
REBAL RUN          10         10      11647      12899        237           5

SQL> select * from v$asm_operation;

no rows selected

SQL> select group_number,state,name,total_mb,label,path from v$asm_disk;

GROUP_NUMBER STATE    NAME              TOTAL_MB LABEL PATH
------------ -------- --------------- ---------- ----- --------------------------------------------------
0 NORMAL                          924       /dev/rdsk/c6t600601607D731F00A8833C0377FEDD11d0s0
0 NORMAL                          924       /dev/rdsk/c6t600601607D731F00A7833C0377FEDD11d0s0
0 NORMAL                          924       /dev/rdsk/c6t600601607D731F00A6833C0377FEDD11d0s0
0 NORMAL                       614290       /dev/rdsk/c6t600601607D731F006937A4E0A0E7DD11d0s0
0 NORMAL                       614290       /dev/rdsk/c6t600601607D731F006837A4E0A0E7DD11d0s0
0 NORMAL                          924       /dev/rdsk/c6t600601607D731F006596A6CD2CE9DD11d0s0
1 NORMAL   DATA_0009           614300       /dev/rdsk/c6t600601607D731F00D994060263E8DD11d0s0
1 NORMAL   DATA_0008           614290       /dev/rdsk/c6t600601607D731F00D894060263E8DD11d0s0
1 NORMAL   DATA_0002           614290       /dev/rdsk/c6t600601607D731F006A37A4E0A0E7DD11d0s0
1 NORMAL   DATA_0007           614290       /dev/rdsk/c6t600601607D731F007608DB3DA0E7DD11d0s0
1 NORMAL   DATA_0006           614290       /dev/rdsk/c6t600601607D731F007508DB3DA0E7DD11d0s0
1 NORMAL   DATA_0005           614290       /dev/rdsk/c6t600601607D731F007408DB3DA0E7DD11d0s0
1 NORMAL   DATA_0004           614290       /dev/rdsk/c6t600601607D731F007308DB3DA0E7DD11d0s0
1 NORMAL   DATA_0003           614290       /dev/rdsk/c6t600601607D731F007208DB3DA0E7DD11d0s0


SQL> alter diskgroup DATA add disk '/dev/rdsk/c6t600601607D731F006837A4E0A0E7DD11d0s0';

Diskgroup altered.

SQL> select group_number,state,name,total_mb,label,path from v$asm_disk;

GROUP_NUMBER STATE    NAME              TOTAL_MB LABEL PATH
------------ -------- --------------- ---------- ----- --------------------------------------------------
0 NORMAL                          924       /dev/rdsk/c6t600601607D731F00A8833C0377FEDD11d0s0
0 NORMAL                          924       /dev/rdsk/c6t600601607D731F00A7833C0377FEDD11d0s0
0 NORMAL                          924       /dev/rdsk/c6t600601607D731F00A6833C0377FEDD11d0s0
0 NORMAL                       614290       /dev/rdsk/c6t600601607D731F006937A4E0A0E7DD11d0s0
0 NORMAL                          924       /dev/rdsk/c6t600601607D731F006596A6CD2CE9DD11d0s0
1 NORMAL   DATA_0009           614300       /dev/rdsk/c6t600601607D731F00D994060263E8DD11d0s0
1 NORMAL   DATA_0008           614290       /dev/rdsk/c6t600601607D731F00D894060263E8DD11d0s0
1 NORMAL   DATA_0002           614290       /dev/rdsk/c6t600601607D731F006A37A4E0A0E7DD11d0s0
1 NORMAL   DATA_0007           614290       /dev/rdsk/c6t600601607D731F007608DB3DA0E7DD11d0s0
1 NORMAL   DATA_0006           614290       /dev/rdsk/c6t600601607D731F007508DB3DA0E7DD11d0s0
1 NORMAL   DATA_0005           614290       /dev/rdsk/c6t600601607D731F007408DB3DA0E7DD11d0s0
1 NORMAL   DATA_0004           614290       /dev/rdsk/c6t600601607D731F007308DB3DA0E7DD11d0s0
1 NORMAL   DATA_0003           614290       /dev/rdsk/c6t600601607D731F007208DB3DA0E7DD11d0s0
1 NORMAL   DATA_0000           614290       /dev/rdsk/c6t600601607D731F006837A4E0A0E7DD11d0s0


SQL> show parameter asm_power_limit;

NAME                            TYPE        VALUE
------------------------------- ----------- -------------------------
asm_power_limit                 integer     1

SQL> alter system set asm_power_limit=10;

System altered.

SQL> show parameter asm_power_limit;

NAME                            TYPE        VALUE
------------------------------- ----------- -------------------------
asm_power_limit                 integer     10

SQL>

Tuesday, August 12, 2008

srvctl problem after 10.2.0.4 patch

New migration project of the database and system group in our company is from Oracle Database 9.2.0.5 RAC on LINUX AS3 32 bit to Oracle Database RAC 10.2.0.4 on LINUX ES4 Update6 64 bit. After installing clusterware and the database 10.2.0.1 successfully, i decided to upgrade the database and the clusterware to the last level of patchset 10.2.0.4. After installing the patchset i realised that srvctl cannot open the databases with the command srvctl start database -d PQ. But i can open the databases individually with the sqlplus. The wierd thing is srvctl works with "status" and "close" parameters. When i examine the crs logs i founded the following error logs.


#cat /oracle/product/crs/log/pqvrtsrv1/crsd/crsd.log
2008-07-16 17:52:37.840: [  CRSAPP][1535326560]0StartResource error for ora.PQ.db error code = 1
2008-07-16 17:52:37.892: [  CRSRES][1535326560]0Start of `ora.PQ.db` on member `pqvrtsrv1` failed.

This one error message did not satisfy me to start taking action for the problem hence it has too few information to make a conclusion. I decided to look at the databases alert.log, if i can find any clue about the problem. But the last logs written to the alert.log was the databases closing messages. This may be a clue, because this means srvctl cannot reach the database to open it, now i can start from srvctl settings to examine.


$ srvctl config database -d PQ -a
pqvrtsrv1 PQ1 /oracle/product/db10g
pqvrtsrv2 PQ2 /oracle/product/db10g
DB_NAME: PQ
ORACLE_HOME: /oracle/product/db10g
SPFILE: /pqdata1/PQ/PQ/spfilePQ.ora
DOMAIN: null
DB_ROLE: null
START_OPTIONS: null
POLICY:  AUTOMATIC
ENABLE FLAG: DB ENABLED

Now i can see things. SPFILE configuration is wrong for this database. I clearly remember that i have changed the spfile location to /pqdata1/PQ/spfilePQ.ora. But somehow after the patch is installed, setting are confused. I changed the SPFILE configuration with the correct settings and my problem is now solved.


#srvctl modify database -d PQ -p /pqdata1/PQ/spfilePQ.ora
#srvctl start instance -d PQ -i PQ2
#srvctl status database -d PQ
Instance PQ1 is running on node pqvrtsrv1
Instance PQ2 is running on node pqvrtsrv2

Resources:

Server Control Utility Reference

Tuesday, July 29, 2008

Problem with CRS startup after disk migration

Our unix and storage group has a project for migration from old storage subsystem to a brand new one. Action plan was simple;

1- shutdown the database with

srvctl stop database -d

2- shutdown the crs with

crsctl stop crs

3- move the data from old disks to new ones
4- rename the new disks with the original names
5- open the crs with

crsctl start crs

6- startup the database with

srvctl start database -d

But right after step 4, crs was unable to open and I am called at night. When i tried to start crs with crsctl start crs, i had errors indicates crs was unable to reach the OCR disk. I tried to check the OCR with ocrcheck and trace the log file of the OCR in $CRS_HOME/log//client/


[root@raquality00 client]$ pwd
/oracle/product/10.2.0/crs/log/raquality00/client
[root@raquality00 client]# cat ocrcheck_13416.log
Oracle Database 10g CRS Release 10.2.0.1.0 Production Copyright 1996, 2005 Oracle.  All rights reserved.
2008-06-18 02:13:39.487: [OCRCHECK][2538462912]ocrcheck starts...
2008-06-18 02:13:39.488: [  OCROSD][2538462912]utstoragetype: /oradata1/orcfile.ora is on FS type 1952539503. Not supported.
2008-06-18 02:13:39.488: [  OCROSD][2538462912]utopen:6'': OCR location /oradata1/orcfile.ora configured is not valid storage type. Return code [37].
2008-06-18 02:13:39.488: [  OCRRAW][2538462912]proprinit: Could not open raw device 
2008-06-18 02:13:39.488: [ default][2538462912]a_init:7!: Backend init unsuccessful : [37]
2008-06-18 02:13:39.488: [OCRCHECK][2538462912]Failed to initialize OCR context: [PROC-37: Oracle Cluster Registry does not support the storage type configured]
2008-06-18 02:13:39.488: [OCRCHECK][2538462912]Failed to initialize ocrchek2
2008-06-18 02:13:39.488: [OCRCHECK][2538462912]Exiting [status=failed]...

Log file says there is a problem with the ocrfile.ora. But I am suspicious if the filename is true. I should be sure with the filename so I examine the install logs in $CRS_HOME/install/ directory.


[oracle@raquality00 install]$ pwd
/oracle/product/10.2.0/crs/install
[oracle@raquality00 install]$ cat paramfile.crs
ORA_CRS_HOME=/oracle/product/10.2.0/crs
CRS_ORACLE_OWNER=oracle
CRS_DBA_GROUP=oinstall
CRS_VNDR_CLUSTER=false
CRS_OCR_LOCATIONS=/oradata1/orcfile.ora
CRS_CLUSTER_NAME=raqcrs
CRS_HOST_NAME_LIST=raquality00,1,raquality01,2
CRS_NODE_NAME_LIST=raquality00,1,raquality01,2
CRS_PRIVATE_NAME_LIST=raquality00-priv,1,raquality01-priv,2
CRS_LANGUAGE_ID='AMERICAN_AMERICA.WE8ISO8859P1'
CRS_VOTING_DISKS=/oradata1/votingfile.ora
CRS_NODELIST=raquality00,raquality01
CRS_NODEVIPS='raquality00/raquality00-vip/255.255.255.0/eth0,raquality01/raquality01-vip/255.255.255.0/eth0'

Lets see if the location of the ocr file is changed by someone after installation by inspecting the /etc/oracle/ocr.loc.


[oracle@raquality00 oracle]$ pwd
/etc/oracle
[oracle@raquality00 oracle]$ cat ocr.loc
ocrconfig_loc=/oradata1/orcfile.ora
local_only=FALSE

OCR location and the filename seems correct. Now we now it hasnt been changed after the installation. It seems there may be corruption with the OCR file. Checking the backups of the OCR is handy. Maybe I will need to restore from the backup. The command I runned is ocrconfig -showbackup. The output shows the case goes bad.


[root@raquality00 client]# cat ocrconfig_9347.log
Oracle Database 10g CRS Release 10.2.0.1.0 Production Copyright 1996, 2005 Oracle.  All rights reserved.
2008-06-18 02:04:20.350: [ OCRCONF][2538462912]ocrconfig starts...
2008-06-18 02:04:20.353: [  OCROSD][2538462912]utstoragetype: /oradata1/orcfile.ora is on FS type 1952539503. Not supported.
2008-06-18 02:04:20.353: [  OCROSD][2538462912]utopen:6'': OCR location /oradata1/orcfile.ora configured is not valid storage type. Return code [37].
2008-06-18 02:04:20.353: [  OCRRAW][2538462912]proprcow: problem reading the bootblock
2008-06-18 02:04:20.353: [ OCRCONF][2538462912]Failure in overwriting OCR configuration on disk
2008-06-18 02:04:20.353: [ OCRCONF][2538462912]Exiting [status=failed]...

I am now getting doubt. Maybe there is a problem about the filesystems. I decided to check the filesystem types from fstab. But they seem OK.


[root@raquality00 raqcrs]# cat /etc/fstab
# This file is edited by fstab-sync - see 'man fstab-sync' for details
LABEL=/1                /                       ext3    defaults        1 1
LABEL=/boot1            /boot                   ext3    defaults        1 2
#LABEL=/oradata          /oradata1              ocfs2   _netdev,datavolume,nointr       0 0
LABEL=/oradata1_new     /oradata1               ocfs2   _netdev,datavolume,nointr       0 0
#LABEL=/oradata2         /oradata2              ocfs2   _netdev,datavolume,nointr       0 0
LABEL=/oradata2_new     /oradata2               ocfs2   _netdev,datavolume,nointr       0 0
#LABEL=/oradata3         /oradata3              ocfs2   _netdev,datavolume,nointr       0 0
#LABEL=/oradata4         /oradata4              ocfs2   _netdev,datavolume,nointr       0 0
#LABEL=/oradata5         /oradata5              ocfs2   _netdev,datavolume,nointr       0 0
#LABEL=/oradata6         /oradata6              ocfs2   _netdev,datavolume,nointr       0 0
#LABEL=/oradata7         /oradata7              ocfs2   _netdev,datavolume,nointr       0 0
#LABEL=/oradata8         /oradata8              ocfs2   _netdev,datavolume,nointr       0 0
#LABEL=/oradata9         /oradata9              ocfs2   _netdev,datavolume,nointr       0 0
LABEL=/oradata9_new     /oradata9               ocfs2   _netdev,datavolume,nointr       0 0
LABEL=/oradata10_new    /oradata10              ocfs2   _netdev,datavolume,nointr       0 0
#LABEL=/oradata11        /oradata11             ocfs2   _netdev,datavolume,nointr       0 0
LABEL=/oradata11_new    /oradata11              ocfs2   _netdev,datavolume,nointr       0 0
#LABEL=/oradata12        /oradata12             ocfs2   _netdev,datavolume,nointr       0 0
LABEL=/oradata12_new    /oradata12              ocfs2   _netdev,datavolume,nointr       0 0
#/dev/mapper/vgra-lvra01        /ra01                   ext3    defaults        1 2
/dev/vgra00/lvra01      /ra01_new               ext3    defaults        1 2
none                    /dev/pts                devpts  gid=5,mode=620  0 0
none                    /dev/shm                tmpfs   defaults        0 0
none                    /proc                   proc    defaults        0 0
none                    /sys                    sysfs   defaults        0 0
LABEL=SWAP-sda2         swap                    swap    defaults        0 0
/dev/sdb1               swap                    swap    defaults        0 0
#/dev/vgora/lvora       /oracle                 ext3    defaults        1 2
/dev/vgra00/lvoracle    /oracle                 ext3    defaults        1 2
/dev/hda                /media/cdrom            auto    pamconsole,exec,noauto,managed 0 0

When i recheck the error logs of ocrcheck command i realised that there are some other error notifications after invalid storage type error line. OCR complains about reading bootblock and overwriting the file. There maybe a process holding the file.


[oracle@raquality00 install]$ ps -ef | grep crs
root     10239  7791  0 Jun18 ?        00:37:59 /oracle/product/10.2.0/crs/bin/crsd.bin reboot

Now it is more clear, maybe our admins forgot to disable crs service by executing

/etc/init.d/init.crs stop

after stopping crs with

crsctl stop crs

. And data migration is made with the running crs service. I recommended a reboot but stopping the the service may also be OK. After the reboot crs is opened successfully and databases are now up and running.


[oracle@raquality00 install]$ crsctl check crs
CSS appears healthy
CRS appears healthy
EVM appears healthy

[oracle@raquality00 install]$ srvctl status database -d RADB
Instance RADB1 is running on node raquality00
Instance RADB2 is running on node raquality01

Ergem PEKER 's Oracle Blog

Pages

Wednesday, October 19, 2011

Oracle Restart hands on

Thursday, December 30, 2010

Error in checking condition of instance on node

Tuesday, December 28, 2010

Relocating CRS Resource

Monday, February 23, 2009

Problem with adding a new node to RAC after ocr and voting disk change

Add/Remove ASM disks and asm_power_limit

Tuesday, August 12, 2008

srvctl problem after 10.2.0.4 patch

Tuesday, July 29, 2008

Problem with CRS startup after disk migration

About Me

Search This Blog

My Documents

Blog Archive (monthly)

group by (subject)

OCP 10g

RAC Expert

OCP 11g

My Blog List

mostly visited links

Total Pageviews

warning