Monday, February 23, 2009

Problem with adding a new node to RAC after ocr and voting disk change

After changing the ocr and voting disks with different raw devices, that used with installation, later on i faced a problem while adding a new node to my existing 2node cluster installation.

After the first step of adding a new node to a cluster (cloning my CRS_HOME (cluster home of oracle RAC) and ran all scripts), the last step in the documentation was running root.sh in the new node's CRS_HOME. But the unexpected error was;


root@radbank04 # ./root.sh
"/dev/rdsk/c6t600601607D731F006596A6CD2CE9DD11d0s0" does not exist. Create it before proceeding.
Make sure that this file is shared across cluster nodes.
1

I am pretty sure with the raw device definitions are OK in the new node. I decided to examine the "root.sh" to see what it is running in the background. "root.sh" was consisting of two other scripts "rootinstall" and "rootconfig".

root@radbank04 # more root.sh
#!/bin/sh
/oracle/product/crs10g/install/rootinstall
/oracle/product/crs10g/install/rootconfig

I found the following part of the script in the very beginning of the rootconfig script.

SILENT=false
ORA_CRS_HOME=/oracle/product/crs10g
CRS_ORACLE_OWNER=oracle
CRS_DBA_GROUP=oinstall
CRS_VNDR_CLUSTER=false
CRS_OCR_LOCATIONS=/dev/rdsk/c6t600601607D731F00BAE716BB2CE9DD11d0s0,/dev/rdsk/c6t600601607D731F006496A6CD2CE9DD11d0s0
CRS_CLUSTER_NAME=crs
CRS_HOST_NAME_LIST=radbank02,1,radbank03,2
CRS_NODE_NAME_LIST=radbank02,1,radbank03,2
CRS_PRIVATE_NAME_LIST=radbank02-priv1,1,radbank03-priv1,2
CRS_LANGUAGE_ID='AMERICAN_AMERICA.WE8ISO8859P1'
CRS_VOTING_DISKS=/dev/rdsk/c6t600601607D731F006596A6CD2CE9DD11d0s0
CRS_NODELIST=radbank02,radbank03
CRS_NODEVIPS='radbank02/radbank02-vip/255.255.255.0/bge0,radbank03/radbank03-vip/255.255.255.0/bge0'

Somehow (my second somehow in the 10gR2 RAC environment) crs records was showing the old ocr and voting disks. The new disks i have replaced in my existing RAC environment before was totally different. I decided to modify the rootconfig file and replace the old defined ocr and voting disks with the new ones as follows. An rerun the rootconfig file. I dont know if this action is recommended by oracle but i think this is the only change that i have in my hand.

SILENT=false
ORA_CRS_HOME=/oracle/product/crs10g
CRS_ORACLE_OWNER=oracle
CRS_DBA_GROUP=oinstall
CRS_VNDR_CLUSTER=false
#CRS_OCR_LOCATIONS=/dev/rdsk/c6t600601607D731F00BAE716BB2CE9DD11d0s0,/dev/rdsk/c6t600601607D731F006496A6CD2CE9DD11d0s0
CRS_OCR_LOCATIONS=/dev/rdsk/c6t600601607D731F00AE6F711488FEDD11d0s0,/dev/rdsk/c6t600601607D731F00AF6F711488FEDD11d0s0
CRS_CLUSTER_NAME=crs
CRS_HOST_NAME_LIST=radbank02,1,radbank03,2
CRS_NODE_NAME_LIST=radbank02,1,radbank03,2
CRS_PRIVATE_NAME_LIST=radbank02-priv1,1,radbank03-priv1,2
CRS_LANGUAGE_ID='AMERICAN_AMERICA.WE8ISO8859P1'
#CRS_VOTING_DISKS=/dev/rdsk/c6t600601607D731F006596A6CD2CE9DD11d0s0
CRS_VOTING_DISKS=/dev/rdsk/c6t600601607D731F00A8833C0377FEDD11d0s0,/dev/rdsk/c6t600601607D731F00A6833C0377FEDD11d0s0,/dev/rdsk/c6t60
0601607D731F00A7833C0377FEDD11d0s0
CRS_NODELIST=radbank02,radbank03
CRS_NODEVIPS='radbank02/radbank02-vip/255.255.255.0/bge0,radbank03/radbank03-vip/255.255.255.0/bge0'

root@radbank04 # /oracle/product/crs10g/install/rootconfig

Checking to see if Oracle CRS stack is already configured
OCR LOCATIONS = /dev/rdsk/c6t600601607D731F00AE6F711488FEDD11d0s0,/dev/rdsk/c6t600601607D731F00AF6F711488FEDD11d0s0
OCR backup directory '/oracle/product/crs10g/cdata/crs' does not exist. Creating now
Setting the permissions on OCR backup directory
Setting up NS directories
Oracle Cluster Registry configuration upgraded successfully
WARNING: directory '/oracle/product' is not owned by root
WARNING: directory '/oracle' is not owned by root
clscfg: EXISTING configuration version 3 detected.
clscfg: version 3 is 10G Release 2.
Successfully accumulated necessary OCR keys.
Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
node :
node 1: radbank02 radbank02-priv1 radbank02
node 2: radbank03 radbank03-priv1 radbank03
clscfg: Arguments check out successfully.

NO KEYS WERE WRITTEN. Supply -force parameter to override.
-force is destructive and will destroy any previous cluster
configuration.
Oracle Cluster Registry for cluster has already been initialized
Startup will be queued to init within 30 seconds.
Adding daemons to inittab
Expecting the CRS daemons to be up within 600 seconds.
CSS is active on these nodes.
radbank02
radbank03
radbank04
CSS is active on all nodes.

root@radbank04 # ./crs_stat -t

Name Type Target State Host
------------------------------------------------------------
ora.monsc.db application ONLINE ONLINE radbank03
ora....c1.inst application ONLINE ONLINE radbank02
ora....c2.inst application ONLINE ONLINE radbank03
ora.montc.db application ONLINE ONLINE radbank02
ora....ntc1.cs application ONLINE ONLINE radbank02
ora....c1.inst application ONLINE ONLINE radbank02
ora....tc1.srv application ONLINE ONLINE radbank02
ora....ntc2.cs application ONLINE ONLINE radbank03
ora....c2.inst application ONLINE ONLINE radbank03
ora....tc2.srv application ONLINE ONLINE radbank03
ora....ntc3.cs application ONLINE ONLINE radbank03
ora....tc1.srv application ONLINE ONLINE radbank02
ora....tc2.srv application ONLINE ONLINE radbank03
ora....SM1.asm application ONLINE ONLINE radbank02
ora....02.lsnr application ONLINE ONLINE radbank02
ora....02.lsnr application ONLINE ONLINE radbank02
ora....k02.gsd application ONLINE ONLINE radbank02
ora....k02.ons application ONLINE ONLINE radbank02
ora....k02.vip application ONLINE ONLINE radbank02
ora....SM2.asm application ONLINE ONLINE radbank03
ora....03.lsnr application ONLINE ONLINE radbank03
ora....03.lsnr application ONLINE ONLINE radbank03
ora....k03.gsd application ONLINE ONLINE radbank03
ora....k03.ons application ONLINE ONLINE radbank03
ora....k03.vip application ONLINE ONLINE radbank03
ora....k04.gsd application ONLINE ONLINE radbank04
ora....k04.ons application ONLINE ONLINE radbank04
ora....k04.vip application ONLINE ONLINE radbank04


My new node is part of my existing cluster now. Now time to add the ORACLE_HOME (database installation) and the instances with the services.

Resources:

Oracle® Database Oracle Clusterware and Oracle Real Application Clusters Administration and Deployment Guide 10g Release 2 (10.2)

Add/Remove ASM disks and asm_power_limit

Before going live on production our storage team decided to reconfigure their storage and raidsets for better performance. So i gave the raw devices back from asm and of course used the opportunity to test the asm_power_limit parameter.

From the following listing you can find how to list asm disks, adding and removing disks from a disk group and watch the rebalance operation from v$asm_operation view.

By changing the init parameter "asm_power_limit", asm rebalance operations are significantly getting faster. The difference can obviously be seen by setting the parameter to 10 (instead of default value 1) and then giving the same operations that need rebalance. Of course this was not an online system so we may want asm to exhaust all the IO for rebalancing. But in case of production systems there can be unwanted IO performance degradation of the application by this kind of aggressive setting.


SQL> select group_number,state,name,total_mb from v$asm_disk;

GROUP_NUMBER STATE    NAME                   TOTAL_MB
------------ -------- -------------------- ----------
0 NORMAL                               924
0 NORMAL                               924
0 NORMAL                               924
0 NORMAL                               924
1 NORMAL   DATA_0009                614300
1 NORMAL   DATA_0008                614290
1 NORMAL   DATA_0002                614290
1 NORMAL   DATA_0001                614290
1 NORMAL   DATA_0000                614290
1 NORMAL   DATA_0007                614290
1 NORMAL   DATA_0006                614290
1 NORMAL   DATA_0005                614290
1 NORMAL   DATA_0004                614290
1 NORMAL   DATA_0003                614290

14 rows selected.

SQL> alter diskgroup DATA drop disk DATA_0000;

Diskgroup altered.

SQL> alter diskgroup DATA drop disk DATA_0001;

Diskgroup altered.

SQL> select group_number,state,name,total_mb,label,path from v$asm_disk;

GROUP_NUMBER STATE    NAME              TOTAL_MB LABEL PATH
------------ -------- --------------- ---------- ----- --------------------------------------------------
0 NORMAL                          924       /dev/rdsk/c6t600601607D731F00A8833C0377FEDD11d0s0
0 NORMAL                          924       /dev/rdsk/c6t600601607D731F00A7833C0377FEDD11d0s0
0 NORMAL                          924       /dev/rdsk/c6t600601607D731F00A6833C0377FEDD11d0s0
0 NORMAL                          924       /dev/rdsk/c6t600601607D731F006596A6CD2CE9DD11d0s0
1 NORMAL   DATA_0009           614300       /dev/rdsk/c6t600601607D731F00D994060263E8DD11d0s0
1 NORMAL   DATA_0008           614290       /dev/rdsk/c6t600601607D731F00D894060263E8DD11d0s0
1 NORMAL   DATA_0002           614290       /dev/rdsk/c6t600601607D731F006A37A4E0A0E7DD11d0s0
1 DROPPING DATA_0001           614290       /dev/rdsk/c6t600601607D731F006937A4E0A0E7DD11d0s0
1 DROPPING DATA_0000           614290       /dev/rdsk/c6t600601607D731F006837A4E0A0E7DD11d0s0
1 NORMAL   DATA_0007           614290       /dev/rdsk/c6t600601607D731F007608DB3DA0E7DD11d0s0
1 NORMAL   DATA_0006           614290       /dev/rdsk/c6t600601607D731F007508DB3DA0E7DD11d0s0
1 NORMAL   DATA_0005           614290       /dev/rdsk/c6t600601607D731F007408DB3DA0E7DD11d0s0
1 NORMAL   DATA_0004           614290       /dev/rdsk/c6t600601607D731F007308DB3DA0E7DD11d0s0
1 NORMAL   DATA_0003           614290       /dev/rdsk/c6t600601607D731F007208DB3DA0E7DD11d0s0


SQL> select * from v$asm_operation;

OPERA STAT      POWER     ACTUAL      SOFAR   EST_WORK   EST_RATE EST_MINUTES
----- ---- ---------- ---------- ---------- ---------- ---------- -----------
REBAL RUN          10         10       7198      12996        260          22
REBAL RUN          10         10       7679      12979        242          21
REBAL RUN          10         10       9286      12930        201          18
REBAL RUN          10         10      11647      12899        237           5

SQL> select * from v$asm_operation;

no rows selected

SQL> select group_number,state,name,total_mb,label,path from v$asm_disk;

GROUP_NUMBER STATE    NAME              TOTAL_MB LABEL PATH
------------ -------- --------------- ---------- ----- --------------------------------------------------
0 NORMAL                          924       /dev/rdsk/c6t600601607D731F00A8833C0377FEDD11d0s0
0 NORMAL                          924       /dev/rdsk/c6t600601607D731F00A7833C0377FEDD11d0s0
0 NORMAL                          924       /dev/rdsk/c6t600601607D731F00A6833C0377FEDD11d0s0
0 NORMAL                       614290       /dev/rdsk/c6t600601607D731F006937A4E0A0E7DD11d0s0
0 NORMAL                       614290       /dev/rdsk/c6t600601607D731F006837A4E0A0E7DD11d0s0
0 NORMAL                          924       /dev/rdsk/c6t600601607D731F006596A6CD2CE9DD11d0s0
1 NORMAL   DATA_0009           614300       /dev/rdsk/c6t600601607D731F00D994060263E8DD11d0s0
1 NORMAL   DATA_0008           614290       /dev/rdsk/c6t600601607D731F00D894060263E8DD11d0s0
1 NORMAL   DATA_0002           614290       /dev/rdsk/c6t600601607D731F006A37A4E0A0E7DD11d0s0
1 NORMAL   DATA_0007           614290       /dev/rdsk/c6t600601607D731F007608DB3DA0E7DD11d0s0
1 NORMAL   DATA_0006           614290       /dev/rdsk/c6t600601607D731F007508DB3DA0E7DD11d0s0
1 NORMAL   DATA_0005           614290       /dev/rdsk/c6t600601607D731F007408DB3DA0E7DD11d0s0
1 NORMAL   DATA_0004           614290       /dev/rdsk/c6t600601607D731F007308DB3DA0E7DD11d0s0
1 NORMAL   DATA_0003           614290       /dev/rdsk/c6t600601607D731F007208DB3DA0E7DD11d0s0


SQL> alter diskgroup DATA add disk '/dev/rdsk/c6t600601607D731F006837A4E0A0E7DD11d0s0';

Diskgroup altered.

SQL> select group_number,state,name,total_mb,label,path from v$asm_disk;

GROUP_NUMBER STATE    NAME              TOTAL_MB LABEL PATH
------------ -------- --------------- ---------- ----- --------------------------------------------------
0 NORMAL                          924       /dev/rdsk/c6t600601607D731F00A8833C0377FEDD11d0s0
0 NORMAL                          924       /dev/rdsk/c6t600601607D731F00A7833C0377FEDD11d0s0
0 NORMAL                          924       /dev/rdsk/c6t600601607D731F00A6833C0377FEDD11d0s0
0 NORMAL                       614290       /dev/rdsk/c6t600601607D731F006937A4E0A0E7DD11d0s0
0 NORMAL                          924       /dev/rdsk/c6t600601607D731F006596A6CD2CE9DD11d0s0
1 NORMAL   DATA_0009           614300       /dev/rdsk/c6t600601607D731F00D994060263E8DD11d0s0
1 NORMAL   DATA_0008           614290       /dev/rdsk/c6t600601607D731F00D894060263E8DD11d0s0
1 NORMAL   DATA_0002           614290       /dev/rdsk/c6t600601607D731F006A37A4E0A0E7DD11d0s0
1 NORMAL   DATA_0007           614290       /dev/rdsk/c6t600601607D731F007608DB3DA0E7DD11d0s0
1 NORMAL   DATA_0006           614290       /dev/rdsk/c6t600601607D731F007508DB3DA0E7DD11d0s0
1 NORMAL   DATA_0005           614290       /dev/rdsk/c6t600601607D731F007408DB3DA0E7DD11d0s0
1 NORMAL   DATA_0004           614290       /dev/rdsk/c6t600601607D731F007308DB3DA0E7DD11d0s0
1 NORMAL   DATA_0003           614290       /dev/rdsk/c6t600601607D731F007208DB3DA0E7DD11d0s0
1 NORMAL   DATA_0000           614290       /dev/rdsk/c6t600601607D731F006837A4E0A0E7DD11d0s0


SQL> show parameter asm_power_limit;

NAME                            TYPE        VALUE
------------------------------- ----------- -------------------------
asm_power_limit                 integer     1

SQL> alter system set asm_power_limit=10;

System altered.

SQL> show parameter asm_power_limit;

NAME                            TYPE        VALUE
------------------------------- ----------- -------------------------
asm_power_limit                 integer     10

SQL>