11gR2 – having a second OCR file on a separate diskgroup

 

Oracle database versions  < 11gR2 RAC, the OUI had a screen where we could specify multiple OCR and or Voting disks.  Starting with 11gR2 since the OCR and Voting disks are stored on ASM storage we don’t have options to specify multiple OCR or for that matter multiple voting disks. Well do we need multiple disks for OCR and voting disks.  with these files stored on ASM, we could take advantage of the ASM mirroring options.  Normal Redundancy (two copies of the file), High Redundancy (three copies of the file) or External Redundancy (where ASM does not manage the redundant copy of these files, but the redundancy is maintained /protected at the storage level by mirroring disks.

If the normal redundancy option or the high redundancy option was selected when the disk group that stores the OCR/voting disks, the automatically the disks are mirrored for either normal or high redundancy.

Now what if we do not want to use ASM mirroring.. but would like the clusterware to maintain two or more copies of OCR files on physically different diskgroups.  Oracle supports this option, not while installing the Grid infrastructure but after the installation is  complete using the ocrconfig utility. Lets go through one such scenario.

OCR is a critical component of the RAC architecture.  From the point that the clusterware starts (on server start/reboot)and to start all applications running on the clusterware which includes database, listener, ASM, database services etc, the OCR is consulted by the clusterware for placement of resources.   OCR contains all the rules for high availability of these resources. The clusterware will use these definitions for placement of resources when an server / instance crashes.  In 11gR2 the OCR contains additional information such as server pool definitions etc.

The first time the clusterware starts it determines the location of the OCR file by checking the following location

1.  Request storage administrators to create a new LUN with the same size as the LUN that currently hosts the OCR and voting disks for the 11gR2 cluster

2.  Connect to the ASM instance on one for the database servers and create a new disk group and mount the disk group on all instances in the cluster using the following syntax

CREATE DISKGROUP PRD_GRID2 EXTERNAL REDUNDANCY ADD DISK ‘ ‘;

ALTER DISKGROUP PRD_GRID2 MOUNT

3. Once the diskgroup have been mounted..we can configure the diskgroup for OCR.   This configuration requires root privileges. 

Connect to the server as root. and execute ocrconfig command from the GRID_HOME /bin directory. 

[root@prddb3 bin]# ./ocrconfig -add +PRD_GRID2

Note:  if the disk group does not have the required privileges then you can get the following error.   I ran into this error while configuring OCR on PRD_GRID2 diskgroup.

PROT-30: The Oracle Cluster Registry location to be added is not accessible.

Also reported in the GRID_HOME/log/prddb3/client directory is a ocrconfig log file that contains the following errors

[root@prddb3 bin]# cat /app/grid/product/11.2.0/log/prddb3/client/ocrconfig_4000.log
Oracle Database 11g Clusterware Release 11.2.0.1.0 - Production Copyright 1996, 2009 Oracle. All rights reserved.
2010-07-10 20:38:18.472: [ OCRCONF][2833243664]ocrconfig starts...
2010-07-10 20:38:23.140: [  OCRCLI][2833243664]proac_replace_dev:[+PRD_GRID2]: Failed. Retval [8]
2010-07-10 20:38:23.140: [  OCRAPI][2833243664]procr_replace_dev: failed to replace device (8)
2010-07-10 20:38:23.140: [ OCRCONF][2833243664]The new OCR device [+PRD_GRID2] cannot be opened
2010-07-10 20:38:23.140: [ OCRCONF][2833243664]Exiting [status=failed]... 

What does this error mean?  by default when you create a diskgroup in 11gR2 RAC from the command prompt using SQL plus, the default compatibility attribute of the diskgroup is set to 10.1. This issue does not occur if you create it through ASMCA.

[oracle@prddb3]# sqlplus ‘ /as sysasm’

SQL> SELECT NAME,COMPATIBILITY,DATABASE_COMPATIBILITY from v$asm_diskgroup; 

NAME         COMPATIBILITY        DATABASE_COMPATIBILI VOT
------------ -------------------- -------------------- ---
PRD_DATA    11.2.0.0.0           11.2.0.0.0           N
PRD_FRA     11.2.0.0.0           11.2.0.0.0           N
PRD_GRID1   11.2.0.0.0           11.2.0.0.0           N
PRD_GRID2   10.1.0.0.0           10.1.0.0.0           N

Note The ASM compatibility and the database (RDBMS) compatibility defaults to 10.1. This needs to be changed to 11.2  in order for the clusterware to recognize that this is a 11gR2 ASM configuration.

4. Change the compatibility of the new diskgroup to 11.2 as follows:

ALTER DISKGROUP PRD_GRID2 SET ATTRIBUTE ‘COMPATIBILITY.ASM’=’11.2’;

ALTER DISKGROUP PRD_GRID2 SET ATTRIBUTE ‘COMPATIBILITY.RDBMS’=’11.2’;

5. The commands will change the compatibility levels for the PRD_GRID2 diskgroup and the new values can be verified using the following query:

SQL> COL NAME FORMAT A12
SQL> COL COMPATIBILITY FORMAT A20
SQL> COL DATABASE_COMPATIBILITY FORMAT A20

SQL> SELECT NAME,COMPATIBILITY,DATABASE_COMPATIBILITY from v$asm_diskgroup; 

NAME         COMPATIBILITY        DATABASE_COMPATIBILI
------------ -------------------- --------------------
PRD_DATA     11.2.0.0.0           11.2.0.0.0
PRD_FRA      11.2.0.0.0           11.2.0.0.0
PRD_GRID1    11.2.0.0.0           11.2.0.0.0
PRD_GRID2    11.2.0.0.0           11.2.0.0.0 

6. Once the data has been verified, attempt to configure the diskgroup for OCR.   This configuration requires ‘root’ privileges. Connect to the server as root. and execute ocrconfig command from the $GRID_HOME/bin directory. 

[root@prddb3 bin]# ./ocrconfig -add +PRD_GRID2

7.  To verify if the new OCR file is created and the /etc/oracle/ocr.loc file is updated with the new file information.

[root@prddb3 bin]# cat /etc/oracle/ocr.loc
#Device/file  getting replaced by device +PRD_GRID2
ocrconfig_loc=+PRD_GRID1
ocrmirrorconfig_loc=+PRD_GRID2

8. The  ocrcheck command reflects two OCR locations

[root@prddb3 bin]# ./ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     262120
         Used space (kbytes)      :       3972
         Available space (kbytes) :     258148
         ID                       : 1288159793
         Device/File Name         : +PRD_GRID1
                                    Device/File integrity check succeeded
         Device/File Name         : +PRD_GRID2
                                    Device/File integrity check succeeded 

                                    Device/File not configured 

                                    Device/File not configured 

                                    Device/File not configured 

         Cluster registry integrity check succeeded 

         Logical corruption check succeeded 

9. Apart from the standard utilities used to very the integrity of the OCR file, the $GRID_HOME/log/prddb3/client directory also contains logs reflecting that the change was successful.

[root@prddb3 bin]# cat /app/grid/product/11.2.0/log/prddb3/client/ocrconfig_25560.log
Oracle Database 11g Clusterware Release 11.2.0.1.0 - Production Copyright 1996, 2009 Oracle. All rights reserved.
2010-07-10 21:01:00.652: [ OCRCONF][4224605712]ocrconfig starts...
2010-07-10 21:01:13.593: [ OCRCONF][4224605712]Successfully replaced OCR and set block 0
2010-07-10 21:01:13.593: [ OCRCONF][4224605712]Exiting [status=success]... 

10.  The clusterware alert log also has an entry indicating successful addition of the OCR disk.

/app/grid/product/11.2.0/log/prddb3/alertprddb3.log
[crsd(27512)]CRS-1007:The OCR/OCR mirror location was replaced by +PRD_GRID2.

11.   Lets check if the physical files can be found on the ASM storage.  Set the ORACLE_SID to the ASM instance on the server and using asmcmd the following can be found

ASMCMD> pwd
+PRD_GRID2/PRDCW/OCRFILE
ASMCMD> ls -lt
Type     Redund  Striped  Time             Sys  Name
OCRFILE  UNPROT  COARSE   JUL 10 21:00:00  Y    REGISTRY.255.715795989 

ASMCMD> cd +PRD_GRID1/PRDCW/OCRFILE
ASMCMD> ls -lt
Type     Redund  Striped  Time             Sys  Name
OCRFILE  UNPROT  COARSE   JUL 10 21:00:00  Y    REGISTRY.255.724021255 

About Murali Vallath
Murali Vallath has over 20 years of experience designing and developing databases. He provides independent Oracle consulting services focusing on designing and performance tuning of Oracle databases through Summersky Enterprises (www.summersky.biz). Vallath has successfully completed over 100 successful small, medium and terabyte sized RAC implementations (Oracle 9i, Oracle 10g & Oracle 11gR2 ) for reputed corporate firms. Vallath is a regular speaker at industry conferences and user groups, including the Oracle Open World, UKOUG and IOUG on RAC and Oracle RDBMS performance tuning topics. Vallath's Publications: Author: 1. ‘Oracle Real Application Clusters’ Publisher: Digital Press 2. ‘Oracle 10g RAC, Grid, Services & Clustering’ Publisher: Digital Press. Co-Author 3. 'Automatic Storage Management Publisher: Oracle Press'

16 Responses to 11gR2 – having a second OCR file on a separate diskgroup

  1. Saroj Mohapatra says:

    I am curious as to how we could take a similar approach for the voting disks.

    1. 11gR2 allows 3 VD if the ASM diskgroup has normal redundancy. Now if you like mirrors to be stored across two storage devices, it’s tough since one storage has to keep 2 voting disks. If that storage fails, since we’d lose majority of the VD (2 out of 3), the cluster would go down. It did go down in my test (after about 1hour 45 mins) of the loss of storage that has 2 VD.

    2. If we use a ASM diskgroup with external redundancy, 11gR2 creates one VD. Now, how do we mirror this VD across two storage devices? I have to try add css votedisk with -force, but what do you think of this option?

    Appreciate your response.

    • Hi Saroj

      Have you tried option item 1 you have mentioned for setting up VD’s? Normal redundancy and High redundancy definitions are different when it comes to diskgroups for the GRID infrastructure versus diskgroups for the database. I think we need three luns for normal redundancy and 5 disks for high redundancy when you create diskgroups for the GRID infrastructure.

      I am working on another topic on this very subject.. pls standby

    • Ashish says:

      Hi,
      i am trying to add votedisk on 11g r2 ASM RAC and it failed.
      These the command i exec: sudo /u01/app/11.2.0/grid/bin/crsctl add votedisk +DATAGRP02
      As mentioned in Oracle DOC, #./crsctl add votedisk path_name, therefore,
      it may seems to be full path where the devices location.
      In my case: #./crsctl add votedisk /asmdisks/disk9
      So far i have not found any documentation what would be exact syntext as what Murali’s test case showed up. Any support would be appreciated. Thxs,
      Saroj,
      You may remeber me, Ashish Desai, DayLight Saving Patch at DowJones.

      -Ashish

  2. Saroj Mohapatra says:

    Murali,
    Have you tested the viability of OCR mirror in the event of a corruption of the first OCR diskgroup? I zeroed out the first OCR LUN and the whole cluster crashed. The creation of mirror is fine, but it doesn’t help in the event of the first OCR disk failing. Just wanted to send you this update.

    Thanks.
    Saroj

    • Hi Saroj

      Could it be because you did this manually? The clusterware is looking at the OCR diskgroups from the /etc/oracle/ocr.loc file for the file.
      [root@prddb3 bin]# cat /etc/oracle/ocr.loc
      #Device/file getting replaced by device +PRD_GRID2
      ocrconfig_loc=+PRD_GRID1
      ocrmirrorconfig_loc=+PRD_GRID2

      I think if this corruption was deducted by the clusterware, it would update the ocr.loc file like this..
      #Device/file getting replaced by device +PRD_GRID2
      # ocrconfig_loc=+PRD_GRID1 being deleted
      ocrmirrorconfig_loc=+PRD_GRID2

      I have a discussion on this topic here (not with ASM diskgroups but with raw devices in 10gR2) http://mvallath.wordpress.com/2010/06/02/ocr-repair-yet-another-scenario/
      —————-
      The reason the cluster crashed with a dd on the diskgroup was that, the dd cleaned up the VD also that resides on the diskgroup. OCR corruption does not mean that the entire diskgroup is bad, just the OCR file that is bad.
      ——————

  3. Sarang says:

    Hi Saroj/Murali,
    We have similar issue while adding voting disk. Grid infrastructure created a single voting disk as we used external redundancy (disk is XP storage with dual path). Now we created a separate disk group with Normal redundancy and trying to replace the voting disk to new disk group, but getting following error. Have you added the voting disk successfully to a normal redundancy disk group ?

    $ crsctl query css votedisk
    ## STATE File Universal Id File Name Disk group
    — —– —————– ——— ———
    1. ONLINE 931c5c07b9154f93bfb338c5ee156e3d (/dev/rdisk/disk727) [REP_DATA]
    Located 1 voting disk(s).

    $ crsctl replace votedisk +OCR_VOT1

    Failed to create voting files on disk group OCR_VOT1.
    Change to configuration failed, but was successfully rolled back.
    CRS-4000: Command Replace failed, or completed with errors.

    Add did not work as well,but this was expected..

    $ crsctl add css votedisk ‘+OCR_VOT1’
    CRS-4671: This command is not supported for ASM diskgroups.
    CRS-4000: Command Add failed, or completed with errors.

  4. Pingback: 11gR2 – OCR /Voting Disk Redundancy…. POC – Method I « Summersky RAC Notebook

  5. Sarang says:

    Issue reolved..

  6. Ashish says:

    Hi Sarang,

    Can you share how did you resolve it. I am in same situation as you were.
    Thanks,
    -Ashish

  7. DanyC says:

    Hi Murali,

    I’m having the same issue with my 11gR2 2 nodes RAC described below by Sarang.

    “11gR2 allows 3 VD if the ASM diskgroup has normal redundancy. Now if you like mirrors to be stored across two storage devices, it’s tough since one storage has to keep 2 voting disks. If that storage fails, since we’d lose majority of the VD (2 out of 3), the cluster would go down”

    Do i have to store 2 voting disks on 1st storage and 3 voting disks on 2nd storage?

    Appreciate if you can reply.

    Thx,
    Dani

    • Hi

      There are two methods of creating OCR mirrored copies.. take the easy approach of using redundancy at the ASM level., When you create an OCR or voting disk Clusterware will automatically maintain redundant copies of both the OCR and voting disk. Under this option there is only one diskgroup but the diskgroup is created using NORMAL REDUNDANCY command and the ocrcheck will only show one diskgroup, however if you try to check the files in on this diskgroup using ASMCMD you would see multiple OCR and VOTING disk files.

      The second method is to add a second diskgroup as an additional OCR disk. This is also possible and similar to 10gR2 Oracle will keep redundant copies of the OCR files. if you have two diskgroups then you have two copies. Pease note this is NOT ASM mirroring, its two external redundancy diskgroups used to maintain two identical copies of the OCR file. The drawback to using this approach is, only copies of the OCR file is maintained by the clusterware, voting disk copies are not maintained. Here you will add a second group using ocrconfig command and ocrcheck will show you diskgroups like this…
      ——————————————————–
      $GRID_HOME/bin/ocrcheck
      Status of Oracle Cluster Registry is as follows :
      Version : 3
      Total space (kbytes) : 262120
      Used space (kbytes) : 3368
      Available space (kbytes) : 258752
      ID : 1766525077
      Device/File Name : +DEV_GRID1
      Device/File integrity check succeeded
      Device/File Name : +DEV_GRID2
      Device/File integrity check succeeded
      Device/File not configured
      Device/File not configured
      Device/File not configured
      Cluster registry integrity check succeeded
      Logical corruption check bypassed due to non-privileged user
      ————————————————–

  8. DanyC says:

    Hi,

    Although i’m using the ASM redundancy level (normal redundancy) it doesn’t work well since i have 2 sotrages where 2 VD are stored on 1st one and 1 VD on 2nd storage. The problem is that if i loose the 1st storage i lost 2 VD and CRS needs half + 1 of my VD to be able to start otherwise i get this error

    CRS-1656:The CSS daemon is terminating due to a fatal error

    So do you have any other ideas?

    Thanks,
    Dani

  9. Asif Momen says:

    There’s a typo. It should be ‘COMPATIBLE.ASM’ & ‘COMPATIBLE.RDBMS’ instead of ‘COMPATIBILITY.ASM’ & ‘COMPATIBILITY.RDBMS’ respectively in the following commands.

    ALTER DISKGROUP PRD_GRID2 SET ATTRIBUTE ‘COMPATIBILITY.ASM’=’11.2’;
    ALTER DISKGROUP PRD_GRID2 SET ATTRIBUTE ‘COMPATIBILITY.RDBMS’=’11.2’;

  10. Samer says:

    Thank you Murali… it is a helpful note…

Leave a reply to Saroj Mohapatra Cancel reply