Managing your Oracle RAC Voting Disks

The voting disk records node membership information. A node must be able to access more than half of the voting disks at any time. To avoid simultaneous loss of multiple voting disks, each voting disk should be on a storage device that does not share any components (controller, interconnect, and so on) with the storage devices used for the other voting disks. If a node cannot access the minimum required number of voting disks it is evicted, or removed, from the cluster.

Thus proper management of voting disks can give great relief in times of disaster.

Given below are the most common tasks associated with Voting Disk management in an Oracle RAC environment.

1. Obtaining Voting disk Information

$ crsctl query css votedisk – gives a listing of all the in-use voting disks of you cluster.

2. Adding and Removing Voting Disks

To add or remove a voting disk, first shut down Oracle Clusterware on all nodes, then use the following commands as the root user, where path is the fully qualified path for the additional voting disk.

To add a voting disk: # crsctl add css votedisk path

To remove a voting disk: # crsctl delete css votedisk path

Caution: If you use the -force option to add or remove a voting disk while the Oracle Clusterware stack is active, you can corrupt your cluster configuration.

Note: If your cluster is down, then you can use the -force option to modify the voting disk configuration when using either of these commands without interacting with active Oracle Clusterware daemons.

3. Backing up Voting Disks

As a best practice it is recommended that you back up the voting disks at the following times:

· After installation

· After adding nodes to or deleting nodes from the cluster

· After performing voting disk add or delete operations

To make a backup copy of the voting disk, use the dd command. Perform this operation on every voting disk as needed where voting_disk_name is the name of the active voting disk and backup_file_name is the name of the file to which you want to back up the voting disk contents:

$ dd if=voting_disk_name of=backup_file_name

If your voting disk is stored on a raw device, use the device name in place of

voting_disk_name. For example:

$ dd if=/dev/sdd1 of=/tmp/vd1_.dmp

When you use the dd command for making backups of the voting disk, the backup can be performed while the Cluster Ready Services (CRS) process is active; you do not need to stop the crsd.bin process before taking a backup of the voting disk.

4. Recovering Voting Disks

If a voting disk is damaged, and no longer usable by Oracle Clusterware, you can recover the voting disk if you have a backup file. Run the following command to recover a voting disk where backup_file_name is the name of the voting disk backup file and voting_disk_name is the name of the active voting disk:

$ dd if=backup_file_name of=voting_disk_name

Needless to say, Keep your voting disk-backups safe.

RMAN ‘PREVIEW’ and ‘VALIDATE’: Ensuring we have proper backups to recover a database

The ability to foresee a disaster helps us in planning for it and the ability to foresee our preparedness for a disaster can literally make the difference between survival and extinction.

RMAN’s PREVIEW and VALIDATE commands if used on a regular basis (daily) can prove to be immensely useful to foresee if, a recovery will actually succeed, thereby enabling us to rectify the possibilities of failure well in advance.

PREVIEW

Description: The preview option of the restore command helps you identify all the required backups for the specified restore command(examples is usage section). Preview displays a list of all available backups needed for a restore operation. If used in the SUMMARY mode, it produces a summary report for the restore command operation.

Usage:

RMAN> restore database preview;

RMAN> restore database from tag FULL_BKP preview;

RMAN> restore datafile 1, 2 preview;

RMAN> restore archivelog all preview summary;

RMAN> restore archivelog from time ’sysdate – 1/24′ preview summary;

RMAN> restore archivelog from scn 25 preview summary;

VALIDATE

Description: VALDIATE can be used as an option on the restore command or as a command by itself (usage given below). The purpose of RMAN validation is to check for block corruption (structural) and missing backup-sets. By default ‘validate’ checks for Structural corruption but can be used to identify logical corruption by specifying CHECK LOGICAL clause on the RESTORE/ VALIDATE command.

Usage:

RMAN> restore database validate check logical;

RMAN> restore database validate;

RMAN> restore database from tag FULL_BKP validate;

RMAN> restore datafile 1 validate;

RMAN> restore archivelog all validate;

RMAN> restore controlfile validate;

RMAN> restore tablespace users validate;

—————————————————————

RMAN> validate backupset 112 check logical;

Use the RMAN> list backup; command to obtain the backupset key (112 above)

RMAN> validate database check logical;

RMAN> validate database;

All relevant information, as and when encountered will be appended to this article. Cool Have fun.

Recovering from Lost Online Redo Logs in NOARCHIVELOG mode

The title above suggests a situation wherein data-loss is inevitable. However if are running your database in NOARCHIVELOG mode, it is more or less understood that transactional data is not what runs your business. You are however more bent on keeping your database alive for the maximum time possible with all the data (static) already committed to datafiles intact. Well, here is what you can do,

Scenarios:

a. You have lost one/all your Online Redo logs.

b. One/all of your redo logs are corrupt.

Example:

SQL> startup

ORACLE instance started.

Total System Global Area 293601280 bytes

Fixed Size 1248600 bytes

Variable Size 79692456 bytes

Database Buffers 205520896 bytes

Redo Buffers 7139328 bytes

Database mounted.

ORA-00313: open failed for members of log group 3 of thread 1

ORA-00312: online log 3 thread 1:

‘E:\ORACLE10.2\PRODUCT\10.2.0\ORADATA\TESTDB2\REDO03.LOG’

Recovery Procedure:

1. Startup the database in MOUNT

SQL> startup mount

ORACLE instance started.

Total System Global Area 293601280 bytes

Fixed Size 1248600 bytes

Variable Size 79692456 bytes

Database Buffers 205520896 bytes

Redo Buffers 7139328 bytes

Database mounted.

2. Fake a database recovery:

This is required to open the database with RESETLOGS option.

SQL> recover database until cancel;

Media recovery complete.

3. Open database with RESETLOGS option.

This will ensure all missing/corrupted redo logs are fixed.

SQL> alter database open resetlogs;

Database altered.

You will surely lose uncommitted transaction data though. Hence it is always a better bet to run your database in ARCHIVELOG mode.