Tuesday, July 31, 2007

Standby Database Using RMAN

I learn something about Oracle everyday and in the last few days the learnings have been rather painful. I have been trying to create a physical standby database on version 10.1.0.4 on HP-UX. The target is a two node RAC and the standby is a single instance database. the reason for the standby was to move to an ASM storage on a differerent server.

I will try to produce a whole cookbook on the effort; but in the meantime let me explain the pain points.

Like everyone else, I scanned the books, articles, MetaLink, the internet, the socks drawer and broom closet; but couldn't find a good write up on how to perform this task in real life.

For instance, most of the recipes suggested taking a cold backup, something that was impossible since the target is a production database.

Others recipes suggested taking a special backup of the database using RMAN with a special clause - WITH CONTROLFILECOPY AS STANDBY (or something like that; I forgot th exact syntax). Well, duh! the backup is already taken and I can't re-exeute this "special" backup. So down with Option 2.

There was no clear document on how to do it. So, here was my first attempt in a nutshell:

(1) Mount the RMAN backup filesystems on the standby server using NFS
(2) On the standby server, create the pfile
(3) start the instance (nomount)
(4) add the service to the listener
(5) add the service to the tnsnames.ora
(6) on the primary, add the tns entry for the standby
(7) On the primary issue the command

RMAN> backup current controlfile for standby format '/backup/c.ctl';

(8) create the rman file:

connect target /
connect auxiliary sys/manager@odssby
connect catalog rman/rman@catdb
run {
set until scn = 71196399221;
duplicate target database for standby
dorecover;
}

(9) and now comes the golden moment. I run it:

$ rman @r1.rman

and it fails after some time with the error:

RMAN 6024 no backup or copy of the controlfile found to restore

Well, this is wierd. The controlfile has been created on the primary as a standby controlfile. This is also recorded in the catalog. So, what's the deal with the "no backup of controlfile"?

Some Metalink articles even suggested something like

run {
restore controfile from '/backup/c.ctl';
replicate controlfile from ....
}

this could not have worked for me, since I am not creating a new database; I'm creating a standby one.

So, I delvedinto some research. I will spae you the details and cut straight to the chase. The issue was with the time when the controlfile backup was taken. In my case, here is what I did:

Time -> ----+----T1---------T2----------
Arc Log Controlfile
Backup Backup
Taken Created

This cuased the recovery to start at a point earlier to T1, where there was no controlfile backup for standby. Naturally the restore failed.

The solution: simple. Just took another backup of archived logs after time T2. Once the backups were succefully created on the disk, I ran the rman script again and voila! everything works.

The real issue is how RMAN displays messages. Instead of producing a relevant message that points to the issue, it produces a rather cryptic message that gives the impression that some controlfile backup is not present. Well, perhaps in 12g, becuase I know that is not available in 11g.

13 comments:

Sreejith said...

Hi Arup,
As part of doing a dataguard creation from RMAN backup without shutting down the production backup,I am doing a testing with a database named - TEST. The proposed standby database name is TESTSBY which resides in another server with DIFFERENT directory structure.
I am seeing similar errors as you mentioned in the blog.Can you please help me out on this ?
Here is the steps that I followed and the error.

Step 1: Take an RMAN backup of the TEST database.
rman target SYS/sree123@TEST CATALOG rman/password@rman
RMAN> backup database plus archivelog;

RMAN> backup database plus archivelog;
Step 2: create a standby controlfile.

RMAN> backup current controlfile for standby;

Starting backup at 17-APR-09
using channel ORA_DISK_1
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
including standby control file in backupset
channel ORA_DISK_1: starting piece 1 at 17-APR-09
channel ORA_DISK_1: finished piece 1 at 17-APR-09
piece handle=/backups/oracle/TEST/TEST_1_684413087 tag=TAG20090417T104447 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:02
Finished backup at 17-APR-09


RMAN> SQL 'ALTER SYSTEM ARCHIVE LOG CURRENT';

RMAN> BACKUP ARCHIVELOG ALL NOT BACKED UP 1 TIMES;
Starting backup at 17-APR-09
current log archived
using channel ORA_DISK_1
skipping archive log file /oracle/archive_logs/TEST/1_75_684327646.dbf; already backed up 1 time(s)
skipping archive log file /oracle/archive_logs/TEST/1_76_684327646.dbf; already backed up 1 time(s)
skipping archive log file /oracle/archive_logs/TEST/1_77_684327646.dbf; already backed up 1 time(s)
skipping archive log file /oracle/archive_logs/TEST/1_78_684327646.dbf; already backed up 1 time(s)
channel ORA_DISK_1: starting archive log backupset
channel ORA_DISK_1: specifying archive log(s) in backup set
input archive log thread=1 sequence=79 recid=79 stamp=684413147
input archive log thread=1 sequence=80 recid=80 stamp=684413158
channel ORA_DISK_1: starting piece 1 at 17-APR-09
channel ORA_DISK_1: finished piece 1 at 17-APR-09
piece handle=/backups/oracle/TEST/TEST_1_684413160 tag=TAG20090417T104600 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:02
Finished backup at 17-APR-09

Step 3: Perform a few log switches in the Primary and record the last log sequence number.

SQL> alter system switch logfile;
SQL> alter system switch logfile;

Step 4: Build standby database directory structures in the destination server.

mkdir /oracle/oradata/TESTSBY
mkdir /logs/oracle/TESTSBY
mkdir /logs/oracle/TESTSBY/udump
mkdir /logs/oracle/TESTSBY/bdump
mkdir /logs/oracle/TESTSBY/cdump
mkdir /oracle/archive_logs/TESTSBY

Step 5: Create pfile from spfile in TEST and transfer the pfile to destination server.

$ . oraenv
ORACLE_SID = [WEBCTLVE] ? TESTSBY
ORACLE_HOME = [/oracle] ? /oracle/10g
$ cd $ORACLE_HOME/dbs
$ scp oracle@diver:/oracle/10g/dbs/initTEST.ora .


Make all necessary changes to the pfile for standby database.

background_dump_dest='/logs/oracle/TESTSBY/bdump'
control_files='/oracle/oradata/TESTSBY/ctrl_01.ctl','/oracle/oradata/TESTSBY/ctrl_02.ctl'
core_dump_dest='/logs/oracle/TESTSBY/cdump'
DB_UNIQUE_NAME= TESTSBY
db_name='TEST' no change for this parameter
instance_name='TESTSBY'

#log_archive_dest_1='' comment this
#log_archive_dest='/oracle/archive_logs/TEST' comment this

log_archive_format='%t_%s_%r.dbf' no change for this parameter

log_archive_dest_2='SERVICE=TEST VALID_FOR=(ONLINE_LOGFILES, ALL_ROLES) DB_UNIQUE_NAME=TEST'

log_archive_dest_state_2='DEFER'
log_archive_dest_1='LOCATION=/oracle/archive_logs/TESTSBY '
remote_archive_enable='true'
fal_server='TEST'
fal_client='TESTSBY'
log_file_name_convert=('/oracle/oradata/TEST/', '/oracle/oradata/TESTSBY/')
db_file_name_convert=('/oracle/oradata/TEST/', '/oracle/oradata/TESTSBY/')


Step 6: Configure network files.

Listener.ora in standby server has to be updated as shown below:

(SID_DESC =
(GLOBAL_DBNAME = TESTSBY)
(ORACLE_HOME = /oracle/10g)
(SID_NAME = TESTSBY)
)

tnsnames.ora on primary node should have an entry like this:

TESTSBY =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = xxx.xx.uk)(PORT = 1521))
)
(CONNECT_DATA =
(SERVICE_NAME = TESTSBY)
)
)

LSNRCTL>RELOAD

Step 6: Create the password file for standby database.
cd $ORACLE_HOME/dbs
orapwd file=orapwTESTSBY password=sree123 entries=10

Step 7: Startup the standby instance in NOMOUNT stage.


sqlplus “/ as sysdba”
SQL> startup nomount

$ sqlplus "/ as sysdba"

SQL*Plus: Release 10.2.0.3.0 - Production on Thu Apr 16 14:52:12 2009

Copyright (c) 1982, 2006, Oracle. All Rights Reserved.

Connected to an idle instance.

SQL> startup nomount
ORA-32004: obsolete and/or deprecated parameter(s) specified
ORACLE instance started.

Total System Global Area 188743680 bytes
Fixed Size 2028784 bytes
Variable Size 100666128 bytes
Database Buffers 79691776 bytes
Redo Buffers 6356992 bytes

Step 7: Duplicate the database by connecting to target,auxiliary and catalog.

$ rman target SYS/sree123@TEST CATALOG rman/password@rman auxiliary /

Recovery Manager: Release 10.2.0.3.0 - Production on Fri Apr 17 10:47:25 2009

Copyright (c) 1982, 2005, Oracle. All rights reserved.

connected to target database: TEST (DBID=1982474078)
connected to recovery catalog database
connected to auxiliary database: TEST (not mounted)

RMAN> DUPLICATE TARGET DATABASE FOR STANDBY DORECOVER;

Starting Duplicate Db at 17-APR-09
allocated channel: ORA_AUX_DISK_1
channel ORA_AUX_DISK_1: sid=485 devtype=DISK

contents of Memory Script:
{
set until scn 207869;
restore clone standby controlfile;
sql clone 'alter database mount standby database';
}
executing Memory Script

executing command: SET until clause

Starting restore at 17-APR-09
using channel ORA_AUX_DISK_1

channel ORA_AUX_DISK_1: starting datafile backupset restore
channel ORA_AUX_DISK_1: restoring control file
channel ORA_AUX_DISK_1: reading from backup piece /backups/oracle/TEST/TEST_1_684413087
ORA-19870: error reading backup piece /backups/oracle/TEST/TEST_1_684413087
ORA-19505: failed to identify file "/backups/oracle/TEST/TEST_1_684413087"
ORA-27037: unable to obtain file status
SVR4 Error: 2: No such file or directory
Additional information: 3
failover to previous backup

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of Duplicate Db command at 04/17/2009 10:47:47
RMAN-03015: error occurred in stored script Memory Script
RMAN-06026: some targets not found - aborting restore
RMAN-06024: no backup or copy of the control file found to restore

Arup said...

Sreejith,

Your error message is:

ORA-19505: failed to identify file "/backups/oracle/TEST/TEST_1_684413087"

This indicates that the backup copies (and archived logs, perhaps as well) have not been copied to the standby server. I don't see in your procedure where you peform that task. Thagt may be the error.

Sreejith said...

Hi Arup,
I could fix the error by copying the Backup files to Standby server.
I was under the assumption that the file copy and restore will be handled by RMAN as I am connecting to target database, the catalog and the auxiliary database from RMAN prompt.I could understand that only the restore is handled by RMAN and not the Copy.
Thank you very much for the feedback.
Sreejith

ninad said...
This comment has been removed by the author.
ninad said...

Hi Sreejith,

This is ninad here. Attempt to create standby database might have gone failed because the path may not be accessible by standby server. So, possible remedies:

1. Create backup on shared disk accessible by both servers.

2. Copy backup to standby server on same directory structure as on primary and catalog that backup in RMAN.

thanks,
Ninad Shah

ninad said...

Hi Arup,

I have even created standby database with same steps, but upon creation subsequent operation are not being carried out successfully.

e.g. Log switching, Archive shipping, managed recovery process starting.

Any corrective measures for same??

Awaiting kindly response for same.

Thanks & Regds,
Ninad shah

vinay singh said...
This comment has been removed by the author.
vinay singh said...

Hi Arup,
sorry but i didn't exactly understood the T1 -T2 thing.Do I need to take the backup of archivelogs after taking the database
database backup? I tried that too.

I am trying to create a single node standby(on filesystem) for a 2 node cluster database(which is using ASM).
Below are the steps i am doing,but I am getting the same

RMAN-06024: no backup or copy of the control file found to restore error.

DB version is 11.1.0.6 .Host is unbreakable Linux 32-bit.
steps:
AT primary
export ORACLE_SID=racdb1
rman target / catalog rman@racdb (racdb is my rac database and catalog too)
backup current controlfile for standby format '/oracle/dumpdr/racdb_stby_01.ctl';
backup database format '/oracle/dumpdr/racdb_%s' plus archive log format '/oracle/dumpdr/arch_%s';
exit

cd /oracle/dumpdr
scp * oracle@rac3:/oracle/dumpdr/

rman target / catalog rman@racdb auxiliary sys@racdr (racdr is tns for standby db)
duplicate target for standby nofilenamecheck dorecover;
(Tried allocating auxiliary channel manually too. )
--------------------------

AT standby:

Created the pfile and started the database in nomount state.


I am using db_file_name_convert and log_file_name_convert in standby pfile.
Am i missing something? Please help.

Regards,
Vinay Singh

tarakan said...

looks nice. but have you ever heard about another way of repair access 2007 database, provided by appropriate data recovery services?

Mutlu said...

As far as I know, you have to have Enterprise Edition for managed Standby Database. Otherwise, you need to manually copy archivelogs and apply them.

vinay singh said...

Hi ,
I am using the enterprise edition and i have been able to create managed standby normally for my other non-rac databases.this is only for this case,i am getting the error.
i suspect this might be because i am using the rac db as a catalog too.

Benjamin Bell said...

Good job now i m doing work on oracle and this help me in different steps when i face any issue i visit this blog and my confusion release.thank keep it up.

Hari Thanneru said...

hai specify nofilenamecheck at the end of dulicate command.this is just my assumption.because standby is in differnt location

Translate