Tuesday, July 31, 2007

Standby Database Using RMAN

I learn something about Oracle everyday and in the last few days the learnings have been rather painful. I have been trying to create a physical standby database on version 10.1.0.4 on HP-UX. The target is a two node RAC and the standby is a single instance database. the reason for the standby was to move to an ASM storage on a differerent server.

I will try to produce a whole cookbook on the effort; but in the meantime let me explain the pain points.

Like everyone else, I scanned the books, articles, MetaLink, the internet, the socks drawer and broom closet; but couldn't find a good write up on how to perform this task in real life.

For instance, most of the recipes suggested taking a cold backup, something that was impossible since the target is a production database.

Others recipes suggested taking a special backup of the database using RMAN with a special clause - WITH CONTROLFILECOPY AS STANDBY (or something like that; I forgot th exact syntax). Well, duh! the backup is already taken and I can't re-exeute this "special" backup. So down with Option 2.

There was no clear document on how to do it. So, here was my first attempt in a nutshell:

(1) Mount the RMAN backup filesystems on the standby server using NFS
(2) On the standby server, create the pfile
(3) start the instance (nomount)
(4) add the service to the listener
(5) add the service to the tnsnames.ora
(6) on the primary, add the tns entry for the standby
(7) On the primary issue the command

RMAN> backup current controlfile for standby format '/backup/c.ctl';

(8) create the rman file:

connect target /
connect auxiliary sys/manager@odssby
connect catalog rman/rman@catdb
run {
set until scn = 71196399221;
duplicate target database for standby
dorecover;
}

(9) and now comes the golden moment. I run it:

$ rman @r1.rman

and it fails after some time with the error:

RMAN 6024 no backup or copy of the controlfile found to restore

Well, this is wierd. The controlfile has been created on the primary as a standby controlfile. This is also recorded in the catalog. So, what's the deal with the "no backup of controlfile"?

Some Metalink articles even suggested something like

run {
restore controfile from '/backup/c.ctl';
replicate controlfile from ....
}

this could not have worked for me, since I am not creating a new database; I'm creating a standby one.

So, I delvedinto some research. I will spae you the details and cut straight to the chase. The issue was with the time when the controlfile backup was taken. In my case, here is what I did:

Time -> ----+----T1---------T2----------
Arc Log Controlfile
Backup Backup
Taken Created

This cuased the recovery to start at a point earlier to T1, where there was no controlfile backup for standby. Naturally the restore failed.

The solution: simple. Just took another backup of archived logs after time T2. Once the backups were succefully created on the disk, I ran the rman script again and voila! everything works.

The real issue is how RMAN displays messages. Instead of producing a relevant message that points to the issue, it produces a rather cryptic message that gives the impression that some controlfile backup is not present. Well, perhaps in 12g, becuase I know that is not available in 11g.
Post a Comment

Translate