Tuesday, November 09, 2010

Tool to Present a Consolidated View of the Status of RMAN Backups

A very important area of database administration is backups which are essential to the availability of the databases. If you use RMAN to backup databases (as you should be doing), how will you ensure that all the database backups have successfully completed? One way of making sure that occurs is by going into each server and checking the backup log – a task not just tedious but often impractical considering the large number of databases we have. Is there an easy way – via a consolidated output of all databases?

The Tool

Yes, there is an easy way – by checking the catalog, assuming of course that you are using a catalog, as you should be. I developed this tool to extract the information from the catalog for each database. Since the repository may be different for each database being backed up, I used a view to union all the RC_* views from all the repositories on a common schema which I named RMAN_COMMON. Then I wrote a PL/SQL program to pull the report presenting the information on all databases together. Since my objective was to have a consolidated view of all the backups, whether they succeeded or not, I selected from multiple views to provide an indented output in case of failures to identify specifically which component failed. I will illustrate with an output shown below. The tool can be customized for number of days it reports. Here I have chosen for the last 2 days.

SQL> conn rman_common/rman_common@catalogdb
Connected.
SQL> @status

********  RMAN Database Backup Report *********

DB Name  Start Time         End Time           Status   Time Tak Output S Type
-------- ------------------ ------------------ -------- -------- -------- -------
DEMO1    2010-11-08 00:25:1 2010-11-08 00:50:4 COMPLETE 00:25:27    17.99 DISK
DEMO1    2010-11-09 00:25:1 2010-11-09 00:50:3 COMPLETE 00:25:16    17.99 DISK
ECR      2010-11-07 23:12:0 2010-11-07 23:14:2 COMPLETE 00:02:13     5.55 DISK
ECR      2010-11-08 23:12:0 2010-11-08 23:14:3 COMPLETE 00:02:21     5.58 DISK
EMREP    2010-11-08 01:00:2 2010-11-08 01:02:5 COMPLETE 00:02:35   326.07 DISK
EMREP    2010-11-09 01:00:2 2010-11-09 01:02:5 COMPLETE 00:02:28   353.71 DISK
PROPRD   2010-11-07 23:00:1 2010-11-08 00:04:0 COMPLETE 01:03:49    50.48 DISK
PROPRD   2010-11-08 00:04:2 2010-11-08 11:47:1 COMPLETE 11:42:42    27.59
PROPRD   2010-11-08 10:35:3 2010-11-08 11:20:4 COMPLETE 00:45:12    30.00 DISK
PROPRD   2010-11-08 11:28:5 2010-11-08 12:21:2 COMPLETE 00:52:33    30.00 DISK
PROPRD   2010-11-08 12:23:5 2010-11-08 12:38:5 COMPLETE 00:15:00    10.00 DISK
PROPRD   2010-11-08 12:43:3 2010-11-08 12:43:4 COMPLETE 00:00:07   192.00 DISK
PROPRD   2010-11-08 12:46:1 2010-11-08 12:46:2 COMPLETE 00:00:07   224.00 DISK
PROPRD   2010-11-08 12:48:1 2010-11-08 13:14:0 COMPLETE 00:25:50    20.00 DISK
PROPRD   2010-11-08 13:37:3 2010-11-08 13:58:4 COMPLETE 00:21:11    15.00 DISK
PROPRD   2010-11-08 14:00:2 2010-11-08 14:13:5 COMPLETE 00:13:30    10.00 DISK
PROPRD   2010-11-08 14:29:0 2010-11-08 14:29:0 FAILED   00:00:01     0.00
. Operation            Input                Status
. -------------------- -------------------- --------------------
. .                    .                    FAILED
.  Level  Status   Operation            Object Type
.  ------ -------- -------------------- --------------------
.  >      COMPLETE RMAN                 .
.  ->     FAILED   BACKUP               DATAFILE INCR
PROPRD   2010-11-08 23:00:2 2010-11-09 00:21:0 COMPLETE 01:20:43    66.75 DISK
PROPRD   2010-11-09 00:21:2 2010-11-09 14:07:3 COMPLETE 13:46:09    40.25
LIGPRD1  2010-11-08 03:00:0 2010-11-08 03:00:4 COMPLETE 00:00:37     1.74 DISK
LIGPRD1  2010-11-08 03:00:5 2010-11-08 03:04:1 COMPLETE 00:03:12     1.49
LIGPRD1  2010-11-09 03:00:0 2010-11-09 03:00:5 COMPLETE 00:00:45     2.59 DISK
LIGPRD1  2010-11-09 03:01:0 2010-11-09 03:05:1 COMPLETE 00:04:05     1.37
LIGPRD2  2010-11-08 02:00:0 2010-11-08 02:00:3 COMPLETE 00:00:29     1.04 DISK
LIGPRD2  2010-11-08 02:00:4 2010-11-08 02:02:3 COMPLETE 00:01:52     1.34
LIGPRD2  2010-11-09 02:00:0 2010-11-09 02:00:4 COMPLETE 00:00:35     1.67 DISK
LIGPRD2  2010-11-09 02:00:5 2010-11-09 02:03:2 COMPLETE 00:02:26     1.38
SW1      2010-11-08 00:05:0 2010-11-08 00:06:1 COMPLETE 00:01:06   519.17 DISK
SW1      2010-11-09 00:05:0 2010-11-09 00:08:0 COMPLETE 00:03:04     2.01 DISK

PL/SQL procedure successfully completed.

Here you can see the databases in the catalog – PROPRD, LIGPRD1, LIGPRD2 and SW1. The columns – “Start Time”, “End Time” and “Time Taken” – are self-explanatory. The “Output Size” shows the size of the backupset produced. The “Status” column shows the status of the job – the key to this report. If it shows “COMPLETE”, then all was well in the job. If it shows “FAILED” then lines below show what actually failed. For instance you can see on 8th Nov, incremental backup of one datafile of PROPRD failed. That one definitely needs investigating.  You got all that important information in just one report. As you add all the databases into the same catalog, your reports will be more complete and expansive.

Construction

Now that you saw the result of the tool, let’s see the code behind it. First I created a user – RMAN_COMMON:

SQL> grant create session, unlimited tablespace, create view to rman_common identified by rman_common;

We need just three views from the repositories; so this user needs to be granted select privileges on those only. As SYS user, grant these:

grant select on rman_PROPRD.rc_rman_backup_job_details to rman_common;
grant select on rman_LIGPRD11.rc_rman_backup_job_details to rman_common;
grant select on rman_LIGPRD21.rc_rman_backup_job_details to rman_common;
grant select on rman_11g.rc_rman_backup_job_details to rman_common;
--
grant select on rman_PROPRD.rc_rman_backup_subjob_details to rman_common;
grant select on rman_LIGPRD11.rc_rman_backup_subjob_details to rman_common;
grant select on rman_LIGPRD21.rc_rman_backup_subjob_details to rman_common;
grant select on rman_11g.rc_rman_backup_subjob_details to rman_common;
--
grant select on rman_PROPRD.rc_rman_status to rman_common;
grant select on rman_LIGPRD11.rc_rman_status to rman_common;
grant select on rman_LIGPRD21.rc_rman_status to rman_common;
grant select on rman_11g.rc_rman_status to rman_common;

Of course, you will need to customize the above script to add more repositories as you add them.

Next, you will create the views by selecting from the corresponding views from individual repositories. Instead of creating a view with all the columns, I chose only a few columns. This will help us in combining the views from 10g and 11g where the columns could be different.


conn rman_common/rman_common
--
create view rc_rman_backup_job_details
as
select
        db_name,
        start_time,
        end_time,
        status,
        time_taken_display,
        output_bytes_display,
        output_device_type,
        session_key,
        session_recid,
        session_stamp
from rman_PROPRD.rc_rman_backup_job_details
union all
select
        db_name,
        start_time,
        end_time,
        status,
        time_taken_display,
        output_bytes_display,
        output_device_type,
        session_key,
        session_recid,
        session_stamp
from rman_LIGPRD11.rc_rman_backup_job_details
union all
select
        db_name,
        start_time,
        end_time,
        status,
        time_taken_display,
        output_bytes_display,
        output_device_type,
        session_key,
        session_recid,
        session_stamp
from rman_LIGPRD21.rc_rman_backup_job_details
union all
select
        db_name,
        start_time,
        end_time,
        status,
        time_taken_display,
        output_bytes_display,
        output_device_type,
        session_key,
        session_recid,
        session_stamp
from rman_11g.rc_rman_backup_job_details
/
create view rc_rman_backup_subjob_details
as
select
        operation,
        input_type,
        status,
        session_stamp
from rman_PROPRD.rc_rman_backup_subjob_details
union all
select
        operation,
        input_type,
        status,
        session_stamp
from rman_LIGPRD11.rc_rman_backup_subjob_details
union all
select
        operation,
        input_type,
        status,
        input_type,
        status,
        session_stamp
from rman_LIGPRD21.rc_rman_backup_subjob_details
union all
select
        operation,
        input_type,
        status,
        session_stamp
from rman_11g.rc_rman_backup_subjob_details
/
create view rc_rman_status
as
select
        row_level,
        operation,
        object_type,
        status,
        session_key,
        session_recid
from rman_PROPRD.rc_rman_status
union all
select
        row_level,
        operation,
        object_type,
        status,
        session_key,
        session_recid
from rman_LIGPRD11.rc_rman_status
union all
select
        row_level,
        operation,
        object_type,
        status,
        session_key,
        session_recid
        session_key,
        session_recid
from rman_LIGPRD21.rc_rman_status
union all
select
        row_level,
        operation,
        object_type,
        status,
        session_key,
        session_recid
from rman_11g.rc_rman_status
/

Now that the views are in place, you can create the actual tool. Here is the PL/SQL code:

/*---------------------------------------------------------------------------
Name            : RMAN Job Summary Report
Version         : 2.0
Released        : Nov 9th, 2010
Author          : Arup Nanda.
Description     : Creates a report of all backup jobs in a specified number
                  of days. If anything failed, it also shows the sub-job details.
                  Tested on 11g; but will work on 10gR2 as well.
                  Connect to RMAN repository and execute.
                  For number of days, change the l_days constant.
---------------------------------------------------------------------------*/
prompt
prompt ********  RMAN Database Backup Report *********
prompt
set serveroutput on size unlimited
declare
        l_days                  constant number := 2;
        l_place                 varchar2(400);
        l_db_name               rc_rman_backup_job_details.db_name%type;
        l_start_time            varchar2(40);
        l_end_time              varchar2(40);
        l_status                rc_rman_backup_job_details.status%type;
        l_time_taken_display    rc_rman_backup_job_details.time_taken_display%type;
        l_output_bytes_display  rc_rman_backup_job_details.output_bytes_display%type;
        l_output_device_type    rc_rman_backup_job_details.output_device_type%type;
        l_session_key           rc_rman_backup_job_details.session_key%type;
        l_session_recid         rc_rman_backup_job_details.session_recid%type;
        l_session_stamp         rc_rman_backup_job_details.session_stamp%type;

        l_operation             rc_rman_backup_subjob_details.operation%type;
        l_input_type            rc_rman_backup_subjob_details.input_type%type;

        l_command_level         varchar2(9);
        l_object_type           rc_rman_status.object_type%type;
        cursor bjd_cur
        (
                p_days in number
        )
        is
        select
                bjd.db_name,
                to_char(bjd.start_time, 'yyyy-mm-dd hh24:mi:ss'),
                to_char(bjd.end_time, 'yyyy-mm-dd hh24:mi:ss'),
                bjd.status,
                bjd.time_taken_display,
                bjd.output_bytes_display,
                bjd.output_device_type,
                bjd.session_key,
                bjd.session_recid,
                bjd.session_stamp
        from rc_rman_backup_job_details bjd
        where end_time > sysdate - p_days
        order by  bjd.db_name, bjd.start_time;

begin
        l_place := 'Place 100';
        dbms_output.put_line(
                rpad('DB Name',8)
                ||' '||
                rpad('Start Time',18)
                ||' '||
                rpad('End Time',18)
                ||' '||
                rpad('Status',8)
                ||' '||
                rpad('Time Taken',8)
                ||' '||
                rpad('Output Size',8)
                ||' '||
                rpad('Type',8)
        );
        dbms_output.put_line(
                rpad('-',8,'-')
                ||' '||
                rpad('-',18,'-')
                ||' '||
                rpad('-',18,'-')
                ||' '||
                rpad('-',8,'-')
                ||' '||
                rpad('-',8,'-')
                ||' '||
                rpad('-',8,'-')
                ||' '||
                rpad('-',8,'-')
        );
        open bjd_cur (l_days);
        loop
                fetch bjd_cur
                into
                        l_db_name,
                        l_start_time,
                        l_end_time,
                        l_status,
                        l_time_taken_display,
                        l_output_bytes_display,
                        l_output_device_type,
                        l_session_key,
                        l_session_recid,
                        l_session_stamp
                ;
                exit when bjd_cur%notfound;
                dbms_output.put_line(
                        rpad(l_db_name ,8)
                        ||' '||
                        rpad(l_start_time ,18)
                        ||' '||
                        rpad(l_end_time ,18)
                        ||' '||
                        rpad(l_status ,8)
                        ||' '||
                        rpad(l_time_taken_display ,8)
                        ||' '||
                        rpad(l_output_bytes_display ,8)
                        ||' '||
                        rpad(l_output_device_type,8)
                );
                --
                --
                l_place := 'Place 300';
                if (l_status != 'COMPLETED') then
                        for bsjd_rec in (
                        select
                                operation,
                                input_type,
                                status
                        from rc_rman_backup_subjob_details
                        where session_stamp = l_session_stamp
                        ) loop
                        l_place := 'Place 400';
                        dbms_output.put_line(
                                '.'
                                ||' '||
                                rpad('Operation',20)
                                ||' '||
                                rpad('Input',20)
                                ||' '||
                                rpad('Status',20)
                        );
                        dbms_output.put_line(
                                '.'
                                ||' '||
                                rpad('-',20,'-')
                                ||' '||
                                rpad('-',20,'-')
                                ||' '||
                                rpad('-',20,'-')
                        );
                        dbms_output.put_line(
                                '.'
                                ||' '||
                                rpad(nvl(l_operation,'.') ,20)
                                ||' '||
                                rpad(nvl(l_input_type,'.') ,20)
                                ||' '||
                                rpad(nvl(l_status,'.') ,20)
                        );
                        end loop;
                        --
                        l_place := 'Place 500';
                        dbms_output.put_line(
                                '.  '||
                                rpad('Level' ,6)
                                ||' '||
                                rpad('Status' ,8)
                                ||' '||
                                rpad('Operation' ,20)
                                ||' '||
                                rpad('Object Type' ,20)
                        );
                        dbms_output.put_line(
                                '.  '||
                                rpad('-' ,6,'-')
                                ||' '||
                                rpad('-' ,8,'-')
                                ||' '||
                                rpad('-' ,20,'-')
                                ||' '||
                                rpad('-' ,20,'-')
                        );
                        for status_rec in (
                                select
                                        rpad('-', row_level, '-')||'>' command_level,
                                        operation,
                                        object_type,
                                        status
                                from rc_rman_status
                                where session_key = l_session_key
                                order by row_level, session_recid
                        ) loop
                                l_place := 'Place 600';
                                dbms_output.put_line(
                                        '.  '||
                                        rpad(nvl(status_rec.command_level,'.') ,6)
                                        ||' '||
                                        rpad(nvl(status_rec.status,'.') ,8)
                                        ||' '||
                                        rpad(nvl(status_rec.operation,'.') ,20)
                                        ||' '||
                                        rpad(nvl(status_rec.object_type,'.') ,20)
                                );
                        end loop;
                end if;
                end loop;
exception
        when OTHERS then
                dbms_output.put_line(l_place);
                raise;
end;
/

This is the PL/SQL code in the file status.sql.  You can adjust the value of the constant l_days to generate the report for as many days as you want. When you run the script it shows the status of the RMAN jobs in last 2 days.

Hope you find this useful. As always, any feedback will be highly appreciated.
Post a Comment

Translate