Pages

Wednesday, November 23, 2016

A Million Thanks to _You_

Tomorrow is Thanksgiving holiday in the United States. Millions of Americans will travel this weekend to be with family, away from the family, visit somewhere fun, or for some unfortunate few, be stuck at work or stuck in the traffic. Amid loads of turkey, stuffing and the oft-present alcohol, it's easy to forge the concept behind this holiday: it's about giving thanks. It's a day to remember, acknowledge and celebrate all those who made us who we are today, and offer our thanks.


I take this moment to show my immense gratitude to all those who have inspired and encouraged me to speak, write and teach. I have always been a mentor and loved to teach and write. I first presented in public exactly 14 years ago in Richmond, VA. My first published article came out later that year in New York Oracle User Group publication TechJournal. And my first book was published a year later. Since then I have had the opportunity to speak at 422 sessions in 24 counties, deliver 35 training seminars, write 517 published articles and 6 books. And, I could do that just for the support and inspiration provided by the readers and attendees who I had the pleasure of delivering the sessions and writings.


Here is a map of the places I presented. It's not complete; I put what I could remember. https://drive.google.com/open?id=1EenMw3RKyKc2AxUMuNCj81lTywo&usp=sharing


I am truly honored to have you read my work or listen to me. For this I am thankful to you, dear reader, attendee or the visitor to my blog.


Happy Thanksgiving!

Friday, November 11, 2016

Sangam16 Session: Resolving Latch Contention

Many thanks to those who came to my session at Sangam, the annual conference of All India Oracle Users Group. You can download the presentation materials here:


Presentation here
Scripts here


As always, I will greatly appreciate to hear your thoughts on the session.. Please reach out to me via a comment on this post, or on the following:


Email: arup@proligence.com
Twitter: @ArupNanda
Facebook: ArupKNanda
Google+: +ArupNanda

OTN Ace Directors APAC Tour at Gold Coast


Thank you all for coming to my sessions at the Glod Coast conference of Oracle Technology Network +Oracle Community Network ACE Directors Tour in APAC. Please feel free to download the materials presented.






1. Secure your database in 1 day Presentation and Scripts
2. Prevent Bad SQLs in the Cloud Presentation
3. Becoming a Multitenant DBA Presentation






As always, I will appreciate to hear from you how you "really" felt about them--the good , the bad and the ugly. You can post a comment on the blog right here, or reach me via:






Email: arup@proligence.com
Twitter: @ArupNanda
Google Plus: +Arup Nanda
Facebook.com/ArupKNanda

Sangam16 Session: Preventing SQL and PL/SQL Injection Attacks

Thank you all those who came to my session at Sangam 16, the conference of All India Oracle User Group, where I presented as a part of the OTN ACE Directors APAC Tour.


You can download the materials here. Slides and Scripts used in the demo


As always, I would highly appreciate your feedback, be it here as a comment, or via social media or email.


Email: arup@proligence.com
Twitter: @ArupNanda
Facebook.com/ArupKNanda
Google+: +ArupNanda

Monday, October 31, 2016

OTN ACE Director APAC Tour in Sydney Sessions

Thank you all for coming to my sessions at the Sydney conference of Oracle Technology Network +Oracle Community Network ACE Directors Tour in APAC. Please feel free to download the materials presented.




1. Secure your database in 1 day Presentation and Scripts
2. Prevent Bad SQLs in the Cloud Presentation
3. Becoming a Multitenant DBA Presentation




As always, I will appreciate to hear from you how you "really" felt about them--the good , the bad and the ugly. You can post a comment on the blog right here, or reach me via:




Email: arup@proligence.com
Twitter: @ArupNanda
Google Plus: +Arup Nanda
Facebook.com/ArupKNanda

Wednesday, October 26, 2016

OTN APAC Tour Wellington, New Zeland Sessions

Thank you all for coming to my sessions. I understand your time was important and I appreciate your taking the time to spend the time listening to me.






Here are the materials for my sessions:






1. Cross Platform Database Migration Using RMAN Slides
2. The Art and Craft of Tracing Slides, Scripts
3. Big Data for Oracle Professionals Slides






Like always, I would love to hear form you, not just that you liked something; but something that you felt you were not happy with. Email me at arup@proligence.com, or tweet me at @ArupNanda or Post a comment at Facebook.com/ArupKNanda

OTN APAC Tour Auckland, New Zealand Sessions

Thank you all for coming to my sessions. I understand your time was important and I appreciate your taking the time to spend the time listening to me.












Here are the materials for my sessions:












1. Cross Platform Database Migration Using RMAN Slides
2. The Art and Craft of Tracing Slides, Scripts
3. Big Data for Oracle Professionals Slides












Like always, I would love to hear form you, not just that you liked something; but something that you felt you were not happy with. Email me at arup@proligence.com, or tweet me at @ArupNanda or Post a comment at Facebook.com/ArupKNanda

Thursday, September 22, 2016

Slides for OOW16 Session Preventing Bad SQLs

Thank you for coming to my session Preventing Bad SQLs at Oracle Open World despite being at lunchtime on the last day.

Congratulations to the winners of my book.

Download slides here http://bit.ly/2cGJqR7

Friday, February 05, 2016

No Data Loss without Sync Data Guard

A few months ago I wrote about an exciting product that allows you to achieve no data loss in a disaster even without using the maximum protection Data Guard that relies on super expensive synchronous networks. Recently I had a chance to actually work on that system and replicate the scenarios. Here I explain how that actually works, with commands and actual outputs.

Background

First, you may want to refresh your memory on the concept of that product in the earlier blog http://arup.blogspot.com/2015/04/no-data-loss-without-synchronous-network.html The product is Phoenix system from Axxana. To reiterate, I am not an employee of that company nor in its payroll; I just want to make others aware of this innovative out of the box solution to a problem we are all familiar with. I usually don't write about specific products unless I feel strongly about its effectiveness to the user community.

Let's examine the issue we are facing:
  1. In a Data Guard environment, some amount of redo information may be left behind at the primary site.
  2. If the primary site meets with a disaster where nothing is left, this data is lost for ever.
  3. If this data loss is not acceptable then you have to run the Data Guard in maximum protection mode which guarantees the updates occurring on the primary to be reflected on the standby database immediately. That way the data is updated at standby site and is not affected by the loss of the primary site.
  4. However, this also means that the network has to be super fast and super reliable to ensure the updates at the standby site. Unfortunately this also means the network is super expensive. And this could also become technically challenging for networks across very large distances.
  5. This is why many corporations that can't afford the expensive network infrastructure do not implement the maximum protection Data Guard and merely accept the data loss as an unavoidable problem.
Well, until now. Axxana's product addresses the issue in a non-traditional manner. The only thing that stands between the full recovery on the standby site in case of a disaster and an expensive network is the redo information that has not been transmitted yet to the standby server. If this information were available to the standby host, you would have been able to recover it completely. The Phoenix system from Axxana is a solid state drive enclosed inside a case protected from common disaster elements such as flood, fire, bomb, etc. You put this special box at the primary site and create one member of the redo log group on its storage, as shown in the figure below. This box has a protected port of network, etc. that allows you to access its contents even if the normal ports are all destroyed by the disaster. It even has a cellular modem that allows it to be connected remotely. In fact the Axxana software can pull the necessary data from this special box remotely from the standby host and apply to the standby database. And the best thing about it is that it is the plain vanilla Data Guard you are familiar with; not any special magic that goes under the hood.
Axxana Blackbox
Fig 1. Axxana Blackbox Internals



Let's see how it works with a hands-on approach inside a lab. I have two hosts:



Host
Database
Primary
e02lora adb
Standby
e02rora1 adbsb


The storage is ASM; not a filesystem. Replication in a filesystem is relatively simple; so I deliberately chose it to see if any ASM nuances coming up. The Data Guard is configured with maximum performance, so there is no need for an expensive fast low latency network infrastructure. The setup has an Axxana Phoenix system which consists of three components:
  1. Blackbox - the "special" protected storage system.
  2. Collector - a small server at the primary site that collects the changes to the files under the Axxana protection. In this case, they are controlfile, redo logs and archived logs. I don't need to place any database files there because they are useless for recovery. Keeping this small set of files also makes it possible to put the protected files of many databases--not just one--inside a single blackbox.
  3. Recoverer - a small server at the standby site that receives the changes from the blackbox and keeps the files up to date when disaster occurs.

Setting up

First I ensured that the managed recovery process is running and the standby database is in MOUNTED mode.

Then I created a test schema in the production database.
e02lora$ sqlplus / as sysdba
SQL> grant create session, create table, create procedure, unlimited tablespace to arup identified by arup;
Then I created some objects and data inside this schema.
SQL> connect arup/arup
SQL> create table T (c1 number, c2 timestamp);
SQL> create index IN_T ON T (C1);
SQL> create or replace procedure INS_T (p_c1 in number, p_commit in number) as
2 begin
3 for i in 1..p_c1 loop
4 insert into T values (i,current_timestamp);
5 if mod(i,p_commit)=0 then
6 commit;
7 end if;
8 end loop;
9 end;
10 /

SQL> create or replace procedure DEL_T (p_commit in number) as
2 begin
3 delete T where mod(C1, p_commit)=5;
4 commit;
5 end;
6 /
SQL> create or replace procedure UPD_T (p_commit number) as
2 begin
3 update T set C1 =-1 where mod(C1, p_commit)=3;
4 commit;
5 end;
6 /

SQL> exec INS_T (400000, 5000);

SQL> commit;
After creating the test data, I switched a few log files and waited a bit to make sure the changes are replicated to the standby.

Simulating a Failure


Now to simulate a disconnected network. I didn't have any special tools with me. So, I did the next best thing: I disabled the ethernet interface.

e02rora$ su -
Password:
e02rora$ ifdown eth0

At this point the standby host will not receive the updates. All the changes made to the primary database will be located at the primary site only. I canceled the recovery process first.

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;

I made a few changes to the primary data.

SQL> connect arup/arup
SQL> exec DEL_T (500);
SQL> exec UPD_T (500);

SQL> conn / as sysdba
SQL> alter system switch logfile;

SQL> conn arup/arup
SQL> select count(*) from T

COUNT(1)
----------
399200

The table has only 399,200 records. Since the Managed Recovery Process is stopped these final changes will not be replicated to the standby. I can confirm that by opening the standby in read only mode and checking the actual tables.


On the standby host:

SQL> conn / as sysdba
SQL> ALTER DATABASE OPEN READ ONLY;

Database altered.

SQL> SELECT COUNT (1) FROM ARUP.T;

 COUNT(1)
----------
400000


The standby has all 400,000 rows, compared to 399,200 rows in primary. The discrepancy is due to the unpropagated changes to the standby. At this point I simulate a failure in the primary by killing the pmon process.

e02lora$ ps -aef|grep pmon
UID PID PPID TTY STIME COMMAND
oracle 10184 10564 pty0 16:26:17 pmon
e02lora$ kill -9 10184


The primary database is down. The data in the redo logs is lost as well since they have not been propagated to the standby system yet.

Performing Recovery

Now that we have simulated a disaster, let's see how to recover from it. Remember, since the Data Guard configuration is Maximum Performance, the database can only be recovered up to the most recent log entry. With Axxana software, however, there is additional data that can be pulled from the Blackbox (the storage unit at the primary site that would not be destroyed).  But how do you--the DBA--know what files are available at the standby site, which are still left at the primary site, and, most important, how to get those into the standby host? It gets even more complicated since the Blackbox is just a storage device; you have to mount the volumes, and mount the ASM disks, etc. These may be fun when creating a brand new database; definitely not so when you are under the gun to recover the database and bring your business online.



No worries. This is where the Axxana software comes into rescue. I spun up the provided shell script at the standby site. This script contacts the blackbox at the primary, pulls the needed data and completes the creation of necessary files at the standby site. Once the files are at the standby site, all you have to do is to perform the typical managed standby database recovery to complete the recovery process. The best part of all? The script even gives you a step by step instruction sheet along with specific files names which you can copy and paste when Oracle prompts for it. Here is how I call the script and the resultant output.

root@e02rora1 AxxRecovery# ./AxxRecovery.sh
Logging to '/home/oracle/AxxRecovery/logs/axxana.debug.log'
Calculating Optimization Parameters [done]
Attempting to connect to Axxana's Recoverer [done]

Perform Failover?

Warning!
This operation terminates the Axxana protection for all DBs.

1) OK
2) Cancel & Exit

At this point the script pauses and asks me for confirmation. I enter "1" and the script continues:


Requesting Axxana's Recoverer to 'FAILOVER' [done]

Recoverer communicates with the Black Box (ENTER_FAILOVER_START) [done]
Recoverer communicates with the Black Box (ENTER_FAILOVER_BBX) [done]
Recoverer communicates with the Black Box (ENTER_FAILOVER_CONNECT) [done]

Recoverer is in FAILOVER mode

Transferring required logs files of the database 'adb'.

The script shows me the names of the files along with their respective transfer status and the %age progress.

FILE NAME | Size in MBytes | Stage | Progress
==================================|=================|==============|===================
ADB-group_3.258.902686913 | 50 (0) | Completed | 100%
ADB-thread_1_seq_111.266.902852033| 46 (46) | Completed | 100%
ADB-group_2.257.902686911 | 50 (50) | Completed | 100%
ADB-1454516139530A12435456XT_112.C| 11 (11) | Completed | 100%
ADB-group_1.256.902686901 | 50 (50) | Completed | 100%
==================================|=================|==============|===================
Total: | 208 (159) | 5 of 5 files recovered

All required log files exist at '/home/oracle/AxxRecovery/axx_recovery_files'

Preparing user recovery scripts ...

You can safely perform the Oracle RPO=0 recovery process

Please follow the RPO=0 recovery instructions that are described in the file '/home/oracle/AxxRecovery/axx_required_files/recovery_instructions.txt'


As I mentioned, the script creates a detailed step-by-step instruction sheet to be followed for the standby recovery. I am actually glad that it does not perform a recovery automatically. That is one process you want to watch and proceed with caution. You probably have only one chance at it; and rushing through it may force you to take missteps. But at the same time you want to think as less as possible under those stressful conditions. So, the detailed instruction sheet comes handy. The last line shows the location of the instructions files. Here is how the file looks like in my case, after removing some banner items:

|---------------------------------------------------------------------------------------------------------------|
| RECOVERY INSTRUCTIONS (03/02/2016 16:26:19) |
|===============================================================================================================|
===============================================================================================================

Step 1) Output setup
====================

Please log into the standby database (as sys user)
and then run the following commands at the SQL prompt:

SQL>
SET SERVEROUTPUT ON
SET LINES 999
SET PAGES 0

Step 2) Recovery using archive log files
========================================

Please run the following commands at the SQL prompt:
* (You can safely ignore the following ORA Error 'Managed Standby Recovery not active')

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;

SQL> RECOVER STANDBY DATABASE UNTIL CANCEL;

-- when requested, feed the following file(s):

+DATA/adbsb/archivelog/2016_02_03/thread_1_seq_110.382.902852027
/home/oracle/AxxRecovery/axx_recovery_files/E02LORA1/BBX/ADB/ARCHIVELOG/2016_02_03/thread_1_seq_111.266.902852033

-- finally enter 'Cancel'

SQL> CANCEL

Step 3) Switching to primary control file
=========================================
Please run the following commands at the SQL prompt:

SQL> SHUTDOWN IMMEDIATE
SQL> STARTUP NOMOUNT
SQL> ALTER SYSTEM SET CONTROL_FILES='/home/oracle/AxxRecovery/axx_required_files/1454516139530A12435456XT_112.C' SCOPE=SPFILE;
SQL> SHUTDOWN IMMEDIATE
SQL> STARTUP MOUNT

Step 4) Renaming data and redo log file names
=============================================
Please run the following SQL statement

SQL> @/home/oracle/AxxRecovery/axx_required_files/logAndDateFileRename.sql

Step 5) Recovery using primary control file
===========================================
Please run the following command at the SQL prompt:

SQL> RECOVER DATABASE UNTIL CANCEL USING BACKUP CONTROLFILE
-- when requested, feed the following file(s):

/home/oracle/AxxRecovery/axx_recovery_files/E02LORA1/BBX/ADB/ONLINELOG/group_1.256.902686901
/home/oracle/AxxRecovery/axx_recovery_files/E02LORA1/BBX/ADB/ONLINELOG/group_2.257.902686911
/home/oracle/AxxRecovery/axx_recovery_files/E02LORA1/BBX/ADB/ONLINELOG/group_3.258.902686913

-- You should now see a message saying 'Media Recovery complete'

Step 6) Open the Standby Database
=====================================================
Please run the following commands at the SQL prompt:

SQL> ALTER DATABASE OPEN RESETLOGS;

Well, it's dumbed down enough for those stressful moments associated with a standby database recovery, down to even the set pagesize commands in SQL*Plus--quite easy to forget in stressful situations. Note it shows the time it got generated at the very beginning, non-US date format as dd/mm/yyyy. Anyway, I followed the instructions step by step. Step 4 requires some attention. It shows the how to change the names of the redo logs and data files after the switchover.

For the sake of brevity I don't want to show the entire output. Here is an excerpt from the tail end of the activity:

...
... output truncated ...
...
Specify log: {=suggested | filename | AUTO | CANCEL}
/home/oracle/AxxRecovery/axx_recovery_files/E02LORA1/BBX/ADB/ONLINELOG/group_2.257.902686911
ORA-00279: change 5172890 generated at 02/03/2016 16:16:56 needed for thread 1
ORA-00289: suggestion : +DATA
ORA-00280: change 5172890 for thread 1 is in sequence #114
ORA-00278: log file '/home/oracle/AxxRecovery/axx_recovery_files/E02LORA1/BBX/ADB/ONLINELOG/group_2.257.902686911' no longer needed for this recovery

Specify log: {=suggested | filename | AUTO | CANCEL}
/home/oracle/AxxRecovery/axx_recovery_files/E02LORA1/BBX/ADB/ONLINELOG/group_3.258.902686913
Log applied.
Media recovery complete.

SQL> ALTER DATABASE OPEN RESETLOGS;

Database altered.

SQL> SELECT COUNT (1) FROM ARUP.T;

COUNT(1)
----------
399200


Voila! The output shows "Media recovery complete". The count is 399,200--the same as the number in the production database. The recovery process got those last changes. I accomplished my objective without a synchronous replication.

Summary

Recovery without data loss has always been dependent on a high throughput, low latency, ultra-reliable network. Unfortunately the cost of this infrastructure often precludes the use of the no data loss setups in organizations. The Axxana solution is innovative in the sense that it addresses the issue with a completely out of the box solution slashing costs dramatically, in fact so dramatically that most corporations will be able accomplish no data loss in Oracle Data Guard setups. At the heart of the setup is the "blackbox" which is just solid state storage with protection to withstand water, fire and bombing. When a disaster obliterates the primary site, this blockbox likely survives and through its cellular modem allows its content to be pulled to the standby site to reconstruct the important files that have not been replicated yet. Once those files are retrieved, it's quite simple to recover the standby database completely. In this blog you saw an example of how I did exactly that as a user.

Hope this helps in making some decisions about no data loss setup in Data Guard with respect to cost.