Oracle 11g New Features part 2



Oracle Database Fault Diagnosability Infrastructure.


Every DBA knows pretty well (at least should know) the trace directory structure used to generate the alert_sid.log and all the others traces (BDUMP,CDUMP, UDUMP).


Now with Oracle 11g they’ve created a much more complex structure to trace errors, in first place the directory structure is called ADR(Automatic Diagnostic Repository), and when a critical error occurs it is assigned an incident number and all the trace files (diagnostic data), will be tagged with this number, so database bugs, metadata corruption and process errors will be in the ADR now.


The purpose of the ADR is basically speeding up diagnosing and solving issues as well facilitate the Oracle support.


The ADR has several new technologies that will help out on that:


1 - Automatic capture of diagnostic data upon first failure.

When an error occurs generally we used to have just one trace about it, now Oracle collects information from several processes involved on the issue and store them on the ADR, the data can be accessed and interpreted by a command line tool(ADRCI) or by the Enterprise Manager DBconsole.


2 – Standardized trace formats.

The trace format was standardized for all the database components, making easy to understand and diagnose the issue.


3 – Health checks

When detecting a critical error Oracle can run some health checks to go deeper on the error, and all this data will be added to the ADR. The DBA can manually execute health checks on a regular basis or when required by Oracle Support.


4 – Incident packaging service (IPS) and incident packages

Now we have all our related traces tagged with the incident number, so we can use the IPS to create a package of traces and zip it to send to Oracle support, also you can view and edit this package after you have created it.


5 – Data Recovery Advisor (RMAN utility)

There is an interaction between the Data Recovery Advisor and the health checks, and then you can see data corruption problems and also the advice in how to correct it.


6 – SQL Test Case Builder

One of the biggest issues when dealing with SQL issues is to gather all the information related with the performance problem, now it is going to collect all the information necessary and this can be uploaded to Oracle support where they should be able to reproduce in details your problem.



7 – Incident Flood Control

When we are facing some problems bugs perhaps we can have a flood of traces, with the ADR there an Incident Flood Control that will prevent it to fill up because of this kind of problems.


An Incident key is created based on a problem occurrence.


The ADR is created based on the parameter DIAGNOSTIC_DEST, and the parameters *_DUMP_DEST are deprecated.


ADR Structure:



To check your structure:


SELECT * FROM V$DIAG_INFO


INST_ID NAME VALUE

---------- ------------------------ ------------------------------------------------------------

1 Diag Enabled TRUE

1 ADR Base /oracle

1 ADR Home /oracle/diag/rdbms/test/test

1 Diag Trace /oracle/diag/rdbms/test/test/trace

1 Diag Alert /oracle/diag/rdbms/test/test/alert

1 Diag Incident /oracle/diag/rdbms/test/test/incident

1 Diag Cdump /oracle/diag/rdbms/test/test/cdump

1 Health Monitor /oracle/diag/rdbms/test/test/hm

1 Default Trace File /oracle/diag/rdbms/test/test/trace/test_ora_12295.trc

1 Active Problem Count 4

1 Active Incident Count 11


11 rows selected.


The Alert Log file


There is a new Alert Log file that is on the ALERT folder, it is a XML file, it can be read using the ADRCI tool or the Enterprise manager, but for backward compatibility Oracle still keeps the old text version on the TRACE folder.


new Alert Log:

/oracle/diag/rdbms/test/test/alert/log.xml


<msg time='2007-08-17T13:37:01.521+01:00' org_id='oracle' comp_id='rdbms'

client_id='' type='UNKNOWN' level='16'

module='sqlplus@test01 (TNS V1-V3)' pid='29691'>

<txt>Active process 28072 user &apos;oracle&apos; program &apos;oracle@test01&apos;

</txt>

</msg>

<msg time='2007-08-17T13:37:01.521+01:00' org_id='oracle' comp_id='rdbms'

client_id='' type='UNKNOWN' level='16'

module='sqlplus@test01 (TNS V1-V3)' pid='29691'>

<txt>Active process 30002 u

[oracle@test01 alert]$ head log.xml

<msg time='2007-08-10T15:25:16.833+01:00' org_id='oracle' comp_id='rdbms'

msg_id='opistr_real:871:3971575317' type='NOTIFICATION' group='startup'

level='16' pid='10143' version='1'>

<txt>Starting ORACLE instance (normal)

</txt>

</msg>

<msg time='2007-08-10T15:25:16.966+01:00' org_id='oracle' comp_id='rdbms'

msg_id='ksunfy:13399:2937430291' type='NOTIFICATION' group='startup'

level='16' pid='10143'>

<txt>LICENSE_MAX_SESSION = 0


Now the generated trace files *.trc can come together with another file *.trm that is a map to help on the trace navigation.


test_m000_12578.trm

test_m000_12578.trc


When an incident is created all the dump traces will be created on its own directory based on the INCIDENT folder, like this:


/oracle/diag/rdbms/test/test/incident

incdir_12021

incdir_12101

incdir_12349

incdir_12350

incdir_12351

incdir_12352

incdir_13205

incdir_1411

incdir_1467

incdir_217

incdir_218


/oracle/diag/rdbms/test/test/incident/incdir_218

test_ora_18892_i218.trc

test_ora_18892_i218.trm


That is the Incident view on the Enterprise Manager:



It is pretty fast and quick to dig up on the traces of an incident using the Enterprise Manager, another way is using the ADRCI tool.


ADRCI is a very powerful tool that enables you to view all the ADR and Health reports, also you can create the packages and zip them with it with IPS, and like SQL*PLUS you can run scripts or run it in background.


To start it just run adrci, and state help to see the list of commands:


adrci


ADRCI: Release 11.1.0.6.0 - Beta on Thu Aug 23 12:33:37 2007


Copyright (c) 1982, 2007, Oracle. All rights reserved.


ADR base = "/oracle"

adrci> help


HELP [topic]

Available Topics:

CREATE REPORT

ECHO

EXIT

HELP

HOST

IPS

PURGE

RUN

SET BASE

SET BROWSER

SET CONTROL

SET ECHO

SET EDITOR

SET HOMES | HOME | HOMEPATH

SET TERMOUT

SHOW ALERT

SHOW BASE

SHOW CONTROL

SHOW HM_RUN

SHOW HOMES | HOME | HOMEPATH

SHOW INCDIR

SHOW INCIDENT

SHOW PROBLEM

SHOW REPORT

SHOW TRACEFILE

SPOOL


There are other commands intended to be used directly by Oracle, type

"HELP EXTENDED" to see the list


adrci>



Checking the last 3 entries in the alert log:


adrci> show alert -tail 3

2007-08-22 17:22:54.584000 +01:00

Completed: ALTER DATABASE OPEN

2007-08-22 17:27:39.553000 +01:00

Starting background process CJQ0

CJQ0 started with pid=30, OS id=26876

2007-08-22 22:00:00.200000 +01:00

Setting Resource Manager plan SCHEDULER[0x2C0B]:DEFAULT_MAINTENANCE_PLAN via scheduler window

Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN via parameter

2007-08-22 22:00:06.578000 +01:00

Wed Aug 22 22:00:06 2007

Logminer Bld: Lockdown Complete. DB_TXN_SCN is UnwindToSCN (LockdownSCN) is 17789108

2007-08-22 22:00:24.585000 +01:00

Begin automatic SQL Tuning Advisor run for special tuning task "SYS_AUTO_SQL_TUNING_TASK"

2007-08-22 22:01:31.584000 +01:00

End automatic SQL Tuning Advisor run for special tuning task "SYS_AUTO_SQL_TUNING_TASK"

2007-08-22 22:01:52.155000 +01:00

Thread 1 advanced to log sequence 1761

Current log# 3 seq# 1761 mem# 0: /oracle/oradata/test/redo03.log

2007-08-23 00:00:35.913000 +01:00

Thread 1 advanced to log sequence 1762

Current log# 1 seq# 1762 mem# 0: /oracle/oradata/test/redo01.log

2007-08-23 02:00:00.118000 +01:00

Clearing Resource Manager plan via parameter

2007-08-23 10:19:09.405000 +01:00

Thread 1 advanced to log sequence 1763

Current log# 2 seq# 1763 mem# 0: /oracle/oradata/test/redo02.log


Checking the incidents:


adrci> show incident


ADR Home = /oracle/diag/rdbms/test/test:

*************************************************************************

INCIDENT_ID PROBLEM_KEY CREATE_TIME

-------------------- ----------------------------------------------------------- ----------------------------------------

13205 ORA 603 2007-08-22 13:11:36.611960 +01:00

12352 ORA 1578 2007-08-22 14:01:30.585383 +01:00

12351 ORA 600 [krhpfh_03-1208] 2007-08-22 13:11:30.674642 +01:00

12350 ORA 600 [krhpfh_03-1208] 2007-08-22 13:11:28.362289 +01:00

12349 ORA 600 [krhpfh_03-1208] 2007-08-22 13:11:24.058978 +01:00

12101 ORA 600 [krhpfh_03-1208] 2007-08-22 13:11:45.031386 +01:00

12021 ORA 600 [krhpfh_03-1208] 2007-08-22 13:11:39.667875 +01:00

1467 ORA 4031 2007-08-11 07:51:07.343231 +01:00

1411 ORA 4031 2007-08-11 07:51:29.534688 +01:00

218 ORA 4031 2007-08-10 16:45:21.005578 +01:00

217 ORA 4031 2007-08-10 16:30:27.821633 +01:00

11 rows fetched



As you can see there are a lot of thing you can do with this tool, as well you can use the Enterprise Manager, this tool is well documented, so don’t worry when you start to use it.

http://download.oracle.com/docs/cd/B28359_01/server.111/b28319/adrci.htm#BGBBADAB



Health Check Monitor:


It will generate reports of their diagnoses and also recommendations for solving the issues, all will be stored in the ADR.


It can be used in 2 ways:


Reactive – It can run automatically in response to a critical error.

Manual – The DBA manually can use the package DBMS_HM or the DBConsole, Oracle support can ask you to run this checker when working on a SR.


These are the types of Health Checks:


DB Structure Integrity Check – This check is going to verify the whole structure of the database, so if it is in NOMOUNT state only the controlfiles will be checked, when it is in MOUNT or OPEN, then all files are checked.


Data Block Integrity Check - As the name says it is going to check the disk image block for corruption such as checksum failures, head/tail mismatch and logical inconsistencies within the block. You can find information about corrupted block on V$DATABASE_BLOCK_CORRUPTION.


Redo Integrity Check – This one will be looking for the redo logs contents also the archive logs if the database is in Archive log mode.


Undo Segment Integrity Check – This check look for logical undo corruptions that may occur during rollback operations, if one is found it will use PMON and SMON to try to recover the corrupted transaction, if the recovery fails it will store the information on V$CORRUPT_XID_LIST. Most undo corruptions can be resolved by forcing a commit.


Transaction Integrity Check – The same as the Undo Segment, but this one will look for a specific transaction.

Dictionary Integrity Check – This check verify the core dictionary, objects such as tab$, clu$, fet$, uet$, seg$, undo$, ts$, file$, obj$, ind$, icol$, col$, user$, con$, cdef$, ccol$, bootstrap$, objauth$, ugroup$, tsq$, syn$, view$, typed_view$, superobj$, seq$, lob$, coltype$, subcoltype$, ntab$, refcon$, opqtype$, dependency$, access$, viewcon$, icoldep$, dual$, sysauth$, objpriv$, defrole$, and ecol$.

This check will verify relationship between these tables to avoid orphan records for example.

Running it manually:


First of all you can check what kinds of check you can perform, using the view V$HM_CHECK:


SQL> SELECT NAME FROM V$HM_CHECK WHERE INTERNAL_CHECK='N';


NAME

----------------------------------------------------------------

DB Structure Integrity Check

Data Block Integrity Check

Redo Integrity Check

Transaction Integrity Check

Undo Segment Integrity Check

Dictionary Integrity Check


6 rows selected.



Then use the package dbms_hm:


SQL> begin

2 dbms_hm.run_check('Dictionary Integrity Check', 'test_hm_1');

3 end;

4 /


PL/SQL procedure successfully completed.


To view the finds report we can use:


Enterprise Manager

DBMS_HM package

ADRCI tool

RMAN

VIEWS – V$HM_RUN, V$HM_FINDING


Let’s check with the DBMS package:

SET LONG 100000

SET LONGCHUNKSIZE 1000

SET PAGESIZE 1000

SET LINESIZE 512

SELECT DBMS_HM.GET_RUN_REPORT('test_hm_1') FROM DUAL;


SQL> SQL> SQL> SQL>

DBMS_HM.GET_RUN_REPORT('TEST_HM_1')


Basic Run Information

Run Name : test_hm_1

Run Id : 842

Check Name : Dictionary Integrity Check

Mode : MANUAL

Status : COMPLETED

Start Time : 2007-08-23 13:42:10.660476 +01:00

End Time : 2007-08-23 13:42:13.838216 +01:00

Error Encountered : 0

Source Incident Id : 0

Number of Incidents Created : 0


Input Paramters for the Run

TABLE_NAME=ALL_CORE_TABLES

CHECK_MASK=ALL


Run Findings And Recommendations

Finding

Finding Name : Dictionary Inconsistency

Finding ID : 843

Type : FAILURE

Status : OPEN

Priority : CRITICAL

Message : SQL dictionary health check: file$ pk 42 on object FILE$

failed

Message : Damaged rowid is AAAAARAABAAAAByAAC - description: No further

damage description available

Finding

Finding Name : Dictionary Inconsistency

Finding ID : 846

Type : FAILURE

Status : OPEN

Priority : CRITICAL

Message : SQL dictionary health check: dependency$.dobj# fk 126 on

object DEPENDENCY$ failed

Message : Damaged rowid is AAAABnAABAAAOoHABJ - description: No further

damage description available


In my case it found 2 inconsistencies in my dictionary. Just for record I didn’t force this error.


Now using rman to list and get the recommendations to solve the issue:


RMAN> list failure;


using target database control file instead of recovery catalog

List of Database Failures

=========================


Failure ID Priority Status Time Detected Summary

---------- -------- --------- ------------- -------

846 CRITICAL OPEN 23-AUG-07 SQL dictionary health check: dependency$.dobj# fk 126 on object DEPENDENCY$ failed

843 CRITICAL OPEN 23-AUG-07 SQL dictionary health check: file$ pk 42 on object FILE$ failed


RMAN> ADVISE FAILURE;


List of Database Failures

=========================


Failure ID Priority Status Time Detected Summary

---------- -------- --------- ------------- -------

846 CRITICAL OPEN 23-AUG-07 SQL dictionary health check: dependency$.dobj# fk 126 on object DEPENDENCY$ failed

843 CRITICAL OPEN 23-AUG-07 SQL dictionary health check: file$ pk 42 on object FILE$ failed


Mandatory Manual Actions

========================

1. Please contact Oracle Support Services to resolve failure 846: SQL dictionary health check: dependency$.dobj# fk 126 on object DEPENDENCY$ failed

2. Please contact Oracle Support Services to resolve failure 843: SQL dictionary health check: file$ pk 42 on object FILE$ failed


Optional Manual Actions

=======================

no manual actions available


Automated Repair Options

========================

no automatic repair options available


Both cases are orphan objects in the dictionary, sometimes is quite simple to solve that, but in any case don’t touch in the dictionary without the written Oracle Support advise, otherwise you will not be able to blame them if something goes bad, hehe.


But just for tests purposes take a look in what I did to solve this issue:


Checking the mentioned rowed:


SQL> select file# from file$ where rowid = 'AAAAARAABAAAAByAAC';


FILE#

----------

3


SQL> select D_OBJ# from dependency$ where rowid='AAAABnAABAAAOoHABJ';


D_OBJ#

----------

63944


Checking the correlated views to verify it is a orphan record:


SQL> select file_name from dba_data_files where file_id=3;


no rows selected


SQL> select object_name from dba_objects where object_id=63944;


no rows selected


Deleting from the dictionary:


SQL> delete from file$ where file#=3;


1 row deleted.


SQL> delete from dependency$ where D_OBJ#=63944;


1 row deleted.


SQL> commit;


Commit complete.


Flushing the shared pool to clean up the dictionary cache:


SQL> alter system flush shared_pool;


System altered.


Running the health check again:


SQL> begin

dbms_hm.run_check('Dictionary Integrity Check', 'test_hm_2');

end;

/


SET LONG 100000

SET LONGCHUNKSIZE 1000

SET PAGESIZE 1000

SET LINESIZE 512

SELECT DBMS_HM.GET_RUN_REPORT('test_hm_2') FROM DUAL;


SQL> SQL> SQL> SQL>

DBMS_HM.GET_RUN_REPORT('TEST_HM_2')


Basic Run Information

Run Name : test_hm_2

Run Id : 872

Check Name : Dictionary Integrity Check

Mode : MANUAL

Status : COMPLETED

Start Time : 2007-08-23 14:01:15.988291 +01:00

End Time : 2007-08-23 14:01:17.582556 +01:00

Error Encountered : 0

Source Incident Id : 0

Number of Incidents Created : 0


Input Paramters for the Run

TABLE_NAME=ALL_CORE_TABLES

CHECK_MASK=ALL


Run Findings And Recommendations



Backup and Recovery New Features:

Below are the new features described on the Oracle documentation, for the most interesting features I’ve done some examples:

Data Recovery Advisor is a built-in tool to automatically diagnose data failures and recommend repairs. You can repair failures manually or request that they be repaired automatically. Data Recovery Advisor supports the LIST FAILURE, CHANGE FAILURE, ADVISE FAILURE, and REPAIR FAILURE commands. This Advisor is based on the Health Check Monitor.

In previous releases, RMAN could not back up transportable tablespaces until they were made read/write at the destination database. Now RMAN can back up transportable tablespaces when they are not read/write and restore these backups.

Enterprise Manager includes an interface for Data Recovery Advisor.

RMAN can back up a single file in parallel by dividing the work among multiple channels. Each channel backs up one file section. You create a multisection backup by specifying SECTION SIZE on the BACKUP command. Restoring a multisection backup in parallel is automatic and requires no option.

You can parallelize validations of a file with VALIDATE ... SECTION SIZE.

The BACKUP command does not back up undo that is not needed for recovery of a backup. Undo is not needed if it was generated for a transaction that has already committed. This undo can represent the majority of undo in the database.

This undo optimization is built-in RMAN behavior and can not be disabled.

When performing block media recovery, RMAN automatically searches the flashback logs, if they are available, for the required blocks before searching backups. Using blocks from the flashback logs can significantly improve block media recovery performance.

Several database components and utilities, including RMAN, can now detect a corrupt block and record it in V$DATABASE_BLOCK_CORRUPTION. When instance recovery detects a corrupt block, it records it in this view automatically. Oracle Database automatically updates this view when block corruptions are detected or repaired. The VALIDATE command is enhanced with many new options such as VALIDATE ... BLOCK and VALIDATE DATABASE.

In addition to the existing BZIP2 algorithm for binary compression of backups, RMAN also supports the ZLIB algorithm. ZLIB runs faster than BZIP2, but produces larger files. You can use the CONFIGURE COMPRESSION ALGORITHM command to choose between BZIP2 (default) and ZLIB for RMAN backups.

You can create RMAN command files and stored scripts that accept user input at runtime. Thus, backup scripts can use RMAN substitution variables for tags, filenames, restore point names, and so on.

Media recovery of databases on symmetric multiprocessing (SMP) computers is now faster. The performance improvements include the following:

No configuration is necessary, although you can use new parallel recovery wait events for tuning if the default apply rate is not satisfactory.

When backing up archived redo log files located in the flash recovery area, RMAN can fail over to archiving destinations outside the recovery area. RMAN can use an intact copy of an archived log in an alternative location to continue writing backups when a log in the recovery area is missing or corrupted.

When you CONFIGURE an archived log deletion policy, the configuration applies to all archiving destinations, including the flash recovery area. Both BACKUP ... DELETE INPUT and DELETE ... ARCHIVELOG obey this configuration, as does the flash recovery area. You can also CONFIGURE an archived redo log deletion policy so that logs are eligible for deletion only after being applied to or transferred to standby database destinations. You can set the policy for mandatory standby destinations only, or for any standby destinations.

E.g.:

CONFIGURE ARCHIVELOG DELETION POLICY TO BACKED UP 2 TIMES TO DISK;

So in this case, the delete will only work for the archives that were backup at least 2 times.

You can create a long-term or archival backup with BACKUP ... KEEP that retains only the archived log files needed to make the backup consistent.

Together with this concept it brings the idea of restore points, that is basically a database point in time recover that you can created to be guaranteed using restore points.

E.g.:


RMAN> backup database format '/oracle/keep%u'

2> tag before_patch_10203

3> keep until time 'sysdate+3'

4> restore point bef_patch_10203;


Starting backup at 28-AUG-07

current log archived


using channel ORA_DISK_1

backup will be obsolete on date 31-AUG-07

archived logs required to recover from this backup will be backed up

channel ORA_DISK_1: starting compressed full datafile backup set

channel ORA_DISK_1: specifying datafile(s) in backup set

.

piece handle=/oracle/keep2piqh08t tag=BEFORE_PATCH_10203 comment=NONE

channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01

Finished backup at 28-AUG-07


RMAN> list restore point all;


SCN RSP Time Type Time Name

---------------- --------- ---------- --------- ----

18443737 28-AUG-07 BEF_PATCH_10203


Then if you decide to restore to this restore point you just need to use:


RMAN> restore database to restore point 'BEF_PATCH_10203';

RMAN> recover database to restore point 'BEF_PATCH_10203';



We could perform that before, using other methods, but this way I think you can let the process more clear and easy to execute.

The owner of a recovery catalog can GRANT or REVOKE access to a subset of the catalog to other database users in the same recovery catalog database. This subset is called a virtual private catalog. You can also use the IMPORT CATALOG command to merge one recovery catalog (or metadata for specific databases in the catalog) into another recovery catalog.

The tests for the below topis will be on the part 3 document.

You can now set persistent RMAN configurations for a primary or physical standby database when RMAN is not connected as TARGET to the database. RMAN works seamlessly on all databases in the Data Guard environment, enabling you to use backups made on one database for restore and recovery on another database. The same recovery catalog can manage metadata for all primary and standby databases.

You can use the DUPLICATE command to create a duplicate database or physical standby database over the network without a need for pre-existing database backups. This form of duplication is called active database duplication.

You can enable block change tracking on a physical standby database. When you back up the standby database, RMAN can use the block change tracking file to quickly identify the blocks that changed since the last incremental backup.

The Oracle VSS writer is integrated with applications that use the Volume Shadow Copy Service (VSS) infrastructure on Windows. You can use VSS-enabled software and storage systems to back up and restore an Oracle database. A key benefit is the ability to make a shadow copy of an open database.

You can enable the DB_LOST_WRITE_PROTECT initialization parameter to detect a lost write during managed recovery of a standby database or media recovery of a primary database. Lost write detection is disabled by default.

You can reverse a transaction. Oracle Database determines the dependencies between transactions and in effect creates a compensating transaction that reverses the unwanted changes. The database rewinds to a state as if the transaction, and any transactions that could be dependent on it, never occurred.

A flashback data archive enables the database to automatically track and store all transactional changes to a table for the duration of its lifetime. Thus, you do not need to build this functionality into database applications.

A flashback data archive is especially useful for compliance, audit reports, data analysis, and DSS (Decision Support Systems). You can use some of the logical flashback features with a flashback data archive to access data from far in the past.


Data Recovery Advisor


This advisory was created to automatically detect and provide repair options that will be executed following the user request.


According to Oracle the Recovery Advisor can detects and repair data failures before any Oracle process signals an error.


We can use it both with GUI interface that is the Enterprise Manager or the RMAN command line.


The RMAN commands to monitor and use the advisor are:

LIST FAILURE, ADVISE FAILURE, REPAIR FAILURE, and CHANGE FAILURE.


Before start my tests, I did a normal backup using BACKUP DATABASE command:


RMAN> backup database;


Starting backup at 22-AUG-07

allocated channel: ORA_DISK_1

channel ORA_DISK_1: SID=120 device type=DISK

channel ORA_DISK_1: starting full datafile backup set

channel ORA_DISK_1: specifying datafile(s) in backup set

input datafile file number=00001 name=/oracle/oradata/test/system01.dbf

input datafile file number=00002 name=/oracle/oradata/test/sysaux01.dbf

input datafile file number=00004 name=/oracle/oradata/test/users01.dbf

input datafile file number=00006 name=/oracle/oradata/test/sqlmag01.dbf

input datafile file number=00007 name=/oracle/oradata/test/undotbs201.dbf

input datafile file number=00005 name=/oracle/oradata/test/test_block01.dbf

channel ORA_DISK_1: starting piece 1 at 22-AUG-07

channel ORA_DISK_1: finished piece 1 at 22-AUG-07

piece handle=/oracle/flash_recovery_area/TEST/backupset/2007_08_22/o1_mf_nnndf_TAG20070822T123733_3dr80cqp_.bkp tag=TAG20070822T123733 comment=NONE

channel ORA_DISK_1: backup set complete, elapsed time: 00:00:45

channel ORA_DISK_1: starting full datafile backup set

channel ORA_DISK_1: specifying datafile(s) in backup set

including current control file in backup set

including current SPFILE in backup set

channel ORA_DISK_1: starting piece 1 at 22-AUG-07

channel ORA_DISK_1: finished piece 1 at 22-AUG-07

piece handle=/oracle/flash_recovery_area/TEST/backupset/2007_08_22/o1_mf_ncsnf_TAG20070822T123733_3dr820wn_.bkp tag=TAG20070822T123733 comment=NONE

channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01

Finished backup at 22-AUG-07


Then I’m going to corrupt a datafile using the VI editor, I just edit it in the middle and deleted a row:


Vi test_block01.dbf


After waiting for a long time, also issuing some redo logs switch and checkpoints manually Oracle didn’t discovered anything:


RMAN> list failure;


no failures found that match specification


So I’ve used the VALIDATE command:


RMAN> validate database;


Starting validate at 22-AUG-07

using channel ORA_DISK_1

channel ORA_DISK_1: starting validation of datafile

channel ORA_DISK_1: specifying datafile(s) for validation

input datafile file number=00001 name=/oracle/oradata/test/system01.dbf

input datafile file number=00002 name=/oracle/oradata/test/sysaux01.dbf

input datafile file number=00004 name=/oracle/oradata/test/users01.dbf

input datafile file number=00006 name=/oracle/oradata/test/sqlmag01.dbf

input datafile file number=00007 name=/oracle/oradata/test/undotbs201.dbf

RMAN-00571: ===========================================================

RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============

RMAN-00571: ===========================================================

RMAN-03009: failure of validate command on ORA_DISK_1 channel at 08/22/2007 13:49:48

ORA-01122: database file 5 failed verification check

ORA-01110: data file 5: '/oracle/oradata/test/test_block01.dbf'

ORA-01565: error in identifying file '/oracle/oradata/test/test_block01.dbf'

ORA-27046: file size is not a multiple of logical block size

Additional information: 1


Then:


RMAN> list failure;


List of Database Failures

=========================


Failure ID Priority Status Time Detected Summary

---------- -------- --------- ------------- -------

262 HIGH OPEN 22-AUG-07 One or more non-system datafiles are corrupt


Now the advisory got the corrupted datafile, and to check out it suggestions we use ADVISE FAILURE:


RMAN> advise failure;


List of Database Failures

=========================


Failure ID Priority Status Time Detected Summary

---------- -------- --------- ------------- -------

262 HIGH OPEN 22-AUG-07 One or more non-system datafiles are corrupt


analyzing automatic repair options; this may take some time

using channel ORA_DISK_1

analyzing automatic repair options complete


Mandatory Manual Actions

========================

no manual actions available


Optional Manual Actions

=======================

no manual actions available


Automated Repair Options

========================

Option Repair Description

------ ------------------

1 Restore and recover datafile 5

Strategy: The repair includes complete media recovery with no data loss

Repair script: /oracle/diag/rdbms/test/test/hm/reco_3072078661.hm


It creates a script that can be executed manually or through the command REPAIR FAILURE, check the content of the file:


cat /oracle/diag/rdbms/test/test/hm/reco_3072078661.hm


# restore and recover datafile

sql 'alter database datafile 5 offline';

restore datafile 5;

recover datafile 5;

sql 'alter database datafile 5 online';


For this case of corruption the solutions it is really this one, the datafile is completed corrupted so a complete restore of it is necessary.


RMAN> repair failure;


Strategy: The repair includes complete media recovery with no data loss

Repair script: /oracle/diag/rdbms/test/test/hm/reco_3072078661.hm


contents of repair script:

# restore and recover datafile

sql 'alter database datafile 5 offline';

restore datafile 5;

recover datafile 5;

sql 'alter database datafile 5 online';


Do you really want to execute the above repair (enter YES or NO)? yes

executing repair script


sql statement: alter database datafile 5 offline


Starting restore at 22-AUG-07

using channel ORA_DISK_1


channel ORA_DISK_1: starting datafile backup set restore

channel ORA_DISK_1: specifying datafile(s) to restore from backup set

channel ORA_DISK_1: restoring datafile 00005 to /oracle/oradata/test/test_block01.dbf

channel ORA_DISK_1: reading from backup piece /oracle/flash_recovery_area/TEST/backupset/2007_08_22/o1_mf_nnndf_TAG20070822T123733_3dr80cqp_.bkp

channel ORA_DISK_1: piece handle=/oracle/flash_recovery_area/TEST/backupset/2007_08_22/o1_mf_nnndf_TAG20070822T123733_3dr80cqp_.bkp tag=TAG20070822T123733

channel ORA_DISK_1: restored backup piece 1

channel ORA_DISK_1: restore complete, elapsed time: 00:00:01

Finished restore at 22-AUG-07


Starting recover at 22-AUG-07

using channel ORA_DISK_1


starting media recovery

media recovery complete, elapsed time: 00:00:01


Finished recover at 22-AUG-07


sql statement: alter database datafile 5 online


repair failure complete


RMAN> list failure;


no failures found that match specification


Now I’m going to do another test, deleting one of my redo log files and after some switchs I got this on the list failure:


RMAN> list failure;


List of Database Failures

=========================


Failure ID Priority Status Time Detected Summary

---------- -------- --------- ------------- -------

478 CRITICAL OPEN 22-AUG-07 Redo log group 1 is unavailable

481 HIGH OPEN 22-AUG-07 Redo log file /oracle/oradata/test/redo01.log is missing


RMAN> advise failure;


List of Database Failures

=========================


Failure ID Priority Status Time Detected Summary

---------- -------- --------- ------------- -------

478 CRITICAL OPEN 22-AUG-07 Redo log group 1 is unavailable

481 HIGH OPEN 22-AUG-07 Redo log file /oracle/oradata/test/redo01.log is missing


analyzing automatic repair options; this may take some time

using channel ORA_DISK_1

analyzing automatic repair options complete


Mandatory Manual Actions

========================

no manual actions available


Optional Manual Actions

=======================

1. If file /oracle/oradata/test/redo01.log was unintentionally renamed or moved, restore it


Automated Repair Options

========================

Option Repair Description

------ ------------------

1 Clear unarchived redo log group 1

Strategy: The repair includes complete media recovery with no data loss

Repair script: /oracle/diag/rdbms/test/test/hm/reco_3529151602.hm



Actually it creates 2 scripts check below:


cat /oracle/diag/rdbms/test/test/hm/reco_3529151602.hm

# clear unarchived log group

sql "begin sys.dbms_ir.execsqlscript(filename => ''/oracle/diag/rdbms/test/test/hm/reco_2584336171.hm'' ); end;";


cat /oracle/diag/rdbms/test/test/hm/reco_2584336171.hm

begin

/*Clear the unarchived Log Group*/

execute immediate 'ALTER DATABASE CLEAR UNARCHIVED LOGFILE GROUP 1';

end;



Here is the repair:


RMAN> repair failure;


Strategy: The repair includes complete media recovery with no data loss

Repair script: /oracle/diag/rdbms/test/test/hm/reco_3529151602.hm


contents of repair script:

# clear unarchived log group

sql "begin sys.dbms_ir.execsqlscript(filename => ''/oracle/diag/rdbms/test/test/hm/reco_2584336171.hm'' ); end;";


Do you really want to execute the above repair (enter YES or NO)? yes

executing repair script


sql statement: begin sys.dbms_ir.execsqlscript(filename => ''/oracle/diag/rdbms/test/test/hm/reco_2584336171.hm'' ); end;

repair failure complete


And to finish this item I’m going to test a block recover also I will use this to show a little bit about ORACLE BBED – Block Editor, but this is an ORACLE INTERNAL TOOL, so if you use it Oracle will not support if you screw up your database, this is just for test:


To install it and to get some tips look at my blog http://dbabrain.blogspot.com/


SQL> create table obj

2 as

3 select object_id, object_name from dba_objects

4 where rownum < 599;


Table created.


Backup the database now:


SQL> select header_file, header_block, blocks

2 from dba_segments

3 where segment_name='OBJ';


HEADER_FILE HEADER_BLOCK BLOCKS

----------- ------------ ----------

4 11 8


SQL> select dbms_rowid.rowid_block_number(rowid) block, COUNT(1) Rows_per_block

from obj

GROUP BY dbms_rowid.rowid_block_number(rowid) ; 2 3


BLOCK ROWS_PER_BLOCK

---------- --------------

12 299

16 299



SQL> select file_name from dba_data_files where file_id=4;


FILE_NAME

---------------------------------------------------------------------------

/oracle/oradata/test/users01.dbf


I’m going to corrupt the block: 16


BBED: Release 2.0.0.0.0 - Limited Production on Fri Aug 24 09:53:45 2007


Copyright (c) 1982, 2007, Oracle. All rights reserved.


************* !!! For Oracle Internal Use only !!! ***************


BBED> set filename '/oracle/oradata/test/users01.dbf'

FILENAME /oracle/oradata/test/users01.dbf


BBED> set listfile '/oracle/bbed.log'

LISTFILE /oracle/bbed.log


BBED> set mode edit

MODE Edit


BBED> show

FILE# 0

BLOCK# 1

OFFSET 0

DBA 0x00000000 (0 0,1)

FILENAME /oracle/oradata/test/users01.dbf

BIFILE bifile.bbd

LISTFILE /oracle/bbed.log

BLOCKSIZE 8192

MODE Edit

EDIT Unrecoverable

IBASE Dec

OBASE Dec

WIDTH 80

COUNT 512

LOGFILE log.bbd

SPOOL No


Corrupting the blocking copying another one over it:


BBED> set offset 0

OFFSET 0


BBED> copy dba 4,10 to dba 4,16

File: /oracle/oradata/test/users01.dbf (4)

Block: 16 Offsets: 0 to 511 Dba:0x01000010

------------------------------------------------------------------------

21a20000 0a000001 e1c21201 00000204 ff990000 00000000 00000000 00000000

00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

00000000 00000000 00000000 0b000001 01000000 01000000 00000000 00000000



Bouncing the database, then try to select the obj table:


SQL> conn sqlmag/sqlmag

Connected.

SQL> select count(1) from obj;

select count(1) from obj

*

ERROR at line 1:

ORA-01578: ORACLE data block corrupted (file # 4, block # 16)

ORA-01110: data file 4: '/oracle/oradata/test/users01.dbf'


Now we go to RMAN and let’s see what it knows:


RMAN> list failure;


using target database control file instead of recovery catalog

List of Database Failures

=========================


Failure ID Priority Status Time Detected Summary

---------- -------- --------- ------------- -------

1616 HIGH OPEN 24-AUG-07 Datafile 4: '/oracle/oradata/test/users01.dbf' contains one or more corrupt blocks



RMAN> advise failure;


List of Database Failures

=========================


Failure ID Priority Status Time Detected Summary

---------- -------- --------- ------------- -------

1616 HIGH OPEN 24-AUG-07 Datafile 4: '/oracle/oradata/test/users01.dbf' contains one or more corrupt blocks


analyzing automatic repair options; this may take some time

allocated channel: ORA_DISK_1

channel ORA_DISK_1: SID=135 device type=DISK

analyzing automatic repair options complete


Mandatory Manual Actions

========================

no manual actions available


Optional Manual Actions

=======================

no manual actions available


Automated Repair Options

========================

Option Repair Description

------ ------------------

1 Perform block media recovery of block 16 in file 4

Strategy: The repair includes complete media recovery with no data loss

Repair script: /oracle/diag/rdbms/test/test/hm/reco_1747659741.hm


cat /oracle/diag/rdbms/test/test/hm/reco_1747659741.hm

# block media recovery

recover datafile 4 block 16;


RMAN> repair failure;


Strategy: The repair includes complete media recovery with no data loss

Repair script: /oracle/diag/rdbms/test/test/hm/reco_1747659741.hm


contents of repair script:

# block media recovery

recover datafile 4 block 16;


Do you really want to execute the above repair (enter YES or NO)? yes

executing repair script


Starting recover at 24-AUG-07

using channel ORA_DISK_1

searching flashback logs for block images until SCN 18006876

finished flashback log search, restored 1 blocks


starting media recovery

media recovery complete, elapsed time: 00:00:03


Finished recover at 24-AUG-07

repair failure complete



SQL> select count(1) from obj;


COUNT(1)

----------

598


So this advisor seems to be pretty good mainly for those DBA’s that are starting to use the RMAN, and don’t forget BBED is a very dangerous tool, don’t play with it in a production database never.


There are a lot of others things you can do such as change the priority of the failures, look the details of a specific failure or repair a specific failure, but I think you got the idea with these examples, let’s move forward.


Multisection Backup


Originally RMAN running in parallel means different channel backing up different files at the same time now you can have different channels backing up pieces of the same file at the same time.


RMAN> CONFIGURE DEVICE TYPE DISK PARALLELISM 4 BACKUP TYPE TO BACKUPSET;


new RMAN configuration parameters:

CONFIGURE DEVICE TYPE DISK PARALLELISM 4 BACKUP TYPE TO BACKUPSET;

new RMAN configuration parameters are successfully stored


RMAN> backup SECTION SIZE 50m tablespace users;


Starting backup at 23-AUG-07

using channel ORA_DISK_1

using channel ORA_DISK_2

using channel ORA_DISK_3

using channel ORA_DISK_4

channel ORA_DISK_1: starting full datafile backup set

channel ORA_DISK_1: specifying datafile(s) in backup set

input datafile file number=00004 name=/oracle/oradata/test/users01.dbf

backing up blocks 1 through 6400

channel ORA_DISK_1: starting piece 1 at 23-AUG-07

channel ORA_DISK_2: starting full datafile backup set

channel ORA_DISK_2: specifying datafile(s) in backup set

input datafile file number=00004 name=/oracle/oradata/test/users01.dbf

backing up blocks 6401 through 12800

channel ORA_DISK_2: starting piece 2 at 23-AUG-07

channel ORA_DISK_3: starting full datafile backup set

channel ORA_DISK_3: specifying datafile(s) in backup set

input datafile file number=00004 name=/oracle/oradata/test/users01.dbf

backing up blocks 12801 through 19200

channel ORA_DISK_3: starting piece 3 at 23-AUG-07

channel ORA_DISK_1: finished piece 1 at 23-AUG-07

piece handle=/oracle/flash_recovery_area/TEST/backupset/2007_08_23/o1_mf_nnndf_TAG20070823T153529_3dv6ssy7_.bkp tag=TAG20070823T153529 comment=NONE

channel ORA_DISK_1: backup set complete, elapsed time: 00:00:14

channel ORA_DISK_2: finished piece 2 at 23-AUG-07

piece handle=/oracle/flash_recovery_area/TEST/backupset/2007_08_23/o1_mf_nnndf_TAG20070823T153529_3dv6t07o_.bkp tag=TAG20070823T153529 comment=NONE

channel ORA_DISK_2: backup set complete, elapsed time: 00:00:07

channel ORA_DISK_3: finished piece 3 at 23-AUG-07

piece handle=/oracle/flash_recovery_area/TEST/backupset/2007_08_23/o1_mf_nnndf_TAG20070823T153529_3dv6t5dz_.bkp tag=TAG20070823T153529 comment=NONE

channel ORA_DISK_3: backup set complete, elapsed time: 00:00:01

Finished backup at 23-AUG-07


As you can see it created in parallel 3 files that were enough to handle the section size I’ve specified.

This is very useful when we have to deal with big datafiles.


Faster backup compression


A normal backup without compression has 1.07gb and took 44 seconds to complete:


BS Key Type LV Size Device Type Elapsed Time Completion Time

------- ---- -- ---------- ----------- ------------ ---------------

19 Full 1.07G DISK 00:00:44 24-AUG-07


With the default compression algorithm BZIP2 it has 193mb and took 1 minute and 23 seconds to complete:


RMAN> CONFIGURE DEVICE TYPE DISK BACKUP TYPE TO COMPRESSED BACKUPSET ;


BS Key Type LV Size Device Type Elapsed Time Completion Time

------- ---- -- ---------- ----------- ------------ ---------------

21 Full 193.70M DISK 00:01:23 24-AUG-07


Now the new option of algotith ZLIB, it is a little bigger 212mb and took 57 seconds:


RMAN> CONFIGURE COMPRESSION ALGORITHM 'ZLIB';


BS Key Type LV Size Device Type Elapsed Time Completion Time

------- ---- -- ---------- ----------- ------------ ---------------

25 Full 212.40M DISK 00:00:57 24-AUG-07


On this test database it seems to be very good, but it is too small to say, so only in a production one we will be able to say how god it is.


Improved scripting with RMAN substitution variables


Likewise sql*plus now we can use substitution variables inside rman scripts, and with the shell scripts variables our scripts can be written much easier now, let’s check an example:


Rman.cmd


run{

backup database tag full_&1;

crosscheck backup;

delete noprompt expired backup;

delete noprompt obsolete;

}


Rman_shell.sh

rman target / @'/home/oracle/rman.cmd' using $1




./rman_shell.sh prod_28082007


Recovery Manager: Release 11.1.0.6.0 - Production on Tue Aug 28 11:29:13 2007


Copyright (c) 1982, 2007, Oracle. All rights reserved.


connected to target database: TEST (DBID=1928403553)


RMAN> run{

2> backup database tag full_prod_28082007;

3> crosscheck backup;

4> delete noprompt expired backup;

5> delete noprompt obsolete;

6> }

7>

8>

Starting backup at 28-AUG-07

using target database control file instead of recovery catalog

allocated channel: ORA_DISK_1

channel ORA_DISK_1: SID=127 device type=DISK


The substation variable is the same as sql*plus ‘&’, and to specify the variables to be replaces there is the new clause ‘USING’.


Recovery catalog enhancements


When you have a big company with many DBA’s that take care of a set of databases using RMAN with only one centralized Recovery Catalog, that can result in big mistakes because every dba can accidentally interfere on an database that doesn’t belongs to them, now Oracle created the Virtual Private Catalog, that means you can create different user for each database on the catalog and they will only have access to it’s data.


First I’m going to create the catalog:


SQL> create tablespace rman_catalog

2 datafile '/oracle/oradata/test/rman_catalog01.dbf' size 100m;


Tablespace created.


SQL> create user rman_central identified by rman_central

2 default tablespace rman_catalog;


User created.


SQL> grant RECOVERY_CATALOG_OWNER to rman_central;


Grant succeeded.


SQL> ALTER USER RMAN_CENTRAL QUOTA UNLIMITED ON RMAN_CATALOG;


User altered.


rman


Recovery Manager: Release 11.1.0.6.0 - Production on Tue Aug 28 13:54:16 2007


Copyright (c) 1982, 2007, Oracle. All rights reserved.


RMAN> connect catalog rman_central/rman_central@rmancat


connected to recovery catalog database


RMAN> CREATE CATALOG TABLESPACE RMAN_CATALOG;


recovery catalog created



Now we create the catalog owner for the database db1:


SQL> create user rman_db1 identified by rman_db1

2 default tablespace rman_catalog;


User created.


SQL> grant RECOVERY_CATALOG_OWNER to rman_db1;


Grant succeeded.


SQL> alter user rman_db1 quota unlimited on rman_catalog;


User altered.


Granting the user the right to register databases, also I could register the database with the main user RMAN_CENTRAL and then just give the privilege to use it instead of register.

To register:

grant register database to user_name;

To use the virtual catalog for a registered database:

Grant catalog for database_name to user_name;


rman


Recovery Manager: Release 11.1.0.6.0 - Production on Tue Aug 28 14:04:08 2007


Copyright (c) 1982, 2007, Oracle. All rights reserved.


RMAN> connect catalog rman_central/rman_central@rmancat


connected to recovery catalog database


RMAN> grant register database to rman_db1;


Grant succeeded.


Creating the Virtual Catalog:


rman


Recovery Manager: Release 11.1.0.6.0 - Production on Tue Aug 28 14:10:38 2007


Copyright (c) 1982, 2007, Oracle. All rights reserved.


RMAN> connect catalog rman_db1/rman_db1@rmancat


connected to recovery catalog database


RMAN> create virtual catalog;


found eligible base catalog owned by RMAN_CENTRAL

created virtual catalog against base catalog owned by RMAN_CENTRAL


Registering the database in the virtual catalog:


rman target / catalog rman_db1/rman_db1@rmancat


Recovery Manager: Release 11.1.0.6.0 - Production on Tue Aug 28 14:12:32 2007


Copyright (c) 1982, 2007, Oracle. All rights reserved.


connected to target database: TEST (DBID=1928403553)

connected to recovery catalog database


RMAN> register database;


database registered in recovery catalog

starting full resync of recovery catalog

full resync complete


RMAN> report schema;


Report of database schema for database with db_unique_name TEST


List of Permanent Datafiles

===========================

File Size(MB) Tablespace RB segs Datafile Name

---- -------- -------------------- ------- ------------------------

1 710 SYSTEM YES /oracle/oradata/test/system01.dbf

2 708 SYSAUX NO /oracle/oradata/test/sysaux01.dbf

3 10 SQLMAG NO /oracle/oradata/test/sqlmag01.dbf

4 150 USERS NO /oracle/oradata/test/users01.dbf

5 0 TEST_BLOCK NO /oracle/oradata/test/test_block01.dbf

6 100 RMAN_CATALOG NO /oracle/oradata/test/rman_catalog01.dbf

7 100 UNDOTBS2 YES /oracle/oradata/test/undotbs201.dbf


List of Temporary Files

=======================

File Size(MB) Tablespace Maxsize(MB) Tempfile Name

---- -------- -------------------- ----------- --------------------

1 39 TEMP 32767 /oracle/oradata/test/temp01.dbf


Creating another user on the catalog and checking what it can see:


SQL> create user rman_db2 identified by rman_db2

2 default tablespace rman_catalog;


User created.


SQL> grant RECOVERY_CATALOG_OWNER to rman_db1;


Grant succeeded.


SQL> grant RECOVERY_CATALOG_OWNER to rman_db2;


Grant succeeded.


SQL> alter user rman_db2 quota unlimited on rman_catalog;


User altered.


rman


Recovery Manager: Release 11.1.0.6.0 - Production on Tue Aug 28 14:14:16 2007


Copyright (c) 1982, 2007, Oracle. All rights reserved.


RMAN> connect catalog rman_central/rman_central@rmancat


connected to recovery catalog database


RMAN> GRANT REGISTER DATABASE TO rman_db2;


Grant succeeded.


RMAN> exit



Recovery Manager complete.

[oracle@test01 ~]$ rman target / catalog rman_db2/rman_db2@rmancat


Recovery Manager: Release 11.1.0.6.0 - Production on Tue Aug 28 14:16:29 2007


Copyright (c) 1982, 2007, Oracle. All rights reserved.


connected to target database: TEST (DBID=1928403553)

connected to recovery catalog database


RMAN> report schema;


RMAN-00571: ===========================================================

RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============

RMAN-00571: ===========================================================

RMAN-03002: failure of report command at 08/28/2007 14:16:34

RMAN-06428: recovery catalog is not installed


RMAN>


As you can see even connect on the same catalog only the rman_db1 user and the rman_central are able to check information about TEST database.




Rodrigo Righetti

Rodrigo.righetti@gmail.com

http://dbabrain.blogspot.com