Oracle 11g New Features part 2
Oracle Database Fault Diagnosability Infrastructure.
Every DBA knows pretty well (at least should know) the trace directory structure used to generate the alert_sid.log and all the others traces (BDUMP,CDUMP, UDUMP).
Now with Oracle 11g they’ve created a much more complex structure to trace errors, in first place the directory structure is called ADR(Automatic Diagnostic Repository), and when a critical error occurs it is assigned an incident number and all the trace files (diagnostic data), will be tagged with this number, so database bugs, metadata corruption and process errors will be in the ADR now.
The purpose of the ADR is basically speeding up diagnosing and solving issues as well facilitate the Oracle support.
The ADR has several new technologies that will help out on that:
1 - Automatic capture of diagnostic data upon first failure.
When an error occurs generally we used to have just one trace about it, now Oracle collects information from several processes involved on the issue and store them on the ADR, the data can be accessed and interpreted by a command line tool(ADRCI) or by the Enterprise Manager DBconsole.
2 – Standardized trace formats.
The trace format was standardized for all the database components, making easy to understand and diagnose the issue.
3 – Health checks
When detecting a critical error Oracle can run some health checks to go deeper on the error, and all this data will be added to the ADR. The DBA can manually execute health checks on a regular basis or when required by Oracle Support.
4 – Incident packaging service (IPS) and incident packages
Now we have all our related traces tagged with the incident number, so we can use the IPS to create a package of traces and zip it to send to Oracle support, also you can view and edit this package after you have created it.
5 – Data Recovery Advisor (RMAN utility)
There is an interaction between the Data Recovery Advisor and the health checks, and then you can see data corruption problems and also the advice in how to correct it.
6 – SQL Test Case Builder
One of the biggest issues when dealing with SQL issues is to gather all the information related with the performance problem, now it is going to collect all the information necessary and this can be uploaded to Oracle support where they should be able to reproduce in details your problem.
7 – Incident Flood Control
When we are facing some problems bugs perhaps we can have a flood of traces, with the ADR there an Incident Flood Control that will prevent it to fill up because of this kind of problems.
An Incident key is created based on a problem occurrence.
The ADR is created based on the parameter DIAGNOSTIC_DEST, and the parameters *_DUMP_DEST are deprecated.
ADR Structure:
To check your structure:
SELECT * FROM V$DIAG_INFO
INST_ID NAME VALUE
---------- ------------------------ ------------------------------------------------------------
1 Diag Enabled TRUE
1 ADR Base /oracle
1 ADR Home /oracle/diag/rdbms/test/test
1 Diag Trace /oracle/diag/rdbms/test/test/trace
1 Diag Alert /oracle/diag/rdbms/test/test/alert
1 Diag Incident /oracle/diag/rdbms/test/test/incident
1 Diag Cdump /oracle/diag/rdbms/test/test/cdump
1 Health Monitor /oracle/diag/rdbms/test/test/hm
1 Default Trace File /oracle/diag/rdbms/test/test/trace/test_ora_12295.trc
1 Active Problem Count 4
1 Active Incident Count 11
11 rows selected.
The Alert Log file
There is a new Alert Log file that is on the ALERT folder, it is a XML file, it can be read using the ADRCI tool or the Enterprise manager, but for backward compatibility Oracle still keeps the old text version on the TRACE folder.
new Alert Log:
/oracle/diag/rdbms/test/test/alert/log.xml
<msg time='2007-08-17T13:37:01.521+01:00' org_id='oracle' comp_id='rdbms'
client_id='' type='UNKNOWN' level='16'
module='sqlplus@test01 (TNS V1-V3)' pid='29691'>
<txt>Active process 28072 user 'oracle' program 'oracle@test01'
</txt>
</msg>
<msg time='2007-08-17T13:37:01.521+01:00' org_id='oracle' comp_id='rdbms'
client_id='' type='UNKNOWN' level='16'
module='sqlplus@test01 (TNS V1-V3)' pid='29691'>
<txt>Active process 30002 u
[oracle@test01 alert]$ head log.xml
<msg time='2007-08-10T15:25:16.833+01:00' org_id='oracle' comp_id='rdbms'
msg_id='opistr_real:871:3971575317' type='NOTIFICATION' group='startup'
level='16' pid='10143' version='1'>
<txt>Starting ORACLE instance (normal)
</txt>
</msg>
<msg time='2007-08-10T15:25:16.966+01:00' org_id='oracle' comp_id='rdbms'
msg_id='ksunfy:13399:2937430291' type='NOTIFICATION' group='startup'
level='16' pid='10143'>
<txt>LICENSE_MAX_SESSION = 0
Now the generated trace files *.trc can come together with another file *.trm that is a map to help on the trace navigation.
test_m000_12578.trm
test_m000_12578.trc
When an incident is created all the dump traces will be created on its own directory based on the INCIDENT folder, like this:
/oracle/diag/rdbms/test/test/incident
incdir_12021
incdir_12101
incdir_12349
incdir_12350
incdir_12351
incdir_12352
incdir_13205
incdir_1411
incdir_1467
incdir_217
incdir_218
/oracle/diag/rdbms/test/test/incident/incdir_218
test_ora_18892_i218.trc
test_ora_18892_i218.trm
That is the Incident view on the Enterprise Manager:
It is pretty fast and quick to dig up on the traces of an incident using the Enterprise Manager, another way is using the ADRCI tool.
ADRCI is a very powerful tool that enables you to view all the ADR and Health reports, also you can create the packages and zip them with it with IPS, and like SQL*PLUS you can run scripts or run it in background.
To start it just run adrci, and state help to see the list of commands:
adrci
ADRCI: Release 11.1.0.6.0 - Beta on Thu Aug 23 12:33:37 2007
Copyright (c) 1982, 2007, Oracle. All rights reserved.
ADR base = "/oracle"
adrci> help
HELP [topic]
Available Topics:
CREATE REPORT
ECHO
EXIT
HELP
HOST
IPS
PURGE
RUN
SET BASE
SET BROWSER
SET CONTROL
SET ECHO
SET EDITOR
SET HOMES | HOME | HOMEPATH
SET TERMOUT
SHOW ALERT
SHOW BASE
SHOW CONTROL
SHOW HM_RUN
SHOW HOMES | HOME | HOMEPATH
SHOW INCDIR
SHOW INCIDENT
SHOW PROBLEM
SHOW REPORT
SHOW TRACEFILE
SPOOL
There are other commands intended to be used directly by Oracle, type
"HELP EXTENDED" to see the list
adrci>
Checking the last 3 entries in the alert log:
adrci> show alert -tail 3
2007-08-22 17:22:54.584000 +01:00
Completed: ALTER DATABASE OPEN
2007-08-22 17:27:39.553000 +01:00
Starting background process CJQ0
CJQ0 started with pid=30, OS id=26876
2007-08-22 22:00:00.200000 +01:00
Setting Resource Manager plan SCHEDULER[0x2C0B]:DEFAULT_MAINTENANCE_PLAN via scheduler window
Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN via parameter
2007-08-22 22:00:06.578000 +01:00
Wed Aug 22 22:00:06 2007
Logminer Bld: Lockdown Complete. DB_TXN_SCN is UnwindToSCN (LockdownSCN) is 17789108
2007-08-22 22:00:24.585000 +01:00
Begin automatic SQL Tuning Advisor run for special tuning task "SYS_AUTO_SQL_TUNING_TASK"
2007-08-22 22:01:31.584000 +01:00
End automatic SQL Tuning Advisor run for special tuning task "SYS_AUTO_SQL_TUNING_TASK"
2007-08-22 22:01:52.155000 +01:00
Thread 1 advanced to log sequence 1761
Current log# 3 seq# 1761 mem# 0: /oracle/oradata/test/redo03.log
2007-08-23 00:00:35.913000 +01:00
Thread 1 advanced to log sequence 1762
Current log# 1 seq# 1762 mem# 0: /oracle/oradata/test/redo01.log
2007-08-23 02:00:00.118000 +01:00
Clearing Resource Manager plan via parameter
2007-08-23 10:19:09.405000 +01:00
Thread 1 advanced to log sequence 1763
Current log# 2 seq# 1763 mem# 0: /oracle/oradata/test/redo02.log
Checking the incidents:
adrci> show incident
ADR Home = /oracle/diag/rdbms/test/test:
*************************************************************************
INCIDENT_ID PROBLEM_KEY CREATE_TIME
-------------------- ----------------------------------------------------------- ----------------------------------------
13205 ORA 603 2007-08-22 13:11:36.611960 +01:00
12352 ORA 1578 2007-08-22 14:01:30.585383 +01:00
12351 ORA 600 [krhpfh_03-1208] 2007-08-22 13:11:30.674642 +01:00
12350 ORA 600 [krhpfh_03-1208] 2007-08-22 13:11:28.362289 +01:00
12349 ORA 600 [krhpfh_03-1208] 2007-08-22 13:11:24.058978 +01:00
12101 ORA 600 [krhpfh_03-1208] 2007-08-22 13:11:45.031386 +01:00
12021 ORA 600 [krhpfh_03-1208] 2007-08-22 13:11:39.667875 +01:00
1467 ORA 4031 2007-08-11 07:51:07.343231 +01:00
1411 ORA 4031 2007-08-11 07:51:29.534688 +01:00
218 ORA 4031 2007-08-10 16:45:21.005578 +01:00
217 ORA 4031 2007-08-10 16:30:27.821633 +01:00
11 rows fetched
As you can see there are a lot of thing you can do with this tool, as well you can use the Enterprise Manager, this tool is well documented, so don’t worry when you start to use it.
http://download.oracle.com/docs/cd/B28359_01/server.111/b28319/adrci.htm#BGBBADAB
Health Check Monitor:
It will generate reports of their diagnoses and also recommendations for solving the issues, all will be stored in the ADR.
It can be used in 2 ways:
Reactive – It can run automatically in response to a critical error.
Manual – The DBA manually can use the package DBMS_HM or the DBConsole, Oracle support can ask you to run this checker when working on a SR.
These are the types of Health Checks:
DB Structure Integrity Check – This check is going to verify the whole structure of the database, so if it is in NOMOUNT state only the controlfiles will be checked, when it is in MOUNT or OPEN, then all files are checked.
Data Block Integrity Check - As the name says it is going to check the disk image block for corruption such as checksum failures, head/tail mismatch and logical inconsistencies within the block. You can find information about corrupted block on V$DATABASE_BLOCK_CORRUPTION.
Redo Integrity Check – This one will be looking for the redo logs contents also the archive logs if the database is in Archive log mode.
Undo Segment Integrity Check – This check look for logical undo corruptions that may occur during rollback operations, if one is found it will use PMON and SMON to try to recover the corrupted transaction, if the recovery fails it will store the information on V$CORRUPT_XID_LIST. Most undo corruptions can be resolved by forcing a commit.
Transaction Integrity Check – The same as the Undo Segment, but this one will look for a specific transaction.
Dictionary Integrity Check – This check verify the core dictionary, objects such as tab$, clu$, fet$, uet$, seg$, undo$, ts$, file$, obj$, ind$, icol$, col$, user$, con$, cdef$, ccol$, bootstrap$, objauth$, ugroup$, tsq$, syn$, view$, typed_view$, superobj$, seq$, lob$, coltype$, subcoltype$, ntab$, refcon$, opqtype$, dependency$, access$, viewcon$, icoldep$, dual$, sysauth$, objpriv$, defrole$, and ecol$.
This check will verify relationship between these tables to avoid orphan records for example.
Running it manually:
First of all you can check what kinds of check you can perform, using the view V$HM_CHECK:
SQL> SELECT NAME FROM V$HM_CHECK WHERE INTERNAL_CHECK='N';
NAME
----------------------------------------------------------------
DB Structure Integrity Check
Data Block Integrity Check
Redo Integrity Check
Transaction Integrity Check
Undo Segment Integrity Check
Dictionary Integrity Check
6 rows selected.
Then use the package dbms_hm:
SQL> begin
2 dbms_hm.run_check('Dictionary Integrity Check', 'test_hm_1');
3 end;
4 /
PL/SQL procedure successfully completed.
To view the finds report we can use:
Enterprise Manager
DBMS_HM package
ADRCI tool
RMAN
VIEWS – V$HM_RUN, V$HM_FINDING
Let’s check with the DBMS package:
SET LONG 100000
SET LONGCHUNKSIZE 1000
SET PAGESIZE 1000
SET LINESIZE 512
SELECT DBMS_HM.GET_RUN_REPORT('test_hm_1') FROM DUAL;
SQL> SQL> SQL> SQL>
DBMS_HM.GET_RUN_REPORT('TEST_HM_1')
Basic Run Information
Run Name : test_hm_1
Run Id : 842
Check Name : Dictionary Integrity Check
Mode : MANUAL
Status : COMPLETED
Start Time : 2007-08-23 13:42:10.660476 +01:00
End Time : 2007-08-23 13:42:13.838216 +01:00
Error Encountered : 0
Source Incident Id : 0
Number of Incidents Created : 0
Input Paramters for the Run
TABLE_NAME=ALL_CORE_TABLES
CHECK_MASK=ALL
Run Findings And Recommendations
Finding
Finding Name : Dictionary Inconsistency
Finding ID : 843
Type : FAILURE
Status : OPEN
Priority : CRITICAL
Message : SQL dictionary health check: file$ pk 42 on object FILE$
failed
Message : Damaged rowid is AAAAARAABAAAAByAAC - description: No further
damage description available
Finding
Finding Name : Dictionary Inconsistency
Finding ID : 846
Type : FAILURE
Status : OPEN
Priority : CRITICAL
Message : SQL dictionary health check: dependency$.dobj# fk 126 on
object DEPENDENCY$ failed
Message : Damaged rowid is AAAABnAABAAAOoHABJ - description: No further
damage description available
In my case it found 2 inconsistencies in my dictionary. Just for record I didn’t force this error.
Now using rman to list and get the recommendations to solve the issue:
RMAN> list failure;
using target database control file instead of recovery catalog
List of Database Failures
=========================
Failure ID Priority Status Time Detected Summary
---------- -------- --------- ------------- -------
846 CRITICAL OPEN 23-AUG-07 SQL dictionary health check: dependency$.dobj# fk 126 on object DEPENDENCY$ failed
843 CRITICAL OPEN 23-AUG-07 SQL dictionary health check: file$ pk 42 on object FILE$ failed
RMAN> ADVISE FAILURE;
List of Database Failures
=========================
Failure ID Priority Status Time Detected Summary
---------- -------- --------- ------------- -------
846 CRITICAL OPEN 23-AUG-07 SQL dictionary health check: dependency$.dobj# fk 126 on object DEPENDENCY$ failed
843 CRITICAL OPEN 23-AUG-07 SQL dictionary health check: file$ pk 42 on object FILE$ failed
Mandatory Manual Actions
========================
1. Please contact Oracle Support Services to resolve failure 846: SQL dictionary health check: dependency$.dobj# fk 126 on object DEPENDENCY$ failed
2. Please contact Oracle Support Services to resolve failure 843: SQL dictionary health check: file$ pk 42 on object FILE$ failed
Optional Manual Actions
=======================
no manual actions available
Automated Repair Options
========================
no automatic repair options available
Both cases are orphan objects in the dictionary, sometimes is quite simple to solve that, but in any case don’t touch in the dictionary without the written Oracle Support advise, otherwise you will not be able to blame them if something goes bad, hehe.
But just for tests purposes take a look in what I did to solve this issue:
Checking the mentioned rowed:
SQL> select file# from file$ where rowid = 'AAAAARAABAAAAByAAC';
FILE#
----------
3
SQL> select D_OBJ# from dependency$ where rowid='AAAABnAABAAAOoHABJ';
D_OBJ#
----------
63944
Checking the correlated views to verify it is a orphan record:
SQL> select file_name from dba_data_files where file_id=3;
no rows selected
SQL> select object_name from dba_objects where object_id=63944;
no rows selected
Deleting from the dictionary:
SQL> delete from file$ where file#=3;
1 row deleted.
SQL> delete from dependency$ where D_OBJ#=63944;
1 row deleted.
SQL> commit;
Commit complete.
Flushing the shared pool to clean up the dictionary cache:
SQL> alter system flush shared_pool;
System altered.
Running the health check again:
SQL> begin
dbms_hm.run_check('Dictionary Integrity Check', 'test_hm_2');
end;
/
SET LONG 100000
SET LONGCHUNKSIZE 1000
SET PAGESIZE 1000
SET LINESIZE 512
SELECT DBMS_HM.GET_RUN_REPORT('test_hm_2') FROM DUAL;
SQL> SQL> SQL> SQL>
DBMS_HM.GET_RUN_REPORT('TEST_HM_2')
Basic Run Information
Run Name : test_hm_2
Run Id : 872
Check Name : Dictionary Integrity Check
Mode : MANUAL
Status : COMPLETED
Start Time : 2007-08-23 14:01:15.988291 +01:00
End Time : 2007-08-23 14:01:17.582556 +01:00
Error Encountered : 0
Source Incident Id : 0
Number of Incidents Created : 0
Input Paramters for the Run
TABLE_NAME=ALL_CORE_TABLES
CHECK_MASK=ALL
Run Findings And Recommendations
Backup and Recovery New Features:
Below are the new features described on the Oracle documentation, for the most interesting features I’ve done some examples:
Data Recovery Advisor
Data Recovery Advisor is a built-in tool to automatically diagnose data failures and recommend repairs. You can repair failures manually or request that they be repaired automatically. Data Recovery Advisor supports the LIST FAILURE, CHANGE FAILURE, ADVISE FAILURE, and REPAIR FAILURE commands. This Advisor is based on the Health Check Monitor.
Backup of read-only transportable tablespaces
In previous releases, RMAN could not back up transportable tablespaces until they were made read/write at the destination database. Now RMAN can back up transportable tablespaces when they are not read/write and restore these backups.
Backup and recovery enhancements in Oracle Enterprise Manager
Enterprise Manager includes an interface for Data Recovery Advisor.
Multisection backups
RMAN can back up a single file in parallel by dividing the work among multiple channels. Each channel backs up one file section. You create a multisection backup by specifying SECTION SIZE on the BACKUP command. Restoring a multisection backup in parallel is automatic and requires no option.
You can parallelize validations of a file with VALIDATE ... SECTION SIZE.
Undo optimization
The BACKUP command does not back up undo that is not needed for recovery of a backup. Undo is not needed if it was generated for a transaction that has already committed. This undo can represent the majority of undo in the database.
This undo optimization is built-in RMAN behavior and can not be disabled.
Improved block media recovery performance
When performing block media recovery, RMAN automatically searches the flashback logs, if they are available, for the required blocks before searching backups. Using blocks from the flashback logs can significantly improve block media recovery performance.
Improved block corruption detection
Several database components and utilities, including RMAN, can now detect a corrupt block and record it in V$DATABASE_BLOCK_CORRUPTION. When instance recovery detects a corrupt block, it records it in this view automatically. Oracle Database automatically updates this view when block corruptions are detected or repaired. The VALIDATE command is enhanced with many new options such as VALIDATE ... BLOCK and VALIDATE DATABASE.
Faster backup compression
In addition to the existing BZIP2 algorithm for binary compression of backups, RMAN also supports the ZLIB algorithm. ZLIB runs faster than BZIP2, but produces larger files. You can use the CONFIGURE COMPRESSION ALGORITHM command to choose between BZIP2 (default) and ZLIB for RMAN backups.
Improved scripting with RMAN substitution variables
You can create RMAN command files and stored scripts that accept user input at runtime. Thus, backup scripts can use RMAN substitution variables for tags, filenames, restore point names, and so on.
Improved media recovery performance for databases on SMP systems
Media recovery of databases on symmetric multiprocessing (SMP) computers is now faster. The performance improvements include the following:
More parallelism
More efficient asynchronous redo read, parse, and apply
Fewer synchronization points in the parallel apply algorithm
The media recovery checkpoint at a redo log boundary no longer blocks the apply of the next log
No configuration is necessary, although you can use new parallel recovery wait events for tuning if the default apply rate is not satisfactory.
Backup failover for archived redo logs in the flash recovery area
When backing up archived redo log files located in the flash recovery area, RMAN can fail over to archiving destinations outside the recovery area. RMAN can use an intact copy of an archived log in an alternative location to continue writing backups when a log in the recovery area is missing or corrupted.
Archived log deletion policy enhancements
When you CONFIGURE an archived log deletion policy, the configuration applies to all archiving destinations, including the flash recovery area. Both BACKUP ... DELETE INPUT and DELETE ... ARCHIVELOG obey this configuration, as does the flash recovery area. You can also CONFIGURE an archived redo log deletion policy so that logs are eligible for deletion only after being applied to or transferred to standby database destinations. You can set the policy for mandatory standby destinations only, or for any standby destinations.
E.g.:
CONFIGURE ARCHIVELOG DELETION POLICY TO BACKED UP 2 TIMES TO DISK;
So in this case, the delete will only work for the archives that were backup at least 2 times.
Improved handling of long-term backups
You can create a long-term or archival backup with BACKUP ... KEEP that retains only the archived log files needed to make the backup consistent.
Together with this concept it brings the idea of restore points, that is basically a database point in time recover that you can created to be guaranteed using restore points.
E.g.:
RMAN> backup database format '/oracle/keep%u'
2> tag before_patch_10203
3> keep until time 'sysdate+3'
4> restore point bef_patch_10203;
Starting backup at 28-AUG-07
current log archived
using channel ORA_DISK_1
backup will be obsolete on date 31-AUG-07
archived logs required to recover from this backup will be backed up
channel ORA_DISK_1: starting compressed full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
….
piece handle=/oracle/keep2piqh08t tag=BEFORE_PATCH_10203 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 28-AUG-07
RMAN> list restore point all;
SCN RSP Time Type Time Name
---------------- --------- ---------- --------- ----
18443737 28-AUG-07 BEF_PATCH_10203
Then if you decide to restore to this restore point you just need to use:
RMAN> restore database to restore point 'BEF_PATCH_10203';
RMAN> recover database to restore point 'BEF_PATCH_10203';
We could perform that before, using other methods, but this way I think you can let the process more clear and easy to execute.
Recovery catalog enhancements
The owner of a recovery catalog can GRANT or REVOKE access to a subset of the catalog to other database users in the same recovery catalog database. This subset is called a virtual private catalog. You can also use the IMPORT CATALOG command to merge one recovery catalog (or metadata for specific databases in the catalog) into another recovery catalog.
The tests for the below topis will be on the part 3 document.
Improved integration with Data Guard
You can now set persistent RMAN configurations for a primary or physical standby database when RMAN is not connected as TARGET to the database. RMAN works seamlessly on all databases in the Data Guard environment, enabling you to use backups made on one database for restore and recovery on another database. The same recovery catalog can manage metadata for all primary and standby databases.
Network-enabled database duplication without backups
You can use the DUPLICATE command to create a duplicate database or physical standby database over the network without a need for pre-existing database backups. This form of duplication is called active database duplication.
Block change tracking support for standby databases
You can enable block change tracking on a physical standby database. When you back up the standby database, RMAN can use the block change tracking file to quickly identify the blocks that changed since the last incremental backup.
Integration with VSS-enabled applications
The Oracle VSS writer is integrated with applications that use the Volume Shadow Copy Service (VSS) infrastructure on Windows. You can use VSS-enabled software and storage systems to back up and restore an Oracle database. A key benefit is the ability to make a shadow copy of an open database.
Lost write detection
You can enable the DB_LOST_WRITE_PROTECT initialization parameter to detect a lost write during managed recovery of a standby database or media recovery of a primary database. Lost write detection is disabled by default.
Oracle Flashback Transaction Backout
You can reverse a transaction. Oracle Database determines the dependencies between transactions and in effect creates a compensating transaction that reverses the unwanted changes. The database rewinds to a state as if the transaction, and any transactions that could be dependent on it, never occurred.
Flashback data archive
A flashback data archive enables the database to automatically track and store all transactional changes to a table for the duration of its lifetime. Thus, you do not need to build this functionality into database applications.
A flashback data archive is especially useful for compliance, audit reports, data analysis, and DSS (Decision Support Systems). You can use some of the logical flashback features with a flashback data archive to access data from far in the past.
Data Recovery Advisor
This advisory was created to automatically detect and provide repair options that will be executed following the user request.
According to Oracle the Recovery Advisor can detects and repair data failures before any Oracle process signals an error.
We can use it both with GUI interface that is the Enterprise Manager or the RMAN command line.
The RMAN commands to monitor and use the advisor are:
LIST FAILURE, ADVISE FAILURE, REPAIR FAILURE, and CHANGE FAILURE.
Before start my tests, I did a normal backup using BACKUP DATABASE command:
RMAN> backup database;
Starting backup at 22-AUG-07
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=120 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00001 name=/oracle/oradata/test/system01.dbf
input datafile file number=00002 name=/oracle/oradata/test/sysaux01.dbf
input datafile file number=00004 name=/oracle/oradata/test/users01.dbf
input datafile file number=00006 name=/oracle/oradata/test/sqlmag01.dbf
input datafile file number=00007 name=/oracle/oradata/test/undotbs201.dbf
input datafile file number=00005 name=/oracle/oradata/test/test_block01.dbf
channel ORA_DISK_1: starting piece 1 at 22-AUG-07
channel ORA_DISK_1: finished piece 1 at 22-AUG-07
piece handle=/oracle/flash_recovery_area/TEST/backupset/2007_08_22/o1_mf_nnndf_TAG20070822T123733_3dr80cqp_.bkp tag=TAG20070822T123733 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:45
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
including current control file in backup set
including current SPFILE in backup set
channel ORA_DISK_1: starting piece 1 at 22-AUG-07
channel ORA_DISK_1: finished piece 1 at 22-AUG-07
piece handle=/oracle/flash_recovery_area/TEST/backupset/2007_08_22/o1_mf_ncsnf_TAG20070822T123733_3dr820wn_.bkp tag=TAG20070822T123733 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 22-AUG-07
Then I’m going to corrupt a datafile using the VI editor, I just edit it in the middle and deleted a row:
Vi test_block01.dbf
After waiting for a long time, also issuing some redo logs switch and checkpoints manually Oracle didn’t discovered anything:
RMAN> list failure;
no failures found that match specification
So I’ve used the VALIDATE command:
RMAN> validate database;
Starting validate at 22-AUG-07
using channel ORA_DISK_1
channel ORA_DISK_1: starting validation of datafile
channel ORA_DISK_1: specifying datafile(s) for validation
input datafile file number=00001 name=/oracle/oradata/test/system01.dbf
input datafile file number=00002 name=/oracle/oradata/test/sysaux01.dbf
input datafile file number=00004 name=/oracle/oradata/test/users01.dbf
input datafile file number=00006 name=/oracle/oradata/test/sqlmag01.dbf
input datafile file number=00007 name=/oracle/oradata/test/undotbs201.dbf
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of validate command on ORA_DISK_1 channel at 08/22/2007 13:49:48
ORA-01122: database file 5 failed verification check
ORA-01110: data file 5: '/oracle/oradata/test/test_block01.dbf'
ORA-01565: error in identifying file '/oracle/oradata/test/test_block01.dbf'
ORA-27046: file size is not a multiple of logical block size
Additional information: 1
Then:
RMAN> list failure;
List of Database Failures
=========================
Failure ID Priority Status Time Detected Summary
---------- -------- --------- ------------- -------
262 HIGH OPEN 22-AUG-07 One or more non-system datafiles are corrupt
Now the advisory got the corrupted datafile, and to check out it suggestions we use ADVISE FAILURE:
RMAN> advise failure;
List of Database Failures
=========================
Failure ID Priority Status Time Detected Summary
---------- -------- --------- ------------- -------
262 HIGH OPEN 22-AUG-07 One or more non-system datafiles are corrupt
analyzing automatic repair options; this may take some time
using channel ORA_DISK_1
analyzing automatic repair options complete
Mandatory Manual Actions
========================
no manual actions available
Optional Manual Actions
=======================
no manual actions available
Automated Repair Options
========================
Option Repair Description
------ ------------------
1 Restore and recover datafile 5
Strategy: The repair includes complete media recovery with no data loss
Repair script: /oracle/diag/rdbms/test/test/hm/reco_3072078661.hm
It creates a script that can be executed manually or through the command REPAIR FAILURE, check the content of the file:
cat /oracle/diag/rdbms/test/test/hm/reco_3072078661.hm
# restore and recover datafile
sql 'alter database datafile 5 offline';
restore datafile 5;
recover datafile 5;
sql 'alter database datafile 5 online';
For this case of corruption the solutions it is really this one, the datafile is completed corrupted so a complete restore of it is necessary.
RMAN> repair failure;
Strategy: The repair includes complete media recovery with no data loss
Repair script: /oracle/diag/rdbms/test/test/hm/reco_3072078661.hm
contents of repair script:
# restore and recover datafile
sql 'alter database datafile 5 offline';
restore datafile 5;
recover datafile 5;
sql 'alter database datafile 5 online';
Do you really want to execute the above repair (enter YES or NO)? yes
executing repair script
sql statement: alter database datafile 5 offline
Starting restore at 22-AUG-07
using channel ORA_DISK_1
channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00005 to /oracle/oradata/test/test_block01.dbf
channel ORA_DISK_1: reading from backup piece /oracle/flash_recovery_area/TEST/backupset/2007_08_22/o1_mf_nnndf_TAG20070822T123733_3dr80cqp_.bkp
channel ORA_DISK_1: piece handle=/oracle/flash_recovery_area/TEST/backupset/2007_08_22/o1_mf_nnndf_TAG20070822T123733_3dr80cqp_.bkp tag=TAG20070822T123733
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
Finished restore at 22-AUG-07
Starting recover at 22-AUG-07
using channel ORA_DISK_1
starting media recovery
media recovery complete, elapsed time: 00:00:01
Finished recover at 22-AUG-07
sql statement: alter database datafile 5 online
repair failure complete
RMAN> list failure;
no failures found that match specification
Now I’m going to do another test, deleting one of my redo log files and after some switchs I got this on the list failure:
RMAN> list failure;
List of Database Failures
=========================
Failure ID Priority Status Time Detected Summary
---------- -------- --------- ------------- -------
478 CRITICAL OPEN 22-AUG-07 Redo log group 1 is unavailable
481 HIGH OPEN 22-AUG-07 Redo log file /oracle/oradata/test/redo01.log is missing
RMAN> advise failure;
List of Database Failures
=========================
Failure ID Priority Status Time Detected Summary
---------- -------- --------- ------------- -------
478 CRITICAL OPEN 22-AUG-07 Redo log group 1 is unavailable
481 HIGH OPEN 22-AUG-07 Redo log file /oracle/oradata/test/redo01.log is missing
analyzing automatic repair options; this may take some time
using channel ORA_DISK_1
analyzing automatic repair options complete
Mandatory Manual Actions
========================
no manual actions available
Optional Manual Actions
=======================
1. If file /oracle/oradata/test/redo01.log was unintentionally renamed or moved, restore it
Automated Repair Options
========================
Option Repair Description
------ ------------------
1 Clear unarchived redo log group 1
Strategy: The repair includes complete media recovery with no data loss
Repair script: /oracle/diag/rdbms/test/test/hm/reco_3529151602.hm
Actually it creates 2 scripts check below:
cat /oracle/diag/rdbms/test/test/hm/reco_3529151602.hm
# clear unarchived log group
sql "begin sys.dbms_ir.execsqlscript(filename => ''/oracle/diag/rdbms/test/test/hm/reco_2584336171.hm'' ); end;";
cat /oracle/diag/rdbms/test/test/hm/reco_2584336171.hm
begin
/*Clear the unarchived Log Group*/
execute immediate 'ALTER DATABASE CLEAR UNARCHIVED LOGFILE GROUP 1';
end;
Here is the repair:
RMAN> repair failure;
Strategy: The repair includes complete media recovery with no data loss
Repair script: /oracle/diag/rdbms/test/test/hm/reco_3529151602.hm
contents of repair script:
# clear unarchived log group
sql "begin sys.dbms_ir.execsqlscript(filename => ''/oracle/diag/rdbms/test/test/hm/reco_2584336171.hm'' ); end;";
Do you really want to execute the above repair (enter YES or NO)? yes
executing repair script
sql statement: begin sys.dbms_ir.execsqlscript(filename => ''/oracle/diag/rdbms/test/test/hm/reco_2584336171.hm'' ); end;
repair failure complete
And to finish this item I’m going to test a block recover also I will use this to show a little bit about ORACLE BBED – Block Editor, but this is an ORACLE INTERNAL TOOL, so if you use it Oracle will not support if you screw up your database, this is just for test:
To install it and to get some tips look at my blog http://dbabrain.blogspot.com/
SQL> create table obj
2 as
3 select object_id, object_name from dba_objects
4 where rownum < 599;
Table created.
Backup the database now:
SQL> select header_file, header_block, blocks
2 from dba_segments
3 where segment_name='OBJ';
HEADER_FILE HEADER_BLOCK BLOCKS
----------- ------------ ----------
4 11 8
SQL> select dbms_rowid.rowid_block_number(rowid) block, COUNT(1) Rows_per_block
from obj
GROUP BY dbms_rowid.rowid_block_number(rowid) ; 2 3
BLOCK ROWS_PER_BLOCK
---------- --------------
12 299
16 299
SQL> select file_name from dba_data_files where file_id=4;
FILE_NAME
---------------------------------------------------------------------------
/oracle/oradata/test/users01.dbf
I’m going to corrupt the block: 16
BBED: Release 2.0.0.0.0 - Limited Production on Fri Aug 24 09:53:45 2007
Copyright (c) 1982, 2007, Oracle. All rights reserved.
************* !!! For Oracle Internal Use only !!! ***************
BBED> set filename '/oracle/oradata/test/users01.dbf'
FILENAME /oracle/oradata/test/users01.dbf
BBED> set listfile '/oracle/bbed.log'
LISTFILE /oracle/bbed.log
BBED> set mode edit
MODE Edit
BBED> show
FILE# 0
BLOCK# 1
OFFSET 0
DBA 0x00000000 (0 0,1)
FILENAME /oracle/oradata/test/users01.dbf
BIFILE bifile.bbd
LISTFILE /oracle/bbed.log
BLOCKSIZE 8192
MODE Edit
EDIT Unrecoverable
IBASE Dec
OBASE Dec
WIDTH 80
COUNT 512
LOGFILE log.bbd
SPOOL No
Corrupting the blocking copying another one over it:
BBED> set offset 0
OFFSET 0
BBED> copy dba 4,10 to dba 4,16
File: /oracle/oradata/test/users01.dbf (4)
Block: 16 Offsets: 0 to 511 Dba:0x01000010
------------------------------------------------------------------------
21a20000 0a000001 e1c21201 00000204 ff990000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 0b000001 01000000 01000000 00000000 00000000
Bouncing the database, then try to select the obj table:
SQL> conn sqlmag/sqlmag
Connected.
SQL> select count(1) from obj;
select count(1) from obj
*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 4, block # 16)
ORA-01110: data file 4: '/oracle/oradata/test/users01.dbf'
Now we go to RMAN and let’s see what it knows:
RMAN> list failure;
using target database control file instead of recovery catalog
List of Database Failures
=========================
Failure ID Priority Status Time Detected Summary
---------- -------- --------- ------------- -------
1616 HIGH OPEN 24-AUG-07 Datafile 4: '/oracle/oradata/test/users01.dbf' contains one or more corrupt blocks
RMAN> advise failure;
List of Database Failures
=========================
Failure ID Priority Status Time Detected Summary
---------- -------- --------- ------------- -------
1616 HIGH OPEN 24-AUG-07 Datafile 4: '/oracle/oradata/test/users01.dbf' contains one or more corrupt blocks
analyzing automatic repair options; this may take some time
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=135 device type=DISK
analyzing automatic repair options complete
Mandatory Manual Actions
========================
no manual actions available
Optional Manual Actions
=======================
no manual actions available
Automated Repair Options
========================
Option Repair Description
------ ------------------
1 Perform block media recovery of block 16 in file 4
Strategy: The repair includes complete media recovery with no data loss
Repair script: /oracle/diag/rdbms/test/test/hm/reco_1747659741.hm
cat /oracle/diag/rdbms/test/test/hm/reco_1747659741.hm
# block media recovery
recover datafile 4 block 16;
RMAN> repair failure;
Strategy: The repair includes complete media recovery with no data loss
Repair script: /oracle/diag/rdbms/test/test/hm/reco_1747659741.hm
contents of repair script:
# block media recovery
recover datafile 4 block 16;
Do you really want to execute the above repair (enter YES or NO)? yes
executing repair script
Starting recover at 24-AUG-07
using channel ORA_DISK_1
searching flashback logs for block images until SCN 18006876
finished flashback log search, restored 1 blocks
starting media recovery
media recovery complete, elapsed time: 00:00:03
Finished recover at 24-AUG-07
repair failure complete
SQL> select count(1) from obj;
COUNT(1)
----------
598
So this advisor seems to be pretty good mainly for those DBA’s that are starting to use the RMAN, and don’t forget BBED is a very dangerous tool, don’t play with it in a production database never.
There are a lot of others things you can do such as change the priority of the failures, look the details of a specific failure or repair a specific failure, but I think you got the idea with these examples, let’s move forward.
Multisection Backup
Originally RMAN running in parallel means different channel backing up different files at the same time now you can have different channels backing up pieces of the same file at the same time.
RMAN> CONFIGURE DEVICE TYPE DISK PARALLELISM 4 BACKUP TYPE TO BACKUPSET;
new RMAN configuration parameters:
CONFIGURE DEVICE TYPE DISK PARALLELISM 4 BACKUP TYPE TO BACKUPSET;
new RMAN configuration parameters are successfully stored
RMAN> backup SECTION SIZE 50m tablespace users;
Starting backup at 23-AUG-07
using channel ORA_DISK_1
using channel ORA_DISK_2
using channel ORA_DISK_3
using channel ORA_DISK_4
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00004 name=/oracle/oradata/test/users01.dbf
backing up blocks 1 through 6400
channel ORA_DISK_1: starting piece 1 at 23-AUG-07
channel ORA_DISK_2: starting full datafile backup set
channel ORA_DISK_2: specifying datafile(s) in backup set
input datafile file number=00004 name=/oracle/oradata/test/users01.dbf
backing up blocks 6401 through 12800
channel ORA_DISK_2: starting piece 2 at 23-AUG-07
channel ORA_DISK_3: starting full datafile backup set
channel ORA_DISK_3: specifying datafile(s) in backup set
input datafile file number=00004 name=/oracle/oradata/test/users01.dbf
backing up blocks 12801 through 19200
channel ORA_DISK_3: starting piece 3 at 23-AUG-07
channel ORA_DISK_1: finished piece 1 at 23-AUG-07
piece handle=/oracle/flash_recovery_area/TEST/backupset/2007_08_23/o1_mf_nnndf_TAG20070823T153529_3dv6ssy7_.bkp tag=TAG20070823T153529 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:14
channel ORA_DISK_2: finished piece 2 at 23-AUG-07
piece handle=/oracle/flash_recovery_area/TEST/backupset/2007_08_23/o1_mf_nnndf_TAG20070823T153529_3dv6t07o_.bkp tag=TAG20070823T153529 comment=NONE
channel ORA_DISK_2: backup set complete, elapsed time: 00:00:07
channel ORA_DISK_3: finished piece 3 at 23-AUG-07
piece handle=/oracle/flash_recovery_area/TEST/backupset/2007_08_23/o1_mf_nnndf_TAG20070823T153529_3dv6t5dz_.bkp tag=TAG20070823T153529 comment=NONE
channel ORA_DISK_3: backup set complete, elapsed time: 00:00:01
Finished backup at 23-AUG-07
As you can see it created in parallel 3 files that were enough to handle the section size I’ve specified.
This is very useful when we have to deal with big datafiles.
Faster backup compression
A normal backup without compression has 1.07gb and took 44 seconds to complete:
BS Key Type LV Size Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ ---------------
19 Full 1.07G DISK 00:00:44 24-AUG-07
With the default compression algorithm BZIP2 it has 193mb and took 1 minute and 23 seconds to complete:
RMAN> CONFIGURE DEVICE TYPE DISK BACKUP TYPE TO COMPRESSED BACKUPSET ;
BS Key Type LV Size Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ ---------------
21 Full 193.70M DISK 00:01:23 24-AUG-07
Now the new option of algotith ZLIB, it is a little bigger 212mb and took 57 seconds:
RMAN> CONFIGURE COMPRESSION ALGORITHM 'ZLIB';
BS Key Type LV Size Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ ---------------
25 Full 212.40M DISK 00:00:57 24-AUG-07
On this test database it seems to be very good, but it is too small to say, so only in a production one we will be able to say how god it is.
Improved scripting with RMAN substitution variables
Likewise sql*plus now we can use substitution variables inside rman scripts, and with the shell scripts variables our scripts can be written much easier now, let’s check an example:
Rman.cmd
run{
backup database tag full_&1;
crosscheck backup;
delete noprompt expired backup;
delete noprompt obsolete;
}
Rman_shell.sh
rman target / @'/home/oracle/rman.cmd' using $1
./rman_shell.sh prod_28082007
Recovery Manager: Release 11.1.0.6.0 - Production on Tue Aug 28 11:29:13 2007
Copyright (c) 1982, 2007, Oracle. All rights reserved.
connected to target database: TEST (DBID=1928403553)
RMAN> run{
2> backup database tag full_prod_28082007;
3> crosscheck backup;
4> delete noprompt expired backup;
5> delete noprompt obsolete;
6> }
7>
8>
Starting backup at 28-AUG-07
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=127 device type=DISK
The substation variable is the same as sql*plus ‘&’, and to specify the variables to be replaces there is the new clause ‘USING’.
Recovery catalog enhancements
When you have a big company with many DBA’s that take care of a set of databases using RMAN with only one centralized Recovery Catalog, that can result in big mistakes because every dba can accidentally interfere on an database that doesn’t belongs to them, now Oracle created the Virtual Private Catalog, that means you can create different user for each database on the catalog and they will only have access to it’s data.
First I’m going to create the catalog:
SQL> create tablespace rman_catalog
2 datafile '/oracle/oradata/test/rman_catalog01.dbf' size 100m;
Tablespace created.
SQL> create user rman_central identified by rman_central
2 default tablespace rman_catalog;
User created.
SQL> grant RECOVERY_CATALOG_OWNER to rman_central;
Grant succeeded.
SQL> ALTER USER RMAN_CENTRAL QUOTA UNLIMITED ON RMAN_CATALOG;
User altered.
rman
Recovery Manager: Release 11.1.0.6.0 - Production on Tue Aug 28 13:54:16 2007
Copyright (c) 1982, 2007, Oracle. All rights reserved.
RMAN> connect catalog rman_central/rman_central@rmancat
connected to recovery catalog database
RMAN> CREATE CATALOG TABLESPACE RMAN_CATALOG;
recovery catalog created
Now we create the catalog owner for the database db1:
SQL> create user rman_db1 identified by rman_db1
2 default tablespace rman_catalog;
User created.
SQL> grant RECOVERY_CATALOG_OWNER to rman_db1;
Grant succeeded.
SQL> alter user rman_db1 quota unlimited on rman_catalog;
User altered.
Granting the user the right to register databases, also I could register the database with the main user RMAN_CENTRAL and then just give the privilege to use it instead of register.
To register:
grant register database to user_name;
To use the virtual catalog for a registered database:
Grant catalog for database_name to user_name;
rman
Recovery Manager: Release 11.1.0.6.0 - Production on Tue Aug 28 14:04:08 2007
Copyright (c) 1982, 2007, Oracle. All rights reserved.
RMAN> connect catalog rman_central/rman_central@rmancat
connected to recovery catalog database
RMAN> grant register database to rman_db1;
Grant succeeded.
Creating the Virtual Catalog:
rman
Recovery Manager: Release 11.1.0.6.0 - Production on Tue Aug 28 14:10:38 2007
Copyright (c) 1982, 2007, Oracle. All rights reserved.
RMAN> connect catalog rman_db1/rman_db1@rmancat
connected to recovery catalog database
RMAN> create virtual catalog;
found eligible base catalog owned by RMAN_CENTRAL
created virtual catalog against base catalog owned by RMAN_CENTRAL
Registering the database in the virtual catalog:
rman target / catalog rman_db1/rman_db1@rmancat
Recovery Manager: Release 11.1.0.6.0 - Production on Tue Aug 28 14:12:32 2007
Copyright (c) 1982, 2007, Oracle. All rights reserved.
connected to target database: TEST (DBID=1928403553)
connected to recovery catalog database
RMAN> register database;
database registered in recovery catalog
starting full resync of recovery catalog
full resync complete
RMAN> report schema;
Report of database schema for database with db_unique_name TEST
List of Permanent Datafiles
===========================
File Size(MB) Tablespace RB segs Datafile Name
---- -------- -------------------- ------- ------------------------
1 710 SYSTEM YES /oracle/oradata/test/system01.dbf
2 708 SYSAUX NO /oracle/oradata/test/sysaux01.dbf
3 10 SQLMAG NO /oracle/oradata/test/sqlmag01.dbf
4 150 USERS NO /oracle/oradata/test/users01.dbf
5 0 TEST_BLOCK NO /oracle/oradata/test/test_block01.dbf
6 100 RMAN_CATALOG NO /oracle/oradata/test/rman_catalog01.dbf
7 100 UNDOTBS2 YES /oracle/oradata/test/undotbs201.dbf
List of Temporary Files
=======================
File Size(MB) Tablespace Maxsize(MB) Tempfile Name
---- -------- -------------------- ----------- --------------------
1 39 TEMP 32767 /oracle/oradata/test/temp01.dbf
Creating another user on the catalog and checking what it can see:
SQL> create user rman_db2 identified by rman_db2
2 default tablespace rman_catalog;
User created.
SQL> grant RECOVERY_CATALOG_OWNER to rman_db1;
Grant succeeded.
SQL> grant RECOVERY_CATALOG_OWNER to rman_db2;
Grant succeeded.
SQL> alter user rman_db2 quota unlimited on rman_catalog;
User altered.
rman
Recovery Manager: Release 11.1.0.6.0 - Production on Tue Aug 28 14:14:16 2007
Copyright (c) 1982, 2007, Oracle. All rights reserved.
RMAN> connect catalog rman_central/rman_central@rmancat
connected to recovery catalog database
RMAN> GRANT REGISTER DATABASE TO rman_db2;
Grant succeeded.
RMAN> exit
Recovery Manager complete.
[oracle@test01 ~]$ rman target / catalog rman_db2/rman_db2@rmancat
Recovery Manager: Release 11.1.0.6.0 - Production on Tue Aug 28 14:16:29 2007
Copyright (c) 1982, 2007, Oracle. All rights reserved.
connected to target database: TEST (DBID=1928403553)
connected to recovery catalog database
RMAN> report schema;
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of report command at 08/28/2007 14:16:34
RMAN-06428: recovery catalog is not installed
RMAN>
As you can see even connect on the same catalog only the rman_db1 user and the rman_central are able to check information about TEST database.
Rodrigo Righetti
http://dbabrain.blogspot.com