Oracle使用RMAN Advisor恢复current redo丢失故障

本文主要介绍通过RMAN Advisor恢复current redo丢失的故障。
环境说明:
DB:Oracle 11.2.0.4.0OS:Red Hat Enterprise Linux Server release 7.5 (Maipo)
准备环境:
查看归档
SQL> archive log list;Database log mode No Archive ModeAutomatic archival DisabledArchive destination USE_DB_RECOVERY_FILE_DESTOldest online log sequence 12Current log sequence 14
启动归档
[root@cjc-db-01 ~]# mkdir arch[root@cjc-db-01 ~]# chown oracle:oinstall archsqlplus as sysdbaalter system set log_archive_dest_1='location=/arch';alter system set log_archive_format = "cjc_%t_%s_%r.arc" scope=spfile;shutdown immediate;startup mount;alter database archivelog;alter database open;archive log list;alter system switch logfile;
创建测试数据
SQL> conn cjc/***SQL>create table t1(id number,time varchar2(100));insert into t1 values (1, to_char(sysdate, 'yyyy-mm-dd hh24:mi:ss'));insert into t1 values (2, to_char(sysdate, 'yyyy-mm-dd hh24:mi:ss'));insert into t1 values (3, to_char(sysdate, 'yyyy-mm-dd hh24:mi:ss'));commit;col time for a25select * from t1;ID TIME---------- ------------------------- 1 2024-03-28 09:24:31 2 2024-03-28 09:24:39 3 2024-03-28 09:24:45
执行rman全备份
rman target run{allocate channel c1 type disk;allocate channel c2 type disk;backup incremental level = 0 format '/back/rman/rman_level0_%d_%T_%U.bak' database;sql 'alter system archive log current';backup archivelog all format '/back/rman/rman_arch_%d_%T_%U.bak';backup current controlfile format '/back/rman/rman_con_%d_%T_%U.bak';release channel c1;release channel c2;}
备份后插入新数据
insert into t1 values (4, to_char(sysdate, 'yyyy-mm-dd hh24:mi:ss'));insert into t1 values (5, to_char(sysdate, 'yyyy-mm-dd hh24:mi:ss'));commit;
查看redo信息
SQL> col member for a50select a.GROUP#,a.STATUS,b.member from v$log a,v$logfile b where a.group#=b.group#;SQL> GROUP# STATUS MEMBER---------- ---------------- -------------------------------------------------- 3 CURRENT oracle/app/oracle/oradata/cjc/redo03.log 2 INACTIVE oracle/app/oracle/oradata/cjc/redo02.log 1 INACTIVE oracle/app/oracle/oradata/cjc/redo01.log
重命名current redo logfile,模拟故障
[oracle@cjc-db-01 cjc]$ cd oracle/app/oracle/oradata/cjc[oracle@cjc-db-01 cjc]$ mv redo03.log redo03.log.1
继续插入数据,仍可以正常提交事物
insert into t1 values (6, to_char(sysdate, 'yyyy-mm-dd hh24:mi:ss'));insert into t1 values (7, to_char(sysdate, 'yyyy-mm-dd hh24:mi:ss'));commit;
手动杀掉pmon进程,模拟异常宕机
[oracle@cjc-db-01 cjc]$ ps -ef|grep pmonoracle 17623 1 0 12:27 ? 00:00:00 ora_pmon_cjcoracle 18287 11168 0 12:35 pts/3 00:00:00 grep --color=auto pmon[oracle@cjc-db-01 cjc]$ kill -9 17623
重新启动数据库,启动失败
SQL> startupORACLE instance started.Total System Global Area 2071076864 bytesFixed Size 2254784 bytesVariable Size 536873024 bytesDatabase Buffers 1526726656 bytesRedo Buffers 5222400 bytesDatabase mounted.ORA-00313: open failed for members of log group 3 of thread 1ORA-00312: online log 3 thread 1: '/oracle/app/oracle/oradata/cjc/redo03.log'ORA-27037: unable to obtain file statusLinux-x86_64 Error: 2: No such file or directoryAdditional information: 3
查看丢失的是current redo logfile
SQL> col member for a50select a.GROUP#,a.STATUS,b.member from v$log a,v$logfile b where a.group#=b.group#;SQL> GROUP# STATUS MEMBER---------- ---------------- -------------------------------------------------- 1 INACTIVE oracle/app/oracle/oradata/cjc/redo01.log 3 CURRENT oracle/app/oracle/oradata/cjc/redo03.log 2 INACTIVE /oracle/app/oracle/oradata/cjc/redo02.log
GROUP# 3丢失,理论上最多能恢复到 1187045,也就是GROUP# 2的NEXT_CHANGE#
SQL> select GROUP#,SEQUENCE#,FIRST_CHANGE#,NEXT_CHANGE# from v$log order by 1; GROUP# SEQUENCE# FIRST_CHANGE# NEXT_CHANGE#---------- ---------- ------------- ------------ 1 7 1186324 1187037 2 8 1187037 1187045 3 9 1187045 2.8147E+14
使用RMAN Advisor工具进行恢复
查看故障,检测到Redo log group 3丢失
RMAN> list failure;using target database control file instead of recovery catalogList of Database Failures=========================Failure ID Priority Status Time Detected Summary---------- -------- --------- ------------- -------1483 CRITICAL OPEN 28-MAR-24 Redo log group 3 is unavailable1486 HIGH OPEN 28-MAR-24 Redo log file /oracle/app/oracle/oradata/cjc/redo03.log is missing
查看建议,生成Repair script
RMAN> advise failure;List of Database Failures=========================Failure ID Priority Status Time Detected Summary---------- -------- --------- ------------- -------1483 CRITICAL OPEN 28-MAR-24 Redo log group 3 is unavailable1486 HIGH OPEN 28-MAR-24 Redo log file /oracle/app/oracle/oradata/cjc/redo03.log is missinganalyzing automatic repair options; this may take some timeallocated channel: ORA_DISK_1channel ORA_DISK_1: SID=20 device type=DISKanalyzing automatic repair options completeMandatory Manual Actions========================no manual actions availableOptional Manual Actions=======================1. If file /oracle/app/oracle/oradata/cjc/redo03.log was unintentionally renamed or moved, restore itAutomated Repair Options========================Option Repair Description------ ------------------1 Perform incomplete database recovery to SCN 1187045 Strategy: The repair includes point-in-time recovery with some data loss Repair script: /oracle/app/oracle/diag/rdbms/cjc/cjc/hm/reco_3624337172.hm
查看Repair script,需要执行不完全恢复,scn 1187045,和之前查的一致
[oracle@cjc-db-01 cjc]$ cat /oracle/app/oracle/diag/rdbms/cjc/cjc/hm/reco_3624337172.hm # database point-in-time recovery restore database until scn 1187045; recover database until scn 1187045; alter database open resetlogs;
根据建议执行恢复
可以手动恢复,也可以执行repair failure进行恢复
RMAN> restore database until scn 1187045;RMAN> recover database until scn 1187045;RMAN> alter database open resetlogs;
查询数据
SQL> col time for a50SQL> select * from cjc.t1;ID TIME---------- ----------------- 1 2024-03-28 12:30:14 2 2024-03-28 12:30:14 3 2024-03-28 12:30:14
丢失了全备后新增的4条数据。
###chenjuchao 20240328###