首页 » ORACLE 9i-23ai » Troubleshooting RMAN-08137: WARNING: archived log not deleted

Troubleshooting RMAN-08137: WARNING: archived log not deleted

RMAN-08137: WARNING: archived log not deleted, needed for standby or upstream capture process

RMAN不能再正常删除归档日志,它们都将返回RMAN-08137,这导致归档日志文件系统被填满, 最终导致数据库实例hang, 等待online redolog归档。常见于有dataguard或基于redo的逻辑同步环境、集成模式OGG。

When RMAN is used to delete Archived Redo logs, it can report the RMAN-08137 warning under the following conditions;

1, V$ARCHIVED_LOG.NEXT_CHANGE# is GREATER THAN the required SCN
2,V$ARCHIVED_LOG.APPLIED column is NO

原因

1, DATAGUARD standby

rman ‘delete input’ or ‘delete archivelog’ 会检查v$archived_log.APPLIED列,在primary db APPLIED=’NO’的archivelog和后面的archivelog都不允许删除,为了确保standby端已应用不再需要。也有一些情况存在primary v$archived_log.APPLIED未更新,如果确认已都传到standby或已应用,依然primary v$archived_log.APPLIED=’NO’,可以使用delete force先项或OS ,asmcmd 等直接删除后crosscheck更新rman元数据。

The following query of v$archived_log in the primary, you show at least one archivelog with APPLIED=NO:

alter session set nls_date_format='dd-mon-rr hh24:mi:ss';
select recid, dest_id, thread#, sequence#, first_time, completion_time, creator, registrar, archived, applied, deleted, status from v$archived_log where standby_dest='YES' and status='A';

2, some Capture process

当创建Stream/Streams Replication Capture  或CDC 或 register GoldenGate EXTRACT时,Oracle GoldenGate就像Oracle Streams一样,具有防止存档日志在处理之前被删除的机制。当使用集成模式OGG,REGISTER EXTRACT 的 DATABASE 版本还为集成数据提取过程创建了一些内部 Streams 进程。会添加到DBA_CAPTURE,其中DBA_CAPTURE包括列REQUIRED_CHECKPOINT_SCN 和SOURCE_RESETLOGS_SCN。创建V$ DATABASE.MIN_REQUIRED_CAPTURE_CHANGE# 列后,min_required_capture_change# 是系统中所有捕获的所有必需检查点 scns 的最小值。”Capture” 进程还将更新它,然后每6小时更新一次(默认情况下, parameter “_CKPT_RETENTION_CHECK_FREQ”)那是相当长的时间。要修改此行为,必须在OGG 配置文件中进行以下设置:

TRANLOGOPTIONS INTEGRATEDPARAMS(_CKPT_RETENTION_CHECK_FREQ xxx)

xxx 是 MIN_REQUIRED_CAPTURE_CHANGE# 更新之间的秒数。默认值为 21600。MOS 注释 421176.1 中也提到了这一点。

可以使用以下查询读取_CKPT_RETENTION_CHECK_FREQ的当前设置:

SELECT * FROM DBA_CAPTURE_PARAMETERS WHERE PARAMETER = '_CKPT_RETENTION_CHECK_FREQ';

每次更新 SCN 时,都应在 Oracle 警报日志中看到一条注释:

Mon Jul 17 20:05:45 2017
GoldenGate Capture C 1: FIRST scn changed.
scn: 0x0000.00a7cc73

源数据库归档日志必须保留在系统上,直到 Streams 结束使用它们。 这意味着来自由REQUIRED_CHECKPOINT_SCN(在 10g DBA_CAPTURE 视图中)或 MIN_REQUIRED_CAPTURE_CHANGE#(在 11g v$DATABASE 视图中)指示的 SCN 的所有日志文件必须对捕获进程可用。当指示RMAN删除archivelog日志时,它将检查V$ARCHIVED_LOG.NEXT_CHANGE#和V$ARCHIVED_LOG.APPLIED,以查看Oracle“捕获”进程是否需要它。如果DBA_CAPTURE中没有记录,则数据库没有定义任何Capture进程。

RMAN cannot be used to remove the shipped archives. This because, when database is open, RMAN checks dba_capture table:

SELECT min(required_checkpoint_scn)
FROM dba_capture
WHERE capture_type = 'LOCAL';

RMAN 将在此配置中使用的唯一 scn 是 required_checkpoint_scn,它通过调用 dbms_rcvman.getRequiredSCN() 获取它。 如果不执行此过程,则 RMAN 可能会删除所需的归档重做日志。 修复是强制 RMAN 检查:

V$DATABASE.min_required_capture_change# 在应用删除策略之前调用 getRequiredSCN()。

然而,重要的是要注意 V$DATABASE.min_required_capture_change# 不是与来自 dba_capture 的 REQUIRED_CHECKPOINT_SCN 实时同步的。默认情况下,该值每六小时同步一次。 因此,当 RMAN 检查 V$DATABASE.min_required_capture_change# 值时,Extract 不需要的归档日志文件在一段时间内仍将被视为“必需”。 为了确保他们经常同步,我们建议客户使用下面的参数进行提取

TRANLOGOPTIONS INTEGRATEDPARAMS(_CKPT_RETENTION_CHECK_FREQ  NN)

您可以通过查询 V$ARCHIVED_LOG 来检查特定归档日志文件中涵盖的 LSN 号:

THREAD# NUMBER 重做线程号
SEQUENCE# NUMBER 重做日志序列号
FIRST_CHANGE# NUMBER 存档日志中的第一个更改#
NEXT_CHANGE# NUMBER 下一个日志中的第一个更改

如果 V$DATABASE.min_required_capture_change# < NEXT_CHANGE#,那么这个归档日志文件被认为是 GoldenGate 所需要的。

其他查询

/* Running Captures */
select CAPTURE_NAME, STARTUP_TIME, STATE from v$streams_capture;

/* CAPTURE Processes and their SCNs */
SELECT CAPTURE_NAME, CAPTURE_TYPE, STATUS, REQUIRED_CHECKPOINT_SCN, scn_to_timestamp(REQUIRED_CHECKPOINT_SCN) as timestamp FROM DBA_CAPTURE;

/* Minimum SCN required by CAPTURE processing */
select min(required_checkpoint_scn), min(source_resetlogs_scn) from dba_capture where capture_type='LOCAL';

/* Archived Redo Log Next Change Number */
select distinct THREAD#, SEQUENCE#, to_char(NEXT_TIME,'DD/MM/YYYY HH24:MI:SS') NEXT_TIME, APPLIED, NEXT_CHANGE#
from V$ARCHIVED_LOG
where NEXT_CHANGE# > ( select min(required_checkpoint_scn) from dba_capture where capture_type='LOCAL')
order by THREAD#, SEQUENCE#;

SELECT MIN_REQUIRED_CAPTURE_CHANGE#, scn_to_timestamp(MIN_REQUIRED_CAPTURE_CHANGE#) as timestamp FROM V$DATABASE;

 

Database is in MOUNT stage

When the database is in MOUNT stage, view DBA_CAPTURE is not available.
Therefore, RMAN will compare V$ARCHIVED_LOG.NEXT_CHANGE# with V$DATABASE.MIN_REQUIRED_CAPTURE_CHANGE#.

If MIN_REQUIRED_CAPTURE_CHANGE# is NULL, then there is no Oracle Capture, or the creation of the Capture process was incomplete, or it cannot be loaded from the ControlFile.

If there is no Oracle Capture, but RMAN is still reporting this particular RMAN-08137 warning, then the problem is with the Standby Database setup.

Database is OPEN

The required SCN is determined by first querying V$DATABASE.MIN_REQUIRED_CAPTURE_CHANGE#.
If MIN_REQUIRED_CAPTURE_CHANGE# is null, then DBA_CAPTURE.REQUIRED_CHECKPOINT_SCN is queried.
RMAN will then compare V$ARCHIVED_LOG.NEXT_CHANGE# to the required SCN it found.

If there are no records in DBA_CAPTURE, and MIN_REQUIRED_CAPTURE_CHANGE# is null, but RMAN is still reporting this particular RMAN-08137 warning, then the problem is with the Standby Database setup, because there is no Oracle Capture.

 

Case:

-- Dataguard NO GAP

RMAN> DELETE ARCHIVELOG xxx;

archived log file name=/archdest/arch_2_182500_920590168.arc thread=2 sequence=182500
RMAN-08137: WARNING: archived log not deleted, needed for standby or upstream capture process
archived log file name=/archdest/arch_2_182501_920590168.arc thread=2 sequence=182501
RMAN-08137: WARNING: archived log not deleted, needed for standby or upstream capture process
archived log file name=/archdest/arch_2_182502_920590168.arc thread=2 sequence=182502
RMAN-08137: WARNING: archived log not deleted, needed for standby or upstream capture process

--  configure archivelog deletion policy to shipped to all standby;
RMAN> configure archivelog deletion policy  clear;
-- try again not work.

SQL> select * from DBA_CAPTURE;

CAPTURE_NAME                   QUEUE_NAME                     QUEUE_OWNER                    RULE_SET_NAME                  RULE_SET_OWNER                 CAPTURE_USER                    START_SCN STATUS   CAPTURED_SCN APPLIED_SCN USE  FIRST_SCN SOURCE_DATABASE                                                                                                                  SOURCE_DBID SOURCE_RESETLOGS_SCN SOURCE_RESETLOGS_TIME LOGMINER_ID NEGATIVE_RULE_SET_NAME         NEGATIVE_RULE_SET_OWNER        MAX_CHECKPOINT_SCN REQUIRED_CHECKPOINT_SCN LOGFILE_ STATUS_CHANGE_TIM ERROR_NUMBER
------------------------------ ------------------------------ ------------------------------ ------------------------------ ------------------------------ ------------------------------ ---------- -------- ------------ ----------- --- ---------- -------------------------------------------------------------------------------------------------------------------------------- ----------- -------------------- --------------------- ----------- ------------------------------ ------------------------------ ------------------ ----------------------- -------- ----------------- ------------
ERROR_MESSAGE
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
VERSION                                                          CAPTURE_TY LAST_ENQUEUED_SCN CHECKPOINT_RETENTION_TIME START_TIME                                                                  PURPOSE
---------------------------------------------------------------- ---------- ----------------- ------------------------- --------------------------------------------------------------------------- -------------------
OGG$CAP_EXT_F                  OGG$Q_EXT_F                    GGADMIN                        RULESET$_3                     GGADMIN                        GGADMIN                        1.6615E+13 DISABLED                          NO  1.6615E+13 TBCSC.HEBEI.MOBILE.COM                                                                                                                     0                    0                     0           1                                                                                0                       0 IMPLICIT 20191218 00:17:24

                                                                 LOCAL                                               60 18-DEC-19 12.16.51.000000 AM                                                GoldenGate Capture

-- 这存在一条一年前曾经有人尝试注册的capture, 未使用,也未取消。

GGSCI (racdb2) 3> dblogin USERID ggadmin, PASSWORD xxxxxxxxxx
Successfully logged into database.

GGSCI (racdb2 as ggadmin@racdb2) 4> 

GGSCI (racdb2 as ggadmin@racdb2) 4> UNREGISTER EXTRACT EXT_F  DATABASE;

2020-04-27 11:39:04  INFO    OGG-01750  Successfully unregistered EXTRACT EXT_F from database.


SQL> select * from DBA_CAPTURE;
no rows selected

-- RMAN delete work fine.
打赏

对不起,这篇文章暂时关闭评论。