Troubleshoot 11.2.0.4 CRS stop fail, ora.asm resource “UNKNOWN”state
前几日朋友一套CRS停不下来,RAC版本是11.2.0.4, 停在关闭ASM阶段, 手动尝试了停资源也不成功,使用-f选项一样失败, 后来发现在这还存在一个bug,简单记录一下。
crsctl stop crs
失败提示crs-2675 stop of ‘ ora.asm’ on xx failed.
crs-5022 stop of resource ‘ora.crsd’ filed: current state is “UNKNOWN”
使用crsctl stop crs -f 同样失败,查看crs res 状态
显示ora.asm状态为UNKNOWN,
# oraagent_grid trace
2018-07-25 07:27:23.706: [ora.DGSYS.dg][2585]{1:7667:25576} [check] DgpAgent::initOcrDgpSet exit }
2018-07-25 07:27:23.707: [ora.DGSYS.dg][2585]{1:7667:25576} [check] DgpAgent::inUseByOcr - OCR is on diskgroup DGSYS
2018-07-25 07:27:23.707: [ora.DGSYS.dg][2585]{1:7667:25576} [check] DgpAgent::runCheck: OCR dg returning OFFLINE
2018-07-25 07:27:23.707: [ AGFW][2057]{1:7667:25576} ora.DGSYS.dg kfora1 1 state changed from: STOPPING to: PLANNED_OFFLINE
2018-07-25 07:27:23.708: [ AGFW][2057]{1:7667:25576} Agent sending last reply for: RESOURCE_STOP[ora.DGSYS.dg kfora1 1] ID 4099:5086977
2018-07-25 07:27:24.677: [ora.DGDATA.dg][1046]{1:7667:25576} [stop] DgpAgent::stop reason is dependency, original reason is system
2018-07-25 07:27:24.682: [ora.DGDATA.dg][1046]{1:7667:25576} [stop] ORA-15032: not all alterations performed
crs: diskgroup "DGDATA" does not exist or is not mounted
2018-07-25 07:27:24.682: [ora.DGDATA.dg][1046]{1:7667:25576} [stop] DgpAgent::stop: error 15032
2018-07-25 07:27:24.682: [ora.DGDATA.dg][1046]{1:7667:25576} [stop] DgpAgent::stop: ignoring err diskgroup is dismounted
2018-07-25 07:27:24.682: [ora.DGDATA.dg][1046]{1:7667:25576} [stop] DgpAgent::stop reason is dependency, original reason is system
2018-07-25 07:27:24.683: [ora.DGDATA.dg][1046]{1:7667:25576} [stop] DgpAgent::getConnxn connected
2018-07-25 07:27:24.684: [ora.DGDATA.dg][1046]{1:7667:25576} [stop] ORA-15032: not all alterations performed
ORA-15001: diskgroup "DGDATA" does not exist or is not mounted
2018-07-25 07:27:24.685: [ora.DGDATA.dg][1046]{1:7667:25576} [stop] DgpAgent::stop: error 15032
2018-07-25 07:27:24.685: [ora.DGDATA.dg][1046]{1:7667:25576} [stop] DgpAgent::stop: ignoring err diskgroup is dismounted
2018-07-25 07:27:24.685: [ora.DGDATA.dg][1046]{1:7667:25576} [stop] DgpAgent::stop reason is dependency, original reason is system
2018-07-25 07:27:24.686: [ora.DGDATA.dg][1046]{1:7667:25576} [stop] DgpAgent::getConnxn connected
2018-07-25 07:27:24.687: [ora.DGDATA.dg][1046]{1:7667:25576} [stop] ORA-15032: not all alterations performed
ORA-15001: diskgroup "DGDATA" does not exist or is not mounted
# ASM alert log
SQL> ALTER DISKGROUP DGSYS DISMOUNT /* asm agent *//* {1:7667:25576} */ Wed Jul 25 07:27:23 2018 NOTE: Instance updated compatible.asm to 11.2.0.0.0 for grp 2 Wed Jul 25 07:27:23 2018 ORA-15032: not all alterations performed ORA-15027: active use of diskgroup "DGSYS" precludes its dismount ERROR: ALTER DISKGROUP DGSYS DISMOUNT /* asm agent *//* {1:7667:25576} */ Wed Jul 25 07:27:23 2018 SQL> ALTER DISKGROUP DGDATA DISMOUNT /* asm agent *//* {1:7667:25576} */ NOTE: cache dismounting (clean) group 1/0x93932725 (DGDATA) NOTE: messaging CKPT to quiesce pins Unix process pid: 1835436, image: oracle@kfora1 (TNS V1-V3) Wed Jul 25 07:27:23 2018 NOTE: LGWR doing clean dismount of group 1 (DGDATA) NOTE: LGWR closing thread 2 of diskgroup 1 (DGDATA) at ABA 37.6567 NOTE: LGWR released thread recovery enqueue Wed Jul 25 07:27:24 2018 kjbdomdet send to inst 2 detach from dom 1, sending detach message to inst 2 Wed Jul 25 07:27:24 2018 NOTE: detached from domain 1 NOTE: cache dismounted group 1/0x93932725 (DGDATA) Wed Jul 25 07:27:24 2018 GMON dismounting group 1 at 8 for pid 29, osid 1835436 NOTE: Disk DGDATA_0000 in mode 0x7f marked for de-assignment NOTE: Disk DGDATA_0001 in mode 0x7f marked for de-assignment NOTE: Disk DGDATA_0002 in mode 0x7f marked for de-assignment NOTE: Disk DGDATA_0003 in mode 0x7f marked for de-assignment SUCCESS: diskgroup DGDATA was dismounted NOTE: cache deleting context for group DGDATA 1/0x93932725 SUCCESS: ALTER DISKGROUP DGDATA DISMOUNT /* asm agent *//* {1:7667:25576} */ SQL> ALTER DISKGROUP DGDATA DISMOUNT /* asm agent *//* {1:7667:25576} */ ORA-15032: not all alterations performed ORA-15001: diskgroup "DGDATA" does not exist or is not mounted ERROR: ALTER DISKGROUP DGDATA DISMOUNT /* asm agent *//* {1:7667:25576} */ SQL> ALTER DISKGROUP DGDATA DISMOUNT /* asm agent *//* {1:7667:25576} */ ORA-15032: not all alterations performed ORA-15001: diskgroup "DGDATA" does not exist or is not mounted ERROR: ALTER DISKGROUP DGDATA DISMOUNT /* asm agent *//* {1:7667:25576} */ SQL> ALTER DISKGROUP DGDATA DISMOUNT /* asm agent *//* {1:7667:25576} */ ORA-15032: not all alterations performed ORA-15001: diskgroup "DGDATA" does not exist or is not mounted ERROR: ALTER DISKGROUP DGDATA DISMOUNT /* asm agent *//* {1:7667:25576} */ SQL> ALTER DISKGROUP DGDATA DISMOUNT /* asm agent *//* {1:7667:25576} */ ORA-15032: not all alterations performed ORA-15001: diskgroup "DGDATA" does not exist or is not mounted ERROR: ALTER DISKGROUP DGDATA DISMOUNT /* asm agent *//* {1:7667:25576} */ SQL> ALTER DISKGROUP DGDATA DISMOUNT /* asm agent *//* {1:7667:25576} */ ORA-15032: not all alterations performed ORA-15001: diskgroup "DGDATA" does not exist or is not mounted ERROR: ALTER DISKGROUP DGDATA DISMOUNT /* asm agent *//* {1:7667:25576} */
Note:
从asm alert log看dgdata 磁盘组一共有6次尝试dismount, 第1次已经成功,后5次都是失败,所以报出了ora-15001错误。
原因:
当前版本是11.2.0.4.4 ,MOS中发现该版本存在一个BUG,在11.2.0.4.8中修复。
The issue was investigated in:
Bug 17816316 – ORA-15001 IN ASM ALERT LOG WHEN STOPPING CRS IN 11.2.0.4 RAC
Which is closed as duplicate of:
Bug 16798862 – FAIL TO START DB SERVICE BECAUSE OF PLS-553
The fix is included in GIPSU 11.2.0.4.8, apply latest GI PSU to fix the problem, or ignore the messages if patch can’t be applied immediately.
可以尝试KILL 进程,但是后来客户重启了OS,启动正常。
对不起,这篇文章暂时关闭评论。