首页 » ORACLE 9i-23ai » Troubleshooting Oracle a node CRS (asm resource) start fail with “CRS-5019” error after OS reboot
Troubleshooting Oracle a node CRS (asm resource) start fail with “CRS-5019” error after OS reboot
客户一套ORACLE 3 node RAC, 因为需要节点3 主机停机维护, 重启后CRS无法启动, 其它两个节点node 1 ,node2 运行正常。 node 3启动过程中 ora.asm 资源启动hang ,等待enq: dd – contention , 简单记录分析步骤。
GI alert log
2024-08-10 05:25:26.261: [cssd(62600)]CRS-1707:Lease acquisition for node rac3 number 2 completed 2024-08-10 05:25:27.553: [cssd(62600)]CRS-1605:CSSD voting file is online: /dev/mapper/grid03; details in /u01/product/grid/log/rac3/cssd/ocssd.log. 2024-08-10 05:25:27.559: [cssd(62600)]CRS-1605:CSSD voting file is online: /dev/mapper/grid02; details in /u01/product/grid/log/rac3/cssd/ocssd.log. 2024-08-10 05:25:27.568: [cssd(62600)]CRS-1605:CSSD voting file is online: /dev/mapper/grid01; details in /u01/product/grid/log/rac3/cssd/ocssd.log. 2024-08-10 05:25:32.109: [cssd(62600)]CRS-1601:CSSD Reconfiguration complete. Active nodes are rac3 rac1 rac2 . ... [ohasd(62261)]CRS-2767:Resource state recovery not attempted for 'ora.diskmon' as its target state is OFFLINE 2024-08-10 05:25:40.252: [ctssd(63018)]CRS-2408:The clock on host rac01 has been updated by the Cluster Time Synchronization Service to be synchronous with the mean cluster time. [client(63124)]CRS-10001:10-Aug-24 05:25 ACFS-9391: Checking for existing ADVM/ACFS installation. [client(63129)]CRS-10001:10-Aug-24 05:25 ACFS-9392: Validating ADVM/ACFS installation files for operating system. [client(63131)]CRS-10001:10-Aug-24 05:25 ACFS-9393: Verifying ASM Administrator setup. [client(63134)]CRS-10001:10-Aug-24 05:25 ACFS-9308: Loading installed ADVM/ACFS drivers. [client(63137)]CRS-10001:10-Aug-24 05:25 ACFS-9154: Loading 'oracleoks.ko' driver. [client(63169)]CRS-10001:10-Aug-24 05:25 ACFS-9154: Loading 'oracleadvm.ko' driver. [client(63211)]CRS-10001:10-Aug-24 05:25 ACFS-9154: Loading 'oracleacfs.ko' driver. [client(63316)]CRS-10001:10-Aug-24 05:25 ACFS-9327: Verifying ADVM/ACFS devices. [client(63324)]CRS-10001:10-Aug-24 05:25 ACFS-9156: Detecting control device '/dev/asm/.asm_ctl_spec'. [client(63328)]CRS-10001:10-Aug-24 05:25 ACFS-9156: Detecting control device '/dev/ofsctl'. [client(63333)]CRS-10001:10-Aug-24 05:25 ACFS-9322: completed 2024-08-10 05:35:43.011: [/u01/product/grid/bin/oraagent.bin(62477)]CRS-5818:Aborted command 'start' for resource 'ora.asm'. Details at (:CRSAGF00113:) {0:0:2} in /u01/product/grid/log/rac3/agent/ohasd/oraagent_grid//oraagent_grid.log. 2024-08-10 05:35:45.015: [ohasd(62261)]CRS-2757:Command 'Start' timed out waiting for response from the resource 'ora.asm'. Details at (:CRSPE00111:) {0:0:2} in /u01/product/grid/log/rac01/ohasd/ohasd.log. 2024-08-10 05:35:45.111: [/u01/product/grid/bin/oraagent.bin(62477)]CRS-5019:All OCR locations are on ASM disk groups [OCR_VOT], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/u01/product/grid/log/rac3/agent/ohasd/oraagent_grid//oraagent_grid.log". 2024-08-10 05:35:45.314: [ohasd(62261)]CRS-2807:Resource 'ora.crsd' failed to start automatically. 2024-08-10 05:35:46.299: [/u01/product/grid/bin/oraagent.bin(62477)]CRS-5019:All OCR locations are on ASM disk groups [OCR_VOT], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/u01/product/grid/log/rac3/agent/ohasd/oraagent_grid//oraagent_grid.log". 2024-08-10 05:36:16.310: [/u01/product/grid/bin/oraagent.bin(62477)]CRS-5019:All OCR locations are on ASM disk groups [OCR_VOT], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/u01/product/grid/log/rac3/agent/ohasd/oraagent_grid//oraagent_grid.log". 2024-08-10 05:36:46.325:
Note:
CRS-5019 OCR ASM disk groups not mounted.
CRS Stack statue
$ crsctl stat res -t -init
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
1 ONLINE INTERMEDIATE rac3 OCR not started
ora.cluster_interconnect.haip
1 ONLINE ONLINE rac3
ora.crf
1 ONLINE ONLINE rac3
ora.crsd
1 ONLINE OFFLINE
ora.cssd
1 ONLINE ONLINE rac3
ora.cssdmonitor
1 ONLINE ONLINE rac3
ora.ctssd
1 ONLINE ONLINE rac3 ACTIVE:0
ora.diskmon
1 OFFLINE OFFLINE
ora.evmd
1 ONLINE INTERMEDIATE rac3
ora.gipcd
1 ONLINE ONLINE rac3
ora.gpnpd
1 ONLINE ONLINE rac3
ora.mdnsd
1 ONLINE ONLINE rac3
Note:
ora.asm state “OCR not started”
OLR check
$ ocrcheck -local Status of 0racle Local Registry is as follows : VersionTotal space(kbytes )262120 Used space(kbytes )2676 Available space(kbytes):259444 Device/File Name/u01/product/grid/cdata/rac01.olr Device/File integrity check succeeded Local registry integrity check succeeded Logical corruption check:succeeded
ASM alert log
NOTE: cache opening disk 5 of grp 2: OCR_VOT_0005 path:/dev/mapper/grid03
NOTE: F1X0 found on disk 5 au 196 fcn 0.84109
NOTE: cache mounting (not first) normal redundancy group 2/0x67A3B07C (OCR_VOT)
kjbdomatt send to inst 1
kjbdomatt send to inst 3
Sat Aug 10 05:26:08 2024
NOTE: attached to recovery domain 2
Sat Aug 10 05:26:08 2024
NOTE: redo buffer size is 256 blocks (1053184 bytes)
Sat Aug 10 05:26:08 2024
NOTE: LGWR attempting to mount thread 1 for diskgroup 2 (OCR_VOT)
NOTE: LGWR found thread 1 closed at ABA 22.878
NOTE: LGWR mounted thread 1 for diskgroup 2 (OCR_VOT)
NOTE: LGWR opening thread 1 at fcn 0.106863 ABA 23.879
NOTE: cache mounting group 2/0x67A3B07C (OCR_VOT) succeeded
NOTE: cache ending mount (success) of group OCR_VOT number=2 incarn=0x67a3b07c
Sat Aug 10 05:26:08 2024
NOTE: Instance updated compatible.asm to 11.2.0.0.0 for grp 1
Sat Aug 10 05:48:41 2024
NOTE: [crsctl.bin@rac3 (TNS V1-V3) 74992] opening OCR file
Note
no error
votedisk check
# crsctl query css votedisk
Note:
work fine. all disk are online.
手动重启ASM
$ sqlplus / as sysasm
startup
-- hang
检查其它所有节点ASM
-- node 2 (running node) $ asmcmd lsdg $ ocrcheck -- hang sqlplus / as sysasm -- check active session wait event "enq: DD - contention" final blocking session node 1 gpnpd process 。
我之前blog 遇到过这个事件 《Troubleshooting query v$asm_disk v$asm_diskgroup hang》
解决方法
kill gpnpd.bin 恢复正常
对不起,这篇文章暂时关闭评论。