首页 » ORACLE 9i-23ai » Oracle 11g ASM Alert log 频繁输出 “Attempting voting file refresh on diskgroup xx”
Oracle 11g ASM Alert log 频繁输出 “Attempting voting file refresh on diskgroup xx”
一套Oracle 11g r2 RAC环境,发现文件系统使用率很高,因为ASM alert log 中在不间断的输出”Attempting voting file refresh on diskgroup xx”信息,RBAL进程似乎因为PST问题一直在尝试,导致内存溢出,最终可能会报出Ora-7445等异常,最终ASM实例 CRASH, 这里简单的记录处理方法。
ASM Alert log
Wed May 15 17:16:31 2024 NOTE: Attempting voting file refresh on diskgroup OCRVOTE NOTE: Refresh completed on diskgroup OCRVOTE . Found 3 voting file(s). NOTE: Voting file relocation is required in diskgroup OCRVOTE NOTE: Attempting voting file relocation on diskgroup OCRVOTE NOTE: Successful voting file relocation on diskgroup OCRVOTE NOTE: Attempting voting file refresh on diskgroup OCRVOTE NOTE: Refresh completed on diskgroup OCRVOTE
RBAL Trace file
2024-05-15 12:11:48.014: [ CSSCLNT]clsssVoteDiskFormat: call clsscfgfmtbegin with leasedata 0000000000000000, size 0 2024-05-15 12:11:48.014: [ CSSCLNT]clsssVoteDiskFormat: succ-ly format the Voting Disk src:9ffffffffd55ff30:c000000061bd1950:3: /dev/rdisk/disk500:7cd1ebdbd0f94fa5bfa9cd93375b2f2c: PST-old:9ffffffffd55fe60:c000000061bd1250:47: /dev/rdisk/disk505:4703259dd3fa4f9abfa41dbf5bae835c: PST-old:9ffffffffd55ff30:c000000061bd1950:47: /dev/rdisk/disk500:7cd1ebdbd0f94fa5bfa9cd93375b2f2c: Reg-old:9ffffffffd55fec8:c000000061bd15d0:7: /dev/rdisk/disk506:6eea4d1c63e74f53bfcb35f30531b5e5:
查看RBAL进程内存
# linux pmap -p xxx # aix svmon -P xxx -O segment=category
最终实例CRASH ORA-07445 [lpmloadpkg()+160]
*** 2024-05-15 12:15:00.779 Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x1A60] [PC:0x40000000112205E0, lpmloadpkg()+160] [flags: 0x0, count: 1] [CPU: 2] Incident 200785 created, dump file: /grid/app/diag/asm/+asm/+ASM1/incident/incdir_200785/+ASM1_rbal_3847_i200785.trc ORA-07445: exception encountered: core dump [lpmloadpkg()+160] [SIGSEGV] [ADDR:0x1A60] [PC:0x40000000112205E0] [Address not mapped to object] []
根据MOS ASM Alert Logs Show Continuously: Attempting Voting File Relocation (Doc ID 1457886.1) 记录存在 bug 13609187和bug 13904435.
解决方法
1, move PST
alter diskgroup GRID drop disk [disk_name] rebalance power 0; alter diskgroup GRID undrop disks;
— or —
2, manaul rebalance diskgroup
上面的步骤如果不放心,也可以先迁移ocr和votedisk到其它diskgroup.如下
-- 替换OCR [root@11g-node2 ~]# crsctl check crs CRS-4638: Oracle High Availability Services is online CRS-4537: Cluster Ready Services is online CRS-4529: Cluster Synchronization Services is online CRS-4533: Event Manager is online [root@11g-node2 ~]# ocrcheck Status of Oracle Cluster Registry is as follows : Version : 3 Total space (kbytes) : 262120 Used space (kbytes) : 3108 Available space (kbytes) : 259012 ID : 299475515 Device/File Name : +OCRVOTE Device/File integrity check succeeded Device/File not configured Device/File not configured Device/File not configured Device/File not configured Cluster registry integrity check succeeded Logical corruption check succeeded [root@11g-node2 ~]# su - grid [grid@11g-node2 ~]$ asmcmd lsdg State Type Rebal Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Voting_files Name MOUNTED EXTERN N 512 4096 4194304 51200 9588 0 9588 0 N DATA/ MOUNTED EXTERN N 512 4096 4194304 51200 43004 0 43004 0 N FRA/ MOUNTED NORMAL N 512 4096 4194304 15360 14320 5120 4600 0 Y OCRVOTE/ [grid@11g-node2 ~]$ [root@11g-node2 ~]# ocrconfig -add +data [root@11g-node2 ~]# ocrcheck Status of Oracle Cluster Registry is as follows : Version : 3 Total space (kbytes) : 262120 Used space (kbytes) : 3108 Available space (kbytes) : 259012 ID : 299475515 Device/File Name : +OCRVOTE Device/File integrity check succeeded Device/File Name : +data Device/File integrity check succeeded Device/File not configured Device/File not configured Device/File not configured Cluster registry integrity check succeeded Logical corruption check succeeded [root@11g-node2 ~]# ocrconfig -delete +OCRVOTE [root@11g-node2 ~]# ocrcheck Status of Oracle Cluster Registry is as follows : Version : 3 Total space (kbytes) : 262120 Used space (kbytes) : 3108 Available space (kbytes) : 259012 ID : 299475515 Device/File Name : +data Device/File integrity check succeeded Device/File not configured Device/File not configured Device/File not configured Device/File not configured Cluster registry integrity check succeeded Logical corruption check succeeded -- 替换VOTE DISKS [root@11g-node2 ~]# crsctl query css votedisk ## STATE File Universal Id File Name Disk group -- ----- ----------------- --------- --------- 1. ONLINE 23b9e942beda4f52bf4d4dded62fca50 (/dev/sdd) [OCRVOTE] 2. ONLINE e0451324eb724f65bfc1238c2ed85e5c (/dev/sdc) [OCRVOTE] 3. ONLINE bb094d8715424f50bfe2d34fba6f7f48 (/dev/sdb) [OCRVOTE] Located 3 voting disk(s). [root@11g-node2 ~]# crsctl replace votedisk +data Successful addition of voting disk 9b91c6cec30a4fb5bf248d83adaf750e. Successful deletion of voting disk 23b9e942beda4f52bf4d4dded62fca50. Successful deletion of voting disk e0451324eb724f65bfc1238c2ed85e5c. Successful deletion of voting disk bb094d8715424f50bfe2d34fba6f7f48. Successfully replaced voting disk group with +data. CRS-4266: Voting file(s) successfully replaced [root@11g-node2 ~]# crsctl query css votedisk ## STATE File Universal Id File Name Disk group -- ----- ----------------- --------- --------- 1. ONLINE 9b91c6cec30a4fb5bf248d83adaf750e (/dev/sdf) [DATA] Located 1 voting disk(s). [root@11g-node2 ~]# su - grid [grid@11g-node2 ~]$ asmcmd ASMCMD> ASMCMD> lsdg -g Inst_ID State Type Rebal Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Voting_files Name 2 MOUNTED EXTERN N 512 4096 4194304 51200 9288 0 9288 0 Y DATA/ 1 MOUNTED EXTERN N 512 4096 4194304 51200 9288 0 9288 0 Y DATA/ 2 MOUNTED EXTERN N 512 4096 4194304 51200 43004 0 43004 0 N FRA/ 1 MOUNTED EXTERN N 512 4096 4194304 51200 43004 0 43004 0 N FRA/ 2 MOUNTED NORMAL N 512 4096 4194304 15360 14416 5120 4648 0 N OCRVOTE/ 1 MOUNTED NORMAL N 512 4096 4194304 15360 14416 5120 4648 0 N OCRVOTE/ ##### 手动rebalance diskgroup ASMCMD> rebal --power 5 ocr Rebal on progress. ASMCMD> lsop Group_Name Dsk_Num State Power EST_WORK EST_RATE EST_TIME OCRVOTE REBAL REAP 5 9 ASMCMD> lsop Group_Name Dsk_Num State Power EST_WORK EST_RATE EST_TIME # 恢复到原DISKGROUP OCRVOTE [root@11g-node2 ~]# crsctl replace votedisk +OCRVOTE Successful addition of voting disk 11e7fa5187684fcebfea09a9383fa244. Successful addition of voting disk b021f532e0164ff3bf874b6a3147ff3b. Successful addition of voting disk c7a28692b8934f1fbf4e86ec65341b3b. Successful deletion of voting disk 9b91c6cec30a4fb5bf248d83adaf750e. Successfully replaced voting disk group with +OCRVOTE. CRS-4266: Voting file(s) successfully replaced [root@11g-node2 ~]# crsctl query css votedisk ## STATE File Universal Id File Name Disk group -- ----- ----------------- --------- --------- 1. ONLINE 11e7fa5187684fcebfea09a9383fa244 (/dev/sdd) [OCRVOTE] 2. ONLINE b021f532e0164ff3bf874b6a3147ff3b (/dev/sdc) [OCRVOTE] 3. ONLINE c7a28692b8934f1fbf4e86ec65341b3b (/dev/sdb) [OCRVOTE] Located 3 voting disk(s). [root@11g-node2 ~]# ocrconfig -add +OCRVOTE [root@11g-node2 ~]# ocrcheck Status of Oracle Cluster Registry is as follows : Version : 3 Total space (kbytes) : 262120 Used space (kbytes) : 3108 Available space (kbytes) : 259012 ID : 299475515 Device/File Name : +data Device/File integrity check succeeded Device/File Name : +OCRVOTE Device/File integrity check succeeded Device/File not configured Device/File not configured Device/File not configured Cluster registry integrity check succeeded Logical corruption check succeeded [root@11g-node2 ~]# ocrconfig -delete +data [root@11g-node2 ~]# ocrcheck Status of Oracle Cluster Registry is as follows : Version : 3 Total space (kbytes) : 262120 Used space (kbytes) : 3108 Available space (kbytes) : 259012 ID : 299475515 Device/File Name : +OCRVOTE Device/File integrity check succeeded Device/File not configured Device/File not configured Device/File not configured Device/File not configured Cluster registry integrity check succeeded Logical corruption check succeeded
注意如果把OCRVOTE的OCR Votedisk移走后再操作,alert 日志会提示如下信息
SUCCESS: rebalance completed for group 3/0x6af84273 (OCRVOTE) NOTE: Attempting voting file refresh on diskgroup OCRVOTE NOTE: Refresh completed on diskgroup OCRVOTE. No voting file found.
放心该提示可以安全的忽略,主要是因为diskgroup没有voting disks, 也是non-published bug:14279847
对不起,这篇文章暂时关闭评论。